分享
 
 
 

RFC677 - Maintenance of duplicate databases

王朝other·作者佚名  2008-05-31
窄屏简体版  字體: |||超大  

Network Working Group Paul R. Johnson (BBN-TENEX)

RFC# 677 Robert H. Thomas (BBN-TENEX)

NIC # 31507 January 27, 1975

The Maintenance of Duplicate Databases

Preface:

This RFCis a working paper on the problem of maintaining duplicated

databases in an ARPA-like network. It briefly discusses the general

duplicate database problem, and then outlines in some detail a solution

for a particular type of duplicate database. The concepts developed

here were used in the design of the User Identification Database for the

TIP user authentication and accounting system. We believe that these

concepts are generally applicable to distributed database problems.

IntrodUCtion

There are a number of motivations for maintaining redundant,

duplicate copies of databases in a distributed network environment. Two

important motivations are:

- to increase reliability of data Access.

The accessibility of critical data can be increased by redundantly

maintaining it. The database used for TIP login and accounting is

redundantly distributed to achieve highly reliable access.

- to insure efficiency of data access.

Data can be more quickly and efficiently accessed when it is "near"

the accessing process. A copy of the TIP user ID database is

maintained at each site supporting the TIP login service to insure

rapid, efficient access. (Reliability considerations dictate that

this database be redundantly maintained, and efficiency

considerations dictate that a copy be maintained at each

authentication site.)

The design of a system to maintain redundant, duplicate databases is

a challenging task because of the inherent communication delay between

copies of the database, as well as the real world constraints of system

crashes, operator error, communication channel failure, etc. This paper

discusses some of the problems we encountered in designing such a

system, and outlines a system design for maintaining a particular type

of database which solves those problems.

The Model

A system for supporting duplicate copies of a database can be

modeled by a group of independent database management processes (DBMPs)

each maintaining its own copy of the database. These processes

communicate with each other over network communication paths. Each DBMP

has complete control over its copy of the database. It handles all

accesses to and modifications of the database in response to requests

from other processes. Though the DBMPs act only upon requests, in the

following they will often be said to be actually causing or originating

the modifications.

An important design consideration is that the communication paths

between the DBMPs are subject to failures. Thus one DBMP may have its

interactions with other DBMPs interrupted and/or have to wait until

communication paths are re-established before it can communicate with

other DBMPs. An assumption made in this paper about the communication

paths is that messages from one process to another are delivered in the

same order that they are sent. This is true of the ARPANET. For networks

that make no such guarantee, communication protocols between the DBMPs

can be used to correctly order the messages.

In order to proceed further, it is necessary to be more precise

about the nature of the duplicated database and the operations allowed

on it. A constant, read-only database is at one extreme. The task of the

DBMPs is trivial in this case. They simply respond to value retrieval

requests. At the other extreme is a general shared database where

functional modification requests (such as "X := f(X,Y,Z)") are allowed

and/or where it is necessary to completely restrict access to entries

while they are being modified. In this case all the problems of shared

databases on a single computer system arise (e.g., the need for

synchronization mechanisms and the resulting potential deadlock

situations), as well as those unique to having multiple database copies

distributed among independent computers. For example, a completely

general system must deal with the possibility of communication failures

which cause the network to become partitioned into two or more sub-

networks. Any solution which relies on locking an element of the

database for synchronized modification must cope with the possibility of

processes in non-communicating sub-networks attempting to lock the same

element. Either they both must be allowed to do so (which violates the

lock discipline), or they both must wait till the partition ceases

(which may take arbitrarily long), or some form of centralized or

hierarchical control must be used, with a resulting dependency of some

DBMPs on others for all modifications and perhaps accesses as well.

The Database

The type of database to be examined in this paper can be represented

as a collection of entries which are (Selector,Value) pairs. Each

selector is unique and the values are atomic entities as far as the

DBMPs are concerned. The mechanisms to be presented may be extended to

handle databases with greater structure - where the values may

themselves be collections of (selector,value) pairs - but this extension

will not be considered further here.

Four operations are to be allowed on the database:

1) Selection - given a selector, the current associated value is

returned.

2) Assignment - a selector and a value are given and the given value

replaces the old value associated with the selector.

3) Creation - a new selector and an initial value are given and a

new (selector,initial value) entry is added to the database.

4) Deletion- a selector is given and the existing (selector,value)

entry is removed from the database.

Note that value modification is limited to assignment. Functional

modification requests - such as "Change X to be Factorial(X)" - are

specifically ruled out. Allowing them would force the use of system wide

synchronization interlocks.

Consistency

The extent to which the copies of the database can be kept

"identical" must be examined. Because of the inherent delay in

communications between DBMPs, it is impossible to guarantee that the

data bases are identical at all times. Rather, our goal is to guarantee

that the copies are "consistent" with each other. By this we mean that

given a cessation of update activity to any entry, and enough time for

each DBMP to communicate with all other DBMPs, then the state of that

entry (its existence and value) will be identical in all copies of the

database.

Timestamps

We permit modifications to the database to originate at any of the

DBMPs maintaining it. These changes must, of course, be communicated to

the other DBMPs. To insure consistency, all of the DBMPs must make the

same decision as to which modification to a particular entry is to be

considered "final". It is desirable to select the "most recent" change.

However, since there is no way to absolutely determine the time sequence

of events in a distributed system without a universal, always accessible

sequence number generator (a network time standard should suffice),

"most recent" can only be approximated. We accomplish the approximation

by associating a timestamp with each modification and with each entry,

the latter being the timestamp of the modification which set its current

value.(1) Since the uniqueness of timestamps given out at more than one

----------

(1) Time is useful in this context because it has the desired properties

of being monotonically increasing, and of being available with a

reasonable degree of accuracy. Any other sequence numbering scheme with

these properties can be used, "time-of-day" was chosen because it is

simple to oBTain. Its main faults are that it is often manually set (and

thus prone to error), and it may stop during system service

interruptions. As computer systems learn to adapt to a network

environment, the use of an independent time source should become more

common. This is beginning to happen with the TENEX sites on the ARPANET.

DBMP can not be guaranteed, a "DBMP of origin" is included as part of

each timestamp. By (arbitrarily) ordering the DBMPs, we thus have a

means of uniquely ordering timestamps. Each

timestamp is a pair (T,D): T is a time, D is a DBMP identifier. For two

timestamps (T1,D1) and (T2,D2) we have the following:

(T1,D1)>(T2,D2) <=> (T1>T2) or (T1=T2 and D1>D2)

(T1,D1) is said to be "more recent" than (T2,D2)

If D1=D2 and T1=T2 it is assumed that the two modifications are

really two copies of the same modification request.

In order to insure the uniqueness of timestamps, it is necessary

that each individual DBMP associate different times with different

modifications. This is certainly possible to do, though the fineness of

the unit of time may restrict the frequency of modifications at a single

DBMP site.

Each entry in the data base is now a triple:

E ::= (S,V,T), where

S is the selector

V is the associated value

T is the timestamp (a Time,DBMP pair) of the last change to the

entry

The task of each DBMP is to keep its copy of the database up-to-

date, given the information on modifications that it has received so

far. At the same time it must insure that each of its entries stays

consistent with those of all the other DBMPs. This must be done despite

the fact that the order in which it receives modifications may be very

different from the order in which they are received by other DBMPs. In

the remainder of this paper we examine the allowable database operations

with respect to their effect on DBMP operation.

Assignment

Consider the case of assignment to an existing entry. When the

assignment is initiated (by a person or process) the DBMP involved makes

the change locally, and creates a copy of the modified entry and an

associated list of DBMPs to which the change must be sent. As the

modification is delivered to the other DBMPs, they are removed from the

list until no DBMPs remain. The copy of the modification is then

deleted. This distribution mechanism must be error free (i.e., receipt

of a modification must be positively confirmed before removing a DBMP

from the list of recipients).(2)

When a DBMP receives an assignment modification (either locally or

from another DBMP) it compares the timestamp of the modification with

the timestamp of the copy of the entry in its database and keeps

whichever is more "recent" as defined by the ordering given above. Thus

when all existing assignments to a given entry have been distributed to

all the DBMPs, they are guaranteed have the same "latest" value

associated with that entry.

Creation

Creation and deletion of entries pose more of a problem. Note that

the ability to create new, previously unknown entries requires that a

DBMP be able to handle assignments to unknown entries. For example,

consider the case of an entry XYZ created by DBMP A, and the following

sequence of events: DBMP A tells DBMP B about the new entry, and

subsequently B assigns a new value to XYZ; DBMP B then tells DBMP C

about the assignment before C has heard from A about the creation. DBMP

C must either save the assignment to XYZ until it hears about the

creation, or simply assume the creation will be coming and use the "new"

entry right away. The latter is more in the spirit of trying to keep the

database as "up-to-date" as possible and leads to no inconsistencies.

Deletion

Deletion of entries is the main problem for this database system.

If deletion is taken to mean immediate removal from the database, then

problems arise. Consider the following scenario:

XYZ is an entry known by all DBMPs.

XYZ is deleted at DBMP A.

XYZ is modified at DBMP B (before B is notified of the deletion

by A).

Now, consider a third DBMP, C. C may hear from DBMP B before DBMP A, in

which case XYZ ends up deleted at DBMP C. C may however hear from DBMP A

----------

(2) This same process (local modification and queuing for remote

distribution) is, of course, performed for the other possible operations

on the database. The details of how the local modification is done

safely, how the messages are queued, how confirmation of delivery is

done, etc., though important, will not be discussed here. The use of an

addressee list attached to the modification to be delivered is

conceptually easy to work with and not difficult to implement in

practice.

before DBMP B. In this case, if C removes XYZ from its database, then

the assignment to XYZ initiated by DBMP B will result in the re-creation

of XYZ at DBMP C. To prevent this C must remember that XYZ has been

deleted until it is "safe" to completely forget about XYZ.

One approach to this problem is to never actually remove an entry

from the database. Deletion just marks the entry as being deleted by

setting a "deleted" flag associated each entry. However, the problem of

receiving assignments to deleted entries still exists. For example, DBMP

A may receive an assignment from DBMP B to an entry which A has marked

as deleted. DBMP A can not tell whether B has not heard about the

deletion, or has heard about it but has also received a re-creation

request for the entry, which hasn't reached DBMP A. To enable A to

resolve such situations we include another timestamp in all entries: the

timestamp of the entry's creation. Thus in the above example, DBMP A can

compare the creation timestamps of the assignment and the deleted entry.

The one with the later creation timestamp is kept. Indeed whenever a

modification with an old creation timestamp is received it can be

ignored.

We now have a 5-tuple for entries and modifications:

E ::= (S,V,F,CT,T)

S is the selector

V is the associated value

F is the deleted/not-deleted flag

CT is the timestamp of creation

T is the timestamp of this (last) modification

Note that the values of the F, CT, and T components of a

modification uniquely specify the type of modification. Thus only the

5-tuple to become the new entry for a selector, not the type of

modification, need be communicated:

F = not deleted, CT = T => creation

F = not deleted, CT < T => assignment

F = deleted => deletion

The mechanism described above handles all the desired operations on

the distributed database in a way that guarantees the consistency of all

copies. A modification to the database will take effect at each DBMP as

soon as it receives the request from the DBMP originating the change.

A deficiency with this scheme is that deleted entries are never

removed from the database. A method which permits "garbage collection"

of deleted entries is discussed below.

Removal of Deleted Entries

The basic constraint is that a DBMP should not remove a deleted

entry until it will never receive any assignments with the same selector

(S) and the same or older create time (CT). If it fails to do this, then

it will be unable to distinguish these "out of date" assignments from

assignments to a newly created entry for the same S. To be able do

this, each DBMP must know for each deleted entry whether all other DBMPs

have heard about the deletion. To accomplish this, each DBMP could

notify the other DBMPs whenever it hears about a deletion. If these

notifications are transmitted in order with the "normal" sequence of

modifications, then upon receipt of such a notification a DBMP can be

sure that the sending DBMP has delivered any outstanding assignments to

the deleted entry, has marked it as deleted, and will not generate any

new assignments to it. Keeping track of, and exchanging messages about,

each individual deleted entry in this manner is, however, somewhat more

elaborate than necessary.

Having each DBMP deliver all its own modifications in sequential

order (by timestamp) allows the following simplification. We have all

DBMPs maintain a table of the timestamps of the last modification

received from each other DBMP. Any DBMP, say A, is then guaranteed to

have received all modifications originating at another DBMP, say B.

which have timestamps earlier than (or equal to) the entry for B in A s

copy of this table. If this table is exchanged between DBMPs, then all

DBMPs would have a second N*N (N= number of DBMPs) table where entry

(I,J) would be the timestamp of the last modification received by DBMP I

from DBMP J. Thus DBMP A can remove a deleted entry whose deletion

originated at DBMP K when all entries in the Kth column of this table at

DBMP A are later than or equal to the timestamp of the deleted entry.

When a DBMP receives a deletion modification, in addition to acting on

it and acknowledging receipt of it, the DBMP should also send its table

of last timestamps received to all other DBMPs. This is sent in a

timestamped message which is queued with the "normal" modification

messages.

A refinement to this approach, which reduces the amount of

information exchanged and the size of the tables, is to have the DBMPs

exchange only the oldest of the entries in the first table (of

timestamps of last modifications received from other DBMPs). These would

then be saved in a 1*N table, replacing the N*N table. A DBMP receiving

a modification with a timestamp equal to or older than the oldest

timestamp in its second table knows that the modification has been

confirmed as being received by all other DBMPs. A deleted entry can thus

be removed when its timestamp satisfies this condition. A DBMP would,

upon receipt of a deletion modification, queue up a message with this

"timestamp of oldest last modification received" for delivery to all

other DBMPs.

Summary of solution:

An entry in the database is a 5-tuple:

(S,V,F,CT,T) where

S is an selector used in all references to this entry.

V is its present value.

F is a deleted/undeleted flag.

CT is the timestamp of the creation of this entry.

T is the timestamp of the modification which set the current V

and/or F of the entry.

A timestamp is a pair (time,DBMP) where the DBMP identifies the

site at which the time was generated, and the DBMPs are

(arbitrarily) ordered, so that timestamps are completely

ordered.

A modification is a pair (Set-of-DBMPs,Entry) where Set-of-DBMPs is

the set of DBMPs to which the Entry has yet to be delivered.

An ordered (by timestamp) list of modifications is kept at each

DBMP. The DBMP periodically attempts to deliver modification

requests to those DBMPs which remain in the Set-of-DBMPs associated

with each modification. Modification entries are removed from this

list when they have been delivered to all DBMPs.

When a DBMP receives a modification request from another DBMP, it

compares the timestamps of the request with the timestamps of the

corresponding entry (if any) in its database, and acts upon or

disregards the new entry accordingly.

Each DBMP keeps a vector of the timestamp (T) of the last

modification received by it from each other DBMP.

When a DBMP receives a deletion modification, it sends a timestamped

message to all other DBMPs containing the oldest timestamp in its

vector of timestamps of last modification received. Each DBMP keeps

a second vector of the last of these timestamps received from each

other DBMP.

A deleted entry may be removed from the database when its timestamp

(T) is older than all the timestamps in this second vector of

timestamps received from other DBMPs.

Conclusion

This paper has presented techniques by which a number of loosely

coupled processes can maintain duplicate copies of a database, despite

the unreliability of their only means of communication. The copies of

the database can be kept "consistent'. However it is possible for

seemingly anomalous behavior to occur. For example a user may assign a

value to a selector using one DBMP, later use another DBMP and assign a

new value, and still later find that the first value is the one that

remains in the database. This can occur if the clocks used by the two

DBMPs for their timestamps are sufficiently out of synchrony that the

second assignment is timestamped as having taken place before the first

assignment. To the extent that the communication paths can be made

reliable, and the clocks used by the processes kept close to synchrony,

the probability of seemingly strange behavior can be made very small.

However, the distributed nature of the system dictates that this

probability can never be zero.

The major innovation presented here is the eXPlicit representation

of the time sequence of modifications through timestamps for both

modifications and entries. This enables the each DBMP to select the same

"most-recent" modification of an entry. In addition, timestamps enable

the DBMPs to decide when a deleted entry is no longer referenced (i.e.,

still active anywhere) and can be deallocated. These techniques should

have broader utility in building and modeling other systems of

concurrent, cooperating processes.

[ This RFCwas put into machine readable form for entry ]

[ into the online RFCarchives by Alex McKenzie with ]

[ support from GTE, formerly BBN Corp. 12/99 ]

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
2023年上半年GDP全球前十五强
 百态   2023-10-24
美众议院议长启动对拜登的弹劾调查
 百态   2023-09-13
上海、济南、武汉等多地出现不明坠落物
 探索   2023-09-06
印度或要将国名改为“巴拉特”
 百态   2023-09-06
男子为女友送行,买票不登机被捕
 百态   2023-08-20
手机地震预警功能怎么开?
 干货   2023-08-06
女子4年卖2套房花700多万做美容:不但没变美脸,面部还出现变形
 百态   2023-08-04
住户一楼被水淹 还冲来8头猪
 百态   2023-07-31
女子体内爬出大量瓜子状活虫
 百态   2023-07-25
地球连续35年收到神秘规律性信号,网友:不要回答!
 探索   2023-07-21
全球镓价格本周大涨27%
 探索   2023-07-09
钱都流向了那些不缺钱的人,苦都留给了能吃苦的人
 探索   2023-07-02
倩女手游刀客魅者强控制(强混乱强眩晕强睡眠)和对应控制抗性的关系
 百态   2020-08-20
美国5月9日最新疫情:美国确诊人数突破131万
 百态   2020-05-09
荷兰政府宣布将集体辞职
 干货   2020-04-30
倩女幽魂手游师徒任务情义春秋猜成语答案逍遥观:鹏程万里
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案神机营:射石饮羽
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案昆仑山:拔刀相助
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案天工阁:鬼斧神工
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案丝路古道:单枪匹马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:与虎谋皮
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:李代桃僵
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:指鹿为马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:小鸟依人
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:千金买邻
 干货   2019-11-12
 
推荐阅读
 
 
 
>>返回首頁<<
 
靜靜地坐在廢墟上,四周的荒凉一望無際,忽然覺得,淒涼也很美
© 2005- 王朝網路 版權所有