Implementing Consistency - Protocols for Data Replication and Cache Coherence

Various concrete consistency models have been described in [1], now it's time to discuss how to implement these models.

Consistency semantic is divided into two categories in [1] - "Coherence and Replication
are very similar concepts and deal with the same problem, but the
former is often used in hardware system (for example, in SMP Cache
system) and the later is often used in software system (for example, in
distributed files system and database system)".

Some other differences:
1. Coherence deals with data item at client side, while Replication deals with data item at server side
Coherence deals with relicated data item that has a corresponding one
in a reliable storage backend, but Replication doesn't have that.

We will discuss protocols for Data Replication and Cache Coherence in different sections.

Part 1 - Data Replication

keep replicated data item consistent in distributed storage system,
there are some core problems to think about: where and how to store
various replicas, what, when and how to propagate update to replicas?
how to keep consistency?

I - Replica Server & Replica Content Placement
- As close to client as possible to improve performance
- Adjust repica placement dynamicly according to access history

II - General Update Propagation Problems

1. What to Propagate?
- Notification (Invalid Message)
- Command (operation & parameters)
- Data (updated region)
- Change Log (to accumulate several updates into one)

2. When to Propagate?
- Pull (each replica asks for updates)
- Push (primary replica send updates to all secondaries)

3. How to Propagate?
- Unicasting (send updates to each replica one by one)
- Multicasting (send update message to all replicas at one time)

III - Replication Protocols

replication protocol aims to implement some specific consistency model,
the detail of the protocol is highly dependent on the target model.
Here we only describe protocols which implement models that are popular
in practical systems.

1. Sequential Consistency

1.1 Primary Based Protocol
- all write operations to a data item x is served by one special
replica called primary replica, this replica is responsible for
updating other replicas, client only interact with this special replica.

1. 2. Replicated Write Protocol - write operations are sent to each replica to execute:
- Active Replication, a total order of all write operations is required in order to make each replica execute the same order of write commands.
- Quorum Based, write operations only need to be executed on part of all replicas before return. It use votes to prevent write-write confilict and write-read conflict:

  • Suppose each data item has N replicas
  • To read a value, client must contact with at least Nr replicas
  • To write a data item, client must contact with at least Nw replicas
  • To ensure no WW and WR conflicts, Nr + Nw > N and Nw + Nw > N should be statisfied

2. Eventual Consistency Protocol

Two requirements should be meet for this kind of protocol:
- All update operations to a data item should reach and be executed all replicas at some time
- These operatioins should be executed in the same order

Some popular methods to ensure these requirements are:
- Use write set and read set (use Nw, Nr to control how many porcesses should be involved in write or read operation), thus update operations are ordered.
Expose data item version number to client, so when client accesses
data, it can pass known latest version number to server to implement
some client centric consistency model
- Limit update operation execution process, so write-write conflicts can be solved easily

- All upper protocols ignore process/node & communication failure, which would occur often in practical distributed system.
- Replication protocols than deal with failure, such as Paxos, Two-Phase Commit, are much more complicated.

Part 2 - Cache Coherence

Coherence Protocol is used to keep client side replicas consistent in
the context that a reliable data item exists in storage backend (it's
true for smp cache like hardware system and for distributed file system
cache like software system).

I. Core problems of Cache Coherence Protocol:

1. Coherence Detection Strategy
That is to say, when inconsistencies are actually detected. For example, in distributed database system, this detection can be performed:
- At the beginning of a transaction
- Parallel with the on going transaction but before it commits
- When transaction commits

2. Coherence Enforcement Strategy
That is to say, how all caches are kept consistent with each other. Generally, there are two methods
- Invalidating: if a data item is modified in one cache, other caches that hold the same data item will receive a invalidation notification.
- Propagating:
if a data item is modified in one cache, the update is propagated to
other caches that hold the same data item. So all replicas in cache are
update to the same version.

3. Cache-Server Consistency
That is to say, how to keep data items in cache and in storage server consistent with each other. There are serveral policies:
- Use Read Only Cache, modification can only be made on items in storage server, cache pull these updates
- Write Back, modifications to data items are accumulated at cache side, and are written to storage server at some other time.
- Write Through,
modifications are made at cache, and are also propagated to storage
server. In SMP system, bus or directory may be used to serialize all
operations. While in distributed file systems, an exclusive lock may be
needed for a data cache that can perform modification operations to
avoid write-write confliction.

II Implementing Cache Coherence

1. Implementation Mechanism

1.1 Directory Based
- In a directory-based system, some information about the data being
shared is placed in a common directory. The directory acts as a filter
through which the processor must ask permission to load an entry from
the primary memory to its cache. When an entry is changed the directory
either updates or invalidates the other caches with that entry.[2] It
is not as fast as snooping but more scalable.

1.2 Snooping
- In such system, the individual caches monitor address lines for
accesses to memory locations that they have cached. When a write
operation is observed to a location that a cache has a copy of, the
cache controller invalidates or update its own copy of the snooped
memory location.[2] Since it needs broadcasting, not very scalable.

2. Protocols

The various cache coherence protocols use Directory or Snooping mechanism to solve the three core problems described above:

2.1 Invalidation Protocols
- Write-Once
- Berkeley
- Illinois/MESI

2.2 Propagating Protocols
- Firefly
- Dragon

Most of the cache protocols in multiprocessors are supporting sequential consistency model, while in software distributed shared memory more popular are models supporting release consistency or weak consistency.

[5] Shared Memory Architecture, 2001, Chinese Higher Education Press, Weiwu Hu
([5] 共享存储系统体系结构, 2001, 中国高等教育出版社, 胡伟武)