3.1.1 Abstract Data Model

This section describes a conceptual model of possible data organization that an implementation maintains to participate in this protocol. The described organization is provided to facilitate the explanation of how the protocol behaves. This document does not mandate that implementations adhere to this model as long as their external behavior is consistent with that described in this document.

Abstract Data Model

Figure 1: Abstract Data Model

The figure shows components as follows:

  • Similar shapes represent similar objects.

  • Shaded elements are leaf nodes.

  • Shaded boxes are objects that store file data.

  • Shaded circles are objects that contain pointers to other objects.

The protocol server maintains a tree of several kinds of data nodes to represent the current state of the file. The nodes of the tree collectively store information about the data elements of this protocol. A data element in this context is a portion of a file, or data about a file, that can be synchronized as a unit. All data elements are immutable.

The types of nodes used in this data model include:

File: The root item of the tree of nodes that collectively represent the current state of a file available on the server. This node maintains the current knowledge information as specified in section 2.2.1.13, and also refers to the Storage Index node (section 2.2.1.12.2).

  • Data Element Knowledge: The current Cell Knowledge (section 2.2.1.13.2) of the file.

  • Waterline Knowledge: The current Waterline Knowledge (section 2.2.1.13.4) of the file.

Storage Index: The node of the tree that maintains data for the Storage Index data element (section 2.2.1.12.2), and which refers to the current Storage Manifest data element (section 2.2.1.12.3) of the file. There is only one Storage Index data element for a file.

  • Storage Index DEID: The Extended GUID (section 2.2.1.7) of the Storage Index data element.

  • Storage Index SN: The current Serial Number (section 2.2.1.9) of the Storage Index.

  • Cell Manifest Mapping: The map of Cell IDs (section 2.2.1.10) to the data element identifiers of their Cell Manifests. When nodes refer to a Cell ID, this mapping allows that reference to be resolved to the Cell Manifest data element (section 2.2.1.12.4) that contains the data for the Cell.

  • Revision Manifest Mapping: The map of revision identifiers to the data element identifiers of their Revision Manifests. When other nodes refer to a revision identifier, this mapping allows that reference to be resolved to the Revision Manifest data element (section 2.2.1.12.5) that contains the data for the revision.

Storage Manifest: The node of the tree that maintains data for the current Storage Manifest data element of the file. There is only one Storage Manifest data element for a file.

  • Storage Manifest DEID: The Extended GUID of the Storage Manifest data element.

  • Storage Manifest SN: The current Serial Number of the Storage Manifest.

  • Schema GUID: The GUID that identifies the schema of the user data and the organization of the nodes in the file. Users of this protocol would generally assign a unique Schema GUID for each different kind of document or file format, so that applications will know how to interpret the contents of the file.

  • Root Cell Set: The ordered set of root Cell ID references to the root cells for this file. This is the set of cells that are required directly by the file. The Cell Manifest Mapping of the Storage Index is used to resolve each Cell ID reference into the data element identifier of the Cell Manifest for the cell. It is possible, through references from objects, for additional cells to be used by a file even though they are not members of this Root Cell Set.

Cell Manifest: The Cell Manifest data element refers to a set of revision data for the cell. Users of this protocol would generally use cells to represent major divisions of the file that have limited interdependency.

  • Cell Manifest DEID: The Extended GUID of the Cell Manifest data element for this cell.

  • Cell Manifest SN: The current Serial Number of the Cell Manifest for this cell.

  • Current Revision: A Revision ID reference to the current Revision Manifest for this cell. The Revision Manifest Mapping of the Storage Index is used to resolve each revision identifier reference into the data element identifier of the Revision Manifest.

Revision Manifest: The Revision Manifest data element is part of a linked list which represents state information for a cell. This node also refers to Object Groups and root objects for the revision. Users of this protocol would generally build a linked list of revisions to represent the state of a cell as a series of incremental changes.

  • Revision Manifest DEID: The Extended GUID of the Revision Manifest data element for this revision.

  • Revision Manifest SN: The current Serial Number of the Revision Manifest for this revision.

  • Root Object Set: The set of root object identifier references to the root object nodes for this revision. The data pertaining to each object identifier is found in an Object Group Set of this same Revision Manifest, or in an Object Group Set of a prior Revision Manifest along the linked list. It is possible, through references from other objects, for an object to be used by a revision even though it is not a member of this Root Object Set.

  • Object Group Set: The set of data element identifier references to the Object Group (section 2.2.1.12.6) nodes for this revision.

  • Base Revision: A revision identifier reference to the prior revision node in the linked list. The Revision Manifest Mapping of the Storage Index is used to resolve each revision identifier reference into the data element identifier of the Revision Manifest of the prior revision. If all of the data for an object is unchanged, it is possible to rely on the data for that object stored by a prior revision somewhere along the linked list. This allows the protocol to be used for incremental updates to the file.

Object Group: A data element that identifies a group of objects that can be referenced by revisions or other objects. Users of this protocol would generally group objects together when they tend to be modified together.

  • Object Group DEID: The Extended GUID of this Object Group data element.

  • Object Group SN: The current Serial Number of this Object Group.

  • Object Set: The set of object nodes included in this Object Group’s data element. Object nodes do not represent separate data elements, but are included in their containing Object Groups. There are two types of object nodes, those that store their user data internally, and those that refer to a separate data element for their user data.

Object with User Data stored internally: A node that is part of an Object Group's Object Set, and which contains user data and reference information for an Object Partition of an object. Users of this protocol would generally use this kind of object to enable references to other objects and cells.

  • Object ID: The Extended GUID (section 2.2.1.7) by which other nodes reference this object.

  • Object Partition ID: An 8-bit unsigned integer that identifies this Object Partition of the object. Users of this protocol would generally use Object Partitions to divide objects into pieces that tend to be updated at different times.

  • User Data: The stream of data for this Object Partition of this object. Contents are opaque to this protocol. Users of this protocol would generally use this stream to store portions of the contents of their file format.

  • Object Reference Set: The ordered set of object identifier references to other objects that this Object Partitions of this object depends on. All of the objects in this set MUST be in the same cell as this object. The data pertaining to each object identifier is found in an Object Group Set of the Revision Manifest of the cell, or in an Object Group Set of a prior Revision Manifest along the linked list.

  • Cell Reference Set: The ordered set of Cell ID references to other cells that the Object Partitions of this object depends on. The Cell Manifest Mapping of the Storage Index is used to resolve each cell identifier reference into the data element identifier of the Cell Manifest for the cell.

Object with User Data stored externally: A node that is part of an Object Group's Object Set, and which references user data for an Object Partition of an object. Users of this protocol would generally use this kind of object when the user data is large and there is no need for references to other objects or cells.

  • Object ID: The Extended GUID by which other nodes reference this object.

  • Object Partition ID: An 8-bit unsigned integer that identifies the Object Partitions of the object. Users of this protocol would generally use Object Partitions to divide objects into pieces that tend to be updated at different times.

  • Data Blob Reference: A data element identifier reference to the Data Blob (section 2.2.1.12.8) node that contains the user data for the Object Partitions of this object.

Data Blob: A data element that contains user data for an Object Partition of an object:

  • Data Blob DEID: The Extended GUID of this Data Blob data element.

  • Data Blob SN: The current Serial Number of this Data Blob.

  • User Data: The stream of data for an Object Partition of an object. Contents are opaque to this protocol. Users of this protocol would generally use this stream to store portions of the contents of their file format.