Tuesday, 27 August 2013

Synchronization algorithm for exchanging data in the “Client – Server” model via REST API

Many mobile applications require to sync data with a server if they operate in the client – server model to exchange data with the central repository.

If the server serves up resources through a REST API, then all sync logic can be handled on the client side.
The sync logic is able to handle bi-directional sync between central server and multiple clients where only incremental changes apply on both sides with some conflict detection.

Each table (both server and client) participating in the sync process should include two additional fields: “Ts” as Number, “Deleted” as Boolean.

The Ts field is maintained by the server side. For example SQL Server automatically generates a unique incremental value on each insert or update. The Ts field is used to determine whether one record in the table was modified more recently than another so to download only the incremental changes. Also it will help to identify the new records created on the client as they will have no Ts value.
Please note: Ts in this algorithm is not an instant in time, typically expressed as a date and time of day. Rather, the Ts should be considered as an Update Sequence Number e.g. the number of data element that identifies the order in which the element was last modified. The Ts within a table start at “1” and then increase monotonically every time a record is created, modified, or deleted.

The Deleted field is used to mark records for deletion so the clients would remove such records from local repositories, whereas clients with initial sync would simply skip them during download.
Additionally the client data storage should include the “Dirty” field as Boolean.
The Dirty field designates the records changed locally. Once those records sent successfully to the server the Dirty flag reset.

The sync process will be initiated by the client either manually (pressing some Sync button) or periodically or by combination of push notifications about server changes and client change triggers.
The sync process first download the server changes then upload the client changes.
The sync procedure will process the table changes sequentially and in correct order (where parent child relationship exist) to avoid referential integrity conflicts.

Below is the sync algorithm:
  1. Initialize the list of tables required to sync
  2. /******* Download server changes **********/
    For each sync table:
  3. Get the latest Ts, e.g. using sqlite “select ifnull(ts, 0) from Table order by ts desc limit 1”
  4. Call the relevant RESTful service method to get server changes as collection for the table based on the provided latest timestamp (Ts)
  5. /* Insert or update the received changes into local storage */
  6. Iterate through each record in the server changes collection
    • Check if the server record with the same id already exists in the local storage
      • If so then check if the existing local record is dirty
        • If so then call some conflict resolution function
        • Reset dirty flag
      • Else replace the local record with the server record and reset the dirty flag
    • If the local storage doesn’t have the server record, then create it
    /******* Upload client changes **********/
    For each sync table:
  7. Get the list of local storage changes for inserts on the server
  8. Iterate through each record in the local changes for inserts
    • Insert the record on the server (using POST)
    • Get the Ts for the inserted record from the response
    • Update the record in the local storage with new Ts
    • Reset the dirty flag
  9. Get the list of local storage changes for updates on the server
  10. Iterate through each record in the local changes for updates
    • Update the record on the server (using PUT)
    • Get the updated Ts for the updated record from the response
    • Update the record in the local storage with new Ts
    • Reset the dirty flag
    /******* Post sync operation ************/
    For each sync table:
  11. Delete records where Deleted=true and Dirty=false

For the conflict resolution the easiest way is to update the records with the server version (server always wins) and at the end of sync show the conflict log.