Synchronisation Session
Automatic Deletion
The synchronisation framework also supports the powerful concept of automatic deletion of users or documents that have not been synchronised during a synchronisation session. This feature makes it easy for clients of the API to not have to keep track of what data has been added via the API. For example, all a client needs to do is just provide the most recent snapshot (users and docs) of what data should be associated with the connector and the framework will automatically delete any previously synchronised users or docs that were not provided in that latest snapshot (synchronisation session).
This automatic deletion feature requires a synchronisation session to be started, synchronisation of content to be performed, and then the synchronisation session to be ended. When the synchronisation session ends, the framework will automatically check if there are any users or docs which were previously synchronised with their "DeletionMode" set to "Session" and if those users or docs were not synchronised during the current synchronisation session, they are deleted.
State and Explicit Deletion
Some data sources may have more advanced mechanisms for keeping track of content.
For example the SharePoint Online native connector, internally takes full advantage of the Microsoft Graph API and its Delta feature. So, whenever a synchronisation is performed, Site Drive (Document Library) files and folders are only synchronised from the most recent delta.
The delta token is stored in state within the Site Drive (Document Library) synchronised document that is persisted using the Synchronisation Framework. The next time the synchronisation of that Site Drive is performed, the delta token (stored in state) is retrieved and a delta sync is performed.
Session vs Explicit Deletion Mode
Its important that any of the Site Drive files or folders synchronised during a delta sync are saved in the framework with the "DeletionMode" set to "Explicit". This means that those files or folders are not automatically deleted if they are not synchronised during a following synchronisation session. Instead, the deletion is handled by processing the delta and explicitly deleting (via the API) those items if the delta indicates to do so.
Sometimes the synchronise state of an external system can become unknown. For example, with the SharePoint Graph API it is documented that the delta token could at any time become invalid. This means that its no longer possible to rely on the delta sync to delete previously synchronised content. Therefore a full resync using session deletion mode is required and at the end a new valid delta token would then be available for further resyncs.
To support this scenario, the synchronisation framework provides a mechanism for resetting the synchronisation state of a document and all of its children (regardless of the Deletion Mode). Resetting the sync status of a document (and its children) will set
- Date Last Updated = null
- So that any date comparison checks are ignored and all items are fully updated
- Deletion Mode = Session
- So that once a follow up full resync session has finished, any documents that are no longer synchronised are automatically deleted
- State = null
- So that any delta tokens or other state are not used (therefore a default full resync)
- Sync Session Deletion Candidate = true
- The document and its children are all eligible for automatic deletion if the document is not synchronised during the synchronisation session
This then means that a follow up full re-sync can be performed and when the synchronisation session has completed, should then end up with valid sync state (delta token) to be used for subsequent synchronisation sessions.
Automatic Deletion of Child Documents
The automatic deletion performed using the Synchronisation Session, also uses the "Materialised Path" and "Is Parent" properties to also automatically delete candidate child documents of a parent ("Is Parent" = true) which is going to be deleted. I.e. The parent has its Deletion Mode set to Session and is a Sync Session Deletion Candidate (was not synchronised during the current synchronisation session).
This only applies to documents which have their "Is Parent" and "Materialised Path" properties set. Child documents do not have a direct reference to parent documents (i.e. there is no Parent Id property). Instead, the materialised path is used to calculate which documents are a child of the parent document. This is more performant for the synchronisation framework, and also means that parent documents do not need to be synchronised in the framework before child documents.
The client is responsible for populating the materialised paths of documents in a unique and meaningful way. The materialised path should uniquely represent the unique hierarchical path to the document. For example, at its simplest, can be a list of concatenated globally unique ids:
- "1001/2002/3003/4004/5005/6006/"
Where 1001 is the Idd of the root document, 6006 is the Id of the current document, and the other Ids represent the hierarchical parents of each document in between.
To continue the example, lets say that the synchronisation framework has detected that document with id 3003 needs to be deleted (along with its children). Then the materialised path for the document with id 3003 is "1001/2002/3003/" and therefore all documents with a materialised path that starts with "1001/2002/3003/" will automatically be deleted. E.g. "1001/2002/3003/4004/5005/6006/" will be deleted.
Updated 8 months ago