Connector Framework

The connector framework is core to all the connector types, including Custom connectors and Native connectors.

Permissions

A critical feature of the connector framework is the ability to ensure that users only see search results for content that they actually have access to (same access as in the external system).

For this there are 2 key components:

  • A users External Identities (identity and group assignments)
  • A doc and which External Identities have access to it (user identity and groups)

To support this in a generic way for any and all connectors, we have introduced a new External Identities concept for users. This does not interfere with any other existing permissions or groups (main organisations, additional organisations, power users, etc) within Interact.

External identities are also stored against indexed documents. This does not interfere with any other existing permissions or groups (main organisations, additional organisations, power users, etc) within Interact.

When searches are performed, the existing security query (validating permissions and access) has been extended to take into account the External Identities of the user performing the query.

Search results are only returned for documents which have one or more of the external identities of the user (e.g. their user id or the id of a group/role that they are in).

Note: External identities are automatically scoped to a specific Workplace Search Connector, to ensure that they are distinct compared to other Workplace Search App external identities, providing an increased level of isolation between connectors.

Synchronised State

The connector framework allows state to be recorded against users or docs (within the context of a connector), which can be used by the connector to optimise the synchronisation process. For example, recording some delta token to be able to query for changes to an item rather than fetching all content each time a synchronisation is performed.

User

For users, a collection of External Identities (for each connector) are associated with them, as well as various state values used by the framework.

Docs

For docs, there are several session values recorded against any document. Each of these values can be set at a document level and it is up to the connector implementation (or Custom Connector) to set these appropriately.

  • Deletion Mode
    • Session or Explicit
  • State
    • Arbitrary string content used by the content to streamline synchronisation processes
  • Sync Session Deletion Candidate
    • Used by the begin/end synchronisation session logic to identify which previously synchronised documents are no longer being synchronised and can be removed from the search index
  • Is Parent
    • Used internally to improve the performance of the connector in scenarios that a parent document (such as a list or library) is no longer returned and as a result the framework will need to delete all of the child documents as well efficiently.
  • Materialized Path
    • Used internally to improve the performance of the connector in scenarios that a parent document (such as a list or library) is no longer returned and as a result the framework will need to delete all of the child documents as well efficiently. Child document materialized path will always start with the parent materialized path - making it easy to discover large quantities of
  • Allowed/Disallowed External Identities
    • Used to limit access to those users with the correct External Identities

Change Detection

Native connectors typically use 2 different types of change detection and synchronisation strategies. Or, in the case of the SharePoint Online connect - a combination of them both.

Basic

Native Connectors that utilise the basic change detection strategy, will use the Last Updated date to track if document have changed. Typically, documents synchronised using this strategy will also have their Deletion Mode set to Session. This means that any documents which have not been received during a synchronisation session (even if their Last Updated has not modified) will be automatically removed once the synchronisation session is complete.

Advanced (Delta)

Native Connectors which use the advanced change detection strategy will typically implement their own change detection. In this case, a connector will typically set the Deletion Mode of documents to be Explicit. This means that the connector itself will explicitly tell the connector framework when to remove a document from the index - and the document should not be automatically removed when the synchronisation session ends. This allows for efficient usage of delta queries (just tracking changes) for connectors which support it. E.g. SharePoint Online and Drives

Digestion

Digestion is the process of synchronising relevant content associated with a Workplace Search Connector.

For any tenant, each native connector is digested sequentially.

Digestion for a native connector is done in the following stages:

  1. Initialisation
    • Setup connection, etc.
  2. Begin Synchronisation Session for Users External Identities
    • Flag any users that have Deletion Mode set to Session, as a Synchronisation Session Deletion Candidate
  3. Digest Users External Identities
  4. End Synchronisation Session for Users External Identities
    • Delete any session deletion candidate users external identities
  5. Begin Synchronisation Session for Docs
    • Flag any docs that have Deletion Mode set to Session, as a Synchronisation Session Deletion Candidate
  6. Digest Docs
  7. End Synchronisation Session for Docs
    • Delete any session deletion candidate docs
  8. Finalise
    • Close connections, etc.

Initialisation and Finalise

The initialisation and finalise stages can be used to create in advance any services (and open/test any connections) that will be used during the digestion stages. They can also be used to handle any synchronisation state management that the connector might want to explicitly handle itself.

Digest Users

This stage is used to return users and their external identities

Pairing neutrino users with connector users

Users are paired between external systems and neutrino via the same mechanisms that are used for other synchronisation features - based on usernames and emails.

External Identities

A user may have multiple external identities associated with them (user id, email, username, role id, group id, etc).

It is up to the connector to evaluate these comprehensively per user - while also evaluating the corresponding external identities for documents.

Digest Docs

This stage is used to return the docs and a relevant synchronisation result as either an Add/Update or Delete result.

During docs synchronisation, a connector also has the a limited ability to influence the existing state of items beyond just returning the Add/Update or Delete results.

  • Error Message
    • Displayed in logs and useful for diagnosing synchronisation problems
  • Deletion Mode
    • Session or Explicit
  • Synchronise State
    • An arbitrary string that can be used by the connector however it likes. Can be used during future synchronisation jobs.

Any doc that is returned will include a list of External Identities that are allowed to access to it.