SharePoint Online
The SharePoint Online Native Connector can automatically discover and synchronise content from all of the common SharePoint areas:
- Site Collections
- Sites
- Site Drives (Document Libraries)
- Site Drive Items (Files and Folders)
- Lists
- List Items
- List Item Attachments
Configuration Overview
The configuration for SharePoint Online allows granular control of what you want to be able to synchronise. These controls allow you to filter what content is synchronised to the search index and therefore discoverable by users in search. Importantly, by default, permissions are applied to all content that is synchronised (users, groups, etc).
Any synchronised content can then also be displayed in content widgets and block editor content listing blocks, for which there is a newly introduced Advanced Filter which allows for the content to be further filtered (all content is still discoverable via search if users have permission to see it) for display purposes in the content widget or the content listing block (including directory level filters etc).
For more detailed information about how to configure the connector see:
Security
What is synchronised
Content is synchronised to the search index for the purpose of search discoverability only. This ensures that a search can search across all the relevant fields of synchronised documents (title, description, content).
The connector will only synchronise SharePoint Sites as specified in the config - and it is also possible to select which Lists or Document Libraries that you want to include from each configured Site.
The connector will not synchronise personal One Drives.
What is viewable
Only the the basic synchronised information such as title and summary are presented in search results within Search, Content Widgets and Content Listing Blocks. It is not possible for users of the intranet to load up the entire document content within the intranet, and instead when clicking a result, they are directed to the external system - allowing that system to perform its own security checks again before any document is accessed.
SharePoint Permissions (Groups, Shares)
Users will not be able to see synchronised documents in Search, Content Widgets and Content Listing Blocks, which they do not have access to in SharePoint Online.
Permissions are applied to Synchronised SharePoint Online content so that they behave in the same way when content is accessed directly within SharePoint Online. This means that within the intranet quick/general search, content widgets and content listing block on pages - users will only see the content that they would be able to see and open within SharePoint Online. This includes content that has been explicitly shared with them or a group that they are part of.
Microsoft APIs
The majority of the http requests made, use the Graph API.
However, not all required functionality for the connector, is possible via the Graph API alone, and as such the connector also makes REST requests to the SharePoint Online API, for the purpose of:
- Identifying if a Site has unique role assignments or inherits them
- For each Site being synchronised fetch the role assignments (groups, users)
- Identifying if a Site List has unique role assignments or inherits them
- For each Site List being synchronised fetch the role assignments (groups, users)
- Identifying if a Site List Item has unique role assignments or inherits them
- For each Site List Item being synchronised fetch the role assignments (groups, users)
- Identifying if a Site Page has unique role assignments or inherits them
- For each Site Page being synchronised fetch the role assignments (groups, users)
- Fetching a list of Site List Item Attachments
- Downloading Site List Item Attachments
- Fetching a list of Site Pages
- For any site group role assignment, identify which users are part of that group
Encrypted Credentials
The credentials of a connector are encrypted during save using the latest industry recommended encryption algorithms and best practices. Once saved and encrypted, the credentials (encrypted or un-encrypted) cannot be retrieved by any end user, power user or administrator of the system. For more details see: Credentials
Specifying which content to synchronise
For any SharePoint Online native connector there are a lot of configuration options which allow you to fine tune which content you want to synchronise (or not).
For sites, you can choose to synchronise All sites, or a list of Specific sites only. If you choose the All sites mode, then you can also explicitly exclude specific sites as well.
For any synchronised site, it is also possible to choose which Drives (Document Library) or Lists you want to synchronise. Just like sites, you can choose to synchronise All (and a specific set to exclude) or Specific Drives and/or Lists. It is also
It is possible to configure the Drive (Document Library) filters and the List filters at a global level, so that they automatically apply to all sites. Or, you can do a combination of global filters and then also override these global filters for specific sites.
It is not possible to currently apply a filter of which files from a Drive (Document Library) or list items from lists are synchronised. I.e. if a drive or list is configured to be synchronised, then all of the items within them will be synchronised. It is planned in a future release to support this additional level of filtering.
General Search Filters and Taxonomy
For synchronised content a set of default taxonomy values are automatically recorded, which when configured allow for additional filters within the general search, as well as for these taxonomy values to be used in the advanced filter in the content widget and content listing block.
For lists and their list items, it is also possible to synchronise data from custom columns into taxonomy values, so that they can also be used in the general search filter and the advanced filter of the content widget and content listing block.
Taxonomy Values
Content Type
By default, the synchronised entities will have one of the following content type taxonomy values set to the following, depending on its type:
- Site Collection
- Site
- Site Drive
- Site Drive File
- Site Drive Folder
- Site List
- Site List Item
- Site List Item Attachment
- Site Page
The text labels for these types can be customised using the SharePoint Online Connector Config
Additional Taxonomy
For synchronised content, the following additional taxonomy values are also stored where relevant
- Site Collection
- Site
- Site Drive
- Site List
- File Extension
- Directory
The text labels for these additional taxonomy values can be customised using the SharePoint Online Connector Config
Icons
Default SharePoint Online style icons are used for synchronised content
- Site Collection
- Site
- Site Drive
- Site Drive File (can be customised per extension)
- Site Drive Folder
- Site List
- Site List Item
- Site List Item Attachment (can be customised per extension)
All files are presented with the same icon by default, but it is possible to override this behaviour and define custom icons for different file extensions within the SharePoint Online Connector Config.
Different sync mechanisms for different types
Different entity types are synchronised in different ways (as supported by Microsoft Graph and the SharePoint Rest API).
Drive Items are synchronised using a Delta URL which can report any changes since the last synchronisation session. The Delta URL is stored in the synchronised state of the associated Drive.
Note: It is also possible to synchronise List Items by Delta - but that feature is only in Beta for Microsoft Graph, so this has not yet been implemented for the Workplace Search SharePoint Connector.
Delta considerations
When performing a delta request (either an initial or follow up delta) the response can include the same content multiple times. Thats just a documented thing in the graph api spec, so the synchronisation process has to handle this and only take the last document as the most recent one.
Delta synchronisation is only in production mode for drive items - and has been in beta mode for list items for some time. For list items, there is the added complexity of handling attachments - so these considerations would need to be investigated and verified before proceeding with implementing deltas for list items:
Handling invalidated delta urls
The Graph API docs indicate that at any time an existing delta url could no longer be valid (returning a 410 gone response).
If this happens, then we are forced to do a full resync of all of the child drive items.
To do this, a connector has the ability to forcibly update all of its children (using materialised path) to have Session deletion mode rather than Explicit. I.e. this bypasses the usual mechanism for updating synchronisation state and allows for synchronisation to continue where it had just failed (410).
Then, after doing a full resync of the drive items, any synchronised drive items deletion mode will automatically have their deletion mode set back to Explicit, and any that were not synchronised will remain as Session deletion mode, and correctly be deleted at the end of the synchronisation session.
Features
- Handling of 410 error responses
- Handling of throttling, appropriate delays (from header) and retries - for both the Graph and Rest clients
- Diagnostic counters keeping track of how many times API methods are called
- Extensive logging
- Highly configurable
- Both basic and advanced (delta) synchronisation
- List item taxonomy
- List item avatar url path builder (configuration)
- Customisation of behaviours for sites and lists
- Default population of avatar asset ids based on file extensions (hard coded currently)
Creating Azure OAuth App and SharePoint
The SharePoint Online native connector uses both the MS Graph API and the SharePoint Rest API.
For more information see:
Domain Mapping
If your SharePoint Online is configured with a different domain than what is associated with your intranet and users, then it is possible to configure a domain replacement. During user synchronisation, when a SharePoint Online user is being matched with a user in the Intranet, the email address is altered (only for the purpose of user look ups) based on this configured domain replacement.
Updated 5 months ago