SharePoint Syntex (formally part of Project Cortex) was released publicly at Microsoft Ignite 2020. Syntex is Microsoft’s first serious foray at infusing artificial intelligence functionality into SharePoint, the document management system many of us has come to use and (sort of) love in the corporate world over the last 15 years. It’s a critical tool for both managing documents and collaborating on content among teams. So with all the changes Microsoft has introduced into SharePoint in the last few years, the AI component was discernibly inevitable. And it’s a cool little tool – if you have the right use cases for it.
This blog post is dedicated to giving you the down-and-dirty intro to Syntex, including what it is, applicable scenarios, and why even bother with it. There’s already a lot written about how to configure and set it up (not the least of which is provided by – ahem – Microsoft), so let’s focus on what matters from a business (and analyst) perspective.
SharePoint Syntex is artificial intelligence tool used to automate the extract of data from structured (or unstructured) documents. SharePoint sites can now use Syntex to conveniently upload and extract data automatically into document libraries. Then you can use the library content – and data – as you normal with as site information.
There’s a bit more to it just being “SharePoint” – there are some back-end Azure and PowerApps stuff going on (e.g. the Form Processing Model), but generally speaking the Syntex part – and for the general license – will be for “unstructured” documents.
- Convenience
- Automation
- Saving Time
Syntex is really good for letting “SharePoint do the work” instead of yourself or your users have to add documents and update document library columns with metadata. Those columns can be filled in based on data inside the document.
This is essentially a form of automation – reducing the manual work and letting a “smart” system do this bit for you.
Syntex can also be a really good time saver, especially if you’re bulk uploading like-minded documents and don’t want to spend hours classifying them.
In the long-run, this may save you and your organization time and money.
You’ll have to buy a specific license from Microsoft in your M365 tenant – one for each user that intends to build, maintain and use the Syntex AI models. This can be done from the Admin Center in the Purchase Services screen listed under the Billing accordion in the Navigation area (the license is technically under the “Add Ons” category as of this blog post publishing). You’re looking at about $5 USD per month, per license, and it’s only available to existing E3 and E5 users. If you have over 300 Syntex licenses, you are automatically 1 million AI Builder credits.
Microsoft advertises that you need a license for each user who is basically a “stakeholder” of Syntex. This is to say, if you have five people all building models and using Syntex content, they would each need a Syntex license. In reality, that doesn’t appear to be entirely correct. If you are a global admin and want to just use a document library with Syntex-provided content, it appears to work as normal without a Syntex license. I may not be able to build a model, but I can get to the content (although I am a Global Admin in my tenant). Either way, it’s best to get at least one license so you can actually build the AI models with your documents.
Here’s a tip – if you have a user in your tenant just for technical and configuration things (like a “service account” user) it’s probably more cost-effective to buy one license for this user account. As regular users, it appears one can still enjoy the benefits of Syntex just by browsing/using the document library that Syntex is applied to. That, however, may change in the future.
Syntex will need a Content Center to actually build the processing models you want. But once they’re built, you can assign the models to document libraries on any of your sites that you have access to in your tenant (as long as their modern team or communication sites).
Once that’s set up, you can visit the document libraries with the applied model and handle the processing at that location (how it kicks off can be handled either manually or automatically. Power Automate can be used to help automate this process).
After purchasing the license, you’ll have to assign it to your users.
Once that is done, you’ll need to perform a series of configuration steps in the Microsoft 365 Admin Center for your tenant.
Rather than write exact copies of this information, you can find step-by-step instructions by Microsoft at their Setup SharePoint Syntex Page on the Microsoft Documentation website.
There’s basically four steps to getting a model created – and they’re all performed on your newly created Content Center SharePoint site.
- Add Example Files
Upload the documents you want Syntex to process. Do this one document type at a time. - Classify Files and Run Training
Add labels to the example files so the model knows what type of document to classify it as. Provide explanations to refine that accuracy. - Create and Train Extractors
Train the model to identify what to look for inside the files (“extractors”). Use explanations to refine that accuracy again. - Apply Model to Libraries
Choose the document libraries across the sites in your tenant to use the new model.
Once the model is set up and applied to the library, users can add documents to the library as per usual. The user can then run the AI component manually (through a button in the GUI):
Or have Power Automate run it upon upload. I recommend getting familiar with Power Automate workflow creation and the associated triggers and actions, including reviewing your business processes and seeing where this automation might fit in.
How will it improve my current SharePoint Experience?
In testing Syntex, I found that there were many good things that created genuine improvements on the SharePoint experience. After the initial configurations and learning the document processing model creation cycle, I can genuinely see how this might benefit organizations in the future – especially if said organizations are open to business change.
Automation of Repetitive Tasks
Rather than input document library metadata manually, Syntex allows basically allows users to upload documents and walk away. The model will fill in the library columns on its own. This saves time and effort.
A Little Bit of Auto-Pilot Mode for Information Architecture
The model design works with SharePoint to basically create new content types and site columns. All are available within the Content Type Gallery in the Admin Center. You can even let Syntex work with your Term Store to auto-tag documents based on your taxonomy. You can even auto-apply retention labels to the different content types as they come in.
Disseminate Information Faster
By tying Syntex models to document libraries, content and metadata is automatically indexed across your tenant to be searchable and usable. This means teams and business units can more easier find what they need.
Syntex Shortcomings
In my configuration and trial adventures of Syntex, it didn’t go all peaches ‘n cream. As with my usual experience with Microsoft 365, the vendor tends to over-market their products as simple, intuitive and far-reaching in terms of improving the digital workplace experience. There were a few things that left me wanting in terms of a better user experience, as well as a better consumer of this new product
The accuracy of the documents you model is highly nuanced
As discussed, you could provide documents of many types and build highly trained, complex processing models with very accurate classifiers and extractors. But accuracy results will differ based on a number of factors. This is even more true on a customer-to-customer basis. Consumers need to be aware that what works for a colleague or a partner organization may not work as effectively for another. It's all about sandboxing, sandboxing, and more sandboxing.
Syntex has a user learning curve and requires ongoing commitment to maintenance
Microsoft brands Syntex as low-code, easy to learn, and quick to jump in. I beg to differ, even as an analyst with a technical background. It will take practice and a deep understanding of both the model learning process, the user interface, and information architecture in general. For modeling to be successful, digital workplace teams will need to dedicate resources internally to learn, build, and curate processing models so that users can generate value out of Syntex and its capabilities.
Read the Fine Print for the Licensing Model of Syntex
The Syntex advertisements and marketing material has warts - or at least the way it's contextualized. Signing up for a Syntex license only provides the Document Understanding processing model, not the Forms Processing Model - although you may be lead to believe it's all one and the same. These are two different technologies with different investment costs. Read the fine print, do your research, and make sure you know what you're getting with a Syntex license (and what you're not).
Good Use Cases and Not-So-Good Use Cases
As like any piece of software or functionality, there are limitations with Syntex – and when I say “Syntex”, I mean the Document Understanding Processing model, not the Form Processing Model (for reasons I will explain later). Most people would agree that automation of documents and their classifications is a good thing. But there are certain documents that make sense with Syntex, and some that don’t. Additionally, Form Processing is good for certain documents, but not necessarily for a basic Syntex license. Below are a list of documents you might find useful if Syntex is on your radar:
Document Types That Syntex Could Work Understand Quite Well
- Contracts and Letters, such as Operating Agreements, NDAs, Statements of Work;
- Legal Documents
- Incorporation Documents
- Newspaper Articles
- Digital Images
- Accounting Documents *
- Shipping Documents *
Some Things That Might Not Place Nicely with Syntex
- Handwritten documents / Scribble Notes
- Resumes and CVs
- Meeting Minutes (without a Template)
- Accounting Documents *
- Shipping Documents *
50 Shades of Document Understanding
You’ll notice I put asteriks next to the Accounting and Shipping Documents, and it’s included on both sides. That’s not a mistake. These documents could work quite well – or not. The grey area lies in the “semi-structured’ nature of these documents. That is, documents that are sort of visual in their layout but definitely have some labels and regional similarities every time. These can include Purchase Orders, Invoices, Journal Entries, Credit Memos, and other types of accounting.
It is worth just sandboxing with these documents as you are building your AI models and seeing what results work for you. You’ll need to be absolutely accurate with your labels and their regionality/settings – so don’t be afraid to get your fingers dirty with multiple rounds of testing. It may take a bit of time.
Why Am I Not Talking About the Form Processing Model?
As mentioned above, the form processing model is great for AI usage to look at structured documents with layouts, but the difference between wanting it and enabling it is about $500 USD/month. This is no small cost to businesses that are operating on tight budgets and don’t quite know if they need that level of AI power.
Additionally, the Form Processing Model is less of a structural integration into SharePoint than the Document Understanding Model. What I mean by this is that Form Process is handled by the PowerApps AI Builder. You simply apply the Builder to a document library and boom! – the documents are scanned with more horsepower to handle structured content. But it doesn’t manage the content types, the content type columns, or the Power Automate part. You wouldn’t see your AI Builder in the SharePoint Admin Center in the Content Types Gallery, and you would certainly need to be familiar with PowerApps and the graphical user interface to maintain the model. Additionally, there is no sense in using Form Processing unless your goals included specific use cases around object, language, and key phrase detection or sentiment analysis. In other words, you’d probably want to know what exactly it is you want to do with AI technology before you commit to spending that kind of money – especially if the Document Understanding Processing Model in Syntex might be able to handle most of the basic document automation needs you encounter on the day-to-day.
The AI Builder in PowerApps is a wonderfully powerful tool and will proably be scaled in your Microsoft 365 experience in the future; but for document library metadata automation it’s likely overkill in most cases.
Making the Most of Syntex
So what really works and what doesn’t? I have sandboxed with SharePoint Syntex considerably over the last few weeks since its release. Here’s what I’ve found as the rules of thumb:
Ensure the right word markers, labels, and expected data are in predictable places. Regional consistency of your data in Syntex is the key to its success.
Helping the model understand what you want to scan is really on you. By familiarizing yourself with the Explanations features (Phases, Patterns, and Proximities), you can make the Classifier and Extractor steps much less daunting, and dramatically improve your accuracy scores with your model.
For best results, mix one part document type with at least five parts of document samples. For extra accuracy, you’ll probably want to feed it at least 20. Document Understanding thrives on learning and finding patterns and consistencies. Microsoft recommends at least five, but don’t be afraid to give it more – especially for content that may vary in it’s structure.
The type of font and font size on your documents can impact the accuracy rate of the models you build. Make sure to use clear, readable font that isn’t obscured or overlapped with other page elements. It’s best to ensure the documents you manage are set up for success before you bring them into Syntex altogether.
And if you are going to attempt modeling documents with handwriting, at least require print – don’t even bother with cursive or chicken scratch. You’re not going to enjoy the results.
This will be critical to undertake as not only does it require a change of behavior (especially uprooting hard-baked SharePoint UX expectations), but also a change of “trust” in automation technology to essentially cover off menial tasks. Communicate, demonstrate, and have a method of taking in feedback or criticism on of processing models so that they can be improved over the long run.
Having someone dedicated as the Syntex Model Admin is not just good governance, it’s good business. Your documents and what you need from them will change over time, and you’ll need to ensure you have the resources available to make sure those models are still effective.
Much like a car, you’ll need to maintain your models to get the best mileage – and keep driving in the right direction.
Templates that have been adjusted over time (e.g. new layouts) may cause some level of problems. You may need to build a new model to handle a new document layouts.
0 Comments
Leave A Comment