Paul Maker, CTO at Aiimi, introduces one of the major components of a great discovery solution - summarising what's in a set of documents. It's a bit of technical challenge to get this right though; Paul explains how four different summarisation techniques provide the answer...

We've been designing and deploying case management and discovery solutions at Aiimi for a long time. But, over the years, a persistent challenge for users has always been the inability to quickly summarise a set of documents, or a case, so they can:

  1. rapidly prioritise it
  2. send it to the right team for processing
  3. answer questions relating to it
  4. look back and perform historical analysis

Over the last 6 months, the Aiimi Labs team have been investing a significant amount of research into effective ways to summarise and navigate large sets of unstructured information to allow users to do all of this, easily.

Unstructured collections of information often take the the form of case files, GDPR DSAR (Data Subject Access Request) disclosures, or large and complex cases within CRM systems. And when we're talking about summarisation, that can even extend to finding relationships in content and data, helping to drive better search engine optimisation and recommendations.

Combining solutions for effective Text Summarisation

To help with the challenge of summarisation, Aiimi Labs (our in-house R&D team) started by researching several machine learning based technologies that would help us to provide users with a comprehensive summary over a large set of documents. These technologies included summarisation, phrase and topic extraction, named entity recognition and sentiment analysis.

First up - summarisation. This allows you to create a human, readable and concise summary of a large body of text. You can scale the size of the summary based in the input text, create summaries for sections of the document or email chain, and apply different algorithms which are suited to different types of documents.

Then, how do you extract the core phrases and topics from a document? When people write they tend to repeat the core concepts that they are talking about, but identifying these efficiently and weeding out the noise is not as trivial as it sounds. We were able to build a robust algorithm that extracts these core items, removing the noise and leaving the user with a list of ‘phrases and topics’ that allows them to quickly understand exactly what a document is talking about.

You might also like - Enrichment: Advanced Entity Extraction with NLP

Another piece of the summarisation puzzle is Named Entity Recognition - a branch of Natural Language Processing. NER centres on extracting entities such as people’s names, geopolitical regions, locations, organisations, and key words from text. Being able to extract this information, as with phrases and topics, gives another dimension to your summary. It allows you to look at a case or a DSAR discovery, for example, and understand the key people and locations it relates to, and any indicator words that may help with prioritisation.

Finally, the fourth element of our summarisation solution was Sentiment Analysis. This is crucial for prioritising inbound correspondence and making sure you can deal with particularly urgent and perhaps frustrated customers first. Interestingly, sentiment analysis can also be used as a check and measure on your outbound correspondence back to customers too.

You might also like - Analysing the Election: Is Twitter biased towards Labour?

At Aiimi, we've brought together these technologies and incorporated them into our next-generation enterprise search and discovery platform, InsightMaker. The result of combining these techniques is that our users can now pull information out of a vast array of systems - from email platforms like Exchange and Office365, to Network Drives and SharePoint.

InsightMaker generates multi-dimensional summaries and then presents this to users in a really easy way, so they can quickly understand a case, navigate it, answer queries and support their advanced business intelligence and analytics.

So, that's a little taster of some of the exciting work we have been doing with AI and machine learning in the text summarisation space! I hope you found it interesting. You may be a data or information manager at a legal firm, a government agency, or any other organisation that's trying to better manage your case processing and drive up customer satisfaction, or more easily fulfill DSAR requests. If that is the case, then perhaps drop us a line and see if we can help you achieve those goals quickly.

Don't forget to check out full details of new features coming to InsightMaker in our latest release, Oslo by reading our release bulletin.

Until next time,

Paul