Services
Explore Discover new data and digital opportunities, prove their value, and unlock your business potential.
Strategy

Map out technology-driven strategies to forge your data, AI, and digital-first future vision.

Transform Build strong data and digital foundations, strengthened by ML, AI, data science, and apps, to achieve your goals.
Enable Establish self-service analytics, citizen data science, and low-code/no-code platforms to support your business intelligence.
Discover our services
Learn
Blogs

From deep dives to quick tips, become an industry leader with Aiimi.

Videos

Webinars, explainers, and chats with industry leaders, all on-demand.

Guides

All of our expert guides in one place. No form fills - just download and go.

CIO+ Hub

Practical advice, CIO success stories, and expert insights for today’s information leaders.

Explore
Customer Stories

Discover how customers across a range of industries have realised value and growth with Aiimi.

Data Risk Assessment

Our free Data Risk Assessment helps you quickly identify your most urgent data risk areas and biggest opportunities for automated data governance.

Partners

Accelerate your success by partnering with Aiimi. Our partner portal is your complete toolkit for driving success.

Our Work
Contact
Insights

What is enrichment? Creating wealth from information.

by Paul Maker

How do you unlock hidden wealth from your organisation’s information, and how do you do this without masses of human intervention? In fact, how do you do this without any human intervention? Now you’re intrigued, right?! This short series of blogs is going to show you some of the work that we are doing at Aiimi Labs in the information enrichment space with our product InsightMaker.

I will start by saying that this is not a sales pitch for InsightMaker, I will touch on the technology, but the real focus will be on the techniques we use and how this helps our customers. So, with that out of the way, what is information enrichment and why might we want to do it?

Information enrichment is the process of taking either unstructured data (such as Microsoft Word documents and PDF files) or structured data (such as data from SAP or a CRM system) and adding additional context to it. This context often incorporates labels, metadata, classifications and other such things that we can use to better structure, navigate and use the information.

For example, we might extract all the site and asset details from CAD drawings so that we can automatically attach them to their SAP asset records, creating a unified world of structured and unstructured asset data. Or, perhaps we might categorise inbound emails into a customer service centre and then route them automatically to the best department to handle them. We may even prioritise these based on sentiment analysis to improve our customer services KPIs.

So, how does this work technically?

The InsightMaker platform has connectors which pull information from source systems. Once we have the information, for example a PDF invoice, we pass this through something we call an enrichment pipeline. The pipeline will be configured with a whole series of enrichment steps that each have their own task, such as extracting key metadata from the invoice which we can then associate with it.

Building the enrichment pipeline

In terms of enrichment steps, there are lots of different things that we have been researching and building in Aiimi Labs.

We started by focusing on extracting the text content from as many document types as possible. For this, we landed on the open source Apache Tika library. We had some teething troubles at the start around memory usage when using this at scale in the enterprise, so we modified it - now we have a much more granular control of how it works.

We then progressed into Named Entity Recognition. Essentially, this is the ability to extract key business entities from information; for example site code, site addresses, asset numbers and so on. Interestingly, we have built a lot of IP in this space. In particular, we have focussed on how we manage Named Entity Recognition at scale and super-fast - something that really matters if you are processing half a billion files (which, yes, we do do for one of our customers – more on that another day).

From there, we ventured into classification, clustering, image recognition, extracting content from hidden databases, CAD drawings, identifying PII data (used for achieving GDPR compliance), payment card information, advanced fact extraction and more. These are all things that I will be talking about in more detail in the subsequent blogs in this series.

Why bother with enrichment?

We believe information enrichment is a crucial enabler for organisations who want to extract value and wealth from the masses of information that flow through their core processes. Automating it in this way offers organisations the chance to unlock value that was previously impossible to liberate. After all, users would never manually label or classify content, and, even if they did, what about all that historic content that’s been growing for years across your networks and legacy systems? Food for thought!

Cheers, and see you soon for the second installment of my 12 Days of Information Enrichment!

Paul, CTO at Aiimi

If you missed my blogs in the 12 Days of Information Enrichment series, you can catch up here.

Aiimi Insights, delivered to you.

Discover the latest data and AI insights, opinions, and news from our experts. Subscribe now to get Aiimi Insights delivered direct to your inbox each month.

Aiimi may contact you with other communications if we believe that it is legitimate to do so. You may unsubscribe from these communications at any time. For information about  our commitment to protecting your information, please review our Privacy Policy.


Enjoyed this insight? Share the post with your network.