I think we can all agree we’ve seen dramatic changes in the way we live and work as a result of the COVID-19 pandemic. Notably, there’s been a shift towards the digital - many companies are now embracing remote working technologies at a rate previously unthinkable. There are many positive aspects to this change; it might be a forced situation, but remote working is dispelling the myth that we need to be in the office 5 days a week to be productive. Hopefully, our increased willingness to allow people to work from home will continue long beyond this awful virus and leave a lasting positive impact on our work life balances and mental health.

But, in a world where I spend the majority of my day sat in front of a webcam, I can’t help but ask: at what cost?

For many years now there has been a bubbling concern about how organisations collect our data and what they use it for. Artificial Intelligence (AI) techniques are key for large organisations to leverage the vast quantities of personal data they collect. As such, AI has made quite a reputation for itself in this space – and it’s not always a good one (see Cambridge Analytica). At a time where we are putting more information about ourselves online than ever before, whilst simultaneously being more conscious of social justice issues in our society than ever before, it feels like it's never been more relevant for us to reexamine the question: How can we use AI for good?

As data professionals, we have an important duty of care.

Working as a consulting data scientist, I regularly find myself entrusted with (often highly sensitive) client data. As data professionals, we have an important duty of care, not only to our clients as organisations but also to the employees and customers whose personal information these organisations hold. Fortunately, this is where technology can help us, for example InsightMaker’s ability to detect Personally Identifiable Information (PII) in documents and manage the file permissions appropriately. But, whilst anonymising documents by blanking out the sensitive personal information detected this way is great for compliance purposes, life is not quite as simple when looking to build an AI model. Even if PII is masked to me as a data scientist, how can I be sure that the AI model I create is using that information responsibly?

In the end it falls to us as data scientists to carefully consider these implications.

The answers to this question are rarely simple. Complex AI algorithms can be notoriously hard to explain, and it’s even harder to pin down the exact logic for their key decisions. Even removing the potentially sensitive information from the model does not always work, as your model runs the risk of basing its decisions on seemingly innocuous data which is highly correlated to more sensitive fields (ones we should not be making decisions based on, such as gender and ethnicity). In the end it falls to us as data scientists to carefully consider these implications for every model we build and make sure to put in place safeguards which are appropriate to the individual use cases for our models.

This week Milton Keynes Artificial Intelligence (MKAI) are addressing this very issue. In the run-up to this event I was fortunate enough to be invited to attend the Boundless Podcast to discuss this issue with Rudradeb Mitra of Omdena, David Troy of 410Labs and John Kamara of the Machine Learning Institute of Africa. AI keynote speaker and founder of MKAI Richard Foster-Fletcher and I also spent some time later in the week discussing the complex issues surrounding data privacy in more depth.

If you enjoyed this blog or the above podcasts and would like to learn more about Using AI for Good, I highly encourage you to attend MKAI’s next virtual event at 5pm on Thursday the 25th of June. The event will feature a fantastic line-up of guest speakers, including Aiimi’s Luke Rawlence who will be hosting an interactive introduction to Natural Language Processing.