What's NXT in business intelligence and insight production – a personal research bot for everyone
For almost ten years, digital assistants like Siri have been able to give us answers to simple questions. But for a long time, we've lacked the tools that can really help analysts and experts quickly get through large masses of text or to automate research work. But now, those days are gone. At Kairos Future, we have been using the latest AI methods for over ten years in order to streamline research work and create solutions for our customers. Now, those methods are also available to others. And shortly, we can all have our own personal research bot. In fact, it’s already almost here.
When Siri came to town
For almost ten years, we have had them by our side – Siri, Alexa and Google Assistant. Even though they are far from perfect, they can help us find facts, check the weather forecast and train timetables.
The digital assistants are all examples of one of the greatest technological revolutions of our time, namely what goes by the name NLP – natural language processing. Or language technology and computational linguistics as many prefer to call it. It consists of a variety of techniques intended to structure, understand, but also generate, text. In the latter case we can talk about natural language generation, NLG. And the great revolution has taken place in very recent years, with language models with cryptic names like BERT, and Open AI's GPT-3 as the crowning achievements.
Insight is not just about needles in haystacks
The digital assistants are in practice more advanced variants of search, where we ask our questions orally instead of with a text string. But the result is the same. An AI-based system guesses at what we’re after and presents the most likely answer. When we search using digital assistants, we usually get one answer, while database or online searches usually give a list as a result, ideally with the most relevant answer at the top. Thereby, both Siri and regular Google search are great when it comes to quickly finding the needle in the haystack.
But a lot of analysis work is not about finding needles. Often it is rather a matter of understanding the structure of the haystack, or the landscape of haystacks. And for this purpose, there haven’t really been any good tools to utilize.
Let’s say that you’re working with a research database, and you’re interested in finding the top five topics in sustainability research listed along with the most relevant articles and the most cited researchers in each field. In that case, there is no search tool that can give you the answers. And you definitely can't ask Siri about it. The same is true for the question about what travel experiences and destinations people seem to be talking about now at the end of the pandemic and how they differ from the discussions two years earlier. The available solutions for monitoring social media discussions are not built for such a task.
Therefore, until now, those who have been interested in anything other than hit lists and sentiment have had to build their own solutions by programming in Python or R. This is both extremely time consuming and something that the vast majority of ordinary knowledge workers, experts and analysts neither have time for nor master.
The revolution is already here
But this is changing. Already at the end of the 00s, we at Kairos Future started experimenting with new research and analysis methods. It started with us collecting blog posts, chopping them up into words and analyzing them statistically. But soon, we had written chunks of code that facilitated the work, built a tool to generate contextual word clouds, and much more. In addition, the blogs had been joined by news text, reports, scientific articles, patents, job postings, and more. Over a ten-year period, we completed around 300 assignments where text analysis was the central component, or at least one of them.
By starting to apply increasingly advanced methods of text analysis, we had the opportunity to solve problems that were otherwise more or less insoluble, we were able to carry out complex business intelligence analyses much faster than before, and not least, we were able to find completely unique insights, that neither we nor our customers knew that we or they didn’t know. The famous "unknown unknowns".
The experiences from all these projects have now been gathered in the tool Dcipher, which has been an independent startup for a few years. At Kairos Future, analysts and consultants now use Dcipher in numerous projects for a variety of purposes. The reason for this is simple: Dcipher contains everything that is needed to carry out ad hoc analysis projects, such as analyzing thousands of free-text survey responses, news article.
Identified topics in a study of vaccine hesitancy in three countries and three different languages – Urdu, Swahili and English. Each circle represents a semantic topic and each hill a cluster of themes, where the interpretation is made manually. Today, each cluster and topic can get an autogenerated name and summary. Data source: news media. Analytics platform used: Dcipher Analytics
How it works – a few quick examples
Let's look at a few examples of how we at Kairos Future have used Dcipher recently:
• Categorized 700 posts in a SWOT analysis that were written in a chat channel during an online workshop. It took 7 minutes while the participants picked up a cup of coffee. The results were presented when they came back.
• Analyzed several open-ended responses in an employee survey for an international corporation about covid-related experiences. The survey was answered by over 3,000 people in 4 different languages, all automatically translated in the platform into English.
• Analyzed 30,000 scientific articles, lots of social media posts, hundreds of sustainability reports and policy documents in a project on sustainability trends.
• Identified narratives around telephone sales through analysis of reports and comments, news media and social media.
• Created a web-based solution to identify and follow startups in China, based on a wide range of data sources.
• Created a solution that automatically classifies research applications into predetermined categories, on behalf of a research organization.
Many of these examples are of the kind where the alternative is doing nothing at all, or doing it manually. For example, few people use open-ended questions in surveys anymore.
The reason for this is that analyzing the answers is such a cumbersome task. If such questions are included, the answers are at most displayed as a word cloud with a large "AND" in the middle. However, open-ended questions are a fantastic way to capture what is really going on inside people's minds, and an opportunity to capture the previously described unknowns. In addition, they are quite easy to analyze using AI. In Dcipher, it is done by simply dragging a text field to a workbench that automatically identifies topics. Often, you get some themes that have no meaning, which can easily be removed. The remaining topics can be treated the same way as every other question in the survey, for example analyzing them by gender, age or other background data. They can also be "enriched" with supplementary information, such as sentiment – i.e. how positive or negative are the people that address the topic on average.
In a very similar way, it is possible to analyze any type of text. With modern AI-based methods, it is thus possible to make sense of the enormous amounts of text that are generated both within and outside the company. We can take these huge unstructured, messy masses of text and transform them into something that is understandable, which can be further transformed into action-oriented insights.
Waiting for the research bot – but most can be done already
There are still a few challenges to be solved before we have the perfect research bot, but we are now very, very close. One is about interpreting the meaning of, for example, an identified topic. Where a trained human directly sees what a text is about, an AI has an extremely difficult time generalizing this from a bunch of fragments that it has collected and summarizing it with a good headline or written summary. Just six months ago, there was no solid solution to that problem, but recently it has been developed and is now being implemented in the Dcipher platform. With such solutions in place, we'll soon be able to submit a bunch of reports straight into a solution like Dcipher and instantly get back an analysis and a summary of the key insights. Maybe even read aloud using speech synthesis, just like Alexa or Siri. Then we're quite close to the jealous hologram avatar Selma from the '90s show Time Trax.
In anticipation of that, we have to make do with what is available, namely extremely powerful platforms that are based on the very latest frontline technology, that save time, blood, sweat and tears, while generating better and more surprising results than people are capable of. And that now, even can produce human style summaries of the analysis. And that's not bad either, it’s fantastic.
Want to know more?
Do you have a specific question you'd like an answer to where you think AI methods might be the answer? Do you want to set up a tailor-made automated business intelligence service using the very latest technology, or perhaps analyze customer interactions or reviews? Or are you just curious and want to know more?
Either way, contact:
Mats Lindgren, firstname.lastname@example.org
Read more about the AI tool Dcipher, here: www.dcipheranalytics.com
A few facts about Dcipher:
Advantages of Dcipher compared to other types of tools include:
• It is designed for text. There are several analysis platforms originally designed for structured (numerical) data, such as Alteryx and Rapidminer, that can handle text. But when working with text in those platforms, you quickly get very complex data structures that cannot be easily managed with traditional tools.
• It includes all the tools you need, from data cleaning, through text enrichment (e.g. labeling named entities, sentiment, etc.), analysis and on to training AI models and generating summaries of analysis. This means that you do not have to switch between different tools.
• It has a simple visual interface which makes it extremely easy to use, compared to every other conceivable option. You interact with the data through drag-and-drop, you are always in control of what you do and immediately see the results of each operation.
• It requires no programming skills (a so-called no-code tool), and basically no prior knowledge in the NLP area.
• There are a variety of aids in the tool that allow you to automatically clean text (e.g. remove duplicate social media posts, delete symbols etc.), speeding up the process considerably.
• It is cloud-based, which e.g. makes it possible to process very large amounts of text incredibly quickly, even when doing advanced analyses. Things that take hours on a PC, might only take a few minutes in Dcipher.
• It is possible to save the so-called pipeline of operations that is built during an analysis, which means that you can easily repeat exactly the same analysis on a new dataset, for example a month later. You can even schedule the repetition making the platform export new data sets or reports every day, week or month.
• You can easily train domain-specific language models, e.g. when analyzing or classifying text within a domain with a lot of specific lingo that general language models cannot handle (e.g. technical language or slang). s or social media posts, bunches of customer interactions and the like. But it can also be used to automate workflows or build solutions that automatically and continuously e.g. classify posts or articles, thus acting as an AI sidekick to an expert or analyst. It can even automatically interpret, name and summarize an identified topic consisting of several mentions and posts, in a way a human analyst would do it.