top of page
  • mhoeschele

Our Conversations Analysis is Super Human

When we set out to design an AI powered topic clustering system, we had four main goals:


  1. Unbiased analysis

  2. Treat distinct thoughts as distinct

  3. Cluster based on the underlying meaning

  4. Produce clear descriptions of each topic


Unbiased analysis

To make sure our analysis is unbiased we started by assuming it didn’t need to know anything about the responses it was analyzing. This means that we produce great analysis on day 1, no training data is required!


Treat distinct thoughts as distinct

After looking at thousands of user responses it was clear that people love to open up and tell our chatbots what’s on their minds. One of the side effects of this is that they tend to give more than one thought in a response, this is great! To ensure that we didn’t ignore one of the thoughts, or assume they are related we came up with a way to identify distinct thoughts from a response and separate them prior to forming topic clusters.


Cluster based on the underlying meaning

When a human analyzes data to identify the topics that are being discussed, they will put responses together if they are describing similar ideas. We have achieved the same result with our Natural Language Processing (NLP) algorithm, it focuses on the underlying meaning of each person’s response, not the specific words they use. To stay on the cutting edge we have taken pre-trained language models and optimized them for the conversational data we collect.


Produce clear descriptions of each topic

Now, all of this world-class analysis is of no use if you still have to read all the responses to understand the general topic. We have found that the two most critical pieces of information to understand topic clusters are a title and a 1 - 2 sentence summary. To generate these we give our AI all of the responses for a given topic cluster and ask it to generate a title and summary that best represent the topic described by the group while also being grammatically correct. We also provide the most representative keywords and exemplar verbatim responses.


What makes our analysis unique is how we were able to modify existing analysis tools that were optimized for datasets of multi-page documents. We made these technologies work with the conversational data we gather and produce instantly understandable results. This proprietary process consistently produces topics grouping responses that are highly similar in meaning regardless of how many responses there are, or the length of those responses.


7 views0 comments

Recent Posts

See All
bottom of page