True data science
Deep academic research
Related Content Detection
Discovering and determining how to treat ‘similar content’ is one of the fundamental challenges in Natural Language Processing (NLP). This problem of semantic similarity becomes more complex when the automatic search, retrieval and the analysis process is focused on multilingual text content from several web sources.
Our engineering and data science team have released 2 science papers detailing how to use and extend the current leading Transformer models, then apply them to tasks focused on the semantic similarities of news content.
We are now able to find any related content in 106 languages, cross lingual, and irrespective of the input text length of the content. A global breakthrough in the field of NLP science.
Abstractive Text Summarization
The majority of text summarisation tools in the market work via a so-called ‘Extraction Approach’. In the fields of Natural Language Understanding and Natural Language Generation (NLG), the most innovative route is currently via “Abstractive Technologies’.
‘Abstractive Technology’ approaches still suffer from accuracy inefficiencies and are only available as single language tools. Our technology team has created a radical method and new neural network based off the mT5 Transformer. We have overcome theses inaccuracies and created a tool that is multilingual.
We aim to release our third science paper, on this topic, in H1 2021.
BERT Semantic Similarities
Research on the specific features of determining the semantic similarity of arbitrary – length text content using multilingual Transformer based models.
Semantic Similarity of Arbitrary-length Text Content
Research into the possibilities of using the multilingual BERT model for determining semantic similarities of news content.
Multilingual Abstractive Text Summarization
Due Q1 of 2021