To the global AI – NLP community; BREAKTHROUGH ALERT! The HIPSTO tech team has released a (second) science paper presenting two radical new approaches in the field of ‘Related Content Detection’ using the Google BERT model. The conclusion is nothing short of mindblowing; both approaches surpass the 512 tokens text sequence limit and extend it to…infinity! HIPSTO’s new approaches also support 104 languages. Wait…what?

Indeed, the major restriction to maintain accuracy levels using Google’s BERT model is now completely lifted. No matter the text volume or length, the output quality won’t diminish, while the scale of comprehension and annotation becomes virtually unlimited. So if you’re involved in Data Annotation, Alt Data, Web Data Integration or any type of text-based research, be prepared to now see how related content detection performance rates go stellar. The science paper can be downloaded from

So much for exercising our bragging rights, back to preparing for our roadshows in the UK & Boston/New York. Perhaps we will see you there :-)!

Read our latest Blog - Native vs. Translation Learn More >>