Deep learning, cloud based

Sophisticated AI solutions suite

FalconV™

Our FalconV™ AI stack powers a fully-automated, multilingual, end-to-end solutions platform.

Delivering a smooth, seamless integration of data extraction, curation, analysis and delivery, its power is untouched by other global platforms.

All proprietary AI microservices have been developed using the very latest Transformer architectures (e.g. mT5) to enable advanced multilingual – 100+ languages – text analytics via Natural Language Understanding (NLU).

EXTRACTION

Web Scraping
  • External & Internal (text) data
  • “Humanized” general spider
  • Automated re-configuration ability
  • Unlimited scalability
  • Historic data retrieval
Blind Vision
  • Proprietary AI (web) text data extraction & labeling technology
  • Technology paradigm shift
  • Superior vs Computer Vision and other technologies
Automated Source Navigator
  • Automated Data Sourcing
  • Smart search & recommendation on bots
  • Coming 2022

CURATION

Automated Text Classification
  • IAB Content Taxonomy 2.2 – 600+ categories
  • 100+ languages
Duplicate Detection
  • Automated detection of (near) duplicates
  • Plagiarism notification
  • Depth at 1000 articles
  • Unlimited depth in theory (server capacity)
Related Content Detection
  • Extension of Google BERT & other Transformers
  • Any text input length
  • Multilingual & Crosslingual
  • 2 Science papers
Fake News Detection
  • Conceptual thoughts backed by Data Science
  • New ’secret sauce’
  • Extraction of hidden features in (news) content
  • Coming 2022
Sensitive Information Discovery
  • Search & Locate sensitive information
  • Internal Data at companies
  • External Data on brands
  • Coming 2022

ANALYSIS

Named Entity Recognition
  • Extension of mT5 transformer
  • Multilingual – 100+ languages
  • Highest accuracy in the market
  • F1 score at 0.95
  • Foundation to most NLP solutions
Advanced Sentiment Analysis
  • Natural Language Understanding (NLU)
  • Multilingual – 100+ languages
  • Human-level sentiment polarity of content
  • Consistency across 100+ languages
  • F1 – 0.9443
Abstractive Text Summarization
  • Natural Language Understanding (NLU)
  • Natural Language Generation (NLG)
  • Multilingual – 100+ languages
  • Coming 2022
Smart Keywords
  • Advancement on ‘Graph Models’
  • Smart semantic analysis
  • Location of keywords in text content
  • Human-level editing quality
  • Coming 2022
Anomaly Detection
  • Detect & Highlight anomalies in text
  • Historical timeline analysis
  • Coming 2022
Cross-Correlation
  • Different heterogeneous data sets
  • e.g. Sentiment vs. (share) Price
  • Automated predictive modeling
  • Coming 2022
Automated Poll Generation
  • Natural Language Understanding (NLU)
  • Natural Language Generation (NLG)
  • Original content construction for publishers
  • (Consumer) Engagement Engine
  • Coming 2022

DELIVERY

Deliverables
  • RESTful API
  • iOS & Android Apps
  • Messenger Platform (e.g. Telegram)
  • Chatbots
  • Webcast
  • HIPSTO Knowledge Graph (KG)
  • Data Marketplace (‘Data Supermarket’)
  • Dynamic Information Search

Data Capturing

Knowledge Discovery

Delivery

Sophisticated Natural Language Understanding (NLU)

Our AI microservices are reinforced by proprietary neural network architectures, to extend and improve standard Transformers. 

This empowers sophisticated Natural Language Understanding (NLU), the advanced area of Natural Language Processing (NLP), to extract the true meaning and context of text in a human-like way, in an unparalleled way.

NLP versus NLU diagram
Chinese review analysis diagram

Powerful Multilingual Capabilities

We use a single model for multiple language approach, trained to understand multiple languages at the same time, using existing Transformers (e.g. mT5) reinforced by our own proprietary neural network architectures.

This gives our solutions the edge on machine-translation services, deriving rich contextual details, vital for meaning.

Paradigm Shift in Information Extraction

We have built a new Information Extraction technology, Blind Vision, that is superior to Computer Vision.

It represents a true ‘Paradigm Shift’ in the extraction and automated labeling of unstructured text data. Blind Vision is faster, more accurate and cheaper.

Due to the techniques used in Blind Vision, it requires far lower computation resources and therefore, its Carbon Footprint is also smaller.

Blind Vision tech advancement diagram

Join the Organisation of the Future

We are building a new type of organisation – a digital ’Neural Network’ consisting of top talent from across the globe.

  Blind Vision

 

We have dubbed our proprietary web text data extraction and labeling technology, Blind Vision. It is now, arguably, the best in class globally and superior to the established and trusted Computer Vision and other technologies.

Test studies* have been conducted versus a US based web data integration platform that uses Computer Vision, and the results are conclusive. They showed a +35% increase in accuracy and -65% lower running costs using Blind Vision technology.

We like to remain a little secretive about the ‘sauce’, but we can say that Blind Vision combines sophisticated Raw Code Processing algorithms and our own deep learning network architecture.

Q

  Advanced Sentiment Analysis

 

Sentiment analysis is a very powerful tool with many commercial applications. However, it is very difficult to do well. Many claim to have a sentiment solution, but upon analysis, few in the market really do. Current solutions suffer from poor consistency, limited accuracy and lack of advanced deep learning techniques.

Humans can easily judge the polarity of text, unlike machines. We have developed an apex sentiment tool, using our proprietary neural network architecture, which enables real Natural Language Understanding (NLU) and emulates how humans judge the content and context of text.

Our solution provides consistent sentiment analysis of high or low-frequency content, in long or short format, across 100+ languages. And, we can do all of this with an impressive F1 score of 0.9443!

.

Q

  Named Entity Recognition

 

Valuable (business) information is buried in a largely text-based (79%) data explosion, most of which also resides in unstructured data on the web. The ability to extract, organize, analyze and connect large amounts of unstructured text data has become of paramount importance.

Extracting, classifying, and connecting entities via Named Entity Recognition (NER) technology plays an important role in sorting unstructured data and identifying valuable information. NER is a key foundational block for any information discovery pipeline and the basis for most Natural Language Processing (NLP) solutions.

We have built the new industry standard: Multilingual NER (in 100+ languages) that is unrivaled in accuracy vs. current ‘open source’ solutions and performs with an F1 score of 0.95.

.

Q

  Web Scraping

 

Web Scraping may sound easy, but it’s not! We have solved the 5 most prevalent issues faced by standard web scraping methods.

One key issue involves constant website layout changes. SEO improvements and UX/UI changes are delivered through HTML layout amendments. As a result, element locators that web scrapers are configured to in order to extract data, change and break the scraping process by extracting incorrect or no data. It takes a lot of manual effort to update these configurations and maintaining thousands of sources becomes near impossible.

We have fully automated the process of source reconfiguration to present you with a truly scalable, leading-edge web scraping solution. One that operates in 100+ languages and can scrape any text data from any web source in real-time.

Q

  Automated Text Classification

 

We have built an industry leading, multilingual, automated text classification capability that demonstrates superior accuracy (underpinned by Natural Language Understanding), uses no language translation layer (which significantly distorts the meaning of content) and is able to proces all length content, via our single, one stop shop, platform. 

These accuracy levels now mimic human understanding of any text.

Q

  BERT Semantic Similarities

 

Research into the possibilities of using the multilingual BERT model for determining semantic similarities of news content.

Q

  Semantic Similarity of
Arbitrary-length Text Content

Research on the specific features of determining the semantic similarity of arbitrary–length text content using multilingual Transformer based models.

Q
Q
Q
Q
Q
Q
Q
Q
Q

Read our latest Blog - Future LLMs will need Blind Vision Technology Learn More >>