BERT is Google’s model for natural language understanding (NLU). ERNIE is Baidu’s. Their effectiveness is measured with a test known as General Language Understanding Evaluation, or GLUE for short.

Language understanding is at the centre of many journalism uses of AI, so advances have a direct bearing on the future capabilities of AI in journalism.

Enrie (seen left) and Bert from Sesame Street

ERNIE became the first machine model to exceed 90% understanding of a complex narrative. The best human scores are in the high 80’s, so ERNIE’s performance not only bettered BERT’s, but surpassed human abilities, too.

The GLUE test measures machine comprehension of text that presents complications in how language is used. Examples are passages that name multiple people or objects, later requiring correctly matching ‘it’ or ‘she’ where multiple possibilities exist.

In MIT Technology Review, Karen Hao reports Baidu achieved superior results by training ERNIE in Chinese. The model learned to analyze and predict text from how the Chinese language works, where meaning is often modified by context.

The MIT report uses an example of a Chinese character that means ‘clever’ or ‘soul’ depending on an adjacent character. When this grouped-prediction method was applied to English, comprehension went up.

  • Baidu’s ERNIE is its acronym for Enhanced Representation through kNowledge IntEgration
  • Google’s BERT stands for Bidirectional Encoder Representations from Transformers
  • Naming NLU models after Sesame Street characters has become an industry convention.