BERT is Google’s model for natural language understanding (NLU). ERNIE is Baidu’s. Their effectiveness is measured with a test known as General Language Understanding Evaluation, or GLUE for short. Language understanding is at the centre of many journalism uses of AI, so advances have a direct bearing on future capabilities of AI in journalism.
ERNIE became the first machine model to exceed 90% understanding of a complex narrative. The best human scores are in the high 80’s, so ERNIE’s performance not only bettered BERT’s, but surpassed human abilities, too. The GLUE test measures machine comprehension of text that presents complications in how language is used. Examples are passages that name multiple people or objects, later requiring correctly matching ‘it’ or ‘she’ where multiple possibilities exist.
In MIT TECHNOLOGY REVIEW, Karen Hao reports Baidu achieved its superior results by training ERNIE in Chinese. The model learned to analyze and predict text from how the Chinese language works, where meaning is often modified by context. The MIT report uses an example of a Chinese character that can either mean ‘clever’ or ‘soul’ depending on an adjacent character. When this grouped-prediction method was applied to English, comprehension went up.
- Baidu’s ERNIE is its acronym for Enhanced Representation through kNowledge IntEgration
- Google’s BERT stands for Bidirectional Encoder Representations from Transformers
- Naming NLU models after Sesame Street characters has become an industry convention.
SEE RELATED STORIES
- China’s Baidu Uses AI Understanding in Chinese to Learn English
LANGUAGE MAGAZINE | December 27, 2019
- Baidu has a new trick for teaching AI the meaning of language
MIT TECHNOLOGY REVIEW | December 26, 2019 | by Karen Hao