journalismAI.com | New synthetic data techniques could change the way AI models are trained

New synthetic data techniques could change the way AI models are trained | SEMAFOR

November 3, 2023

1–2 minutes

“In many ways, the fate of synthetic data sits at the center of the biggest questions facing generative AI.”
– Reed Albergotti

Semafor reports on work by Microsoft researchers to use data-from-data as a new way to train generative AI models without touching copyrighted materials. Neutal networks like ChatGPT and others depend on vast datasets to form their text in response to queries.

Some observers argue synthetic data – data generated by AI models – could lead to a kind of intellectual in-breeding, eventually degrading the generated text to rubbish.

Others see data-from-data as a highly efficient way to help produce tailor-made text for a given use; for example, a model that learns by summarizing all the data in a narrow domain and then draws upon only that synthesis in order to generate new responses. Proponents say this outcome could be even more relevant than processing countless peripheral words and concepts.

Semafor also points to work on synthetic data by IBM and Google’s DeepMind.

SEE FULL STORY

New synthetic data techniques could change the way AI models are trained | SEMAFOR | November 3, 2023 | by Reed Albergotti

AI advances
AI in the newsroom
Ethics & standards
Explainers
Trust

SEE BY TYPE

Academic papers Blog Briefs 2014-2020 Op-Eds Quotes Reports Video

LATEST

Ethics & standards

Meet the One Woman Anthropic Trusts to Teach AI Morals | WSJ.MAGAZINE
AI advances, Trust

Albania Created an ‘A.I. Minister’ to Curb Corruption. Then Its Developers Were Accused of Graft | THE NEW YORK TIMES
Trust

Anthropic Releases A New ‘Constitution’ For Claude | FORTUNE
AI in the newsroom

Publishers prepare to be “squeezed” by AI and creators in 2026 | NEIMAN LAB

journalismAI.com is a public knowledge base supporting AI literacy for journalists, maintained since 2017. The indexed collection is updated regularly and contains 700+ entries, each curated and written by humans.

Click on a subject above or use the search box in the masthead.

About Us
Privacy Policy
Contact Us
The Next Disruption

New synthetic data techniques could change the way AI models are trained | SEMAFOR

SEE BY SUBJECT

SEE BY TYPE

LATEST

Meet the One Woman Anthropic Trusts to Teach AI Morals | WSJ.MAGAZINE

Albania Created an ‘A.I. Minister’ to Curb Corruption. Then Its Developers Were Accused of Graft | THE NEW YORK TIMES

Anthropic Releases A New ‘Constitution’ For Claude | FORTUNE

Publishers prepare to be “squeezed” by AI and creators in 2026 | NEIMAN LAB

Discover more from journalismAI.com