by Tom Kent
Feb 24, 2015
Updated October 2019
It’s routine now for news organizations to use artificial intelligence to write news. Robots are regularly transforming data into stories, and stories into multimedia presentations.
As more organizations start deploying AI, we need to keep a focus on the ethics and quality of robot newswriting. What accuracy and transparency problems do robotic systems create? What bad scenarios do we need to worry about?
Most robot newswriting so far has been for fairly formulaic situations: company earnings reports, stock market summaries, earthquake alerts and youth sports stories.
The Associated Press is the U.S. leader in using robot newswriting, generating sports and business stories that appear in thousands of publications. RADAR, a joint venture between data journalism startup Urbs Media and PA Media Group, produces 2,000 automated stories for UK publishers each week. Swedish local news publisher MittMedia produces a robot-written story on every local house sale — catnip to those who follow real estate.
As they move into robotics, news companies should ask some serious ethical questions. Here’s my checklist of what to consider regarding 1) the robot stories themselves, 2) issues of news judgment and transparency, and 3) pitfalls to be alert to.
THE STORIES THEMSELVES
● How accurate is the underlying data? Does the data your robot is using consist of official information from companies, a stock exchange or government? If so, it’s probably safe for automatic crunching — with regular checks to make sure the data is being properly transmitted. However, not all data comes from such authoritative sources. If scores are being sent in by parents from their kids’ soccer games, how will you assure the data is reliable? Your readers will hold your organization responsible for the data, whatever its original source.
● Do you have the rights to the data? Does the data provider have the legal right to send it to you? Do you have the further right to process and publish it, and on what platforms?
● What data will your system highlight? In a story wrapping up the day in the markets, what stocks and indexes will the robot lead with? Will it compare the latest numbers to the start of the year or to five years ago? Substantial advance thinking needs to go into what data the system will highlight. For some stories, you may need to switch off the automation entirely — say, for the financial results of a company in turmoil where full context is necessary from the very start. You also need staff available to take stories off autopilot in the event of a sudden development — for instance, in sports, if a player is seriously injured or fights break out during a game.
● Who’s watching the machine? Errors with underlying data or automation software can quickly metastasize, potentially creating thousands of erroneous stories. Test the automation thoroughly. When you first use robot-written stories, have a human editor check each one before it goes out. Once the product proves itself, stories can go out automatically with spot-checking by human editors.
● Do the automated reports match your style? Spellings, capitalization and general writing style should match the rest of your content. Readers will be confused by content that doesn’t feel like the rest of your journalism. Have someone not involved in the automation project read the automatically written stories to see if they fit in.
● What about long-term maintenance? Automatic processes can’t run on their own forever. Background material needed by the algorithm (team names, company headquarters locations) may change. A data source may suddenly become less reliable. The if-then choices that worked in the algorithm last year may no longer be appropriate. Responsible news organizations will have humans regularly maintaining their data and reviewing the choices that algorithms make.
● Will your automation create multimedia presentations? Some automated systems create video or photo displays to accompany text stories. If so, can you be certain that the system is accessing only imagery that you have a legal right to use? How will you make sure it doesn’t grab imagery that’s satirical, hateful or tasteless?
● Do you use software to reduce long articles to bullet points? Test it extensively to make sure it’s truly finding what’s important. And find out if the software you’re considering requires the original content to be in a certain format — say, the typical inverted pyramid style used in news stories. Text written in other ways may yield poor results. (At AP, we tried dropping the Book of Genesis into an automatic summary program; the bullet points created by the program left out the Garden of Eden.)
NEWS JUDGMENT AND ETHICS
● Is the subject matter appropriate for automation? It’s usually an easy decision to automatically create stories from statistical information. But what about from other facts? Imagine if political campaigns began to offer data feeds of candidate speeches — location of speech, size of crowd, main points, key quote, etc. Your robot could then spin out stories on every rally, but how would a story like that be different from a press release?
● Do you want everything in a feed? You might welcome a feed from your local police department reporting traffic tie-ups and minor arrests. But would you run, without a journalist’s involvement, a police report stating that an officer had killed a young man when he was pulling what appeared to be a gun from his waistband? Before you “plug through” a feed from a data source to your readership, consider everything that might be in that feed and whether you want all of it to flow through automatically.
● Will you disclose what you’re doing? Some organizations believe they must mark each story as having been produced automatically. The Guardian’s Australia edition puts at the end of such stories, “This story was generated by ReporterMate, an experimental automated news reporting system.” But as the process becomes more common, other robot journalism producers are including no such disclosure. Most producers also put no byline on automatically produced stories. But RADAR in the UK uses them; it considers that one of its human journalists is responsible for each set of automatically produced stories, mining and testing the data and writing the template. Therefore, the stories carry that journalist’s byline. The Guardian’s stories carry the “ReporterMate” byline.
● Is this about reducing your human staff? Many organizations that use automation say their goal isn’t to replace journalists with robots, but to let their human staff focus on reporting that matters rather than rote tasks. If so, fine. But if you are in fact going to be hiring fewer journalists for some jobs, there’s a lot to be said for being honest about it.
PITFALLS TO BE ALERT TO
● Can you defend how the story was “written”? If people question the facts in a robot-written story or how they were presented, can you give an explanation (or get a quick answer from your data and automation providers)? “The computer did it” isn’t much of an explanation, especially since public faith in computers isn’t what it used to be. Are your automatic writing processes well enough documented so that even as your staff turns over, you will be able to thoroughly explain how every story came to be? Might your system keep on file, for each story, a record of all the rules it followed in putting it together?
● Can your automation be spun? Your automation can be reverse-engineered by people trying to meddle with how it writes stories. Your system could become a target of pranksters, market manipulators or companies trying to make their results look better. What safeguards can you can put in place?
● Suppose you’re sued? The more issues your robots cover, the more their stories will be scrutinized. A politician may demand to know why her name doesn’t appear more often in automated stories. You may be sued over something the robot wrote, or over a picture your system attached. It’s not a sure thing whether automated processes enjoy the same legal protections as human journalists. Imagine political activists — or parties to a legal case — demanding the source code behind your automation. Will you reveal it, or try to defend it as a trade secret?
The points above aren’t aimed at discouraging editors from moving into robotic newswriting. But they underline the importance of planning, testing and fidelity to your editorial standards.
Plus, it’s good to recognize that some things will always be best done by humans.
This article originally appeared on Medium and is reprinted here with the author’s permission.
SEE RELATED STORIES
- Tom Kent: Deepfakes are about to get so much worse | THE WASHINGTON POST
- Paul Chadwick: As technology develops, so must journalists’ codes of ethics | THE GUARDIAN