‘85% is quite good in machine learning terms — but journalistically, it’s problematic. We have moved from dealing with facts, to dealing with estimates.’
The ONLINE JOURNALISM BLOG points out differences between reporting data as facts and data as probabilities, often found when results are from machine learning systems instead of quantitative findings.
The piece offers a checklist for accountability when using algorithmically-generated results:
- How big was the training data set?
- How big was the test set?
- What was the training data?
- How was the percentage of accuracy determined?
- What issues have they identified in the data — for example the sorts of false positives or negatives that it tends towards?
OUR TAKE
- The author brings many points to life. An example: a BBC report about the ratio of male:female speaking time on Game of Thrones that was analyzed using a machine learning system.
- Journalistic skills in algorithmic accountability are increasingly important for both internal and external material. Newsrooms are using machine learning for their own stories and reporting findings by third-parties who have used algorithms in their conclusions.