‘85% is quite good in machine learning terms — but journalistically, it’s problematic. We have moved from dealing with facts, to dealing with estimates.’
The ONLINE JOURNALISM BLOG points out differences between reporting data as facts and data as probabilities. The difference is in machine learning systems instead of quantitative findings.
The piece offers a checklist for accountability when using algorithmically-generated results:
- How big was the training data set?
- How big was the test set?
- What was the training data?
- How was the percentage of accuracy determined?
- What issues have they identified in the data — for example, the sorts of false positives or negatives it tends towards?
OUR TAKE
Journalistic skills in algorithmic accountability are increasingly valuable. Newsrooms are using machine learning internally for their own stories and also must report findings by third parties who use algorithms in their conclusions.