Finds matches in two spreadsheets, optionally using various fuzzy-matching algorithms. Used by organisations including the Guardian, the Times, and news agency Irin who used it to identify a company the United Nations had a contract with who was also on its own sanctions list.
Enrich data by doing batch lookups against various online services. For example, quickly convert a list of company names into a list of directors of those companies.
An introduction to how code is used in the newsroom, with recent story examples, explaining the fundamental concepts and demystifying the jargon. We also guided attendees through the most common programming languages, and gave a roadmap to deciding which to pursue. Slides here.
This talk explained the ways automation is already being used in newsrooms, why the coming wave of automation is not a threat, and how we can embrace this new technology to improve the quality of investigative reporting at a time of shrinking newsroom resources. Slides here.
Graph databases are incredibly useful to find connections or patterns within our data. This was a hands-on introduction to graph database Neo4j, showing examples of its use for investigative stories including the Panama and Paradise Papers, and teaching attendees how to build a graph of noteworthy individuals and match them with corporate data to see the networks involved.
Fuzzy matching has become an increasingly important part of data-led investigations as a way to identify connections between public figures, key people and companies that are relevant to a story. This class showed attendees how it typically fits into the investigative process, and gave a practical introduction to using the CSV Match tool I developed. Slides here.