Automatically identify potential story leads. Lets you create autonomous bots which poll data sources and run predefined data analysis. Results are then compared to the last time the bot ran – and any additions or deletions trigger an email alert.
Create simple static charts quickly – a tool for the non-technical. Can be easily customised to your organisation's house style using a simple stylesheet.
Enrich data by doing batch lookups against various online services. For example, quickly convert a list of company names into a list of directors of those companies.
Finds matches in two spreadsheets, optionally using various fuzzy-matching algorithms. Used by organisations including the Guardian, the Times, and news agency Irin who used it to identify a company the United Nations had a contract with who was also on its own sanctions list.
Like our reality, our data is often messy. Finding meaningful connections between such datasets often means using fuzzy matching algorithms. This was a high-level look at some of the most commonly used algorithms, their pros and cons, and how they are used in practice. Slides here.
Fuzzy matching has become an increasingly important part of data-led investigations as a way to identify connections between public figures, key people and companies that are relevant to a story. This class will show attendees how it typically fits into the investigative process, and give a practical introduction to using the CSV Match tool I developed.
Whether you find yourself collaborating on code, data, or prose, GitHub can work for journalists. This class will cover what GitHub is, the benefits of using it, and how it is typically used both by people doing data analysis and by developers. Attendees will be shown how to create a first repository and make pull requests.