Finds matches in two spreadsheets, optionally using various fuzzy-matching algorithms. Used by organisations including the Guardian, the Times, and the New Humanitarian who used it to identify a company the United Nations had a contract with who was also on its own sanctions list.
Enriches data, adding new columns based on lookups to online services. For example, taking a spreadsheet of company numbers and turning it into a list of directors of those companies.
Get organised and avoid mistakes with better working practices. This session will explain concepts as well as cover tips, tricks, and traps to avoid which will make your work with spreadsheets easier and more effective. This will include wide versus tall data, separating semantics from presentation, how to name things, and data validation techniques.
Guest lecture on the data processing pipelines that powered the Financial Times’ coverage of the 2020 US election poll tracker and live results page.
You may have come across acronyms like HTTP and HTML, but what do they mean, and what does it matter? This class explains the concepts that underpin how the web works – which are simpler than you might think – as well as how you can use this knowledge to extract out the information you need, and understand how exactly your stories reach your readers.
Fuzzy matching has become an increasingly important part of data-led investigations as a way to identify connections between public figures, key people and companies that are relevant to a story. This class shows attendees how it typically fits into the investigative process, and includes a practical introduction to using the CSV Match tool I developed.