Revealed that Chinese suppliers now dominate the trade in ‘computer numerical control’ devices vital to Moscow’s military industries. I matched up customs data on the imports of CNC machinery with data on companies sanctioned by the US Treasury.
Shortlisted for the Sigma Awards. Revealed that thousands of mosques across China have been architecturally altered or destroyed in a government effort to suppress Islamic culture. I built a dataset of mosques in China based on data extracted from Baidu Maps. Along with others, I then tediously manually classified historic satellite imagery to create a database of what has changed.
Finds fuzzy matches between CSV files. Used by news organisations including the Wall Street Journal who used it to match up officials’ shareholding declarations with names of companies their agency had oversight of and the New Humanitarian who used it to identify a company the United Nations had a contract with who was also on its own sanctions list.
Processes ship tracking data and generates a summary of where the vessel has been, and identifies any gaps. It can also highlight where data has changed, which can be used to spot where transponder data has been spoofed.
As data journalism has become mainstream, more data editor positions have been created. But what makes a good data editor? In this panel we will discuss what it takes to do the job effectively, the different things it can involve, and the different routes to getting there. With Marie-Louise Timcke, Jan Strozyk, Helena Bengtsson, Eva Belmonte, and Dominik Balmer, moderated by me.
Guest lecture covering the origins of investigative data journalism, the nature of data in investigations, where it comes from, plus what code is and how it is used in the newsroom to do this kind of work.
Fuzzy matching is a process for linking up names that are similar but not quite the same. It can be an important part of data-led investigations, identifying connections between key people and companies that are relevant to a story. This class covers how it fits into the investigative process, and includes a practical introduction to using the CSV Match tool I developed.
Ever relied upon an online source, only later to find it deleted or changed? This class covers how to get the most out of resources like the Wayback Machine – what they’re good for, and what they’re not. We also cover when and how to build your own private archives of web content.