I’m a data reporter at Bloomberg News in London. Previously I was at the Financial Times.
I also organise Journocoders, a community of journalists and other people working in the media interested in developing technical skills for use in their reporting.
A look into the collapse of UK lender Market Financial Solutions, which left lenders facing up to $1.7bn of losses. I analysed corporate filings, finding that much of their lending went to companies controlled by five people – many sharing a registered address with MFS. I also found hundreds of the properties these loans were secured on had multiple mortgages.
A look at the effects of the Labour government’s tax changes targeting the wealthy, and whether they are likely to achieve their aim of bringing in more tax revenue. I analysed corporate filings to identify when and how business leaders had changed their country of residence over the previous four years.
Scraped data is often the backbone of an investigation, but some websites are more difficult to scrape than others. This session covers best practices for dealing with tricky sites, including coping with captchas, using proxy and other scraping services, plus the tradeoffs and costs of these approaches.
Fuzzy matching is a process for linking up names that are similar but not quite the same. It can be an important part of data-led investigations, identifying connections between key people and companies that are relevant to a story. This class covers how it fits into the investigative process, and includes a practical introduction to using the CSV Match tool I developed.
Finds fuzzy matches between CSV files. Based on Textmatch, a Python library I also maintain. Has been used by news organisations including the Wall Street Journal who used it to match up officials’ shareholding declarations with names of companies their agency had oversight of and the New Humanitarian who used it to identify a company the United Nations had a contract with who was also on its own sanctions list.
Enriches data, adding new columns based on lookups to online services. For example, taking a spreadsheet of company numbers and turning it into a list of directors of those companies.