Projects

Tools

CSV Match

Finds fuzzy matches between CSV files. Based on Textmatch, a Python library I also maintain. Has been used by news organisations including the Wall Street Journal who used it to match up officials’ shareholding declarations with names of companies their agency had oversight of and the New Humanitarian who used it to identify a company the United Nations had a contract with who was also on its own sanctions list.
Reconcile

Enriches data, adding new columns based on lookups to online services. For example, taking a spreadsheet of company numbers and turning it into a list of directors of those companies.
CSV Fetch

Downloads a list of URLs from a CSV spreadsheet.
Entabulate

Converts Json files to CSV. Supports Json Lines (such as the Companies House PSC data) and folders with nested Json files. Can also output Json Lines. Data is streamed, so files much bigger than the available memory can be converted. Takes into account nested Json objects.
Autocracy

Automates bulk OCR processing.
Ship Overviewer
archived
Processes ship tracking data and generates a summary of where the vessel has been, and identifies any gaps. It can also highlight where data has changed, which can be used to spot where transponder data has been spoofed.
NDJson-to-CSV
archived
Superseded by Entabulate.
Newsagent
archived
Monitors data sources, alerts you when they change. Lets you watch any kind of online dataset, which is then periodically fetched, processed, and compared against the last time. Additions or deletions to that data trigger an alert being sent.
Pantiler
archived
Converts geographic data into vector map tiles.
Graphik
archived
Creates simple static charts quickly – a tool for the non-technical. Can be easily customised to a organisation's house style using a stylesheet.
CSV Pivot
archived
Produces pivot tables, much like those in Excel, but in the terminal.
Who Follows Who
archived
Finds which Twitter accounts follow each other from a predefined list.

Other projects

London Overground capacity display

An experimental display showing the predicted space in train carriages. Deployed at Shoreditch High Street station for the last quarter of 2017. Built in collaboration between Geovation and OpenCapacity with TfL.
Track the dot

Project managed an interdisciplinary creative team at Data4Change 2017 in Kampala, Uganda working with Kenyan NGO Chrips. My team developed a offline-first campaign based on Chrips's research on urban violence that targeted community leaders in north Kenya.
We are Sudan

A mobile-first website built at Data4Change 2016 in Beirut, Lebanon working with Sudanese NGO Kace. Developed with a team as part of a social media campaign based on Kace's research into quality of life as experienced by ordinary people in the country.

Projects

Tools

Other projects

Track the dot