I seem to often find myself munging datasets from various sources in CSV format. While grep, awk, and sort go a long way, command line tools for processing CSV data are surprisingly lackluster.
Sure, I could import the data into a database and run SQL queries on it, but that gets tedious after a few runs - unless, of course, if you automate it. Which I did, and I have decided to share the tool as CSV Query.
CSV Query is a command line tool that allows you to run SQL queries on data stored in CSV files, for example:
$ csvq --select "count(*)" --where "name='Jakob'" sample.csv count(*) -------- 1
$ gem install csv_query
… assuming you have Ruby set up.
A few days ago, I tweeted:
- Find problem
- Fail to find existing project that solves it
- Write it yourself
- Find 2 other projects that does the same thing
That was roughly my experience with CSV Query. A few other projects aim to do roughly the same thing, and I didn’t find them until after I had already written my own tool. Sigh.
- csvq.py - Python variant that does almost the same thing, but uses Python for querying
- CSVql - A query language for CSV files
CSV Query is slightly different in its approach, basically passing all querying on to SQLite. Here’s hoping it’ll prove useful for a few people other than myself.