Using SQL to analyze data in JavaScript without a database
This is the elaboration of an idea I had in this blog post: JavaScript Arrays - Map, Reduce and Filter.
Take the following dataset listing countries and their respective population/GDP:
Consider you want to find the average gdp of all countries with a population less than 10 million.
Wouldn’t it be nice to be able to do this using an SQL statement?
The idea is the following: The map, reduce, sort and filter methods on JavaScript arrays can easily be mapped to select, where and order by in SQL.
WHERE <—> Filter
The WHERE statement thins out the rows of a tables.
Just like Array.prototype.filter
SELECT <—> Map
The SQL SELECT statement picks a subset of the table columns.
Array.prototype.map can be used to do the same:
ORDER BY <—> Sort
Using ORDER BY sorts the selected rows.
Exactly like Array.prototype.sort:
Aggregation functions (like sum, avg etc.) <—> Reduce
The following SQL statement containing the ‘sum’ aggregation function can be implemented in JavaScript by using Array.prototype.reduce.
Pretty straight-forward, don’t you think?
Well these examples are simplified for the sake of brevity, but they constitute a good starting point to dive into the implementation.
Implementation, Demo and Repository
The implementation is based on the SQLLite parser written by Nick Wronski for Codeschool which parses SQL statements and generates an abstract syntax tree out of which the neccessary data can be extracted.
You can find the repository here.
Please note that this is experimental and to be considered a case study.
If you would however like to use or build on this technique, feel free to do so.
You can find a nice collection of data to analyze in this github repo by Justin Dorfman
Currently supported statements:
Since the where clause is eval’d you can do more advanced stuff like filtering on properties of sub-objects:
or determining whether every items of an array fullfills a certain condition
Note that due to the fact that sqllite columns are case insensitive you have to put camelCase columns in quotation-marks:
Checking if a column is null only works like this: where “column == null” = ‘true’