By Camuel Gilyadov, on February 5th, 2012

BigData OpenSource adventures

Lightweight virtualization: http://ZeroVM.org
OpenDremel: http://code.google.com/p/dremel/
Apache Drill Proposal: http://wiki.apache.org/incubator/DrillProposal
Our GitHub page: http://github.com/Dazo-org

 ——————

What is started to be OpenDremel project ended up as lightweight virtualization project named ZeroVM.

In order to build a full Google BigQuery clone a decent lightweight virtualization framework is required. The reason for that with triple VVV for BigData definition, the last V stands for variety. And therefore we cannot limit all the queries to SQL without enabling the user to run arbitrary imperative code within query. And as soon as we allow querying with imperative language (in addition to SQL) the need for secure sandbox becomes immediately evident. Hence the need for lightweight virtualization framework. And after such framework could not be located on the web we kinda did developed our own. Look here fore more details:

http://ZeroVM.org
http://github.com/Dazo-org/zerovm

After completion of ZeroVM project we plan to get back on implementing Google BigQuery clone on top of it.

Another issue with implementing proper Google BigQuery clone is that a decent object storage platform is required. Google has Google Storage as underlying storage infrastructure for its BigQuery system and we need something similar. Fortunately with phenomenal success of OpenStack and Swift in particular there could not be any doubt with selection. We took Swift and embedded ZeroVM into it, so among other benefits all data access is local.