Big Data Spain

15th ~ 16th OCT 2015 MADRID, SPAIN #BDS15


THANK YOU FOR AN AMAZING CONFERENCE!


THE 4th EDITION OF BIG DATA IN Oct 2015 WAS A RESOUNDING SUCCESS.

ToroDB: SCALING PostgreSQL LIKE MongoDB

Friday 16th

from 18:00 pm to 18:45 pm

Room 25

-

Technical

NoSQL databases have emerged as a response to some perceived problems in the RDBMSs: agile/dynamic schemas; and transparent, horizontal scaling of the database. The former has been promptly targeted with the introduction of unstructured data types, but scaling a relational databases is still a very hard problem.

Read more

As a consequence, all NoSQL databases have been built from scratch: their storage engines, replication techniques, journaling, ACID support (if any). They haven't leveraged the previously existing state-of-the-art of RDBMSs, effectively re-inventing the wheel. Isn't this sub-optimal? Wouldn't it be possible to construct a NoSQL database by layering it on top of a relational database?

Enter ToroDB. ToroDB is an open source project that behaves as a NoSQL database but runs on top of PostgreSQL, one of the most respected and reliable relational databases. ToroDB offers a document (JSON-like) interface, and implements the MongoDB wire protocol, hence being compatible with existing MongoDB drivers and applications. Rather than using PostgreSQL's jsonb data type, ToroDB explored an innovative approach by transforming JSON documents to a fully relational representation, in an automated way. This brings to the table many advantages like lower disk footprint and automatic data-partitioning, leading to significantly faster queries.

As ToroDB speaks the MongoDB protocol, it also implements MongoDB replication and sharding techniques, enabling it to scale and offer HA like Mongo. Being based on PostgreSQL, ToroDB is effectively scaling PostgreSQL much in the same way MongoDB scales.

This presentation describes the architecture, internals and pitfalls of implementing MongoDB replication on ToroDB, and how key PostgreSQL technologies have been leveraged to accomplish this task (such as the use of Logical Decoding to serve idempotent database changes). It also addresses the MongoDB protocol itself, CAP, Jepsen and comments about real performance. It also touches MongoWP, a component of ToroDB built as a separate open source library, that implements the MongoDB protocol and enables development of Mongo-server-like, third-party applications.

Álvaro Hernández foto

Álvaro Hernández

8KdataCEO