Date:

Estimated Time:less than a minute

PostgreSQL Reflexions

After 4 years of postgresql management, I d'like to relate my experience.

Postgreql is incredible:

However, sadly postgresql has some drawnbacks:

Complementing postgresql with hadoop:

For this reason, it has been decided to externalize some responsibilities such batch exporting and analytics to an other technology that scale horizontally: hadoop.

The good point is synchronizing data from postgresql to hive daily is easy thanks to incremental sqoop import on hdfs, based on indexed timestamps in postgreql. This allows pushing the data from postgresql to hive ready to analyse. The two tools interface well where hive does not handle sequence, or triggers, and postgresql does not manage huge data historisation, and columnar aggregation.

While postgreqsl does not handle well aggregations on bilions rows, it's children greenplum is supposed to.

Let's envisage a greenplum post soon.

This page was last modified: