SCHEDULE - TALK DETAIL


← Back to the schedule

Keynote | Technical

Apache Spark vs rest of the world – Problems and Solutions

Thursday 16th | 13:40 - 14:10 | Theatre 18


One-liner summary:

Apache Spark is a great solution for building Big Data applications. It provides really fast SQL-like processing, machine learning library, and streaming module for near real time processing of data streams. Unfortunately, during application development and production deployments we often encounter many difficulties in mixing various data sources or bulk loading of computed data to SQL or NoSQL databases. All in all, there are a lot of challenges at the confluence of Apache Spark and the rest of the Big Data world, including HBase, Hive, PostgreSQL or Kafka. Those are the issues that we will discuss in our presentation.

Keywords defining the session:

- Apache hbase

- Apache Spark

- Postgresql

Apache Spark is a great solution for building Big Data applications. It provides really fast SQL-like processing, machine learning library, and streaming module for near real time processing of data streams. Unfortunately, during application development and production deployments we often encounter many difficulties in mixing various data sources or bulk loading of computed data to SQL or NoSQL databases. All in all, there are a lot of challenges at the confluence of Apache Spark and the rest of the Big Data world, including HBase, Hive, PostgreSQL or Kafka. Those are the issues that we will discuss in our presentation.