Big Data Spain

17th ~ 18th NOV 2016 MADRID, SPAIN #BDS16

Introduction to Apache Apex

Thursday 17th

from 17:15 to 17:55

Theatre 20

-

Keynote

Apache Apex is an open source stream processing platform, built for large scale, high-throughput, low-latency, high availability and operability. With a unified architecture it can be used for real-time and batch processing. Apex is Java based and runs natively on Apache Hadoop YARN and HDFS. We will discuss the key features of Apache Apex and architectural differences from similar platforms and how these differences affect use cases like ingestion, fast real-time analytics, data movement, ETL, fast batch, low latency SLA, high throughput and large scale ingestion. Apex APIs and libraries of operators and examples focus on developer productivity. We will present the programming model with examples and how custom business logic can be easily integrated based on the Apex operator API. We will cover integration with connectors to sources/destinations (including Kafka, JMS, SQL, NoSQL, files etc.), scalability with advanced partitioning, fault tolerance and processing guarantees, computation and scheduling model, state management, windowing and dynamic changes. Attendees will also learn how these features affect time to market and total cost of ownership and how they are important in existing Apex production deployments.

Thomas Weise foto

Thomas Weise

The Apache Software FoundationCommitter and PMC member Apache Apex