Keynote | Technical
Friday 17th | 13:20 - 13:50 | Theatre 25
Keywords defining the session:
- Apache Flink
- Apache Kudu
- Kappa
Takeaway points of the session:
- How to implemente kappa architectures
- How to integrate Apache Flink and Kudu
Kappa Architecture is a software architecture pattern that makes use of an immutable, append only log. All the processing of the event will be performed in the input streams and persisted as real-time views. Apache Flink is very well suited to be the processing engine because it provides support for event-time semantics, stateful exactly-once processing, and achieves high throughput and low latency at the same time. Apache Kudu Kudu is a storage system good at both ingesting streaming data and good at analysing it using ad-hoc queries (e.g. interactive SQL based) and full-scan processes (e.g Spark/Flink). So Kudu is a good fit to store the real-time views in a Kappa Architecture. We have developed and open-sourced a connector to integrate Apache Kudu and Apache Flink. It allows reading/writing data from/to Kudu using the DataSet and DataStream Flink’s APIs. The connector has been submitted to the Apache Bahir project and is already available from maven central repository.
Keynote