Talk | Technical | English

Building Notebook-based AI Pipelines with Elyra and Kubeflow

Nick Pentreath

Principal Engineer, IBM

18 November. 18.15 - 18.55 | Attic

A typical machine learning pipeline begins as a series of preprocessing steps followed by experimentation, optimization and model-tuning, and, finally deployment. Jupyter notebooks have become a hugely popular tool for data scientists and other machine learning practitioners to explore and experiment as part of this workflow, due to the flexibility and interactivity they provide. However, with notebooks it is often a challenge to move from the experimentation phase to creating a robust, modular and production-grade end-to-end AI pipeline. Elyra is a set of open-source, AI centric extensions to JupyterLab. Elyra provides a visual editor for building notebook-based pipelines that simplifies the conversion of multiple notebooks into batch jobs or workflows. These workflows can be executed both locally (during the experimentation phase) and on Kubernetes via Kubeflow Pipelines for production deployment. In this way, Elyra combines the flexibility and ease-of-use of notebooks and JupyterLab, with the production-grade qualities of Kubeflow (and in future potentially other Kubernetes-based orchestration platforms). In this talk I introduce Elyra and its capabilties, then give a deep dive of Elyra’s pipeline editor and the underyling pipeline execution mechanics, showing a demo of using Elyra to construct an end-to-end analytics and machine learning pipeline. I will also explore how to integrate and scale out model-tuning as well as deployment via Kubeflow Serving.