business triangle technical hexagon

Would you trust your model with your life? Research vs. reality in AI

Technical talk | English

Theatre 16: Track 4

Thursday - 11.40 to 12.20 - Technical


There are many considerations before deploying deep learning models into the real world, especially in safety-critical environments like automated driving, smart medical devices, aerospace, and biomedical applications. Consider the consequences when an AI system fails in such environments.

A deep learning researcher can achieve 99% accuracy on a deep learning model, but what about the edge cases? What if those edge cases represent someone’s life? Is AI ready to move from research to reality? Model accuracy is only one part of a production-ready system: model justification and documentation, rigorous testing, use of specialized hardware (GPUs, FPGAs, cloud resources, etc.), and collaboration between multiple people with various expertise related to the project and system. In this session, we will discuss the importance of explainable models, system design and testing before an AI system is production-ready.

AI model development continues to gain popularity across so many industries and areas impacting our everyday lives, the “explainability” of AI models have been coming into question. There are enormous costs, including legal and ethical implications, for example facial recognition applications and credit scoring algorithms accused of bias. While in many applications the ramifications of failure are quite low – a movie-recommender fails to provide an accurate prediction, for example – there is urgent need for safety-critical systems to not only get the answer correct, but also explain how it came to a final decision.

Engineers and scientists must be able to thoroughly understand and investigate a model before feeling comfortable putting a model into production. In many industries, there are often regulatory requirements and documentation to allow a model into production. The higher the ramification of failure, the more need to fully explain the model behavior, and a “black-box” model will not suffice.

Several techniques to improve model explainability in these cases will be discussed in this session. This includes simple, traceable models [such as decision trees and statistical models] to more advanced deep learning models with proven visualizations, and models tested and vetted by deep learning experts. The ability to communicate data properties, model decisions and results will be emphasized, documenting details on not only the inner workings of the model, but how the model was trained, different validation techniques, comparisons with other models, how the data were collected, and the importance of training, validation, and test sets.

An explainable model is only one part of a complete AI system: A production-ready model must be incorporated into a much larger system. AI research teams require a fully cross functional team collaborating to incorporate the model into the various components of a larger system. These systems often involve perception, controls, and system components implemented in many different languages and software architectures. The models may also be required to run on specific hardware such as GPUs, FPGAs, or cloud resources.

A complete AI-driven system must not only be validated for research purposes using traditional techniques but must also be thoroughly tested in the full operational environment, and the edge cases must be fully understood in safety-critical applications. In this session, we will lay the groundwork for understand the steps to deliver a complete AI system: from model testing and explainability to deployment and system design, and discuss testing and collaborating across various skill levels. We will use MATLAB to illustrate designing and deploying machine learning and deep learning models to various environments by automatically converting the code and packaging applications to be called from other languages.