The challenge of AI errors and the limitations of training accurate and verifiably stable data-driven AI
Since the seminal work by Szegedy et al. revealing the apparent sensitivity of deep learning classifiers to small adversarial perturbations of their input data, the robustness of modern data-driven AI systems has been a widely discussed and broadly debated issue. Among these adversarial perturbations, there can exist even universal perturbations which trigger the instability of the network for seemingly any input. The presence of such instabilities in a tool which is so widely used in applications gives rise to a fundamental question: are these instabilities typical, and to be expected in modern large-scale AI and deep learning models? Moreover, is it even possible to compute a data-driven AI model which is both accurate and verifiably stable at the same time?
In the talk, we will present and discuss a list of scenarios enabling the formulation of high-level verifiable criteria for the detection of instabilities in a broad class of trained models. However, as we will show too, major limitations on the pathway to compute accurate and verifiably stable AI from data remain. These limitations constitute a fundamental issue around the possibility of building data-driven systems which are indeed accurate and verifiably stable. We will discuss potential approaches to alleviate the problem by accepting the inevitability of errors and finding computationally efficient ways to correct them “on-the-job” with given performance guarantees and without re-training.