Antonio Orvieto 2024 Early Career Fellow
Affiliation ELLIS group leader, Max Planck Institute for Intelligent Systems Hard Problem Solved the science and technological limitations and hard problems in current AI that are critical to enabling further breakthrough progress in AI leading to more powerful and useful AI capable of realizing the beneficial and exciting possibilities, including artificial general intelligence (AGI).

Antonio Orvieto is an independent group leader at the Max Planck Institute for Intelligent Systems and a Hector endowed principal investigator at the ELLIS Institute Tübingen, Germany. He holds a Ph.D. from ETH Zürich and spent time at Google Deepmind, Meta, MILA, INRIA Paris, and HILTI. His main area of expertise is optimization for Deep Learning and the design of neural networks for reasoning in complex sequential data. He has published in NeurIPS, ICML, ICLR, AISTATS, and CVPR; he organized the “Optimization for Data Science and Machine Learning” session at the International Conference on Continuous Optimization (ICCOPT) in 2022 and the Workshop on Next Generation of Sequence Modeling Architectures at ICML 2024.

In his research, Antonio strives to improve deep learning technologies by pioneering new architectures and training techniques grounded in theoretical knowledge. His work encompasses two main areas: designing innovative models capable of handling complex data and understanding the intricacies of large-scale training dynamics. Central to his studies is exploring innovative techniques for decoding patterns in sequential data, with implications in natural language processing, biology, neuroscience, and music generation. His LRU architecture is the basis for some of Google’s Gemma language model variants.

AI2050 Project

Large foundation models are already pervasive in many aspects of our lives. Antonio’s project aims to bring AI advantages to the scientific domain by developing improved solutions for knowledge discovery in the genome. Through developing and refining neural network technology to target the biological field, he hopes to provide scientists with new tools for data analysis and thus accelerate data-driven knowledge discovery.