Dan Hendrycks - AI2050

Dan Hendrycks 2023 Early Career Fellow

Affiliation Executive Director, Center for AI Safety Hard Problem Solved challenges of safety and control, human alignment and compatibility with increasingly powerful and capable AI and eventually AGI.

Dan Hendrycks 2023 Early Career Fellow

Dan Hendrycks is the director of the Center for AI Safety. He received his PhD from UC Berkeley, where he was advised by Jacob Steinhardt and Dawn Song. His research is supported by the NSF GRFP and the Open Philanthropy AI Fellowship. Dan contributed the GELU activation function, the default activation in nearly all state-of-the-art ML models including BERT, Vision Transformers, and GPT-3. Dan also contributed the main baseline for OOD detection and benchmarks for robustness (ImageNet-C) and large language models (MMLU, MATH). For more information, visit his website https://danhendrycks.com.

AI2050 Project

Modern AI systems, while powerful, are complex and hard to understand. This can pose a security risk, as it’s challenging to oversee these AIs and ensure their safe operation. Right now, we mostly understand these systems by observing their actions, rather than their inner workings. Dan Hendrycks’ AI2050 project aims to delve deeper, breaking down the AI into smaller components to understand its behavior. This understanding will not only make AI more transparent but also more controllable. For instance, we could tweak a deceptive AI to be more truthful. In essence, the project is working to make AIs safer and more predictable for society.