Dan Hendrycks

2023 Early Career Fellow

Dan Hendrycks is the executive director of the Center for AI Safety and an advisor to xAI and Scale AI. He received his PhD in AI from UC Berkeley. He has contributed the GELU activation function (the most-used activation in state-of-the-art models, including BERT, GPT, Vision Transformers, etc.), benchmarks and methods in robustness, MMLU, and an Introduction to AI Safety, Ethics, and Society.

AI2050 Project

Modern AI systems, while powerful, are complex and hard to understand. This can pose a security risk, as it’s challenging to oversee these AIs and ensure their safe operation. Right now, we mostly understand these systems by observing their actions, rather than their inner workings. Dan Hendrycks’ AI2050 project aims to delve deeper, breaking down the AI into smaller components to understand its behavior. This understanding will not only make AI more transparent but also more controllable. For instance, we could tweak a deceptive AI to be more truthful. In essence, the project is working to make AIs safer and more predictable for society.

Affiliation

Executive Director, Center for AI Safety

Hard Problem

Alignment