The Rise of Experiential AI: Learning Beyond Human Data
This blog post was automatically generated (and translated). It is based on the following original, which I selected for publication on this blog:
Welcome to the Era of Experience.
The Shift from Human Data to Experience-Driven AI
Artificial intelligence has recently achieved significant progress by training on massive datasets of human-generated content. Large Language Models (LLMs) exemplify this trend, demonstrating a wide range of abilities from creative writing to scientific problem-solving. However, relying solely on imitating humans has limitations, especially in fields requiring superhuman intelligence such as mathematics, coding, and scientific discovery.
While human data provides a strong foundation, progress driven by supervised learning from this source is demonstrably slowing. Breakthroughs often lie beyond current human understanding and cannot be captured by existing datasets. Therefore, a new paradigm is emerging: AI that learns primarily from its own experiences.
The Era of Experience: AI in the Real World
The next generation of AI will learn continually from interactions within its environment, generating data that improves as the agent becomes more proficient. This approach allows AI to explore possibilities beyond the constraints of human knowledge, leading to novel solutions and advancements. Examples include AI achieving high performance in mathematical olympiads by generating its own proofs and AI autonomously developing problem-solving strategies through reinforcement learning.
This era will likely be characterized by:
- Continuous streams of experience: Agents will operate within ongoing streams of interactions rather than isolated instances.
- Environmentally grounded actions and observations: Interactions will be based on direct engagement with the environment, rather than solely on human dialogue.
- Experience-based rewards: Rewards will derive from the agent's experiences in the environment, not human pre-judgments.
- Planning and reasoning about experience: Agents will reason based on their own experiences, expanding beyond human-centric terms.
Key Elements of Experiential AI
Streams of Experience
Future AI agents will exist within long-term streams of actions and observations, enabling them to adapt over time and pursue long-term goals. For instance, a health agent could monitor a user's health data over months, providing personalized recommendations based on trends and goals. A scientific agent could analyze real-world observations, run simulations, and propose experiments to achieve ambitious goals like discovering new materials or reducing carbon emissions.
Actions and Observations
AI in the era of experience will interact autonomously with the real world, using motor control and sensors, rather than relying solely on human-privileged communication. This includes using APIs, executing code, and interacting with computer interfaces. Scientific agents could monitor environmental sensors or control robotic arms in labs to conduct experiments.
Rewards
Instead of relying on human judgment, experiential agents will learn from external events and signals in their environment. Grounded rewards, based on real-world measurements like heart rate, exam results, or carbon dioxide levels, will drive learning. This allows agents to discover strategies that humans might not appreciate. Users can also provide feedback, fine-tuning the reward function over time and correcting misalignments.
Planning and Reasoning
While LLMs have shown promise in reasoning, human language may not be the optimal form of computation. Experiential AI can discover more efficient mechanisms of thought through self-learning. Grounding thinking in the external world, by building a world model, allows agents to predict the consequences of their actions and refine their understanding based on real-world data.
Why Now?
Past AI, based on reinforcement learning, has mastered simulated environments with clear reward signals. The era of human data shifted focus to general-purpose agents trained on massive datasets, but sacrificed the ability to self-discover knowledge. The era of experience reconciles these approaches by enabling agents to act and observe autonomously in real-world environments, with rewards connected to grounded signals.
Reinforcement Learning Revisited
The shift toward human-centric LLMs has, to some extent, overshadowed core RL concepts, such as value functions and exploration techniques. However, the era of experience presents an opportunity to revisit and improve these classic concepts.
This includes developing new ways to ground reward functions in observational data, revisiting methods for estimating value functions from long, incomplete streams, and creating principled methods for real-world exploration. By building on the foundations of RL and adapting its principles to the challenges of this new era, the full potential of autonomous learning can be unlocked.
Consequences and Considerations
The rise of experiential AI offers immense potential, but also presents challenges that require careful consideration. While personalized assistants and accelerated scientific discovery are promising outcomes, the automation of human capabilities could lead to job displacement. The long-term autonomy of these agents may raise safety concerns and make their actions harder to interpret.
However, experiential learning also provides safety benefits. Agents can adapt to changes in their environment and correct misaligned reward functions over time. The time it takes to execute actions in the real world may also provide a natural brake on AI self-improvement.
Conclusion
The era of experience marks a transformative phase in AI's evolution, moving beyond the limitations of human data. Agents will learn from real-world interactions, utilize powerful non-human reasoning, and construct plans grounded in the consequences of their actions. This paradigm shift, combined with advancements in reinforcement learning, will unlock unprecedented capabilities in many domains. Is this evolution something to fear or be excited about? The answer likely depends on the ethical choices made by AI researchers moving forward.