Fellows Forum 2025

LLMs at the Crossroads: Advances, Security Challenges, and Safety

Panel: LLMs at the Crossroads: Advances, Security Challenges, and Safety

Overview

This panel dug into why 2025 feels like the “year of agents.” Dawn Song shared fresh research showing simple agentic systems already solve real bug-bounty tasks and are finding zero-days in major open-source projects—evidence of rapid capability gains but also growing risk. Andrew Mauboussin highlighted agents’ new knack for gathering their own context across tools (Slack, Drive, AWS), plus the need for human red teaming to catch prompt-injection tricks (the “lethal trifecta” of public input + private data + outbound actions). Artemis Seaford emphasized enterprise adoption, noting trust, verifiable reasoning, and layered guardrails (model- and system-level) as keys, alongside standards and provenance. The group agreed defenses must be “defense-in-depth,” often combining third-party guardrails with in-house policy-as-code, continuous monitoring, and red teaming. Big picture: agents are accelerating fast; they’ll boost both attackers and defenders in cybersecurity, so organizations should build with security from day one while pushing toward transparent, controllable, human-aligned systems.

Speakers

Dawn Song

UC Berkeley / Virtue AI

Andrew Mauboussin

Surge AI

Artemis Seaford

ElevenLabs

Moderator

Bo Li

Virtue AI