Meta's Rogue AI Gave Bad Advice and Triggered a Security Incident

Meta had another run-in with a rogue AI agent last week, and this time it led to a serious security incident. For nearly two hours, employees had unauthorized access to company and user data because an AI agent gave bad technical advice that an employee acted on.

The incident, first reported by The Information, started when a Meta engineer used an internal AI agent—described by Meta spokesperson Tracy Clayton as “similar in nature to OpenClaw within a secure development environment”—to analyze a technical question posted on an internal company forum. The agent analyzed the question and then publicly replied to it without getting approval first. That reply was only meant for the employee who requested it, not for the whole forum.

An employee saw the AI’s advice and acted on it. The problem? The advice was wrong. It “provided inaccurate information” that triggered a “SEV1” level security incident, the second-highest severity rating Meta uses. The mistake temporarily let employees access sensitive data they weren’t authorized to see. Meta says the issue has been resolved and that “no user data was mishandled” during the incident.

Clayton was quick to point out that the AI agent didn’t take any technical action itself beyond posting inaccurate advice—something a human could have also done. But a human might have tested the advice first or made a more complete judgment call before sharing it. It’s also not clear whether the employee who originally prompted the answer planned to post it publicly.

“The employee interacting with the system was fully aware that they were communicating with an automated bot,” Clayton told The Verge. “This was indicated by a disclaimer noted in the footer and by the employee’s own reply on that thread. The agent took no action aside from providing a response to a question. Had the engineer that acted on that known better, or did other checks, this would have been avoided.”

This isn’t the first time Meta has dealt with a rogue AI agent. Last month, an OpenClaw agent went more directly rogue when an employee asked it to sort through emails in her inbox. The agent deleted emails without permission. The whole idea behind agents like OpenClaw is that they can take action on their own, but like any other AI model, they don’t always interpret prompts correctly or give accurate responses. Meta employees have now discovered this twice.

What strikes me about this incident is how easily it could happen at any company using AI agents for internal tasks. The AI didn’t do anything malicious—it just gave bad advice. The real failure was the lack of safeguards around how that advice was shared and acted upon. A disclaimer in the footer isn’t enough when the AI’s output can trigger a SEV1 incident. Meta needs better guardrails, and fast.

Meta’s Rogue AI Gave Bad Advice and Triggered a Security Incident

Comments (0)