AI Agents Need Observability: Why Security Logs Must Capture Intent to Action

AI agent security logs should not only record whether an API call succeeded. They need to connect user intent, agent planning, tool calls, policy decisions, human approval, and execution results.

When teams adopt AI agents, the first security instinct is usually to add permissions, approvals, and sandboxes.

Those are necessary, but they are not enough. Agents also need observability.

Traditional security logs usually answer three questions:

  • who called the interface?
  • when did it happen?
  • did it succeed or fail?

AI agents are different. They do not simply click a button or execute a fixed API call. They interpret a goal, make a plan, read context, choose tools, generate arguments, wait for approval, and then perform an action.

So the next generation of security logs should not only record “what happened.” They also need to record why the agent believed the action was appropriate.

The short version

Agent security logs should connect “user intent → agent plan → tool call → policy decision → actual action → execution result” into one traceable chain.

If the only log you see is:

DELETE /api/users/123 200 OK

you still do not know whether:

  • the user explicitly asked for deletion;
  • the agent confused “deactivate account” with “delete account”;
  • malicious page content persuaded the agent to act;
  • an automation flow called the tool without enough context;
  • the approval logic allowed the wrong action.

All of these can look identical in a traditional API audit log.

Why Traditional Logs Are Not Enough

In a normal application, the user clicks a button and the backend executes an endpoint. Logging the interface, identity, IP address, and status code is often enough to reconstruct the responsibility chain.

With an agent, there are more layers in the middle:

  flowchart LR
    A[User goal] --> B[Agent interpretation]
    B --> C[Task plan]
    C --> D[Tool selection]
    D --> E[Argument generation]
    E --> F[Permission / policy check]
    F --> G[Human approval]
    G --> H[Execution]
    H --> I[Result feedback]

The dangerous part is often not the final API call.

An agent may call a valid API with valid credentials and sufficient permission, while the real problem happened earlier:

  • the user meant “archive,” but the agent interpreted it as “delete”;
  • retrieved web content contained hidden or misleading instructions;
  • the agent selected the wrong tool;
  • tool arguments came from untrusted context;
  • the policy engine checked permission but not intent;
  • the approval UI showed the action but not the reason.

If you only inspect the API audit log, everything can look normal.

What Agent Logs Should Capture

I would split agent security logs into six layers.

LayerWhat to recordWhy it matters
IntentOriginal user goal, summarized task intent, task sourceShows whether the action drifted from the user’s goal
ReasoningAgent plan, key decision points, reason for choosing a toolReconstructs why the workflow moved in this direction
ContextFiles, web pages, retrieval snippets, memory sourcesHelps detect influence from untrusted context
ToolTool name, action type, argument summary, resource scopeShows what the agent intended to do
ControlPolicy decision, permission result, human approval, block reasonProves whether safety controls worked
ResultExecution result, external system response, rollback informationCompletes the audit trail

This does not mean saving every prompt, web page, or raw argument.

Security logs should preserve audit evidence, not create a new copy of all sensitive context.

A Practical Log Shape

For a first version, I would start with an event like this:

{
  "event.name": "agent.tool.execution",
  "trace_id": "01HT...",
  "agent.session_id": "sess_123",
  "agent.intent_id": "intent_456",
  "agent.task_id": "task_789",
  "user.id": "u_001",
  "user.goal_summary": "Deactivate a test account without deleting account data",
  "agent.plan_step": "Call the account status update tool",
  "agent.decision": "Selected update_user_status because the goal is deactivation",
  "tool.name": "user_admin",
  "tool.action": "update_status",
  "tool.args_hash": "sha256:...",
  "target.resource": "user:123",
  "risk.level": "medium",
  "policy.decision": "allow",
  "approval.required": true,
  "approval.actor": "human",
  "context.refs": [
    "issue:29",
    "file:docs/account-lifecycle.md"
  ],
  "result.status": "success"
}

The important parts:

  • trace_id connects multiple events inside one agent task;
  • intent_id separates the user goal from individual tool calls;
  • tool.args_hash avoids storing sensitive raw arguments;
  • policy.decision records whether the policy layer participated;
  • approval.* records whether a human approved the action;
  • context.refs keeps references instead of copying full context into logs.

When something goes wrong, you can ask:

What was the original intent?
At which step did the agent choose this tool?
Where did the arguments come from?
Why did policy allow it?
Was there human approval?
Did the execution result match the plan?

That is a very different level of evidence from “who called which API.”

Pre-Action Logs Matter Most

Many systems only log after an action finishes.

For agents, that is not enough.

At minimum, agents need two event types.

1. Pre-action events

Record what the agent is about to do:

  • current intent;
  • plan step;
  • reason for choosing the tool;
  • argument summary;
  • risk level;
  • permission decision;
  • whether human approval is required.

If the risk is visible here, the action can still be blocked before execution.

2. Post-action events

Record what actually happened:

  • which external system was called;
  • what result came back;
  • whether the call failed or retried;
  • whether a compensating action was triggered;
  • whether data state changed.

Pre-action logs explain “why this action was attempted.” Post-action logs explain “what happened in the end.”

Without the first part, many agent incidents only show the result, not the intent drift.

Do Not Turn Logs Into a New Leak Surface

The more detailed agent logs become, the easier it is to create another security problem: storing sensitive data in logs.

I would explicitly avoid logging:

  • API keys, tokens, cookies, SSH private keys;
  • full prompts and full tool arguments;
  • private fields from user input;
  • sensitive business data from web pages;
  • raw orders, accounts, students, customers, or financial records;
  • long memory excerpts.

Better patterns:

  • keep raw text only in short-lived secure storage with expiration;
  • keep long-term logs to summaries, hashes, resource IDs, and references;
  • label high-risk fields separately and restrict queries;
  • audit access to the logging system itself;
  • exclude raw sensitive logs from training, analytics, and dashboards by default.

Observability is not “save everything.” It is “preserve enough evidence to reconstruct the important path safely.”

How to Implement It

The first version does not need to be complex. I would implement it in this order.

1. Connect the trace first

A user request, an agent run, and a tool call must be linkable.

At minimum:

  • trace_id
  • agent.session_id
  • agent.intent_id
  • tool.call_id

OpenTelemetry already has a log data model with TraceId, SpanId, attributes, and event names. It is a good fit for structured agent events. Do not reduce agent behavior to an unstructured log string.

2. Log at the tool gateway

Do not let every tool invent its own logging format.

A better boundary is a tool gateway:

Agent → Tool Gateway → Policy Check → Human Approval → Tool Execution

At that boundary, you can consistently record:

  • intent, plan, and argument summary before the call;
  • policy decision;
  • approval status;
  • execution result;
  • failure and retry behavior.

Putting safety control and logging at the same boundary makes incidents easier to investigate.

3. Promote high-risk actions

Not every action needs the same log detail.

These actions deserve higher log severity:

  • deleting data;
  • exporting large amounts of data;
  • changing permissions;
  • changing billing or payment state;
  • touching production systems;
  • accessing sensitive systems;
  • committing code or triggering deployments automatically.

For high-risk actions, log intent, policy, approver, pre-action argument summary, and post-action result.

4. Start With Useful Alerts

Do not start with a wall of dashboards.

Start with alerts that expose real agent risk:

  • high-risk tool call without human approval;
  • read-only plan followed by a write action;
  • unusual number of tool retries under one intent;
  • tool arguments derived from untrusted context;
  • action allowed while the policy engine is unavailable;
  • one agent accessing too many resources in a short window;
  • execution result that does not match the plan step.

These are difficult to detect with traditional API logs alone.

A Simple Test

When designing agent security logs, ask:

If an agent accidentally deletes data tomorrow, can I answer within 30 minutes: what did the user want, how did the agent interpret it, why was this tool selected, who approved it, and what was the impact?

If not, the logs are not good enough yet.

Conclusion

Agent risk does not only come from “does it have permission?”

It also comes from “it had valid permission and still did the wrong thing.”

That means security logs need to evolve.

The next generation of agent security logs should capture:

  1. user intent;
  2. agent plan;
  3. context sources;
  4. tool selection;
  5. policy decision;
  6. human approval;
  7. actual action;
  8. execution result.

This is not about assigning blame after the fact. It is about making agent systems explainable, auditable, blockable, and recoverable.

A production-ready AI agent should not only complete work. It should also leave enough evidence to explain why every important step happened.

References

Tags

Comments

Load GitHub Discussions comments only when you need them.

Progress 0% Top
Follow on WeChat
WeChat official account QR code