When teams adopt AI agents, the first security instinct is usually to add permissions, approvals, and sandboxes.
Those are necessary, but they are not enough. Agents also need observability.
Traditional security logs usually answer three questions:
- who called the interface?
- when did it happen?
- did it succeed or fail?
AI agents are different. They do not simply click a button or execute a fixed API call. They interpret a goal, make a plan, read context, choose tools, generate arguments, wait for approval, and then perform an action.
So the next generation of security logs should not only record “what happened.” They also need to record why the agent believed the action was appropriate.
The short version
Agent security logs should connect “user intent → agent plan → tool call → policy decision → actual action → execution result” into one traceable chain.
If the only log you see is:
DELETE /api/users/123 200 OKyou still do not know whether:
- the user explicitly asked for deletion;
- the agent confused “deactivate account” with “delete account”;
- malicious page content persuaded the agent to act;
- an automation flow called the tool without enough context;
- the approval logic allowed the wrong action.
All of these can look identical in a traditional API audit log.
Why Traditional Logs Are Not Enough
In a normal application, the user clicks a button and the backend executes an endpoint. Logging the interface, identity, IP address, and status code is often enough to reconstruct the responsibility chain.
With an agent, there are more layers in the middle:
flowchart LR
A[User goal] --> B[Agent interpretation]
B --> C[Task plan]
C --> D[Tool selection]
D --> E[Argument generation]
E --> F[Permission / policy check]
F --> G[Human approval]
G --> H[Execution]
H --> I[Result feedback]
The dangerous part is often not the final API call.
An agent may call a valid API with valid credentials and sufficient permission, while the real problem happened earlier:
- the user meant “archive,” but the agent interpreted it as “delete”;
- retrieved web content contained hidden or misleading instructions;
- the agent selected the wrong tool;
- tool arguments came from untrusted context;
- the policy engine checked permission but not intent;
- the approval UI showed the action but not the reason.
If you only inspect the API audit log, everything can look normal.
What Agent Logs Should Capture
I would split agent security logs into six layers.
| Layer | What to record | Why it matters |
|---|---|---|
| Intent | Original user goal, summarized task intent, task source | Shows whether the action drifted from the user’s goal |
| Reasoning | Agent plan, key decision points, reason for choosing a tool | Reconstructs why the workflow moved in this direction |
| Context | Files, web pages, retrieval snippets, memory sources | Helps detect influence from untrusted context |
| Tool | Tool name, action type, argument summary, resource scope | Shows what the agent intended to do |
| Control | Policy decision, permission result, human approval, block reason | Proves whether safety controls worked |
| Result | Execution result, external system response, rollback information | Completes the audit trail |
This does not mean saving every prompt, web page, or raw argument.
Security logs should preserve audit evidence, not create a new copy of all sensitive context.
A Practical Log Shape
For a first version, I would start with an event like this:
{
"event.name": "agent.tool.execution",
"trace_id": "01HT...",
"agent.session_id": "sess_123",
"agent.intent_id": "intent_456",
"agent.task_id": "task_789",
"user.id": "u_001",
"user.goal_summary": "Deactivate a test account without deleting account data",
"agent.plan_step": "Call the account status update tool",
"agent.decision": "Selected update_user_status because the goal is deactivation",
"tool.name": "user_admin",
"tool.action": "update_status",
"tool.args_hash": "sha256:...",
"target.resource": "user:123",
"risk.level": "medium",
"policy.decision": "allow",
"approval.required": true,
"approval.actor": "human",
"context.refs": [
"issue:29",
"file:docs/account-lifecycle.md"
],
"result.status": "success"
}The important parts:
trace_idconnects multiple events inside one agent task;intent_idseparates the user goal from individual tool calls;tool.args_hashavoids storing sensitive raw arguments;policy.decisionrecords whether the policy layer participated;approval.*records whether a human approved the action;context.refskeeps references instead of copying full context into logs.
When something goes wrong, you can ask:
What was the original intent?
At which step did the agent choose this tool?
Where did the arguments come from?
Why did policy allow it?
Was there human approval?
Did the execution result match the plan?That is a very different level of evidence from “who called which API.”
Pre-Action Logs Matter Most
Many systems only log after an action finishes.
For agents, that is not enough.
At minimum, agents need two event types.
1. Pre-action events
Record what the agent is about to do:
- current intent;
- plan step;
- reason for choosing the tool;
- argument summary;
- risk level;
- permission decision;
- whether human approval is required.
If the risk is visible here, the action can still be blocked before execution.
2. Post-action events
Record what actually happened:
- which external system was called;
- what result came back;
- whether the call failed or retried;
- whether a compensating action was triggered;
- whether data state changed.
Pre-action logs explain “why this action was attempted.” Post-action logs explain “what happened in the end.”
Without the first part, many agent incidents only show the result, not the intent drift.
Do Not Turn Logs Into a New Leak Surface
The more detailed agent logs become, the easier it is to create another security problem: storing sensitive data in logs.
I would explicitly avoid logging:
- API keys, tokens, cookies, SSH private keys;
- full prompts and full tool arguments;
- private fields from user input;
- sensitive business data from web pages;
- raw orders, accounts, students, customers, or financial records;
- long memory excerpts.
Better patterns:
- keep raw text only in short-lived secure storage with expiration;
- keep long-term logs to summaries, hashes, resource IDs, and references;
- label high-risk fields separately and restrict queries;
- audit access to the logging system itself;
- exclude raw sensitive logs from training, analytics, and dashboards by default.
Observability is not “save everything.” It is “preserve enough evidence to reconstruct the important path safely.”
How to Implement It
The first version does not need to be complex. I would implement it in this order.
1. Connect the trace first
A user request, an agent run, and a tool call must be linkable.
At minimum:
trace_idagent.session_idagent.intent_idtool.call_id
OpenTelemetry already has a log data model with TraceId, SpanId, attributes, and event names. It is a good fit for structured agent events. Do not reduce agent behavior to an unstructured log string.
2. Log at the tool gateway
Do not let every tool invent its own logging format.
A better boundary is a tool gateway:
Agent → Tool Gateway → Policy Check → Human Approval → Tool ExecutionAt that boundary, you can consistently record:
- intent, plan, and argument summary before the call;
- policy decision;
- approval status;
- execution result;
- failure and retry behavior.
Putting safety control and logging at the same boundary makes incidents easier to investigate.
3. Promote high-risk actions
Not every action needs the same log detail.
These actions deserve higher log severity:
- deleting data;
- exporting large amounts of data;
- changing permissions;
- changing billing or payment state;
- touching production systems;
- accessing sensitive systems;
- committing code or triggering deployments automatically.
For high-risk actions, log intent, policy, approver, pre-action argument summary, and post-action result.
4. Start With Useful Alerts
Do not start with a wall of dashboards.
Start with alerts that expose real agent risk:
- high-risk tool call without human approval;
- read-only plan followed by a write action;
- unusual number of tool retries under one intent;
- tool arguments derived from untrusted context;
- action allowed while the policy engine is unavailable;
- one agent accessing too many resources in a short window;
- execution result that does not match the plan step.
These are difficult to detect with traditional API logs alone.
A Simple Test
When designing agent security logs, ask:
If an agent accidentally deletes data tomorrow, can I answer within 30 minutes: what did the user want, how did the agent interpret it, why was this tool selected, who approved it, and what was the impact?
If not, the logs are not good enough yet.
Conclusion
Agent risk does not only come from “does it have permission?”
It also comes from “it had valid permission and still did the wrong thing.”
That means security logs need to evolve.
The next generation of agent security logs should capture:
- user intent;
- agent plan;
- context sources;
- tool selection;
- policy decision;
- human approval;
- actual action;
- execution result.
This is not about assigning blame after the fact. It is about making agent systems explainable, auditable, blockable, and recoverable.
A production-ready AI agent should not only complete work. It should also leave enough evidence to explain why every important step happened.
References
- OWASP: https://genai.owasp.org/resource/agentic-ai-threats-and-mitigations/
- OpenTelemetry Events: https://opentelemetry.io/docs/specs/semconv/general/events/
- OpenTelemetry Logs Data Model: https://opentelemetry.io/docs/specs/otel/logs/data-model/
- NIST AI RMF: https://www.nist.gov/itl/ai-risk-management-framework
