Building an AI agent that works is one problem. Building one that is secure enough to run in production, handle real customer data, and survive a basic penetration test is a completely different challenge — and most developers skip it entirely.
The security failures I see in AI deployments are not exotic. They are the same patterns that plagued web applications a decade ago, now showing up in AI systems because developers are moving fast and treating security as a later problem. It never is. An AI agent connected to your CRM, your database, and your customer communication channels is a high-value target. If it is not hardened properly, it is a liability, not an asset.
The 5 most common AI security failures I see in the wild
Exposed API keys in client-side code. This is the most common and the most embarrassing. The OpenAI API key is in the JavaScript bundle. Anyone who opens DevTools can find it, copy it, and run up a bill on your account. Every AI API call must go through your own backend. Never call an LLM API directly from the browser.
No rate limiting on AI endpoints. An AI inference endpoint without rate limiting is an open invitation for abuse. A single script can send thousands of requests, burning through your token budget in minutes and potentially extracting sensitive information through prompt injection. Rate limiting should be implemented at the API gateway level, not as an afterthought in application code.
Trusting user input without sanitisation. Prompt injection is real. If your AI agent takes user input and passes it directly into a system prompt or a tool call without sanitisation, a sufficiently clever user can manipulate the agent's behaviour — potentially accessing data it should not, bypassing restrictions, or triggering unintended tool executions. Every input that touches an AI reasoning layer needs to be treated as untrusted.
Overprivileged MCP tools and API connections. When you give an AI agent access to an external API, scope that access to the minimum required. A calendar tool that needs to create and read events does not need delete permissions. A CRM tool that reads contact data does not need to delete records. Principle of least privilege is not optional — it is the single most important architectural decision in an AI agent deployment.
No audit trail for AI actions. When an AI agent takes an action — sends a message, updates a record, creates a booking — that action should be logged with full context: what input triggered it, what the model decided, what tool was called, what the result was. Not just for debugging, but for accountability. If something goes wrong, you need to be able to reconstruct exactly what happened and why.
How I structure security in every production AI deployment
Every AI system I put into production runs through the same security checklist before it goes live. Authentication is JWT-based with role-scoped claims, issued server-side, transmitted in httpOnly cookies (never localStorage). Token refresh is handled silently. Sessions expire. Admin endpoints require explicit admin-role claims validated on every request.
All AI API calls route through server-side API routes. No client-side LLM calls, no exceptions. Environment variables live in Vercel or a secrets manager — never in .env files committed to Git.
Rate limiting sits at the edge (Cloudflare or Vercel middleware) before requests reach application logic. AI endpoints get tighter limits than standard endpoints because the cost of abuse is higher.
Every external API connection through an MCP server or direct integration uses scoped credentials with the minimum permissions required. Credentials rotate on a schedule. Access logs are monitored.
All AI actions are logged to a dedicated audit collection with timestamps, user IDs, inputs, model decisions, tool calls, and outcomes. Logs are append-only.
The HIPAA case: when security becomes compliance
I built a HIPAA AI risk assessment tool for a healthcare client where security was not just good practice — it was a legal requirement. The tool evaluated AI use cases against the HIPAA Security Rule (45 CFR §164) using a deterministic rules engine. Every assessment produced an audit-ready report identifying which security safeguards were in place, which were missing, and what the risk level was.
The experience reinforced something I had already learned from building production systems: security decisions made at the architecture stage cost almost nothing. The same decisions made after launch cost enormous amounts of time, money, and sometimes reputation.
Build it right the first time. The shortcuts are never worth it.



