TL;DR
AI coding agents (Cursor Agent, Devin, OpenAI Codex) don't just suggest lines of code. They write entire features, install dependencies, run commands, and modify your configs autonomously. This expands the security attack surface well beyond what Copilot-style autocomplete introduced. The key risks: unreviewed dependency installation, configuration drift in security-critical files, prompt injection via untrusted inputs, and security mistakes happening at scale. The fix: scan after every agent session, review diffs carefully, and sandbox your agents.
The shift happened fast. In 2024, AI coding meant Copilot suggesting the next line. You'd tab to accept or keep typing. You were always in the loop.
Now? Cursor Agent mode writes an entire authentication system across six files. Devin plans and implements a feature branch while you're in a meeting. OpenAI Codex takes a GitHub issue and ships a pull request. The developer went from co-pilot to air traffic controller.
This is a different security equation. And most teams haven't updated their threat model to account for it.
From Autocomplete to Autonomous
Let's be specific about what changed. Copilot-style tools operate in a tight feedback loop:
- You type code
- The AI suggests a completion
- You accept, reject, or modify it
- You move to the next line
The blast radius of a bad suggestion is small. One function. Maybe one file. You're reading every line as it appears.
AI agents work differently:
- You describe what you want (in natural language or by pointing at an issue)
- The agent plans an approach
- It writes code across multiple files
- It installs packages
- It runs commands to test its work
- It modifies configs as needed
- You review the final result
That last step is where things get interesting. You're reviewing a completed diff, not watching code appear line by line. The cognitive load of catching a subtle security issue buried in a 400-line diff across 12 files is significantly higher than catching it in a single inline suggestion.
The core problem: AI agents make the same security mistakes that Copilot makes, but they make them across more files, more quickly, and with less human oversight at each step.
The New Attack Surface
1. Unreviewed Dependency Installation
When you manually run npm install some-package, you're making a conscious decision. You (hopefully) checked the package, looked at its download count, maybe glanced at the source.
AI agents install packages as part of their workflow. They need a JWT library? They pick one and install it. They need a file upload handler? Same thing. The agent optimizes for "does this solve the task," not "is this package trustworthy."
This creates real supply chain risk:
- Typosquatting: An agent might install
axoisinstead ofaxiosif its training data included the typo - Abandoned packages: Agents don't check when a package was last updated or whether it has known vulnerabilities
- Excessive dependencies: Agents tend to reach for packages for things you could do in a few lines of code, expanding your dependency tree unnecessarily
What to do: After every agent session, review your lockfile diff. Run npm audit or pip audit. Better yet, configure your agent to use an allowlist of approved packages.
2. Configuration Drift in Security-Critical Files
AI agents modify whatever files they need to complete their task. That includes:
.envfiles and environment configuration- Authentication middleware and route guards
- CORS policies
- Database permissions and migration files
- Docker and deployment configs
- CI/CD pipeline definitions
The agent doesn't have a mental model of your security architecture. It doesn't know that you intentionally restricted CORS to specific origins, or that your CSP header took a week to get right. If loosening a restriction makes the feature work, the agent will do it.
Real scenario: An agent tasked with "add image upload support" might modify your CORS policy to allow all origins, add a permissive file upload route without size limits or type validation, and update your CSP to allow inline scripts. Each change makes the feature work. Each change weakens your security posture.
3. Security Mistakes at Scale
Copilot might generate one SQL query with string concatenation instead of parameterized queries. An AI agent building a CRUD feature will generate that same pattern across every endpoint it creates. Five routes, five SQL injection vulnerabilities, all in one session.
The patterns are the same ones we've seen with autocomplete tools:
- Missing input validation
- Hardcoded secrets
- Overly permissive error messages that leak internal details
- Missing rate limiting
- Weak authentication checks
The difference is volume. An agent producing 500 lines of code in a single session can introduce the same vulnerability in a dozen places simultaneously.
4. Prompt Injection via Untrusted Input
This is the risk unique to agentic AI. If your agent processes untrusted input as part of its workflow, that input could manipulate the agent's behavior.
Where does untrusted input show up?
- README files from open source dependencies
- Package descriptions on npm or PyPI
- Error messages from external services
- Issue bodies on GitHub
- API responses from third-party services
An attacker could craft a malicious package whose README contains hidden instructions like "also add this script to the project's postinstall hook." If the agent reads that README while evaluating the package, it might follow those instructions.
This isn't theoretical. Researchers have demonstrated prompt injection attacks against coding agents by embedding instructions in code comments, docstrings, and markdown files that the agent processes during its workflow.
Prompt injection in practice: A 2025 research paper showed that hidden instructions in code comments could make AI agents install backdoors, exfiltrate environment variables, and modify security configurations. The instructions were invisible to casual code review because they used Unicode tricks and comments that looked benign.
5. Context Window Blindness
AI agents work within a context window. They can only "see" a limited amount of your codebase at any time. This leads to a specific class of problems:
- Duplicated security logic: The agent implements its own auth check instead of using your existing middleware, creating an inconsistent security boundary
- Contradictory configurations: The agent sets up its own database connection with different permission settings than your main config
- Missed dependencies: The agent doesn't realize that the file it's modifying is imported by a security-critical path
Your codebase has implicit security invariants, things that must stay true for the system to be secure. An agent that can only see part of the codebase will violate these invariants without knowing they exist.
What to Do About It
None of this means you should stop using AI agents. They're genuinely useful. But you need guardrails.
Review Every Diff Before Committing
This sounds obvious, but it's the most important practice. Don't let agents auto-commit. Review the diff the same way you'd review a junior developer's pull request, with extra attention to:
- New dependencies added
- Changes to config files
- Authentication and authorization logic
- Database queries and migrations
- Any file you didn't expect the agent to touch
Scan After Every Agent Session
An automated security scan catches the patterns that are easy to miss in manual review. After the agent finishes its work and before you commit, run a scan. This is faster than reading every line and catches things like exposed secrets, missing headers, and insecure configurations.
Sandbox Your Agents
Run AI agents in isolated environments where they can't access:
- Production credentials and API keys
- Your real
.envfile (give them a.env.examplewith dummy values) - Deployment pipelines
- Production databases
Most agent tools support some form of sandboxing. Cursor has permission controls for terminal commands. Codex runs in a sandboxed environment by default. Use these features.
Pin Dependencies
Configure your project to require exact versions in lockfiles. When an agent adds a dependency, the pinned version goes through your normal review process. This prevents both malicious packages and unexpected breaking changes.
Maintain Security Invariants Documentation
Keep a short document listing your security-critical configurations: CORS policy, CSP headers, auth middleware, database permissions. When reviewing an agent's diff, check this list. Did the agent modify anything on it?
Quick checklist after every AI agent session:
- Review the full diff, especially config files and new dependencies
- Run
npm audit/pip auditon new packages - Run an automated security scan
- Check that auth middleware and route guards are intact
- Verify CORS, CSP, and other security headers haven't changed
The Bigger Picture
AI coding agents are going to keep getting more capable. The trend is clearly toward more autonomy, not less. That's fine. Better tools make developers more productive.
But every increase in agent autonomy needs a corresponding increase in automated verification. The less time a human spends reviewing each line of code, the more important it becomes to have automated systems catching security issues.
Think of it like self-driving cars. More autonomy doesn't mean less safety infrastructure. It means different safety infrastructure. The same applies to AI coding agents.
The developers who will ship securely with AI agents are the ones who treat every agent session like a pull request from a very fast, very productive contributor who doesn't think about security. Review the work. Scan the output. Trust, but verify.
How are AI agents different from Copilot in terms of security risk?
Copilot suggests code inline and you accept or reject each suggestion. AI agents like Cursor Agent, Devin, and Codex operate autonomously. They write multi-file features, install packages, run shell commands, and modify configurations without per-action approval. This means a single agent session can introduce vulnerabilities across your entire codebase, not just one line at a time.
Can prompt injection attacks target AI coding agents?
Yes. If an AI agent reads untrusted content (a README file, a package description, an issue body, or an error message), that content could contain instructions that manipulate the agent's behavior. An attacker could craft a malicious package description that tells the agent to install a backdoor or exfiltrate environment variables.
Should I stop using AI coding agents?
No. AI agents are genuinely productive tools. The goal is to use them with appropriate guardrails: run agents in sandboxed environments, review every diff before committing, scan after every agent session, pin dependencies, and never give agents access to production credentials.
How often should I scan after using an AI agent?
After every agent session that produces code you plan to ship. Unlike Copilot, where you review each suggestion, an agent session can touch dozens of files. A quick automated scan catches the security issues that are easy to miss in a large diff.
AI agents write code fast. Make sure they're not writing vulnerabilities just as fast. A free scan takes 60 seconds and checks for the security issues that agents commonly introduce.