Why URLs Need Encoding
URLs can only contain a limited set of ASCII characters. Spaces, punctuation, non-ASCII characters, and reserved URL characters must be percent-encoded. Percent-encoding replaces unsafe characters with a percent sign followed by two hex digits representing the UTF-8 byte value.
- Space →
%20 - Ampersand (&) →
%26 - Equal sign (=) →
%3D - é →
%C3%A9(two bytes in UTF-8)
encodeURI vs encodeURIComponent
encodeURI('https://example.com/path?q=hello world')
// Preserves URL structure chars: : / ? & = # @
// → 'https://example.com/path?q=hello%20world'
encodeURIComponent('hello world&lang=fr')
// Encodes EVERYTHING — safe for individual values only
// → 'hello%20world%26lang%3Dfr'
Rule: use encodeURIComponent() for individual query parameter values. Use encodeURI() only when encoding a complete URL you want to preserve as-is.
Form Encoding vs URL Encoding
HTML forms use application/x-www-form-urlencoded, where spaces are encoded as + instead of %20. This is why search engines historically used q=hello+world — form encoding, not standard percent-encoding.
Decoding Safely
decodeURIComponent('hello%20world%26lang%3Dfr')
// → 'hello world&lang=fr'
function safeDecodeURI(str) {
try { return decodeURIComponent(str); }
catch { return str; }
}
Common Mistakes
- Double-encoding: Encoding an already-encoded value converts
%20to%2520— always decode before re-encoding - Using encodeURIComponent on a full URL — this encodes slashes and colons, breaking the URL structure
- Not encoding user input before inserting into URLs — can lead to open redirect vulnerabilities