What is a Hash Function?
A hash function takes any input and produces a fixed-size output called a digest. The same input always produces the same output, but you cannot reverse the process. This one-way property, combined with the avalanche effect (a tiny input change produces a completely different hash), makes hash functions the backbone of passwords, checksums, and digital signatures.
The Three Core Properties
- Deterministic: SHA-256("hello") always equals the same 64-character hex string
- Avalanche effect: SHA-256("hello") and SHA-256("Hello") share zero recognizable bits
- Collision resistant: Computationally infeasible to find two different inputs producing the same hash
MD5 — Fast But Broken for Security
MD5 produces a 128-bit (32 hex character) hash. Researchers demonstrated practical collision attacks — two different files with the same MD5 hash. You can forge a file with the same MD5 as a trusted one.
Acceptable: Non-security checksums (verifying a download was not corrupted), cache keys, database partition keys.
Never use for: Passwords, digital signatures, or any security-critical verification.
SHA-256 — The Current Standard
SHA-256 produces a 256-bit (64 hex character) hash. No practical collision attacks exist. Used in TLS certificates, Bitcoin mining, Git commit hashes, and HMAC API signatures. SHA-256 is fast — good for checksums, bad for passwords (a GPU can compute billions per second).
For Passwords: Use Slow Hash Functions
Password hashing needs a deliberately slow algorithm that GPUs cannot parallelize efficiently:
- bcrypt — most widely supported; tunable cost factor
- scrypt — memory-hard; harder to attack with specialized hardware
- Argon2 — winner of the Password Hashing Competition; the modern best choice
Never use MD5, SHA-1, or SHA-256 directly for passwords, even with a salt.
Practical Uses
- File integrity: SHA-256 checksums confirm a file was not corrupted or tampered with
- API signatures: HMAC-SHA256 signs API requests to prevent tampering in transit
- Content addressing: Git uses SHA-1 (migrating to SHA-256) to address every file and commit
- Deduplication: Storage systems hash files to find duplicates without comparing byte-by-byte