URL Scanner
Paste any URL and get a heuristic risk score. Detects typosquatting against 40+ major brands, Cyrillic/Greek homoglyphs, abused TLDs, IP-as-host, URL shorteners, credentials embedded in the URL, excessive subdomains, and brand impersonation in subdomains or paths — without ever fetching the URL.
Checks typosquatting (PayPaI vs PayPal), Cyrillic homoglyphs, high-abuse TLDs, IPs as host, shorteners, brand impersonation, and more. 100% in your browser — the URL is never visited nor sent to any server.
How to Use the URL Scanner
- Paste a suspicious URL into the input. Schemes are auto-prepended if missing (e.g., paypal-secure.tk → http://paypal-secure.tk).
- Click Analyze. The scanner parses the URL locally and runs a battery of heuristics — no HTTP request is made, the URL is never visited.
- Read the parsed breakdown: protocol, host, registrable domain, TLD, port, path, query, label count.
- Each detected signal is grouped by severity (critical/high/medium/low) with a bilingual explanation of WHY the pattern is suspicious.
- Use the score (0–100) and signal count as a triage indicator. A 0 score means no heuristics triggered — it does NOT mean the URL is safe.
What the Scanner Detects
Detecting a phishing URL by inspection alone is a constrained problem: you have one string and a set of patterns historical attackers have used. This scanner implements the patterns that survive in current threat reports and that don't require fetching the URL (which would itself be risky and would tip off the attacker). **Typosquatting** uses Levenshtein distance to compare the registrable domain's core (e.g., 'paypa1' from 'paypa1.com') against a curated list of 40+ frequently-impersonated brands: PayPal, Amazon, Microsoft, Apple, Google, Meta, Netflix, Coinbase, Binance, Bank of America, Chase, BBVA, Santander, CaixaBank, Correos, DHL, FedEx, IRS, AEAT, MetaMask, and more. A distance of 1–2 characters from a real brand triggers a critical or high signal. **Homoglyph attacks** are detected by scanning each label for mixed-script characters. Cyrillic 'а' (U+0430) is visually identical to Latin 'a' (U+0061). The scanner flags any label that mixes ASCII letters with non-Latin Unicode — a textbook IDN homograph attack. Punycode prefixes (xn--) are also surfaced separately. **Abused TLDs** are matched against a list maintained from Spamhaus, Cloudflare Radar, and other public abuse feeds: .tk, .ml, .ga, .cf, .gq (legacy Freenom), .top, .xyz, .click, .icu, .buzz, .live, .country, .win, .loan, and others with disproportionately high phishing rates. **Structural red flags** include: IP addresses used as hostnames (legitimate brands use names), the user@host credentials trick (browsers display the part before @ but navigate to the part after), URL shorteners that hide the destination (bit.ly, tinyurl, t.co, etc.), excessive subdomains (login.paypal.security.attacker.com pushes the real domain off-screen on mobile), heavy percent-encoding (>5 %xx sequences — a known evasion), non-standard ports, and HTTP without HTTPS. **Brand impersonation in subdomains or paths** is detected by searching for brand keywords ('paypal', 'amazon', 'microsoft', etc.) in the hostname or path when the registrable domain is NOT the brand's official one. 'paypal.com.attacker.tk' and 'attacker.com/paypal-login' both trigger this. The risk score is a weighted sum of all triggered signals, capped at 100. Critical signals add ~35 each, high ~20, medium ~12, low ~6. The level mapping is: 0–9 safe, 10–24 low, 25–44 medium, 45–69 high, 70+ critical.
Frequently Asked Questions
No. The URL is parsed and analyzed locally in your browser. There is no HTTP request, no DNS lookup, no API call. You can verify in the Network tab — there are zero outbound requests when you click Analyze. This is intentional: visiting a phishing URL can leak your IP, trigger downloads, or get you on a 'verified visitor' list.
The TLD signal is medium-severity precisely because many legitimate sites use these domains. A .xyz site for a side project is fine; a .xyz site impersonating PayPal is not. Combine the TLD signal with the other detected signals to judge — a single 'abused TLD' on a clean domain is just informational.
Safe Browsing requires a server-side API call (and a key). This tool is intentionally 100% client-side so it can't leak the URL you're analyzing — phishing campaigns sometimes monitor lookups against their domains to know they've been spotted. For a reputation check, paste the URL into VirusTotal or urlscan.io directly when you decide it's worth that risk.
No tool can. Heuristics catch known patterns; new campaigns can use clean domains, freshly-registered, with no abuse markers. Treat a low score as 'no obvious red flags', not 'safe'. The strongest defense is context: did you expect this URL? Does the sender match? Were you led to it by a message that pressured you? When in doubt, type the brand's URL by hand.