normally domain name can’t contain Unicode, to have unicode characters in domain names they need to be encoded into Punycode - Wikipedia.
So far so good, but where lies the problem?
Punycode domain names can be used for phishing:
taking a well known address like google.com and replacing some letters with similar/identical looking ones from different alphabets make for a nearly perfect phishing link.
Example:
Links from https://github.com/blazeinfosec/advisories/blob/master/signal-advisory.txt:
Legitimate URL: http://blazeinfosec.com
Malicious URL: http://blаzeinfosec.com - with the ‘a’ as a Cyrillic character, not Latin
Somehow people classified this as a security risk in signal, although I don’t expect my messengers to warn me and personally believe this is normally a user error/fail and at most responsibility of the browsers.
(rule of thumb: don not click nor trust links of people you don’t trust!)
What can and should we do about it?
As stated above I think this is not something that we as messenger are responsible for, but we can still be nice and warn users from such links if we want.
A few methods come to my mind:
- showing the punycode encoded version of the link: (
xn--blzeinfosec-zij.com
) - showing a confirmation dialog when clicking on punycode links
(possibly with whitelist that the user can fill: “don’t ask again for this domain”) - (desktop) showing full link on hovering above the link in a tooltip
Those methods could be refined with more algorithms to let legitmate punycode urls through, like münchen.de:
the method that frefox uses: IDN Display Algorithm - MozillaWiki
Related Links: