user behavior - Are there any letters/numbers that should be avoided in an ID?

Monday, September 2, 2019

user behavior - Are there any letters/numbers that should be avoided in an ID?

For example, an image upload site that gives you a 5 digit ID for your file (domain.com/aCd5y)

Should any letters/digits not be used in the final ID, to make it easier for people to read and share links (without copy and paste)?

Should I avoid i, I, l or 1? How about O or 0? In the URL bar in Chrome, 0 doesn't have a line through it, and I (uppercase i) and l (lowercase L) look different, but I'm not sure about other browsers, devices, screens, etc.. for example in this post, uppercase I and lowercase L look the same.

Answer

What you are referring to are called Ambiguous characters since they seem similar to certain numbers.

You can get the list of those characters from this C code file on Pwgen.

If you are not comfortable reading C code, the characters and the corresponding confusing numerals (and letters) are

B = 8
G = 6

I = 1 = l (lowercase L)
O = 0
Q = D
S = 5 
Z = 2

All of this said, I strongly recommend choosing a font which will enhance the legibility of your text as that would help ensure the confusion is relatively lesser. The factors to choose the correct font would involve

Fixed width: For picking out random numbers/letters, fixed width helps tremendously, since the kerning isn't changing as you move across the font.

Use a font with separate 0/O looks - those definitely mess people up. Look for other letter/number combinations that are similar. Potentially, leave 0/O out of the mix just for this reason.

Choose a font with subtle serifs and weight changes.

Here is an article worth checking out about font legiblity

I also recommend reading this interesting article on the UX coupon codes which has a couple of inputs on how to remove ambiguity. To quote the article

Solution 1: Deal with ambiguity If you are worried about the distinction between O0, 1Il, 8B, or any other combinations, treat them as the same character!

This is what Base32 does. It will standardize on one of the characters above (say the digits 018), and omit the ones are too similar (in this case OIlB).

When you receive input from the user, map the omitted characters to the canonical ones (e.g. replace the letter O with the digit zero). This way, even if the user can't figure it out, it doesn't matter anyways.

Solution 2: Remove all ambiguity Base32 still leaves characters which seem like they may be ambiguous, even if underneath they can't be. For users with a little experience with this, they will still stop to question what they are doing.

Ergo, you can take it a step further and completely remove all characters that could be perceived as ambiguous (e.g. all of 0O1Il8B).

After all, you don't actually need your alphabet size to be a power of two. It is easy enough to convert into arbitrary bases, and you don't need it to be particularly fast either (since this is often coinciding with user input).

Blog

Monday, September 2, 2019

user behavior - Are there any letters/numbers that should be avoided in an ID?

No comments:

Post a Comment

technique - How credible is wikipedia?