Member-only story

Why are emoji characters like 👩‍👩‍👧‍👦 treated so strangely in Swift strings?

3 min readOct 6, 2020

This has to do with how the String type works in Swift, and how the contains(_:) method works.

The ‘👩‍👩‍👧‍👦 ‘ is what’s known as an emoji sequence, which is rendered as one visible character in a string. The sequence is made up of Character objects, and at the same time it is made up of UnicodeScalar objects.

If you check the character count of the string, you’ll see that it is made up of four characters, while if you check the unicode scalar count, it will show you a different result:

print("👩‍👩‍👧‍👦".characters.count)     // 4
print("👩‍👩‍👧‍👦".unicodeScalars.count) // 7

Now, if you parse through the characters and print them, you’ll see what seems like normal characters, but in fact the three first characters contain both an emoji as well as a zero-width joiner in their UnicodeScalarView:

for char in "👩‍👩‍👧‍👦".characters {
    print(char)    let scalars = String(char).unicodeScalars.map({ String($0.value, radix: 16) })
    print(scalars)
}// 👩‍
// ["1f469", "200d"]
// 👩‍
// ["1f469", "200d"]
// 👧‍
// ["1f467", "200d"]
// 👦
// ["1f466"]

As you can see, only the last character does not contain a zero-width joiner, so when using the contains(_:) method, it works as you'd expect. Since you aren't comparing against emoji containing zero-width joiners, the method won't find a match for any but the last character.

Why are emoji characters like 👩‍👩‍👧‍👦 treated so strangely in Swift strings?

Written by Mr.Javed Multani

No responses yet