Strings and Characters in Swift: Behind the Scenes

A basic data type that is useful in programming. Yes!

It is clear that Strings and Characters are related in Swift. How? This article is dedicated to the details in Swift!

Image for post
Image for post
Photo by Derek Story on Unsplash

Difficulty: Beginner | Easy | Normal | Challenging

Prerequisites:

  • Some knowledge of Binary for programmers (something like THIS) ensures you have the knowledge of Binary, Denary and Hex is detailed HERE
  • Understand the difference between reference and value types (guide HERE)

Terminology

Technical terminology

Scalar: A single value, as differentiated from a vector or matrix

Scalar Values: Scalar is a single value, derived from linear algebra. It separates the single value from a data structure

Unicode: A standard for encoding, representation and handling of text

UnicodeScalar: A type representing a single Unicode Scalar value

UInt8: An 8-bit signed Integer type

UInt32: A 32-bit signed Integer type

General Strings and Character terminology

Character: A character, usually associated with a letter of the alphabet

Strings: A collection of Characters, commonly thought of as a word or sentence

Front of House: Strings

This means that copy-on-write is used for Strings, which is great for performance when longer Strings are used since the String is only copied when a change is made.

Awesome!

Behind the scenes

Swift’s String and Character types are Unicode compliant. But what does Unicode mean, and why should we be interested?

Unicode is a standard for encoding, the representation and the handling of text. A single Unicode value would be known as a Scalar, that is value as differentiated from a vector or matrix.

Unicode represents characters from languages all around the world, and is used by browsers to interpret characters transmitted through web pages on the Internet. Since it has variants between 8 and 32 bits per character there is much more space for these different characters (there are more than 3000 commonly used characters in Chinese, alone — and double that for simplified and ordinary Chinese type).

The extra space in unicode allows the storage of characters like Emojis.

However, Unicode uses the same codes as ASCII for the first 127 characters.

Swift’s String type is said to be built from Unicode Scalar Values. In Swift this is a 21-bit number. That is the maxiumum number that can be stored (111111111111111111111, the equivalent of 2097151).

So a representation of the largest number that can be stored is as shown below:

Image for post
Image for post

This gives Swift a LOT of values to potentially use for characters.

This means that not all of the 21-bit Unicode Scalar Values are used.

Now there are two ways of thinking about Unicode Scalar Values, that is the value itself

let firstCharacter: UnicodeScalar = “A”

Now something really interesting happens when we run this in Playgrounds.

Image for post
Image for post

I know that image is quite small. On the right-hand side there is a number. That number is 65. What’ s it doing there?

What has happened is that a String is created, and that instance represents that unique value.

The 65? The numerical representation of the same.

This means that you can create the String from the defined numeric value.

As in the following code will print the letter A to the console

let letterA: UnicodeScalar = UnicodeScalar(65)
print (letterA) // A

Now it should be clear that we can print that letter to the console, and it is represented by a number.

Swift has an initialiser (for either Unicode.Scalar or UnicodeScalar — they’re the same thing) that accepts a string and returns a UInt32 — But only if the String represents a exactly one unicode scalar value

let uIntValue: UInt32 = Unicode.Scalar("a").value // 97

this is perfectly fine, but what if we want to convert a String with multiple characters?

That’s the next section as Swift has us covered for converting a UInt32 to Binary. Let us see

let uIntValue: UInt32 = Unicode.Scalar("a").value
let binaryString = String(asciiInt, radix: 2)

binaryString is, perhaps predictably 1000001 in binary which is obviously 97 in denary, and we can see the result of that (for completion’s sake):

if let number = Int(binaryString, radix: 2) {
print(number)
}

Which reassuringly will print out 97 to the console

Converting Strings

It is usually quite easy to traverse a String to produce Characters

var str = "Hello, playground"
for ch in str {
print(ch)
let stringOfCharacter = String(ch)
print (stringOfCharacter)
}

Each ch within the for loop is of type Character., and can be converted to a String using the initialiser as shown above within the loop.

One disadvantage of this is that converting a Character to a String can be rather costly.

Swift has us covered! Swift has an instance property asciiValue to help us out. It works on a Character but, and it is a but, this returns an optional UInt8.

for ch in str {
let asciiValue: UInt8? = ch.asciiValue
}

one (fairly obvious) solution would be to use optional binding, but…

It seems a logical (and computationally performant operation) is to use Optional Binding and create an array of UInt8

var str = "Hello"
var array: [UInt8] = []
for ch in str {
if let asciiValue: UInt8 = ch.asciiValue {
array.append(asciiValue)
}
}
print (array)

this gives us the expected answer of [72, 101, 108, 108, 111] (if you wish to print the binary equivalent, the instructions are shown above)

However, what if we wished to replace each character with the next alphabetically, and return this back to a String (which will result in Hello -> Ifmmp).

Adding one to each value is simple — we do array.append(asciiValue + 1) which then gives us the next problem. How do we convert this back to a String?

Taking the array as stated above:

var opStr = ""
for val in array {
opStr += String(Character(UnicodeScalar(val)))
}
print (opStr)

is a winner — it gives us “Ifmmp” as output to the console.

but what if…rather than using a loop we could use a map operation?

It turns out that we can:

print (String(array.map{Character(UnicodeScalar($0))}))

Which of course prints out our expected “Ifmmp” to the console!

Conclusion:

Strings and Characters are useful, and and can be a favourite of some of the quizzes and puzzles that are online for budding coders.

The idea that there is a little more going on under the hood of Strings and Characters.

Extend your knowledge

  • Apple have a rather skinny guide on Strings and Characters (Guide HERE)

The Twitter contact:

Any questions? You can get in touch with me HERE

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store