Data Types

Data types in Python, like most other languages, can be a rather involved topic. Here we will cover the most important data types for Python, and we will cover them in depth. I will try to not overload you with too much information up front, but there is a lot to cover here. So, let's get started.

Booleans

Booleans are probably the simplest data type in programming. A boolean is simply a True or False. Another way to say this is whether a condition or value is True or False. Let's look at this in the eyes of a Python programmer. Well, the eyes of every programmer. Let's assign a value to the variable x and another value to the variable y.

x=5
y=6

Now, we all know that 6 is greater than 5. That should mean that y is greater than x, correct? Let's ask the computer if we're right. We need to pass the comparison of y>x inside of the print() method like this:

print(y>x)

If we run this, we get an output on the console of True. What if we changed the sign? Perhaps we decided to put y
print(y
If you run that now, you'll find that the new output is False. Booleans are used in comparing values, as we've seen here, but what about statements? Let's say that we have some massive calculation we're trying to do and we want to compare two different values and find out if one is greater than another? To do this we start getting into the next series, but that won't hurt anything. With our existing x and y values, let's compare them and find out how they compare to each other. Remember, this is supposed to be after much more extensive calculations, and we won't know the value exactly. So, let's say that we want y to be greater than x. The best way to approach this is to ask if y is greater than x. do something. In this case we are going to ask the computer to print True. The code looks like this:

if y > x:

� � print(True)

else:

� � print(False)

The last part of that statement basically asks the computer to tell us the comparison is False if that is the case. Also, take notice of the indentation. This is key in the Python interpreter. If you're using an IDE that understands Python and it's inherent requirements the indentation is automatic. If you're still in the console, you will have to manually provide the indentation by TAB or four spaces. We will cover this and others like it more later, but go ahead and run the code.

With the original values in place, the code should return True. Go ahead and change the values to make x greater and see what happens. False should be the new value returned. There is one more trick with Booleans. It is a carry-over from Python 2 and has no relevant value in what we're doing here, but I'm going to mention it anyway.

The values of True and False are associated with numbers. For those that have programmed in other languages will automatically know which ones. True is equal to 1 while False is equal to 0. This also means that True and False can be treated mathematically. Here are a couple of examples:

>>>True + True

2

>>>True * True

1

>>>True - False

1

>>>False/True

0

Obviously, there are more examples, but you get the point. I would say this makes a good way to start talking about numbers. Before we move on, I would like to introduce you to one more method. The type() method can tell us what type of value we are using. This will come in handy when you're dealing with data frames and large databases. Go ahead and run the code below with the previous examples.

print(type(x))

The console should print

Numbers

Numbers are used in everyday life. It's just about the one thing that we can all agree upon. Well, their existence at least. A computer looks at numbers a particular way. Different languages look at numbers similarly, but they have minor differences among them.

The simplest number is known as an integer. An integer is simply any whole number. Well, sort of. If we were to take a stroll through the other common languages in programming, we would find that integers have their limits. In particular, the C family of languages (C/C++ and Java) say that the value on an integer has to be within a range of -(2³¹) to 2³¹-1. Any whole number outside of that range simply became a long integer. This was the case with Python 2, but was deprecated in Python 3. Now, as long as it's a whole number (without decimals), it's an integer. The denotation for this is 'int'.

The next type is known as a Float value. Floats are simply a number with a decimal place. Even if the number is 1.0, the computer will only see a float value. The differences in the floats and the integers rest a lot on the amount of cache these use during processing. An integer will typically use 32 bits and a float will use 64 bits. This means it could take longer to calculate using floating point values versus integer values. For what we will be doing here, this shouldn't be an issue, but keep this in mind as you start to venture out on your own.

There are also limitations associated with Python's floating-point numbers. Often times, there are numbers that are very long behind the decimal, but only have the first digit displayed. For instance, if you wanted the exact value for a number that looked like 0.1000000000000567, Python may only show you 0.1 in the console. That doesn't mean that the rest of the number is lost. This does mean that you can't find out what's behind the truncation. You can explore more on that here.

There is one more type of number in Python. That is the complex number. This may be a refresher for some and new to others, but a complex number is simply any real number multiplied by the square root of negative one. An equation like 14 + 3i where i is the imaginary number and would be represented as 14+3i, where 14 is an integer and 3i is the representation of three times the imaginary number.

**IMPORTANT: If you are trying to convert from a string to a complex number, you need to remove any white space around the plus(+) or minus(-) symbols.

These are the different types of numbers in Python. Let's see what we can do with them now. This portion we will cover the different operators involved with numbers. We already know how to add and subtract with + and -. The multiply and divide operators are covered with * and / respectively. To denote an exponent, we use the double asterisk (**). This would look like x**y for the equivalent of x^y. The other way to show this is to use the pow() method by passing the base in followed by the exponent. This would look like pow(x,y). Again, the same results.

These next three are important to keep in your toolbox. Also, you need to understand how data types are affected by each of these. The first is called the modulus operator. It is noted by the % sign. This will divide two numbers and hand you the remainder of those numbers. In other words, if we try the following calculations 10 / 3, we know that we get 3 as the whole number and 1 as the remainder. The modulus operator is used to suss out that remainder value. This is typically used to determine whether a number is divisible by another. The next operator is the floored quotient operator. It looks like // and returns a truncated value of two numbers divided. That same calculation of 10 / 3 will only display 3. This is handy for simplified equations or when paired with the modulus operator to display an answer that would look like 3 r1 or 3 remainder of 1. A danger with this is when you divide two numbers and receive a value of zero. The bigger danger is if you simply commit the common offense of mistyping. The last one of this nature is a combination of the previous two. It is known as the divmod() method. It performs both of those calculations at the same time and returns the pair of numbers that are the whole number and the remainder. Using our example, it would look like divmod(10, 3) and return (3,1).

Some of the other important operators include converting a String object to a number. This comes up a lot in large data sets that need to be cleaned up. There are times a table may try and pass off a number as a String. If the number is a whole number, we call it an integer and we can convert as such by the int() method. Simply pass the string value into the parentheses and an integer value will come out the other side. Something like int(x) will create the desired result. This can also convert floats to integers. The same is true with float(). Pass x through the parentheses, and out comes a float value. If an integer is passed in, it will just have a .0 tacked to the end.

Strings

The string type is actually referred to as the "Text Sequence" type. If you're new to programming, then think of a string in Python as a series of characters (numbers or letters) encapsulated in single or double quotes. The best way to look at this is through an example. We can assign strings to variables just like we did with numbers. So, that's what we'll do here.

x = 'Hello Python!'

print(x)

x = "Hello Python!"

print(x)

There's no real difference there, but one allows for the use of the other inside of them. In other words, I would use a string assignment that used the single quote method in order to display double quotes within the string. The opposite direction works the same way. The other type of denotation of strings is the triple quote. There's something special about those. You can use the single or double quotes just three times as much. The special part is that all white space is maintained. White space is simply defined as any non-character space including spaces, return carriage (enter key), and tabs. So, if you have a paragraph or a specific spacing in mind, you can use the triple-quote encapsulation and maintain the text in whatever way you deem necessary. Let's view some examples.

x = "This is a 'quote' using single quotes inside."

y = 'This is a "quote" using double quotes inside.'

print(x)

print(y)

The output should read:

This is a 'quote' with single quotes inside.

This is a "quote" with double quotes inside.

Now we can try out the triple quotes with the following:

x = '''This is a triple quote that I

can write whatever I want

and use as many lines as I want.'''

print(x)

Output:

This is a triple quote that I

can write whatever I want

and use as many lines as I want.

This is a good time to bring up mutable versus immutable. Strings are immutable, which mean they can't be changed. A number can be changed. Those that are familiar with the C family of languages will think of strings as an array of characters. This isn't the case. if you tried to print or modify any one character within a string through an array access, you will receive an error message. This doesn't stop you from assigning the new character in the string to a new variable.

printing-press

The best way I can think to explain this is to compare it to a printing press. In the other C languages, each character is part of an array of characters like an array of stamps that would be laid out on an old printing press. If you need to change a letter in a word, you would simply take that letter out and replace it with a new one. In Python, you can't do that. You have to create a new string variable that would copy the first string and just replace that one letter you want to change out.

Lists

Python lists are somewhat unique in the programming world. For those that have been programming for a while, you might recognize them as the same as an array. It kind of is. What makes lists unique is that you can put whatever data type we've covered here. Here's an example:

x = [1, 2, 3, 'Jon', 'Snow', True]

If you were to print that out on your screen, you should see something like [1, 2, 3, 'Jon', 'Snow', True] which isn't any different than what you typed in to begin with.

Let's take a quick break for a second. If you're not a programmer, think for a second about counting to ten. Chances are, you started with the number one. Computers don't do this. They start with zero. This becomes�very important when trying to work with indexes in lists or anything that stores information in a series within a computer. Each of the values that we listed have an index value. This means that we can call a specific value from the list using the variable followed by square brackets and the number associated with that element. In the example that was styled above, there are six different elements. These elements are numbered from 0 to 5.

Lists-expl

So, if we type print(x[0]), we should get 1. If we type print(x[4]), we should get 'Snow'. This means the list can be iterated through. We will cover that in a later tutorial. The way to call a particular element in a list is variable[element number]. The variable in our case was x and we could choose any number from 0 through 5.

There is also what is called slicing. Slicing is reading a section of the list using the first index number of the elements you want and the number proceeding the last element. What does that mean? Let's say we want to pull out the strings Jon and Snow. As we can see on the image above, 'Jon' is index number 3, and 'Snow' is index number 4. So, we need to write a print method that will call, from our variable x, 'Jon' and 'Snow'. This will look like print(x[3:5]). Notice that would look like it includes the boolean at the end of the list, but the last number is not included our view. The first number is inclusive while the second number is exclusive.

Lists are also mutable. They can be changed. We can add to the existing list with the append() method or remove from the list with the pop() method. Using our existing example, we can add the numbers 4, 5, and 6 by calling x.append() three different times with each number in the parentheses.

x.append(4)

x.append(5)

x.append(6)

If we print out our results again, we will see 4, 5, and 6 have been added to the end of our list. There are ways to delete individual elements from lists as well. The first is the pop() method. It removes the last element in the list (last in first out). if you type x.pop() the number 6 will show up in the console and your new list will stop at 5. the other is del. It's time to say goodbye to Mr. Snow and remove him from the list. We can do this individually or by slicing.

WARNING: If you remove one element at a time, you will reset the numbering in the list. Everything to the right of that deleted element will shift one to the left.

Let's go ahead and slice this time. We will be seeing a lot more slicing in the future, and this might be good practice. Type out del x[3:5] and then print(x). Your new list should be [1, 2, 3, True, 4, 5]. Not too bad. Let's go ahead and get rid of that boolean value too. This has an index of 3. So, the code should look like del x[3]. Now your list should look like the numbers from 1 through 5.

There is far more to lists than this, but we only needed some of the foundational peices first. The next section�Tuples are another data type similar to lists that we will cover a bit more in depth.

Tuples

Probably the best way to think about tuples is a list but different. They can contain the same as any list, but they are immutable. Let's cover the syntax first. Instead of square brackets, the proper way to address a tuple with either parentheses or nothing at all. As always, let's look at some examples.

x = 1, 2, 3, 4, 5

This becomes a tuple. If we type print(type(x)) the output shown is . Also, if you print the variable out, you will notice that the output includes parentheses for you. Parentheses are the preferred method for using tuples. So, that's what we'll do from here on. All values are separated with a comma. These can also contain lists within a tuple. Let's see another example.

x = [1, 2, 3]

y = [4, 5, 6]

t = (x, y)

print(t)

print(type(t))

Output:

([1, 2, 3], [4, 5, 6])

So, we seem to have a list of lists. Each element in the tuple is indexed starting with zero. Just like in lists, zero is always the first number in the sequence. If we asked to see the first element in t by passing the first element in t inside square brackets, we would receive the first list.

print(t[0])

Was it clear so far? Highlight the text in question Or

Output:

[1, 2, 3]

We can even take this a step further and find the value at index 1 by passing it into a second set of brackets like this:

print(t[0][1])

Output:

2

It acts a little like a coordinate system, and shows a lot of similarity to a two-dimensional array. You can also pass the different values of the tuple to variables. Keep in mind how many items are in your tuple.

a, b = t

print(a)

print(b)

Output:

[1, 2, 3]

[4, 5, 6]

Each variable on the left now holds the values of the individual components of the tuple. There's one catch to this though. The value of a is the same as the value of x and b to y. Does that mean you have two different sets of data? Actually, the answer is no. Let's see what happens when we change a value in the list a to something else.

a[0] = 'sorry'

print(a)

print(x)

Output:

['sorry', 2, 3]

['sorry', 2, 3]

Uh oh... We seem to have changed the value of x without meaning to do it. That's because 'a' simply points to the values we assigned to x in the first place. remember we called a tuple to be the value of x and the value of y. Just because we passed those values to a and b respectively, doesn't mean we've relieved x and y of their purpose. So, how do we avoid that? Well, we will cover that in a later tutorial that involves lists and tuples much more heavily.

There are more pitfalls of tuples that we will cover here. Much like the rest of the programming world, typos happen. You will probably notice a few scattered throughout here. As has already been stated, a tuple can be called simply by separating two or more values by a comma. If you initialize a variable with a desired value and accidentally placed a comma at the end of it, you will have a tuple.

c = 'Hello!',

print(type(c))

Output:

This could cause some problems in larger programs. If something isn't quite looking right, check for stray or even missing punctuation. Python has a really good error handling system, but you can't always see the error it's trying to point out. Even worse, it's not noticing the actual error and sends you looking at other things that are just fine.

Sets

Sets are an unordered collection. They are usually used to eliminate duplicates. Let's say you had an inventory of cars on a lot and you simply wanted to get a lit of the manufacturers. There could be twenty cars from the same manufacturer mixed in with dozens of other cars with other manufacturers. Sets, acts like "UNIQUE" in SQL and returns a set of values that are unique. Example time!

cars = {'Ford', 'Chrysler', 'Chevrolet', 'Ford', 'Dodge', 'Dodge', 'Chevrolet', 'Ford', 'Mazda'}

print(cars)

Output:

{'Chevrolet', 'Chrysler', 'Ford', 'Dodge', 'Mazda'}

Notice, all those Fords just disappeared. We can also search for a particular item in a set by simply asking.

print('Dodge' in cars)

Output:

True

or

print('Lincoln' in cars)

Output:

False

Keep in mind that these sets are case sensitive. If we were to pass 'ford' into the query, we would receive False. We can also compare two sets to each other. Let's set up another variable with car manufacturers. We'll skip the inevitable dropping of duplicates on this one.

fcars = {'Mitsubishi', 'Honda', 'Mazda', 'Toyota'}

For getting the cars from cars that aren't in fcars we subtract fcars from cars.

print(cars)

print(fcars)

print(cars - fcars)

Output:

{'Chevrolet', 'Chrysler', 'Ford', 'Dodge', 'Mazda'}

{'Mitsubishi', 'Honda', 'Mazda', 'Toyota'}

{'Chrysler', 'Ford', 'Dodge', 'Chevrolet'}

Notice how the 'Mazda' value disappeared from the output. There is a way to put both sets together by using the 'or' operator.

print(cars | fcars)

Output:

{'Chrysler', 'Ford', 'Mitsubishi', 'Dodge', 'Honda', 'Mazda', 'Toyota', 'Chevrolet'}

You can see both lists have been combined and the extra 'Mazda' was dropped. Also, you can see how the order of the sets are changing. The values come in a different order almost every time. If you are following along this series, you may see a different order to what's on this page. That's fine. This is an�unordered collection as I stated before. Now let's look at only the manufacturers that are on both sets. This is done with the 'AND' operator.

print(cars & fcars)

Output:

{'Mazda'}

The last comparison to make is a set of manufacturers that are in each list but do not appear in both.

print(cars ^ fcars)

Output:

{'Chevrolet', 'Chrysler', 'Mitsubishi', 'Ford', 'Dodge', 'Toyota', 'Honda'}

There is a little more with sets, but we need to cover loops before we dive into that. We will come back to that and the other subjects after a little more groundwork has been completed.

Dictionaries

We should all know what a non-programming dictionary is. If you look up a word, you'll find its definition. The word and definition are tied together. Python doesn't act too much different. Dictionaries are an immutable set of keys matched with values. The keys can be either numbers, strings, or tuples. Tuples have a special case for them. They can only contain immutable objects. It's not recommended highly to use tuples as keys, but it does happen more often than you might think. When cleaning data in a database for analysis, it often happens where you might have a tuple as a key. The regular syntax for a dictionary is similar to a set with a new addition.

a = {'Bob' : 77566, 'Fred' : 98311, 'Jackie' : 10523, 'Jane' : 54865}

That sets up our dictionary. Now, let's see what it can do. We'll start by calling a key and getting the value associated with it.

print(a['Bob'])

Output:

77566

Perhaps we should add another name.

a['Kyle'] = 65892

Output:

{'Bob': 77566, 'Fred': 98311, 'Jackie': 10523, 'Jane': 54865, 'Kyle': 65892}

What if you only wanted the keys? we simply use the keys() method. Here, we should receive a list of names with no numbers. Another method in line with this is the values() method. It returns the values, in this case numbers, without the keys.

list(a.keys())

list(a.values())

Output:

['Bob', 'Fred', 'Jackie', 'Jane', 'Kyle']

[77566, 98311, 10523, 54865, 65892]

One last thing for dictionaries before we wrap this up. There is a constructor for dictionaries that allows you to create dictionaries from key-value pairs. Here's an example using the same data we have been.

b = dict([('Bob', 77566}, ('Fred', 98311), ('Kyle', 65892)])

print(b)

Output:

{'Bob': 77566, 'Fred': 98311, 'Kyle': 65892}

Summary

There was quite a bit covered here. There is still more to learn about each of these data types, but we only need the foundation pieces to start. Right now, you should have enough to start practicing around with different data types. See if there are any other methods that work with the different data types.