Lists are the primary 'sequence object' in Python which allows you to store an almost infinite number of values within a single identifier. This module introduces you to the basics of using Lists in Python as well as a number of different kinds of problems that can be easily solved using a variety of list-specific functions and methods. Lastly, the module will introduce additional data times: tuples and sets.
The programs that we have written so far have been designed to operate using textual data (strings), logical data (booleans) and numeric data (integers and floating Point numbers). These data types have the ability to represent and store one piece of data at a time - for example:
x = 5 # integer y = 5.0 # floating point number z = 'hello' # string q = True # boolean
However, there are times when we need to write a program that keeps track of many values at the same time. For example, if we wanted to write a program to keep track of final exam scores for a group of 50 students in a class we would need to create 50 different variables, like this:
test_01 = 95.45 test_02 = 89.35 test_03 = 76.43 ... ... test_50 = 97.11One way to solve this problem is to use a "sequence" data type, which has the ability to hold multiple values within the same identifier. In many programming languages we call these "arrays," but in Python we refer to this data type as a
list
.
An easy way to think about a list
in Python is to think about how a book operates in the real world. A book is a single object (i.e. "Harry Potter and the Chamber of Secrets") that can contain any number of sub-items (i.e. pages).
You can create a list
in Python by using square brackets, which almost look like the covers of a book in the real world. For example:
my_book = ["Page 1", "Page 2", "Page 3"]The above code will create a new
list
in Python that holds three strings – "Page 1", "Page 2" and "Page 3" – in that order.
Lists can contain any data type that we have covered so far - for example:
my_list = [100, 200, 300]
Lists can also mix data types.
my_list = ['Craig', 5.0, True, 67]
You can print the value of a list
using the print
function. This will print all of the values stored in the list
in the order in which they are represented:
my_list = ["a", "b", "c"] print (my_list)
Just like with a string, you can use the repetition operation (*
) to ask Python to repeat a list. For example:
my_list = [1, 2] * 3 print (my_list) #>> [1, 2, 1, 2, 1, 2]
You can use the concatenation operation (+
) to ask Python to combine lists, much like how you would combine two strings. For example:
my_list = [1, 2] + [99, 100] print (my_list) # >> [1, 2, 99, 100]
In a book, you can reference a page by its page number and in a list
you can reference an element stored in a list using its index. Indexes are integer values that represent the position of an item within a list. Indexes always start at zero (the same way a string index begins at zero). For example:
my_list = ["Apple", "Pear", "Peach"] print (my_list[0]) # >> AppleYou will raise an exception if you attempt to access an element outside the range of a
list
. For example:my_list = ["Apple", "Pear", "Peach"] print (my_list[4]) # Index doesn’t exist!
Lists are "mutable" data types, which means that they can be changed once they have been created (unlike strings). If you know the index of an item you wish to change, you can simply use the assignment operator to update the value of the item at that position in the list
. For example:
my_list = [1, 2, 3] print (my_list) # >> [1,2,3] my_list[0] = 99 print (my_list) # >> [99,2,3]
Sample program: This program demonstrates list creation, repetition, concatenation and accessing individual elements within a list
.
When working with lists you will often need to access many or all of the elements within a list
to solve a certain problem. For example, imagine that you had the following list of price values:
prices = [1.10, 0.99, 5.75]
If you wanted to compute 7% sales tax on each price you would need to access each item in the list and multiply it by 0.07. For a list
with three elements this is pretty easy:
tax_0 = prices[0] * 0.07 tax_1 = prices[1] * 0.07 tax_2 = prices[2] * 0.07
However, as your lists become larger and larger this technique will become unmanagable (imagine you had 1,000 prices in the list!) -- the solution to this problem is to use a repetition structure to iterate over the contents of the list
.
The simplest way to iterate over a list
is to use a for
loop to iterate over all items in the list
. When you do this, the target variable of your loop assumes each value of each element of the list
in order.
Sample Program: The program below demonstrates how to quickly iterate over a list
using a for
loop.
Programming Challenge: Given the list below, write a program that counts the # of A’s (scores between 90 and 100). Extension: Count the # of B’s, C’s, D’s and F’s. Click the "Run" button to check your work, and click here to download the solution.
As you can see, a for
loop is a convenient way to sequentially iterate through a list
.
The target variable in a for
loop assumes the value of the current item in the list as you iterate.
However, the target variable isn’t very helpful if you want to change the value of an item in a list
since it just represents a copy of the data. For example:
Sample Program: The list below remains unchanged because we are not modifying the values stored in the list
.
In order to change a list
item you need to know the index of the item you wish to change. You can then use that index value to change an item at a given position in the list
. For example:
Sample Program: The list below does change because we are using index notation to change a value at a particular position in the list.
There are two main techniques for iterating over the index values of a list
:
list
and continually updating the variable as you move to the next position in the list
range
function to create a custom range that represents the size of your list
If you set up an accumulator variable outside of your loop, you can use it to keep track of where you are in a list
. For example:
Sample Program: Using an accumulator variable to keep track of our current position in a list.
To improve upon this example, we can use the len
function to determine the size of our list
(rather than just hard coding our loop to end after 3 iterations). The len
function can take a list
as an argument and will return the integer value of the size of the list
. Example:
Sample Program: Using the len
function to count the number of elements in a list
and then using that result to control our loop.
You can also use the range
function to construct a custom range that represents all of the indexes in a list
. This technique can be a bit cleaner to implement since you don't need to worry about setting up and maintaining a counter variable:
Sample Program: Use the range
function to create a custom range that represents all valid positions in a list
.
Programming Challenge: Given the following list of student test scores, apply a class "curve" to each score. The class curve is as follows:
You can create an empty list
with no elements using the following syntax:
mylist = []
With an empty list
, you cannot place content into the list
using index notation since there are no "slots" available to be used in the list
. You can, however, append values to the list using the concatenation operator, like this:
mylist = [] mylist += ["hello"] mylist += ["world"] print (mylist) # >> ["hello","world"]
Since you cannot access an element outside of the range of a list
it is sometimes necessary to set up a correctly sized list before you begin working with it.
For example:
# create a list of 7 zeros daily_sales = [0] * 7
Programming Challenge: Write a program that asks the user for daily sales figures for a full week (Sunday – Saturday). Store these values in a list and print them out at the end of your program. Here's a sample running of your program:
Enter sales for Day #1: 100 Enter sales for Day #2: 200 Enter sales for Day #3: 300 Enter sales for Day #4: 400 Enter sales for Day #5: 500 Enter sales for Day #6: 600 Enter sales for Day #7: 700 Sales for the week: [100,200,300,400,500,600,700]
Click the "Run" button to check your work, and click here to download the solution.
Sometimes you need to extract multiple items from a list
.
Python contains some built in functions that make it easy for you to “slice” out a portion of a list. The syntax for list
slicing is identical to the syntax for string slicing. To slice a list
you use a series of "slice indexes" to tell Python which elements you want to extract. Example:
new_list = old_list[start:end]
Python will copy out the elements from the list on the right side of the assignment operator based on the start and end indexes provided. It will then return the result set as a new list
. Note that slice indexes work just like the range
function – you will grab items up until the end index, but you will not grab the end index itself. Here's an example:
list_1 = ['zero', 'one', 'two', 'three', 'four', 'five’] list_2 = list_1[1:3] print (list_1) print (list_2) # >> ['zero', 'one', 'two', 'three', 'four', 'five’] # >> ['one', 'two']
If you omit the start_index in a slice operation, Python will start at the first element of the list
. If you omit the end_index in a slice operation, Python will go until the last element of the list
. If you supply a third index, Python will assume you want to use a step value. This works the same as the step value you would pass to the range
function
Programming Challenge: Given the following list
, write a program that does the following:
Click the "Run" button to check your work, and click here to download the solution.
You can easily test to see if a particular item is in a list
by using the in
operator. Here’s an example:
my_list = ['pie', 'cake', 'pizza'] if 'cake' in my_list: print ("I found cake!") else: print ("No cake found.")
The in
operator lets you search for any item in a list
. It will return a Boolean value that indicates whether the item exists somewhere in the list
.
Programming Challenge: Given the following lists, write a program that lets the user type in the name of a product. If the product name exists in our inventory, you should print out that it is in our inventory. Otherwise you should print out that the product is not found. Ensure that your program is case insensitive (i.e. searches for "Apple" or "apple" or "APPLE" should all succeed). Click the "Run" button to check your work, and click here to download the solution.
Programming Challenge: Given these two lists, write a program that finds all elements that exist in both lists (i.e. the integer 2 exists in both lists). Store your results in a list and print it out to the user. The expected answer is:
[1, 2, 3]
Click the "Run" button to check your work, and click here to download the solution.
You have already seen a few ways in which you can add items to lists:
list
using the *
operator+
operatorAnother way to add items to a list is to use the append
method. The append
method is a function that is built into the list
datatype. It allows you to add items to the end of a list
. Example:
mylist = ['Christine', 'Jasmine', 'Renee'] mylist.append('Kate') print (mylist)
Programming Challenge: Write a program that continually prompts a user to enter in a series of first names. The user can elect to stop entering names when they supply the string "end." Store these first names in a list and print them out at the end of your program. Extension: Prevent the user from entering duplicate names (hint: use the in
operator). Click the "Run" button to check your work, and click here to download the solution.
You can remove an item from a list
by using the remove
method. Here’s an example:
prices = [3.99, 2.99, 1.99] prices.remove(2.99) print (prices)
Note that you will raise an exception if you try and remove an item that is not in the list
. In order to avoid this, you should make sure to test to see if it is in the list first using the in
operator, or use a try / except
block to catch any errors you might raise.
Programming Challenge: Continually ask the user for a product name. Next, see if that product name is included in the inventory list below. If it is, remove the product from the list and then print the current list of products to the user. If the product is not on the list you should alert the user that we do not currently carry the product in question. You can end the program when the list of products is exhausted or when the user types the string "end." Click the "Run" button to check your work, and click here to download the solution.
Sometimes you want to delete an item from a particular position in a list
. You can do this using the del
keyword. For example, say you wanted to delete the item in position #0 in the following list
:
Sample Program: Using the del
keyword to remove an item from a particular position in a list
.
You can have Python sort items in a list
using the sort
method. Here’s an example:
Sample Program: This program sorts a list
in ascending alphabetical order.
Python can also reverse the items in a list using the reverse
method. This method will swap the elements in the list
from beginning to end (i.e. [1,2,3] becomes [3,2,1]) - note that this method does not sort the list
at all. It simply reverses the values in the list
. Here's an example:
Sample Program: This program reverses a list
.
You can use the index
method to ask Python to tell you the index
of where an item can be found in a list
. The index
method takes one argument – an element to search for – and returns an integer value of where that element can be found. Caution: The index
method will throw an exception if it cannot find the item in the list
. Here’s an example:
Sample Program: Demonstration of how to find the location of an item in a list
using the index
method.
Programming Challenge: The lists below are organized in such a way that the item at position 0 in the first list
matches with the item at position 0 in the second list
. With this in mind, write a product price lookup program that works as follows:
Enter a product: peanut butter This product costs 3.99
Click the "Run" button to check your work, and click here to download the solution.
Python has two built-in functions that let you get the highest and lowest values in a list
. They are called min
and max
– here’s an example:
Sample Program: Demonstration of how to find the largest and smallest items in a list.
To add a type annotation for lists, you'll have to bring in the List
type by importing the typing
module:
from typing import List
After you've imported the type, List
, you can use it like you would with any other type annotations:
planets: List = ['mercury', 'venus', 'earth', 'mars']
It's common to store the same types of values within a list. For example, you may want to store a bunch of float
s to represent the values coming from an ambient light sensor on your phone... or you may want a collection of strings representing all of the verbs in a text. In fact, types analogous to lists in some other languages restrict those types so that they can only contain on kind of value.
If your intent is store only one type of value in your list, you may also want to specify that in your type annotation. This can be done by placing square brackets ([
and ]
, like indexing) after List
, and inserting the type name between the brackets. Here's a list that stores only integers:
nums: List[int] = [1, 2, 3, 4]
PyCharm's type checker will warn you if you have an item that doesn't conform to your list type annotation:
nums: List[int] = ['a', 'b']
# this will show as a type error
Mixed types should also trigger an error above, but it may depend on which type checker you're using. At the time this writing, PyCharm's type checker does not correctly identify this as an issue.
⚠️ If you're using a list as a parameter, do not give it a default value!
Here's an example... imagine that we have a function that accepts a list as an argument, but we default the argument to an empty list in case the function is called without a value passed in. What do you think will be printed out?
Sample Program: We have a function that uses a list, a mutable type, as a default value for a parameter. What do you think the output will be?
What 🤯? Why does this happen? Python doesn't actually reset the default value on each call of the function. It's set just once, and when it's changed, it accumulates those changes. So, DO NOT USE A LIST AS A DEFAULT VALUE FOR A FUNCTION PARAMETER.
A tuple is a sequence type (just like lists and strings). That is, it's a compound data type (a type that is made up of other values), and it's composed of an ordered sequence of values. Like lists, they're a collection of any kind of values (the values that make up a tuple can be any type). With that said, though, there are a few major differences between tuples and other sequence types:
To create a tuple, you simply need comma separated values. This is a tuple:
t = 1, 2, 3
print(type(t))
# prints out type, tuple!
The commas used in a function definition's parameters do not create a tuple (they are there only to specify separate parameters). If there's ambiguity in the intent of using commas, then a tuple
should be surrounded with parentheses. For example, if a tuple were used as an argument in a function call, then it has to be within parentheses so that the tuple is taken as a single argument rather than each element being used as a separate argument:
# this prints out 3 integers, 1, 2 and 3.
print(1, 2, 3)
# but... what if we wanted to print out the tuple: 1, 2, 3
# surround it with parentheses!
print((1, 2, 3))
A tuple can also be created from other sequence types... that is, you can convert a list or a string into a tuple. Just like type conversion we've seen before, use a function named after the type we would like to convert to. In this case: tuple
.
number_list = [1, 2, 3]
t = tuple(number_list) # voila, new tuple!
print(t) # prints out (1, 2, 3)
print(type(t)) # prints out <class 'tuple'>
Because tuples are sequences, operators and functions that you've seen used on strings and lists can also be used with tuples. The operations and functions include:
Sample Program: A tuple is a sequence type, so common sequence operators and functions will work on tuples too! Note that the online editor will not surround tuples with parentheses when printed out.
Finally, if you'd like to specify that a variable is a tuple, you'll have to bring in the typing
module so that you can use Tuple
as a type annotation. Just like using List
, you can specify the types within your tuple. There are a couple of details that make type annotations for tuples a little tricky:
t: Tuple[int, int, int] = (1, 2, 3)
From this, we can see that tuples are clearly syntactically different from other sequence types -- lists and strings.
As mentioned above, tuples are immutable. This makes them similar to strings: they cannot be changed. Another way of thinking about a tuple is that it's like an immutable list.
This means that you can't:
⚠️ You will get a runtime error if you try to change a tuple. All of the following will cause an exception:
numbers = 1, 2, 3
# AttributeError (tuples don't have an append method)
numbers.append(4)
# TypeError (items in tuples can't be deleted)
del numbers[0]
# an item in a tuple can't be reassigned
numbers[0] = 'can i change this?'
There's one catch, though. If an element in a tuple
is mutable, then that element can be changed... but the tuple itself cannot. For example, if one of the elements in a tuple is a list, that list can be modified... but it cannot be replaced entirely:
t = ([1, 2, 3], 4)
t[0][0] = 'omg, changed!' # ok!
# but, we still cannot reassign the first element entirely
# t[0] = 'will not work'
When deciding between using a tuple and a list, intention should also be taken into account:
These are just semantics - the language has no way of enforcing this usage (though other languages do - for example, in Java, the data structures that are analogous to Python lists must contain elements all of the same type). comparisons
Based on the material above, we can see that tuples, while similar to sequence types through shared operations and functions, are actually quite different from lists and strings:
Programming Challenge: Now that you know about some basic tuple syntax and behavior, write a program that uses tuples:
t
t
repeated twice, call the variable repeated_t
repeated_t
added to the tuple, ('penultimate', 'ultimate')
; name the variable added_t
added_t
by retrieving a "sub" tuple (hint: slice!)added_t
added_t
added_t
Use tuple operators or functions to do this! See the expected output below (again, note that the online editor may have slightly different output... such as not wrapping tuples with parentheses and using type
instead of class
when printing out the type of a value).
# if your tuple contained the values 1, 2, and 3... (2, 3, 'penultimate', 'ultimate') 1 8 class 'tuple'
Click the "Run" button below to test your program. You can download the solution to this problem by clicking here.
Tuples only support two methods:
count(val)
- returns the number of times val
occurs in the tuple
index(val)
- returns the index of the first occurrence of val
within the tuple
(ValueError
occurs if val
is not in the tuple
)Sequence types can be unpacked. That is, if you have multiple comma separated variable names on the left-hand side and a sequence type on the right-hand side, then each variable name will be bound to the element that matches positionally in the sequence.
city, state = ['brooklyn', 'ny']
print(city, state)
You can unpack any sequence, but due to semantics (tuples are meant to pack different values together that you'll later use individually), you'll often see unpacking used with tuples often:
t = (7, 20, 1969)
m, d, y = t
We've actually seen tuple unpacking before! Remember multiple assignment 🤔?
x, y = 0, 0
Yup! That's a tuple on the right-hand side. Also, remember returning multiple values from a function? Also tuple unpacking:
def f():
return 0, 0
x, y = f()
One other place that we'll see unpacking is within a for
loop. If a list of tuples is iterated over, the loop variable can be unpacked into individual loop variables:
names = [('ang', 'alice'), ('benson', 'bob')]
for last, first in names:
n = first + ' ' + last
print(n)
Just like regular tuple unpacking, in order for this to work, the number of elements in each tuple within the object being iterated over must be the same... and must match the number of loop variables.
This comes in handy when you need to iterate over a list and get a list's element as well as index. You'll have to use the built-in function enumerate
to get a list-like object composed of tuples to do this. Each tuple contains the index and value of every element in the original list.
result = enumerate(['alice', 'bob', 'carol'])
# result is _like_ [(0, 'alice'), (1, 'bob'), (2, 'carol')]
When you loop over an enumerated list, you can unpack each tuple into separate loop variables: the index and the value.
Sample Program: This program will print out the index and value of every element in the list. It uses the built-in function enumerate to transform a list into an iterable object composed of tuples... which are unpacked in the for loop.
Programming Challenge: Given the list provided, numbers
, double every number in the list. The enumerate
function and tuple unpacking must be used in the solution. The original list must be modified. Here's what the resulting output should be.
[2, 200, 2000]
Click the "Run" button below to test your program. You can download the solution to this problem by clicking here.
Finally ... remember *args
for variable number of arguments? What does that actually do? Well, it takes all of the arguments passed in, and it puts it into a single parameter called args
. The type of args
is actually a tuple
!
Sample Program: Let's look at what *args
does in more detail...
A set is "an unordered collection of distinct hashable objects". Let's dissect this definition:
*
) One last thing not apparent in the definition... sets are mutable, that is they can be modified (elements can be added, removed, changed).
So, in Python, sets are:
To create a set:
s1 = {1, 2, 3}
s2 = {1}
print(s1, s2) # {1, 2, 3} {1}
print(type(s1), type(s2)) # <class 'set'> <class 'set'>
set
function; pass in an iterable object, such as a list, string, or tuple, as the argument ... and each element will become a distinct element in the set
s1 = set([1, 2, 3])
s2 = set('abc')
print(s1, s2) # {1, 2, 3} {'a', 'b', 'c'}
set
function without any arguments... ⚠️ do not use two curly braces without elements as an empty set - it actually represents a different empty data type (a dictionary)!
empty = set()
Regardless of how you create a set, your set will be composed of distinct elements - that is, there will be no duplicates in your set:
s1 = {1, 1, 1, 1, 1}
print(s1) # {1}
s2 = set("settttttttt!!!!!")
print(s2) # {'s', 'e', 't', '!'}
When working with sets (creating, adding elements, etc.) the elements that make up the set must be immutable. If you try to use a mutable type with a set, you'll get a runtime error ☹️:
uh_oh = {1, 2, []} # TypeError!
Python sets support the following methods / operators:
union
, |
- returns new set composed of all elements in one combined with elements from another set(s)intersection
, &
- returns new set composed of all elements that are common between one set and another set(s)difference
, -
- returns a new set composed of elements that are in one set but not in anotherisubset
/issuperset
, <=
/>=
- determines if one set is either a subset or superset of another setNote that these operations can be used by working with operators or by calling methods. Let's take a look at operators first. This demonstrates three basic set operators: |
(union), & (intersection), -
(difference), <=
(subset), and >=
(superset)...
s2 = {3, 4, 5, 6}
print(s1 | s2) # union: {1, 2, 3, 4, 5, 6}
print(s1 & s2) # intersection: {3, 4}
print(s1 - s2) # difference: {1, 2}
print({1, 3} <= s1) # is subset: True
print({1, 2, 3, 100} >= s1) # is superset: False
Both the <=
(subset) and >=
(superset) operators also return true if both operands are the same. If you want a "proper" subset (that is, if the sets are equal, then they cannot be a superset or subset of each other), use <
and >
.
The union, intersection and difference operators can be chained (note that union and intersection are commutative; the order of the operands does not matter):
print({1} | {2, 3} | {4} | {5} | {6}) # {1, 2, 3, 4, 5, 6}
print({2} & {1, 2, 3} & {2, 3, 4, 5} & {1, 2}) # {2}
print({1, 2, 3, 4} - {3, 4} - {2}) # {1}
All of these operators must have a set
for each operand... otherwise, a runtime error will occur:
{1, 2, 3} | [4, 5] # TypeError!
Python also supports these operations as methods. Each operation has a corresponding method named after the operation, like .union
or .intersection
. When using these methods, a new set is returned. Additionally, these methods allow for iterable types other than sets to be used as the "other operand".
The following code shows that set operations can also be used by calling methods named after the operation they perform: .union
, .intersection
, .difference
, .issubset
, and .issuperset
. Each method either creates a new set or -- in the case of .issubset
and issuperset
-- returns a boolean. Both the original set and the argument are not modified.
s1 = {1, 2, 3, 4}
s2 = {3, 4, 5, 6}
print(s1.union(s2)) # {1, 2, 3, 4, 5, 6}
print(s1.intersection(s2)) # {3, 4}
print(s1.difference(s2)) # {1, 2}
print(s1.issubset({1, 2, 3, 4, 5}))
# note that iterables can be passed in as well!
print(s1.union('abc')) # {1, 2, 3, 4, 'a', 'b', 'c'}
print(s1.issuperset([1, 2, 2, 2])) # True
As we saw above, sets have operators and methods that are different from other sequence types. However, because it is a collection of elements, there are still some similarities: you can still retrieve the number of elements in a set (len
), test for membership in a set (in
, not in
), and, lastly, iterate over a set (with for
) - just like lists, strings and tuples.
Sample Program: The program below demonstrates in
, len
and for
loops used with sets.
Even though the elements in a set are immutable, a set itself is mutable; it can be changed! Sets can be changed by using "augmented assignment operators" (much like incrementing with +=
), and there are also several methods that can be called to change a set.
Augmented assignment operators
|=
, &=
, and -=
can all be used to modify a set through union, intersection and difference. They all modify the set on the left hand side of the augmented assignment operator used:
s = {1, 2, 3}
s |= {4, 5}
print(s) # {1, 2, 3, 4, 5}
s &= {3, 4, 5, 6, 7}
print(s) # {3, 4, 5}
s -= {3, 4}
print(s) # {5}
Methods
There are several methods that can be called on sets that add, remove or change elements.
.add(ele)
- add a new element, ele
, to the set; does not return a value.remove(ele)
- removes element, ele
, from the set; does not return a value... error if ele
isn't in the set.discard(ele)
- removes element, ele
, from the set; does not return a value... no error if ele
isn't found.pop()
- remove random element from set and return it... error if there are no elements in the list.update(update_set)
- updates the elements in the set using the elements in update_set
(similar to union); does not return a valueSample Program: The following code uses add
and remove
to modify a set.
Programming Challenge: Write a program that asks the user for 4 words. Then the program should continually ask the user for words that the user remembers; prompting the user to continue after taking in a remembered word. At the end, the program will list out all of the correctly remembered words without duplicates. See the example interaction below (text after a colon, :
, is entered by the user).
Give me 4 words Word 1, plz: foo Word 2, plz: foo Word 3, plz: bar Word 4, plz: baz How many words can you remember? Give me a word you remember:baz Do you remember any more words (Y to give more words): Y Give me a word you remember:WAT???? Do you remember any more words (Y to give more words): Y Give me a word you remember:foo Do you remember any more words (Y to give more words): N These are the words your remembered correctly: * foo * baz
Click the "Run" button below to test your program. You can download the solution to this problem by clicking here.
If you want an immutable set, you could use the frozenset
type. Immutable sets are created by calling the built-in, frozenset
(you can pass in a set... or an iterable... to create the new frozenset). Most regular set operators will work with a frozenset
, but methods that mutate the set won't (like add
, remove
, etc.).
Again, frozensets support regular set operations
s1 = frozenset({1, 2})
s2 = frozenset({2, 3})
print(s1 | s2) # frozenset({1, 2, 3})
But, you can't change them, so they don't have methods like add
s = frozenset([1, 2])
s.add(3) # error!
We now know quite a few compound data types: strings, lists, tuples, sets, and even range objects. Why would you use a set over these other types? Use a set when you want to take advantage of its characteristics and behavior:
Type annotations can be added for a set. Just like lists and tuples, you'll have to bring in a type from the typing
module... and you can specify the types contained within a set:
from typing import Set
aliens: Set[str] = {'zim', 'alf', 'jet', 'et'}
The definition of a set included the term hashable objects. What does that mean?
A hash is a function that transforms or maps a value into some other value:
potato
" and maps it to the value 123The fact that hashing produces the same output if both inputs are the same could be useful when comparing values... especially when comparing values that would be cumbersome to compare otherwise (think of comparing two tuples, each with 100 values vs comparing two numbers).
There's a built in function in Python called, hash
... and it will return the hashed value, an integer, of the object passed into it (note that the resulting integer from hashing the same value may differ between program runs):
# the following print statements should produce
# two different integers...
print(hash("potato"))
print(hash(24))
If the hash is the same, then the objects are equal!
meal_1 = "breakfast"
meal_2 = "breakfast"
print(hash(meal_1) == hash(meal_2)) # True
print(meal_1 == meal_2) # True
Objects in Python that do not change (immutable objects) can be hashed. Mutable objects cannot be hashed (if an object is mutable, then using its hash to compare it with something else wouldn't be possible!).
# a list is mutable, so it can't be hashed
hash(['toast', 'jam']) # error!
Aaaaannnd... back to our definition of a set: an unordered collection of hashable objects
. That just means that every element in a set must be immutable (otherwise, the element would not be hashable). Why is it important that sets have hashable elements? It allows for quicker look-up of elements in a set:
This is a very high-level explanation of how sets work, but it's adequate to show that it's faster than going through every element in the set, and checking to see if it's equal to the element you're trying to find.
Two operations that are commonly used to create new lists is map and filter
Sample Program: Map values to a new array: in this example, we adjust a list of test scores so that they're all incremented by 5. This is implemented by using an accumulator, a normal for loop, and incrementing.
Sample Program: Another common task we have is creating a new list out of elements in a list that meet a certain condition. This is called filtering. Here's an example of that, again using an accumulator, regular iteration with a for loop... and an if statement to conditionally transform the original element.
We start off with a list of musicians:
artists = ['billie eilish', 'the cure', 'lil uzi vert', 'roy orbison', 'lil wayne', 'the knife']
The program below picks out a list of all of the artists that have a name that starts with the string "lil"
... and puts them into a new list. We can use the same pattern as map, but this time, just add an if statement:
Note that we could also combine both patterns to filter and map at the same time.
Map and filter are so common that Python has a concise syntax to create lists by transforming elements and / or filtering elements without having to write 3 or 4 lines of code. Python allows us to create new lists by manipulating elements in an iterable object... all in one line. This language feature is called a list comprehensions - an expression that gives back a new list by mapping and / or filtering values in an existing iterable object (such as a list, range, tuple, set, etc.)
Going back to our original map example pattern with a for loop and accumulator, we can make a generic version:de
iterable_object = ...
accumulator = []
for element in iterable_object:
accumulator.append(expression that uses element to create new_element)
Where:
iterable_object
is the value that is looped overaccumulator
is the variable where we put the new listelement
is the loop variableexpression
transforms the loop variable, element
, with the result being added to the accumulator
With a list comprehension, we can do the same, but in a single line! Here's the generic version first:
accumulator = [expression for element in iterable_object]
Taken step-by-step:
[]
[for element in iterable_object]
[expression for element in iterable_object]
element * 2
[element * 2 for element in iterable_object]
You can also filter elements:
accumulator = [element for element in iterable_object if some_condition]
...where some_condition
controls whether or not an element is included in the accumulator. If the condition is True
, then add the element. Typically, you'll use the loop variable, element
in some_condtion
:
accumulator = [element for element in iterable_object if len(element) > 3]
Lastly, you can filter and map in the same list comprehension:
accumulator = [element * 2 for element in iterable_object if len(element) > 3]
Now let's see this in action using our original sample programs:
Sample Program: Convert the two examples from earlier into list comprehensions:
At then end, there's an example of filtering and mapping at the same time... taking all of the musicians named "lil" and converting their name to uppercase.
Programming Challenge: Write code to...
greetings = ['hi', 'hello', 'hola']
, create a new list based off of greetings
where all occurrences of 'h'
with 'j'
and add a number of exclamation points equal to equal to the number of characters in the string for every element in the original listislower
doesn't work in the online editor, so use an IDE or compare unicode code points (97 through 122 are lowercase letters)Print out each list created. See the expected output below.
[1.0, 1.4142135623730951, 1.7320508075688772, 2.0, … 10.0] ['ji!!', 'jello!!!!!', 'jola!!!!'] [97, 100, 101, 105, 110]
Click the "Run" button below to test your program. You can download the solution to this problem by clicking here.
A similar syntax can be used to create sets and dictionaries. (There are no tuple comprehensions, though!)
For sets, wrap your set comprehension in curly braces instead of square brackets. The code below takes a sentence (from a quora question about sentences with repeated words) and gives a set of the words that exist in the sentence:
lower
normalizes casing on the wordssentence = "You cannot end a sentence with because because because is a conjunction"
words = {w.lower() for w in sentence.split(' ') if w not in ('a', 'an', 'the')}
print(words)
For a dictionary, also use curly braces. However, the expression in the beginning of the dictionary comprehension must specify both the key and value pairs in the dictionary separated by a colon.
The following code makes a new dictionary with keys that are each letter in the string, "abc"
... and the values all initialized to 0:
d = {k: 0 for k in 'abc'}
print(d) # {'a': 0, 'b': 0, 'c': 0}
Now that you've completed this module, please visit our NYU Classes site and take the corresponding quiz for this module. These quizzes are worth 5% of your total grade and are a great way to test your Python skills! You may also use the following scratch space to test out any code you want.