File I/O

Storing Data

What's the difference between volatile and non-volatile memory, and what are some examples of each?

Storing Data in Main Memory

What are the consequences of your data being stored in your computer's main memory?

So, if you want to persist data beyond the lifetime of your running program or through on-off cycles…

Store data as a file on non-volatile memory.

open

In Python, what built-in function do we use to interact with files? How many parameters does it have, and what does it return?

A File Object …

#  my_input_file is a file object
my_output_file = open("myfile.txt", "w")

Putting Data Into a File

What are the steps for opening a file and putting data into it? What file object method is used for putting data into a file?

Writing to a File

  1. Get a file object using open with write mode: open('filename', 'w')
    • filename is the file to be opened
    • 'w' means that the file will be opened for writing
    • if the file doesn't exist, 'w' will create it
    • if the file exists, 'w' will overwrite it!
  2. Use the write method on the file object to write data to the file
    • takes a string as an argument (non-string will give a TypeError)
    • does not automatically add new lines
  3. Use the close method on the file object when you're done

Writing to a File Code

#  open using mode 'w'
my_output_file = open("myfile.txt", "w")

#  use the write method
f.write("Monday\n")
f.write("Tuesday\n")
f.write("Wednesday\n")

#  close when you're done
f.close()

Retrieving Data From a File

What are the steps for opening a file and retrieving data from it? What file object methods can be used for reading data from a file?

Reading a File

  1. Get a file object using open with read mode: open('filename', 'r')
    • filename is the file to be opened
    • 'r' means that the file will be opened for reading
  2. To read data…
    • iterate over the file object itself (reads one line at a time)
    • use one of the following methods:
      • iterate over the file object itself (use a for loop with the file object)
      • readline()
      • read()
      • readlines()
  3. Use the close method on the file object when you're done

Methods for Reading a File

All of the following methods do not have any parameters.

Examples

The following examples assume the presence of a file called ingredients.txt (download here - right-click and save as) in the same folder/directory as your program.

The contents of the file is:

3:tomatoes
1:garlic cloves
2:green peppers

(Download or recreate to follow along)

Reading a File With Iteration

A file object is actually iterable!

my_input_file = open('ingredients.txt', 'r')
for line in my_input_file:
    print(line)
my_input_file.close()

Output of Reading a File With Iteration

Notice the extra new lines…

3:tomatoes

1:garlic cloves

2:green peppers

You can use the string method, strip(), to get rid of them.

print(line.strip())

Reading a File With readline()

The readline() method also reads in one line at a time

my_input_file = open('ingredients.txt', 'r')
while True:
    line = my_input_file.readline()
    if len(line) == 0:
        break
    print(line)
my_input_file.close()

Output of Reading a File With readline()

As with iteration, there are extra new lines:

3:tomatoes

1:garlic cloves

2:green peppers

Again, you can use the string method, strip(), to get rid of them.

print(line.strip())

Reading a File With readlines()

You can also call readlines (with an s) to just read the the entire contents of a file as a list

my_input_file = open('ingredients.txt', 'r')
lines = my_input_file.readlines()
print(lines)
my_input_file.close()

Output of Reading a File With readlines()

The list is printed out. Notice the newlines (as usual!).

['3:tomatoes\n', '1:garlic cloves\n', '2:green peppers\n']

Of course… you can then iterate over every item in the list:

for line in lines:
  print(line)

Finally, Reading a File with read()

Use the read() method on your file handle object to read the file in all at once.

read() returns the entire contents of a file (including newlines) as a string.

my_input_file = open("ingredients.txt", "r")
contents = my_input_file.read()
print(contents)

Output of Reading a File With read()

Contents contains a string representing all of the data in the file.

3:tomatoes
1:garlic cloves
2:green peppers

Some Exercises

Double the Ingredients

Let's Do This Step By Step

Reading the Ingredients

Let's try printing out every line in the ingredients file first:

my_input_file = open('ingredients.txt', 'r')
for line in my_input_file:
    print(line)my_input_file = open('ingredients.txt', 'r')
my_input_file.close()

Extracting Meaningful Information

Let's add code to get the number out of each line, double it, and print it out along with the ingredient:

#  there's a problem with this solution...
my_input_file = open('ingredients.txt', 'r')
for line in my_input_file:
	number = int(line[0])
	# notice that we're using strip to get rid of the excess new line
	print(str(number * 2) + line[1:].strip())
my_input_file.close()

This solution works for the data that's currently in the file, but…

Extracting Meaningful Information Part 2

What if the number in the beginnin of the line had 2 digits? …like 10:cloves of garlic.

#  use split on the string...
my_input_file = open('ingredients.txt', 'r')
for line in my_input_file:
	clean_line = line.strip()
	parts = clean_line.split(":")
	number, ingredient = int(parts[0]), parts[1]
	print("%s:%s" % (number * 2, ingredient))
my_input_file.close()

Extracting Meaningful Information Part 2

Now… let's write out the ingredients rather than printing out to the screen.

my_input_file = open('ingredients.txt', 'r')
#  add a file to write to
my_output_file = open('ingredients.txt', 'w')
for line in my_input_file:
	clean_line = line.strip()
	parts = clean_line.split(":")
	number, ingredient = int(parts[0]), parts[1]
	# write to file instead of print
	my_output_file.write("%s:%s\n" % (number * 2, ingredient))
my_input_file.close()
my_output_file.close()

Jane Austen

You can download a text version of Pride and Prejudice from Project Gutenberg

Using that file with our pig_latin and translate_passage functions… can you write out a pig latin version of Pride and Prejudice?

Also… these are sort of from Jane Austen….

Downloading the File

Save the text version of Pride and Prejudice in the same folder that your program is in.

Pig Latin

def to_pig_latin(w):
    """translates word to pig latin"""

    w = w.lower()

    if not w.isalpha():
        return w

    if w == '' or len(w) == 1:
        return w

    if w[0] in 'aeiou':
        return w + 'way'

    first_two = w[0:2]
    if first_two == 'qu' or first_two == 'ch' or first_two == 'sh' or first_two == 'th':
        return w[2:] + first_two + 'ay'

    return w[1:] + w[0] + 'ay'

Translate Passage

def translate_passage(passage):
    """translates text into pig latin"""
    translation = ""
    word = ""
    for c in passage:
        if not c.isalpha():
            translation += to_pig_latin(word)
            translation += c
            word = ""
        else:
            word += c
    return translation

Putting it All Together

#  open file for reading
fh_in = open('pg1342.txt', 'r')
s = fh_in.read()
fh_in.close()

#  translate and write
fh_out = open('pg1342_translated.txt', 'w')
fh_out.write(translate_passage(s))
fh_out.close()