File I/O (CSCI-UA.0002 - Summer 2017)

Where's My Stuff?

When our programs are run, we usually have a bunch of variables that store numbers, lists, strings and all sorts of other data. But where do you think this data is actually stored? →

Our program and the data that our programs have been using is stored in your computer's main memory (really! I mean, where else would you put values that need to be remembered)

RAM!?

Your computer's main memory or RAM (random access memory) is an example of volatile memory.

volatile memory - memory that requires an electrical current to maintain state
non-volatile memory - memory that can maintain state without power

What are some examples of non-volatile memory →

Hard drives, flash drives, CDs and DVDs

Storing Data in Main Memory

What are the consequences of your data being stored in your computer's main memory? →

data may go away at the end of the program, or when the computer gets turned off
if you're working with large amounts of data, you may run out of RAM (which is typically less than the amount of non-volatile memory that you have)!

I Want Data to Last Longer Than That

What if we want to persist our data beyond the lifetime of the running program… or through on-off cycles? →

let's store data on non-volatile media!
… maybe as a file on your hard drive or SSD…

open

Python has a built-in function called open.

open opens a file!
it can be used for reading or writing
it takes two arguments: a filename and a mode
- filename
  - the absolute path…
  - or relative (to the script that your writing) path
- mode … for now, we only care about:
  - 'w' - write
  - 'r' - read
  - 'a' - append
it returns a file object

A File Object …

is an object that allows your program to manipulate/read/write to an actual file on disk
to create a file object and open a file, use the built-in function, open()

Writing to a File

f = open("test.txt", "w")
f.write("I'm in yr filez!\n")
f.write("Writin' some bits!\n")
f.write("...\n")
f.close()

call open …
- with the name of the file to open, 'test.txt'
- and the mode, 'w' to specify that we're writing to it
use the write() method to write strings to a file
you must always call close() on file when you're done with it

Using open to Write to a File

Let's look at open and write in more detail:

open(filename, mode)

filename is the file to be opened
a mode of 'w' means that the file will be opened for writing
if the file doesn't exist, 'w' will create it
if the file exists, 'w' will overwrite it!

write(s)

does not automatically add new lines
takes a string as an argument (non-string arguments result in an error)

Lottery Ticket

Write a program that creates a lottery ticket. The lottery ticket should:

have the words "Lucky Numbers" on the first line
contain 5 unique numbers between 1 through 59
each number printed from lowest to highest
each number printed on its own line
be saved to a file named lotto.txt

Lucky Number
3
28
33
50
51

Pseudocode #1

"""
create a list of numbers
mix up the numbers
open a file
write out 'lucky numbers' to file
get the last 5 numbers from list
sort them
for each number in list
	write it the number
"""

Pseudocode #2

"""
create an empty list to store numbers
while length of list < 5
	generate a random number
	if the number isn't in the list of stored numbers
		add it
sort numbers
open a file
write out 'lucky numbers' to file
for every number in the list
	write it to the file
"""

Potential Solution

import random

#  generate list of sorted unique random numbers
random_number_list = []
while len(random_number_list) < 5:
    n = random.randint(1, 59) 
    if n not in random_number_list:
        random_number_list.append(n)
random_number_list.sort()

#  write out the list of numbers to a file
file_handle = open('lotto.txt', 'w')        
file_handle.write('Lucky Numbers\n')
for n in random_number_list:
    file_handle.write('%s\n' % n)

Another Version

import random
def unique_random_list(sample_size, a, b):
    """ 
    sample_size is the number of random numbers to return
    a is the start of the pool of numbers to choose from
    b is the end of the pool of numbers to choose from
    """
    numbers = []
    while len(numbers) < sample_size:
        n = random.randint(a, b)
        if n not in numbers:
            numbers.append(n)
    return numbers 

observed = unique_random_list(3, 1, 5)
assert 3 == len(observed)
for n in observed:
    assert 1 == observed.count(n)

Another Version Continued

random_number_list = unique_random_list(5, 1, 59)
random_number_list.sort()
file_handle = open('lotto.txt', 'w')
file_handle.write('Lucky Numbers\n')
for n in random_number_list:
    file_handle.write('%s\n' % n)

BTDubz (re random)

By the way… (of course) there's already a function in the random module that does this:

random.sample(population, k)
- population - a sequence or set (think list) containing the elements to sample from
- k - the number of elements to retrieve
example output:

>>> print(random.sample([1, 2, 3, 4, 5, 6, 7], 4))
[7, 3, 1, 6]
>>> print(random.sample([1, 2, 3, 4, 5, 6, 7], 4))
[7, 5, 4, 6]
>>> print(random.sample([1, 2, 3, 4, 5, 6, 7], 4))
[5, 6, 3, 4]

Reading a File

To open a file in read mode, use "r" as the second argument:

f = open("test.txt", "r")

Reading a File

Once you have a file object (sometimes called a file handle), you can read the contents of a file by: using one of the following methods on your file handle object:

iterating over the file object itself (use a for loop with the file object)
using one of the following methods on the file object
- readline() - read one line at a time
- readlines() - reads entire contents of a file into memory as a list (each element is a line)
- read() - reads the entire contents of a file into memory as a string

The Easiest Way to Read a File

Once you have a file object, you can actually iterate over the file object itself. That is, you can use a for loop to loop over every line in the file object:

f = open("test.txt", "r")
for line in f:
    print(line)

Using readline

readline() takes no arguments, and it returns a string.

it always returns a string, even if it's just a new line character ("\n")
if it returns an empty string, then we've reached the end of the file

Using readline Continued

To use readline to read the contents of a file, loop forever (or at least until we know that we're at the end of a file! …

f = open("test.txt", "r")
while True:
	line = f.readline()
	if len(line) == 0:
		break
	print(line)
f.close()

Using readline() Continued More!

Using the test.txt file we've used in previous examples:

I'm in yr filez!
Writin' some bits!\n
...

What is the first line that will be printed? What is the actual string representation? How many times will the loop run?

f = open("test.txt", "r")
while True:
	line = f.readline()
	if len(line) == 0:
		break
	print(line)
f.close()

I'm in yr filez!
"I'm in yr filez!\n"
3 times

Reading a File in All At Once

Use the read() method on your file handle object to read the file in all at once. read() returns the entire contents of a file (including newlines) as a string.

f = open("test.txt", "r")
contents = f.read()
print(contents)

Reading a File in All At Once

Use the readlines() method on your file handle object to read the file in all at once as a list, with each line being a single element in the list.

f = open("test.txt", "r")
lines = f.readlines()
for line in lines:
	print(line)

Memory Efficiency

Which function uses more main memory, readline or read/readlines? Why? →

read/readlines consumes more memory because it reads the entire file at once!
similarly, in our previous exercises… we had solutions that either used up a lot of memory… or were expensive computationally

A File Object and For Loops

Again, a file object is itself an iterable (you can loop over it using a for loop)… and it reads in chunks of the file as you go along →

f = open('my_file.txt', 'r')
#  read chunks at a time
for line in f:
  print(line)

Creating Text Files with IDLE

Some of these exercises require you to work with existing text files. So, how do you create these files? →

IDLE can be used to work on files that aren't Python programs. To save a plain text file →

go to new file as usual
…and save as … but add .txt as your extension

(You can also open file that aren't .py or .txt in IDLE, as long as they're just plain text)

Multiple File Objects

You can have more than one file object open at a time. The following example: →

reads every line from readme.txt
writes each line with an exclamation point at the end to a file called writeme.txt

input_file = open('readme.txt', 'r')
output_file = open('writeme.txt', 'w')
for line in input_file:
    output_file.write("{}!\n".format(line))

Reading and Writing

read the contents of a file called names.txt
the file will have first names in it
sort the names by alphabetical order
write the newly sorted names to another file
the original file should remain unchanged

The contents of names.txt (download and save to where your program is) will be:

Erin
Charles 
Bob
David
Alice

A Potential Solution

file_in = open("names.txt", "r")
names = file_in.readlines()
#  or alternatively...
#  contents = file_in.read()
#  names = contents.split("\n")
names.sort()
file_in.close()
file_out = open("names_sorted.txt", "w")
for name in names:
        file_out.write(name + "\n")
file_out.close()

File I/O

Where's My Stuff?

RAM!?

Storing Data in Main Memory

I Want Data to Last Longer Than That

File Input and Output

open

A File Object …

Writing to a File

Using open to Write to a File

Lottery Ticket

Pseudocode #1

Pseudocode #2

Potential Solution

How About Some Tidying Up

Another Version

Another Version Continued

BTDubz (re random)

Reading a File

Reading a File

The Easiest Way to Read a File

Using readline

Using readline Continued

Using readline() Continued More!

Reading a File in All At Once

Reading a File in All At Once

Memory Efficiency

Some Notes…

A File Object and For Loops

Creating Text Files with IDLE

Multiple File Objects

File System and Paths

An Exercise

Reading and Writing

A Potential Solution