Data Wrangling Course

James Howison's Data Wrangling course from the Information School at the University of Texas at Austin.

Python if/else/while (flow of control)

Flow of control is about how we move through a program. We cover if/else statements and while loops. There is a screencast that covers this material: Flow of Control Screencast.

The code in the screencast is below, you should copy this and save it as “friday_yet.py” in your account on Holden. Then you can follow along with the screencast or experiment yourself.

"""A short script to demonstrate branching."""
# No branching

destination = "home"

print("Done with work, I'm off {}".format(destination))

But often we want to execute only some lines of the code, depending on the content of a variable.

##############
# One branch
#############

day = "Friday"

# Is it Friday yet?
if (day == "Friday"):
    destination = "bar"
else:
    destination = "gym"

print("Done with work, I'm off {}".format(destination))

We can combine this, taking a branch inside another branch, leading to three possible outcomes.

##########
# Two branches
##########

day = "Wednesday"

# Is it Friday yet?
if (day == "Friday"):
    destination = "bar"
else:
    if (day == "Wednesday"):
        destination = "Park"
    else:
        destination = "gym"

print("Done with work, I'm off {}".format(destination))

We could split off even further. Examine the code below and draw a picture similar to the others on this page.

day = "Wednesday"

# Is it Friday yet?
if (day == "Friday"):
    destination = "bar"
else:
    if (day == "Wednesday"):
        destination = "Park"
    else:
        if (day == "Tuesday"):
            destination = "Sleep"
        else:
            destination = "gym"

print("Done with work, I'm off {}".format(destination))

When we’re mapping options in this way we can simplify the syntax using the elif construct.

##########
# Two branches, a little less indentation, using "elif"
##########

day = "Wednesday"

# Is it Friday yet?
if (day == "Friday"):
    destination = "bar"
elif (day == "Wednesday"):
    destination = "Park"
else:
    destination = "gym"

print("Done with work, I'm off {}".format(destination))

Assignment or Comparison?

Note the difference between a single = and double ==. The single = is the “Assignment operator” and puts things into variables. The double == is the “comparison operator” and tests whether things are the same. A double == returns either True or False. You don’t want to have a = in the parens for an if statement. The linter will help here, showing a E0100 SyntaxError if you accidentally use a single = in the parens for an if.

The colon starts a “block” of code

Can you see the colon at the end of the if and the else lines? The colon starts a section of code, called a “block”. It’s easy to forget the colon and the linter or python3 will show an “UnexpectedIndent” error on the next line.

if (day == "Friday"):
    destination = "bar"
else:
    destination = "gym"

After the colon comes an indent, we will use 4 spaces (following the Python style guide called PEP8) which show as four light grey dots in Atom. The block ends when we return to the previous indent level. So know you know what it means for code to be “in” the if block or “in” the else block.

In Atom it is helpful to ensure that your “Editor” settings are set up. I prefer to set the options shown the in the screenshot below (accessed via Atom -> Preferences/Settings -> Editor). We will use four spaces rather than a tab character; spaces are known as “soft tabs”. The Indent Guide shows vertical lines in the editor to help you line up code blocks.

Loops over code - while and for loops

In addition to if/else we can also execute parts of the code repeatedly, “looping” back over those lines using a while loop or a for loop (which we’ll discuss later).

The while loop is explained in this Screencast on While Loops. The screencast uses this code:

"""first_while_loop.py Demonstrates a while loop.

In addition to branching with if, we can repeat parts of the code
multiple times, called a "loop".  Later this will be important for
processing lines of csv input files (where we want to do the same thing
over and over, once for each line

For now, though, we'll just do things a certain number of times.

This code celebrates with "hip, hip, hurray" but you can customize it
for greater anticipation (e.g., "hip, hip, hip, hip, hurray") by changing
todo.  The test on line 13 (todo > done) is repeated after each line 15.
"""
todo = 3
done = 0

while(todo > done):
    print("hip")
    done = done + 1

print("hurray")

# Q: why does this only print "hip" twice and not three times?
# Q: why does hurray only print once, regardless of what you number
# you set todo to?

The figures used in the screencast, showing the state of variables is below: