Lecture Zero: Introduction

Housekeeping

Grade breakdown:

  • 50% homework (5 assignments, 10% per homework released on GitHub and submitted on Gradescope)
  • 40% final project
  • 10% participation

Class is curved.

Post questions on Ed (not for debugging, but for conceptual questions on the homeworks or lecture clarifications) or come to OH.

What is DevOps?

Breaking down the wall between developers (people writing code) and operations (people releasing and deploying code into production and making sure it is reliable). Traditionally these have been two very separate teams, which means that the incentives developers and operations engineers don’t always align. Developers aren’t motivated to make life easier for operations and operations isn’t motivated to make life easier for developers. When a crash happens in production, the people handling the crash aren’t the ones familiar with the code.

The key concept behind DevOps is that if these two teams can share responsibilities, they can build empathy, align their incentives, and ultimately lead to a better experience for the end user if new features are more stable and reliable.

There are a few main DevOps solutions we will be focusing on in CIS 1912:

  • Automated testing and deployment (we can easily ship new features with testing)
  • Easy deploy rollback (if something breaks we can revert quickly)
  • Observability (so we can know when something is wrong)

The main takeaway is to get developers involved in the operations process so that developers can use their skills to build tools to automate away the tedious parts of operations jobs. DevOps is not a role, but a way of doing things.

Python

We’ll be using Python for most of the development side of the DevOps solutions we cover in this course. It’s common and well supported in the infrastructure space because it’s easy to learn and there is wide library support.

Python example

# Comments start with a `#`

import time # import a module from the standard library

for i in range(1, 16): # For loops only over iterators like lists and `range()`
    print(i)
    if i % 3 == 0 and i % 5 == 0:  # Conditional expressions
        print(time.time())
        print("fizzbuzz")  # Strings can be double or single quoted
    elif i % 3 == 0:
        print("fizz")
    elif i % 5 == 0:
        print("buzz")

Running the above code produces the following output:

$ python3 test.py
1
2
3
fizz
4
5
buzz
6
fizz
7
8
9
fizz
10
buzz
11
12
fizz
13
14
15
1607229328.9530184
fizzbuzz

Code in Python can run at the top-level, but it’s good practice to pull logic into functions:

def get_buzz(i): # (def)ine a function
    if i % 3 == 0 and i % 5 == 0:
        return "fizzbuzz"
    elif i % 3 == 0:
        return "fizz"
    elif i % 5 == 0:
        return "buzz"
    return ""

for i in range(1, 16):
    print(str(i) + " " + get_buzz(i)) # Call a function

You can check out CIS 192 for more learning materials, and come to office hours with any questions about Python!

Packaging

Writing code is useful, reusing code is even more useful! Making sure that you has access to the right packages and are also using the correct version of that package is no easy task. Python’s default package manager, pip, will simply take the most recent version of a package and (and its dependencies) and pull it down, not taking into account compatibility with other dependencies and potential conflicts a mismatch might cause. Other package managers built on top of pip, like Poetry, help solve this problem. Poetry is like NPM for JavaScript, or Maven for Java. Take a look that the Poetry demo below for a more in-depth explanation!

Most importantly, Poetry helps us create reproducable build environments wherever we run our code: on our local machines, on our friends’s machines, or even on a production server somewher in “the cloud.”

How it works

Poetry creates and manages two files. pyproject.toml is Poetry’s dependency file: a human-readable (and writeable) file which declares “acceptable versions” of packages, generally a range, such as “1.1 - 1.12”, if version 2.0 contains a breaking change. poetry.lock is the lock file: an autogenerated file used to declare specific package versions, including dependencies of dependencies. Poetry uses the lock file to save and persist its resolution of conflicts that it resolves from the list in pyproject.toml.

The Poetry demo is replicated below, so here’s a relevant comic courtesy of xkcd, describing the spaghetti mess that Poetry helps us avoid: Relevant XKCD

Demos

Poetry

One reoccurring design pattern we see in DevOps is package managers. This is a tool that helps manage your program’s dependencies. In other words, the package manager is in charge of keeping track of what packages your project needs to run correctly, and then downloading those packages in a way that makes it easy for your program to use this auxillary code.

We’ll look at a few different package managers over the course of the semester. Node has one called NPM (Node Package Manager), Java has a package manager called Maven, and Python has a few offerings. Note that these package managers are all a little different because they work with different languages that all have different nuances. This is why we can’t reuse package managers across languages.

The Python package manager we’ll be using is called Poetry. Essentially, Poetry allows you to download certain Python libraries, then it creates a virtual python environment on your machine to run your code with the given libraries. So, why the virtual environment? The answer is that Python varies a lot from version to version (especially Python 2 compared to Python 3). The virtual environment ensures that you, your team of developers, and your production environment are all on the same version of Python. This way we can avoid any issues and bugs that may arise from code that’s written to work on one version of Python actually being run with a different version of Python.

Now, let’s get into how to actually use Poetry. First, make sure that you have Poetry installed on your machine, instructions for installation can be found here.

Once you have Poetry installed, let’s create a new project:

# Create a new folder called poetry_demo
$ mkdir poetry_demo
# Enter the new folder
$ cd poetry_demo
$ poetry init

Now Poetry will give you lots of options for how to initialize your project, just hit enter for all of them (Poetry will use the default setup which is fine for our purposes). Once you’ve finished, you’ll see that there is a new file pyproject.toml in the directory, this is the file that stores the information we just initialized.

Next, let’s add a dependency:

$ poetry add numpy
Creating virtualenv poetry-demo-KkU142w6-py3.9 in /Users/airbenderang/Library/Caches/pypoetry/virtualenvs
Using version ^1.19.5 for numpy

Updating dependencies
Resolving dependencies... (39.8s)

Writing lock file

Package operations: 1 install, 0 updates, 0 removals

  • Installing numpy (1.19.5)

Now, Poetry actually does two things here. It downloads NumPy, but before that it actually creates a virtual environment which we are going to use to run our Python code. If we wanted to use a deprecated version of Python (like Python 2) we could configure Poetry to setup the virtual environment so it runs an older release of Python. Again, you will see a new file in your directory, this is the poetry.lock file. It doesn’t make much sense to humans, but the poetry.lock file tracks which packages your program depends on and the version number of those packages.

Finally, let’s run some code on Poetry’s virtual environment. There are two ways that you will run python programs with Poetry. The first is you can type poetry run script.py and this would run a Python script in the Poetry environment, but instead we will be opening a new shell that will have the Poetry virtual environment as our default Python environment:

$ poetry shell
Spawning shell within /Users/airbenderang/Library/Caches/pypoetry/virtualenvs/poetry-demo-KkU142w6-py3.9
$ . /Users/airbenderang/Library/Caches/pypoetry/virtualenvs/poetry-demo-KkU142w6-py3.9/bin/activate
# Open a new Python interactive terminal
$ python
Python 3.9.1 (default, Dec 24 2020, 16:53:18) 
[Clang 12.0.0 (clang-1200.0.32.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> x = np.array([[1,2],[3,4]])
>>> x
array([[1, 2],
       [3, 4]])
>>> y = np.linalg.inv(x)
>>> y
array([[-2. ,  1. ],
       [ 1.5, -0.5]])
>>> exit() # To leave the Python terminal
# Then exit again to leave the Poetry shell
$ exit

It looks like NumPy works! This means that Poetry has been properly able to manage our dependencies so that they are accessible when we run our Python code with Poetry. Now, let’s make a simple Python file and have Poetry run it. Create a new file called average.py in the same directory as your pyproject.toml and poetry.lock and paste this code into it:

import sys
import numpy as np

if len(sys.argv) < 2:
    print("Not enough command line arguments")
    exit()

xs = []
try:
    for i in range(1, len(sys.argv)):
        xs.append(int(sys.argv[i]))
except:
    print("Command line arguments are not integers")
    exit()

print(np.average(np.asarray(xs)))

Now, we can run this in the virtual environment created by Poetry:

$ poetry run python average.py 1 2 3 4
2.5

Awesome, it looks like this is working, too. Try changing around the command line arguments!