DH Code Analysis

Due Date: Monday, September 24, 2018

Submission Instructions: Submit digitally (on Box) and bring printed copy to class

Assignment Description:

This assignment asks you to pay close attention to a block of code and analyze how it works. I have provided three examples drawn from we have read or will read this semester. Each block has comments to help you understand what it does. You will choose one of the examples and write a two-page close reading of its function and approach.

One goal of this assignment is to begin demystifying digital humanities text analysis. Much of the code used for published work is less complicated than you might think. Much of it (if not most of it) pays little or no attention to best practices or notions of "good code." A lot of it takes longer to run than it might, but efficiency is rarely a factor when you have only a few texts to analyze. (By a few, I mean thousands and not millions.)

A second goal of the assignment is to see code-based decisions as acts of interpretation. If I want to isolate all words that suggest happiness, how do I choose the words I'm looking for? If I want to download all novels from Project Gutenberg, which catalogue subjects do I select? Coders base many of these decisions on expediency, but they have real consequences.

As you write your response, think about (and consider addressing) the following questions:

Once I have set up a Github repository, I will post more detailed instructions on:


The Github repository is now online at https://github.com/dh-fall-2018/DH-Code-Analysis-Assignment. There are three .ipynb files in the repository, but the other files are needed to make the code run properly (i.e. dependencies). The code blocks are as follows:

Alger-Best-Lavin-edits.ipynb: Code adapted from Tatlock, Lynne et. al. "Crossing Over: Gendered Reading Formations at the Muncie Public Library, 1891-1902"

FiveThirtyEight-GOP-Debate.ipynb: Code adapted from Beckman, Milo. "These Are The Phrases Each GOP Candidate Repeats Most"

Gutenfetch.ipynb: Code adapted from Parrish, Allison. "The Average Novel"

Accessing and Downloading the Files

To access and download repo files, you should first fork the repo to your github account and then use the Github desktop app to clone the repo to your computer. This is a two-step process. (I will cover both steps in class on 9/6.)

Setting up Python and Running the Code (we'll cover this in class 9/6)

As stated above, all three of the code blocks are in .ipynb files, which means that you can use Anaconda explorer to run a jupyter notebook kernel, and then load them as live notebooks. This is essentially the same process that we will use in our in-class workshops.

Making Changes and Observing the Output (we'll begin to cover this in class on 9/6)