Syllabus and Lesson Schedule
A listing of lessons and concepts planned for the Fall 2016 session of Public Affairs Data Journalism I
Clickable-lesson dates will have fully-detailed notes. To see more information about the assignments, go to the Assignments page.
-
Florida's missing speeding cops
How the Sun-Sentinel busted speeding cops using databases and grade-school arithmetic
-
FOIAing for FBI Files
Resources and articles about how to request files from the Federal Bureau of Investigation via FOIA.
-
Random slides
These will be incorporated into more readable pieces. If you have questions where I got the data just ask and I can send you along.
- Due at 1:30 PM:
Homework: First FBI FOIA Letter - Due at 1:30 PM:
Homework: Cataloging Google's Self-Driving Car Accidents -
ALS Icebreaker Challenge
By now, revenues from the wildly successful ALS Icebucket Challenge are reflected in the latest 990s. What can we find out about how the money was used?
-
Add it Like Apgar
How do we make data relevant to what seems like a non-binary, non-enumerable story? We learn from the famous doctor who saved countless newborn babies by thinking of them as numbers. Even something as valuable as a baby’s life can be reduced to a number from 0 to 10.
- Due at 12:00 AM:
Homework: Basic Spreadsheet and Pivot Table Exercise with Evictions Data - Due at 12:00 AM:
Homework: Comparing 990 Datapoints across several organizations -
The Washington Post and the Trump Foundation
A reporter is curious about how Trump the presidential candidate compares to Trump the private billionaire when it comes to charity. By systematically double-checking Trump’s claims, he found an even better story.
-
Small to Big by Comparing State-by-State
How do we turn a narrow topic on a local story into a year-long project? Take that narrow topic on a state-by-state tour and compare how laws and data and history and stories differ in 50 comparable ways. This page is a set of real-world examples of how to find focus in a larger issue, and make a big project by looking at the U.S. state-by-state picture.
-
Using LexisNexis Academic to Search for Old News
Not all of the news and journalism made it to the web. Lexis-Nexis Academic gives us access to well-organized, news archives.
-
Homework: Document the Sources of an Investigative Story
Read an investigative story, then document all of its sources with a spreadsheet. You should get an idea of how many people and organizations were contacted, as well as the document trail that was followed. -
Homework: LexisNexis That Story
For an investigative story that you've read, use LexisNexis to do 5 queries, to find 5 more stories of at least 500+ words, from 5 different years and publications that have insights/context not found in the investigative story you read.
- Due at 1:30 PM:
Homework: LexisNexis That Story - Due at 1:30 PM:
Homework: Document the Sources of an Investigative Story - Due at 1:30 PM:
Quiz: Quiz 01 -
Getting Started with Carto Builder
Logging in, creating a dataset, and making a map with Carto Builder and USGS Earthquake data.
-
California S.B. 272
California’s state and city agencies publish a lot of data. But a law requiring them to publish catalogs of everything they have makes it much easier to see what’s not yet published.
-
Quiz: Quiz 01
Quiz covering the first 2 chapters of "Homicide" and "Art of Access". And some spreadsheets.
-
Instructor absence
Homework/topics pushed back a day.
-
Installing DB Browser for SQLite
The lab computers have this installed, but you should install this SQLite client on your own computer: http://sqlitebrowser.org/
-
Homework: Research Public Records on MuckRock
Browse MuckRock. Document 5 successful and 5 rejected public records requests.
- Due at 12:00 AM:
Homework: Research Public Records on MuckRock -
Getting to Know SQLite with a Client and a Database
A brief description of the SQLite database software, specifically, how to navigate a SQLite database file using a SQLite client.
-
SQLite Data Starter Packs
This is a collection of public datasets conveniently packaged as SQLite databases to practice on. You don’t have to worry about the data cleaning/import process, just download the SQLite database files and query them from your favorite SQLite client.
-
No homework for Thursday. Probably a bit of study work for Tuesday
Ignore the current homepage
-
A primer for the U.S. Census American Community Survey data
The U.S. Census records such an incredible, intimidating volume of data points about American life that is is hard to know where to begin. This guide aims to explain how Census data is organized, how to find important and interesting data points, and how the data is distilled and displayed by journalism outlets.
-
SQL Part 1: Select, Sort, and Transform Data
This this first unit of SQL lessons, we cover the syntax and concepts to fetch, sort, and – as we’ve done using spreadsheet functions – transform data values.
-
Reading Your Browser's History with SQLite
How the SQLite database is used in billions of real-world applications today is of little relevance to us in this class. But the web browser is a easy-to-understand scenario of how a database gets created and filled.
- Due at 1:50 PM:
Quiz: SQL Basics -
SQL JOIN Statements
A lot of powerful journalism is simply looking at one list and seeing which of its names are on another list. The
JOIN
clause is the clearest way to express that concept, and to execute it in a blink of an eye. It is the main reason why we learn SQL instead of trying to hack around the usually versatile spreadsheet. -
SQLite Simple Folks: Overview
A set of SQL programming lessons using a tiny dataset of “simple folks” so that we can focus on the just the SQL syntax and concepts.
-
Quiz: SQL Basics
A review of basic syntax for SELECT statements, from FROM to some aggregations with GROUP BY -
Homework: LexisNexis Refresher for Ro Khanna
Log into Lexis-Nexis again and find 10 stories from the past, 5 each focusing specifically on Rep. Mike Honda and his challenger for the CA-17, Ro Khanna. From five of the 10 stories, come up with a question that you think is worth asking.
- Due at 1:30 PM:
Homework: LexisNexis Refresher for Ro Khanna -
Campaign Finance Data
Election Day is near, so what better SQL practice is there than the FEC datasets, which are not only comprehensive and fairly well-documented, but large enough that knowing SQL becomes a huge advantage over the limitations of spreadsheets.
-
Homework: Aggregation SQL warmups with earthquakes
Just some exercises covering SQL up to GROUP BY, and to familiarize you with earthquake data. -
Quiz: Homicide/Art of Access Readings Part 2 and More
Quiz covering chapters 3, 4, 5 of "Homicide" and "Art of Access". And some SQL.
- Due at 1:30 PM:
Quiz: Homicide/Art of Access Readings Part 2 and More - Due at 1:30 PM:
Homework: Practice on baby names (optional) - Due at 1:30 PM:
Homework: Aggregation SQL warmups with earthquakes -
Homicide/Art of Access Readings Part 2 and More
Quiz covering chapters 3, 4, 5 of “Homicide” and “Art of Access”. And some SQL.
-
SQL Joins for estimating gender and Hollywood power
THR released a list of Most Powerful People in Entertainment. How can we analyze that as data?
-
Homework: SQL Joins, Salaries, and Baby Names
This is a walkthrough as exercise, trying out some SQL JOINs, and slight practice on data-wrangling outside of the database. -
Homework: Why does Ryan Shapiro have so many documents?
Ryan Shapiro is a Ph.D. candidate at MIT and a research affiliate at the Berkman Center for Internet & Society at Harvard University. He is an historian of national security specializing in governmental transparency and the policing of dissent. Politico has referred to Shapiro as “a FOIA guru at the Massachusetts Institute of Technology”, while the FBI has declared Shapiro’s FOIA research methodologies themselves to be a threat to national security. Shapiro will speak to us about his research and journey from animal rights activist to FOIA-powered-scholar and transparency activist.
- Due at 9:00 AM:
Homework: Why does Ryan Shapiro have so many documents? - Due at 1:30 PM:
Homework: SQL Joins, Salaries, and Baby Names -
Guest Speaker Ryan Shapiro
Shapiro is a PhD candidate at MIT and a prolific user of FOIA laws. He’ll talk about how FOIA became relevant to his research and activism, how he got better at making requests, and how to argue with the FBI.
-
Midterm Overview and Example Questions
Oh boy…
- Due at 5:00 PM:
Homework: Election Predictions with SQL -
Election Predictions with SQL
Today is Election Day. Learn new SQL statement, compete with your classmates, and win the prediction pool by copying Nate Silver.
-
The Midterm Datasets
The data for the midterm in database form.
- Midterm
-
Homework: MuckRock, Baltimore, and more David Simon
Some readings, a quick signup, stuff to prep for the last 2 weeks of data journalism work. No points, but please do the work or face a very difficult quiz after Thanksgiving... -
Quiz: Homicide/Art of Access Readings, All Together
Final quiz covering the rest of "Homicide" and "Art of Access". -
Project: Ten FOIAs to finish
Get ready for a New Year and a new President with 10 public records requests (FOIA, state, and local)
-
Phillip Reese of the Sacramento Bee
Phillip Reese, investigative data journalist, Pulitzer Finalist, and my former colleague, will talk about what it’s like to find stories and make impact with data.
-
Midterm on SQL and Structured Data
(Reviewing) An in-class midterm evaluating your understanding and proficiency of Structured Query Language and relational database concepts. If you’ve already taken the midterm, here are the answers.
-
Homework: MuckRock, Baltimore, and more David Simon
Some readings, a quick signup, stuff to prep for the last 2 weeks of data journalism work. No points, but please do the work or face a very difficult quiz after Thanksgiving... -
Quiz: Homicide/Art of Access Readings, All Together
Final quiz covering the rest of "Homicide" and "Art of Access". -
Project: Ten FOIAs to finish
Get ready for a New Year and a new President with 10 public records requests (FOIA, state, and local)
- Due at 1:30PM:
Homework: MuckRock, Baltimore, and more David Simon -
More Real World Dirty Data
Covering more challenges of real-world data. Data without structure is just noise, after all.
-
Crime Data
If we haven’t already, we’ll look at the very nebulous and political nature of crime stats. Or, watch Season 3 of The Wire.
-
Data Journalism Disasters
What are ways that data journalism can go very wrong?
-
Having singular data and focus
Anecdote, spreadsheet, database, FOIA – it doesn’t matter what you start with but that you start somewhere. So here are some random bits of advice and examples.
-
Published/Partnered Data Sidebar
Alone, or with a partner, find a significant dataset or (several) that you want to dig into deeper. Use spreadsheets, SQL, Carto – whatever gets the job done.
-
Ten FOIAs to finish
Get ready for a New Year and a new President with 10 public records requests (FOIA, state, and local)
- Due at 1:30 PM:
Quiz: Homicide/Art of Access Readings, All Together -
Spreadsheet-Made Data Visualizations That Are Relatively Simple and Absolutely Effective
A non-comprehensive list of examples of powerful data storytelling in the wild that (likely) originated with a spreadsheet.
-
Carto, Political Boundaries, SQL, and Guns and Coffee
Using Carto and SQL Joins to make a choropleth of Starbucks and gunshops by zip code