The Midterm Datasets

The data for the midterm in database form.

Table of contents

Datasets to download

While you will be provided with printouts of what a data table looks like, it's best to get acclimated before the midterm starts.

The midterm-datasets.sqlite will have all of the data we'll use. Go ahead and download it.

The midterm may refer to the simplefolks.sqlite database, which you can download here.

Exercises

I've been trying to put together spreadsheets of questions for each dataset, but it's been pretty repetitive (here are some relating to Twitter).

I don't intend to give you anything harder than what's found on the spreadsheet of queries that accompanies the Election Predictions with SQL

The data in Midterm Datasets

A brief description of where the data in each table came from.

electoral_votes U.S. Presidency Electoral Votes

The popular vote per major presidential party, from 1972 to 2012, such that every state has a vote value for d and r (Democrats and Republicans), making it easy to compare how much a state has changed politically over the years.

Via electoral-vote.com, from their page: Data Galore, the Presidency

U.S. Congress Legislators

Data curated by the unitedstates.io project and curated by Hello World Data

congress_legislators Legislator biographies

Biographical information about each legislator, including latest term in office.

congress_terms Legislative terms

Joined to the bio information table via bioguide_id, contains the start, end, state, and position of each term, because each person can switch parties and chambers.

Los Angeles County Sheriff

Three tables from the L.A. Sheriff, all related to deputy-involved shooting incidents. Thanks to the Public Safety Open Data Portal for curating this data from Socrata.

lacs_shootings Shooting Incidents involving Los Angeles Sheriff's Deputies

A spreadsheet of incident-level information in which a deputy shot at someone.

lacs_deputies Deputy Details

Details specific to the deputy, including race/gender/age, years of service, and number of previous shootings.

lacs_people People Details

Details specific to the person shot by an LA deputy, including demographics and past background.

tweets Tweets from Election 2016

Tweets and account information from selected presidential candidates, staff, and political reporters. Data gathered courtesy sferik's handy t tool.

tw_users Twitter users

Profile information for each user, including how they describe themselves, when they first started tweeting, and how many friends they have.

tweets Tweets

The most recently available tweets as of early Monday, Nov. 7, right before the 2016 election. Contains the (lowercased) author, the timestamp of the tweet, and the text. Every tweet as a unique ID.

tw_hashtags Mentions

If you want to mention another user in a tweet and, to have Twitter notify them, you have to refer to that user by their screenname with a prepended @ sign, i.e. "Hello @dancow".

This table contains every time a user has been mentioned in a Tweet, in the given dataset.

tw_hashtags Hashtags

Prepending a poundsign in front of a word is how you signal you want to make a term happen. It's similar to mentions, but with just words.