The Midterm Datasets
The data for the midterm in database form.
Datasets to download
While you will be provided with printouts of what a data table looks like, it's best to get acclimated before the midterm starts.
The midterm-datasets.sqlite will have all of the data we'll use. Go ahead and download it.
The midterm may refer to the simplefolks.sqlite database, which you can download here.
Exercises
I've been trying to put together spreadsheets of questions for each dataset, but it's been pretty repetitive (here are some relating to Twitter).
I don't intend to give you anything harder than what's found on the spreadsheet of queries that accompanies the Election Predictions with SQL
The data in Midterm Datasets
A brief description of where the data in each table came from.
electoral_votes
U.S. Presidency Electoral Votes
The popular vote per major presidential party, from 1972 to 2012, such that every state has a vote value for d
and r
(Democrats and Republicans), making it easy to compare how much a state has changed politically over the years.
Via electoral-vote.com, from their page: Data Galore, the Presidency
U.S. Congress Legislators
Data curated by the unitedstates.io project and curated by Hello World Data
congress_legislators
Legislator biographies
Biographical information about each legislator, including latest term in office.
congress_terms
Legislative terms
Joined to the bio information table via bioguide_id
, contains the start, end, state, and position of each term, because each person can switch parties and chambers.
Los Angeles County Sheriff
Three tables from the L.A. Sheriff, all related to deputy-involved shooting incidents. Thanks to the Public Safety Open Data Portal for curating this data from Socrata.
lacs_shootings
Shooting Incidents involving Los Angeles Sheriff's Deputies
A spreadsheet of incident-level information in which a deputy shot at someone.
lacs_deputies
Deputy Details
Details specific to the deputy, including race/gender/age, years of service, and number of previous shootings.
lacs_people
People Details
Details specific to the person shot by an LA deputy, including demographics and past background.
tweets
Tweets from Election 2016
Tweets and account information from selected presidential candidates, staff, and political reporters. Data gathered courtesy sferik's handy t tool.
tw_users
Twitter users
Profile information for each user, including how they describe themselves, when they first started tweeting, and how many friends they have.
tweets
Tweets
The most recently available tweets as of early Monday, Nov. 7, right before the 2016 election. Contains the (lowercased) author, the timestamp of the tweet, and the text. Every tweet as a unique ID.
tw_hashtags
Mentions
If you want to mention another user in a tweet and, to have Twitter notify them, you have to refer to that user by their screenname with a prepended @
sign, i.e. "Hello @dancow".
This table contains every time a user has been mentioned in a Tweet, in the given dataset.
tw_hashtags
Hashtags
Prepending a poundsign in front of a word is how you signal you want to make a term happen. It's similar to mentions, but with just words.