SQLite Data Starter Packs
This is a collection of public datasets conveniently packaged as SQLite databases to practice on. You don’t have to worry about the data cleaning/import process, just download the SQLite database files and query them from your favorite SQLite client.
Quick links to the datasets
|SimpleFolks for Simple SQL||0.01 MB||3|
|American Community Survey 1-Year Data for 2015||0.25 MB||3|
|M3.0+ Earthquakes in the Contiguous U.S., 1995 through 2015||52.3 MB||1|
|S.F. Food Inspections (LIVES)||16.4 MB||1|
|Census 2000 Surnames||23.3 MB||1|
|Dallas Police Officer-Involved Shootings||0.4 MB||3|
|Florida Death Row Roster||0.1 MB||1|
|Salaries of City Officials from the California Peninsula||65.9 MB||1|
|SFPD Incidents, 2012 through 2015||98.3 MB||1|
|San Francisco Restaurant Health Inspections||9.8 MB||3|
|Social Security Administration Baby Names, 1980 through 2015||81.0 MB||1|
|Social Security Administration Baby Names 2015 for All States||11.4 MB||1|
|California School SAT Performance and Poverty Data||14.8 MB||3|
|Gendered Baby Names 2015||19.6 MB||1|
|Gender assessment of Hollywood Reporter's 2016 Power 100 Rankings||1.6 MB||2|
About the datasets
SimpleFolks for Simple SQL
As a way to simplify learning new SQL syntax, this is a very simple, very small database of people who just go by their first names, and live in a world in which they own pets and homes.
American Community Survey 1-Year Data for 2015
Selected demographic data, including population by ethnicity and wealth, for U.S. states, places, and congressional districts. Note that the
places table doesn’t have complete data.
M3.0+ Earthquakes in the Contiguous U.S., 1995 through 2015
Earthquakes within contiguous United States, from 1995 through 2015, that have a magnitude of at least 3.0 as measured by the U.S. Geological Survey
S.F. Food Inspections (LIVES)
A single-table, flattened version of the health department’s food inspection data. This is what Yelp uses to tie health scores to business listings.
Census 2000 Surnames
The most popular last names and their racial breakdowns as catalogued by the U.S. Census in 2000.
Dallas Police Officer-Involved Shootings
Officer-involved shootings as disclosed by the Dallas Police Department. Includes separate tables for officer and subject/suspect information.
Florida Death Row Roster
Inmates currently on Florida’s death row, with basic biographical information.
Salaries of City Officials from the California Peninsula
Anonymized salary and benefits information for city officials in San Mateo, Santa Clara, and San Francisco counties, as released by the California state controller.
SFPD Incidents, 2012 through 2015
Incidents reported to the San Francisco Police Department from 2012 through 2015.
San Francisco Restaurant Health Inspections
The San Francisco Dept. of Public Health’s database of eateries, inspections of those eateries, and violations found during the inspections.
California School SAT Performance and Poverty Data
A database containing geospatial information, as well as SAT average scores and Free-or-Reduced-Price Meal eligibility data, for California schools.
Gendered Baby Names 2015
This dataset is a transformation of the data in the 2015 Social Security babyname dataset. Instead of having a
F entry for
Leslie, this dataset has one entry for every name, with two additional fields that specify what that name’s majority gender is (and by how much).
This is a useful dataset for joining on other tables with names to get a gender calculation. This dataset includes name data for each state and nationwide.
Gender assessment of Hollywood Reporter's 2016 Power 100 Rankings
This database contains a copy of the Gendered Baby Names 2015 dataset (just nationwide, not each state), as well as a hand-copied spreadsheet I made of the Hollywood Reporter’s 100 Most Powerful People in Entertainment feature. Useful for an exercise in learning real-world messy JOINs.