SQL Counting and Filtering Lesson

The SQLite data

Here are the quick links to the downloadable SQLite databases that we’re using today. The .sqlite file extension is arbitrary, it’s just there so that when we look at the filename (from among many that we download), we have a reminder of what the file may be. The DB Browser can open any file you point it to.

However, SQLite databases are indeed a binary file format. Open it in Excel and it will throw an error. Open it in a plain text editor and you’ll get gibberish.

Census Most Frequent Surnames for 2000 and 2010

As practiced in this assignment – except with all of the data in one table, as opposed to separate spreadsheets. : CanViz: Popular surnames as recorded by the U.S. Census [2017-10-11]

Download: census_surnames.sqlite

Palo Alto Public Salaries, from 2011 to 2016

As practiced in this assignment – except with all of the data, 2011 through 2016, in one table, not just 2014 and 2016: : Palo Alto Pay Aggregated, 2014 vs 2016 [2017-10-16]

Download: palo-alto-salaries.sqlite

Reference

Previous lessons for Reading

I’ve been wanting to streamline and prettify these lessons, but they pretty much cover the concepts exactly as I would in any year. The lessons refer to different downloaded datasets – this is a minor detail that you can figure out, or if you can’t, you can download the past versions of data. It doesn’t matter for our purposes, as the SQL programming is the same.

Counting, sorting, and filtering with SQL, plus Answers

Assuming you’ve dowloaded this SQLite data:

Attempt to answer these questions: SQL Counting/Sorting/Filtering (Questions)

The answers for ssa-babynames

  • The first records (just name, sex, year) in the babynames database

    SELECT
        name, sex, year
    FROM
        babynames
    LIMIT 5;
    
  • Top 5 popular baby names of all time

    SELECT
        name, sex, count, year
    FROM
        babynames
    ORDER BY
        count DESC
    LIMIT 5;
    
  • Top 5 female babynames in 1991

    (no need to inclue sex, since it’s all female)

    SELECT
        name, count, year
    FROM
        babynames
    WHERE
        sex = 'F'
        AND year = 1991
    ORDER BY
        count DESC
    LIMIT 5;
    
  • How the female name “Jenny” compares to “Jennifer” in 2001

    SELECT
        name, count
    FROM
        babynames
    WHERE
        sex = 'F'
        AND year = '2001'
        AND
            (name = 'Jennifer' OR name = 'Jenny');
    
  • How the male names ‘Noah’, ‘Jordan’, and David compare, in 1980 and 2010

    Include the year column and sort it by both year and order to get a listing in which it’s easy to compare the years.

    SELECT
        name, year, count
    FROM
        babynames
    WHERE
        sex = 'M'
        AND
            year IN(1980, 2010)
        AND
            name IN('David', 'Noah', 'Jordan')
    ORDER BY
        year ASC,
        count DESC;
    
  • The 10 most popular names in 2009 that began with a vowel

    SELECT
        name, sex, count
    FROM
        babynames
    WHERE
        SUBSTR(name, 1, 1)
            IN ('A', 'E', 'I', 'O', 'U')
    
        AND year = 2009
    ORDER BY
        count DESC
    LIMIT 10;
    
  • The 10 most popular names in the year 2015 that begin with ‘Z’ and have at least 5 characters

    SELECT
        name, sex, year,
        LENGTH(name) AS name_length,
        count
    FROM
        babynames
    WHERE
        name_length >= 5
        AND SUBSTR(name, 1, 1) = 'Z'
        AND year = 2015
    ORDER BY
        count DESC
    LIMIT 10;