Agenda


In this module, we will learn to:

  • select
    • single column
    • multiple columns
    • distinct values in a column
  • limit the number of records returned
  • handle NULL values
  • and filter columns using the following operators
    • WHERE
    • AND, or & NOT
    • BETWEEN
    • IN
    • LIKE
  • aggregate data
  • order data
  • group data

What is SQL


  • stands for Structured Query Language
  • lets you access and manipulate databases

What can SQL do?


  • SQL can execute queries against a database
  • SQL can retrieve data from a database
  • SQL can insert records in a database
  • SQL can update records in a database
  • SQL can delete records from a database
  • SQL can create new databases
  • SQL can create new tables in a database

Data




## # A tibble: 1,000 x 6
##    device referrer purchase duration n_visit country       
##    <chr>  <chr>    <chr>       <dbl>   <int> <chr>         
##  1 laptop google   false         693      10 Czech Republic
##  2 tablet yahoo    false         459       9 Yemen         
##  3 laptop direct   false         996       0 Brazil        
##  4 tablet bing     true          468       3 China         
##  5 mobile yahoo    false         955       9 Poland        
##  6 laptop yahoo    false         135       5 South Africa  
##  7 mobile yahoo    false          75      10 Bangladesh    
##  8 mobile direct   false         908      10 Indonesia     
##  9 mobile bing     false         209       3 Netherlands   
## 10 mobile google   false         208       6 Czech Republic
## # ... with 990 more rows

Data Dictionary


  • id: row id
  • referrer: referrer website/search engine
  • device: device used to visit the website
  • duration: time spent on the website (in seconds)
  • country: country of origin
  • purchase: whether visitor purchased
  • n_visit: number of visits

Select Single Column


SELECT device FROM trade
Displaying records 1 - 6
device
laptop
tablet
laptop
tablet
mobile
laptop

Select Multiple Columns


Select the following fields from the trade table:

  • referrer
  • device
  • purchase

Select Multiple Columns


SELECT referrer, device, purchase  FROM trade
Displaying records 1 - 6
referrer device purchase
google laptop false
yahoo tablet false
direct laptop false
bing tablet true
yahoo mobile false
yahoo laptop false

Select All Columns


SELECT * FROM trade
Displaying records 1 - 6
device referrer purchase duration n_visit country
laptop google false 693 10 Czech Republic
tablet yahoo false 459 9 Yemen
laptop direct false 996 0 Brazil
tablet bing true 468 3 China
mobile yahoo false 955 9 Poland
laptop yahoo false 135 5 South Africa

Limit


SELECT * FROM trade limit 10
Displaying records 1 - 6
device referrer purchase duration n_visit country
laptop google false 693 10 Czech Republic
tablet yahoo false 459 9 Yemen
laptop direct false 996 0 Brazil
tablet bing true 468 3 China
mobile yahoo false 955 9 Poland
laptop yahoo false 135 5 South Africa

Distinct


SELECT distinct referrer FROM trade
5 records
referrer
google
yahoo
direct
bing
social

Filter


Now that we know how to select columns, let us focus on filtering data. In SQL, the WHERE keyword is used to extract only those records that fulfill a specified condition. Data filter based on both text and numeric values in a table. Below are a few comparison operators we can use:

  • =
  • <>
  • <
  • >
  • <=
  • >=

Filter


SELECT * FROM trade WHERE duration > 300
Displaying records 1 - 6
device referrer purchase duration n_visit country
laptop google false 693 10 Czech Republic
tablet yahoo false 459 9 Yemen
laptop direct false 996 0 Brazil
tablet bing true 468 3 China
mobile yahoo false 955 9 Poland
mobile direct false 908 10 Indonesia

Filter


SELECT * FROM trade WHERE device == 'mobile'
Displaying records 1 - 6
device referrer purchase duration n_visit country
mobile yahoo false 955 9 Poland
mobile yahoo false 75 10 Bangladesh
mobile direct false 908 10 Indonesia
mobile bing false 209 3 Netherlands
mobile google false 208 6 Czech Republic
mobile direct true 406 9 Ireland

And, Or & Not


The WHERE clause can be combined with other operators to filter data based on more than one condition or to create more complex conditions. Below are the operators:

  • AND - displays a record if all the conditions separated by AND is TRUE
  • OR - displays a record if any of the conditions separated by OR is TRUE
  • NOT - displays a record if the condition(s) is NOT TRUE

And, Or & Not


SELECT * FROM trade WHERE n_visit > 3 AND duration > 100
Displaying records 1 - 6
device referrer purchase duration n_visit country
laptop google false 693 10 Czech Republic
tablet yahoo false 459 9 Yemen
mobile yahoo false 955 9 Poland
laptop yahoo false 135 5 South Africa
mobile direct false 908 10 Indonesia
mobile google false 208 6 Czech Republic

And, Or & Not


In the next example, we will use both AND & OR. Our goal is to filter all rows from the trade table that follow the below conditions:

  • n_visit (visit count) is either equal to 3 or 5
  • device used to visit the website is either mobile or tablet

And, Or & Not


SELECT * FROM trade WHERE (n_visit == 5 OR n_visit == 3)  
  AND (device = 'mobile' OR device = 'tablet')
Displaying records 1 - 6
device referrer purchase duration n_visit country
tablet bing true 468 3 China
mobile bing false 209 3 Netherlands
tablet yahoo false 80 5 Philippines
tablet bing true 368 5 Peru
mobile google false 147 5 Brazil
mobile yahoo false 700 3 Canada

BETWEEN


The BETWEEN operator selects values within a given range and is inclusive: begin and end values are included. The values can be numbers, text, or dates. In the following example, we filter rows from the trade table where the visit count is between 1 and 3, and the device used to visit the website is mobile.

BETWEEN


SELECT * FROM trade WHERE n_visit BETWEEN 1 AND 3 AND device = 'mobile'
Displaying records 1 - 6
device referrer purchase duration n_visit country
mobile bing false 209 3 Netherlands
mobile direct false 501 2 El Salvador
mobile bing false 25 1 Ireland
mobile yahoo false 700 3 Canada
mobile direct false 234 1 Indonesia
mobile social false 44 1 France

IN


The IN operator allows us to specify multiple values in a WHERE clause. It is a shorthand for multiple OR conditions. In the below example, we filter rows from the trade table where the visit count is either 2 or 4 or 6 or 8 or 10. Instead of using multiple OR conditions, we use the IN operator.

IN


SELECT * FROM trade WHERE n_visit