Agenda
In this module, we will learn to:
- select
- single column
- multiple columns
- distinct values in a column
- limit the number of records returned
- handle NULL values
- and filter columns using the following operators
- WHERE
- AND, or & NOT
- BETWEEN
- IN
- LIKE
- aggregate data
- order data
- group data
What is SQL
- stands for Structured Query Language
- lets you access and manipulate databases
What can SQL do?
- SQL can execute queries against a database
- SQL can retrieve data from a database
- SQL can insert records in a database
- SQL can update records in a database
- SQL can delete records from a database
- SQL can create new databases
- SQL can create new tables in a database
Data
## # A tibble: 1,000 x 6
## device referrer purchase duration n_visit country
## <chr> <chr> <chr> <dbl> <int> <chr>
## 1 laptop google false 693 10 Czech Republic
## 2 tablet yahoo false 459 9 Yemen
## 3 laptop direct false 996 0 Brazil
## 4 tablet bing true 468 3 China
## 5 mobile yahoo false 955 9 Poland
## 6 laptop yahoo false 135 5 South Africa
## 7 mobile yahoo false 75 10 Bangladesh
## 8 mobile direct false 908 10 Indonesia
## 9 mobile bing false 209 3 Netherlands
## 10 mobile google false 208 6 Czech Republic
## # ... with 990 more rows
Data Dictionary
- id: row id
- referrer: referrer website/search engine
- device: device used to visit the website
- duration: time spent on the website (in seconds)
- country: country of origin
- purchase: whether visitor purchased
- n_visit: number of visits
Select Single Column
Displaying records 1 - 6
laptop |
tablet |
laptop |
tablet |
mobile |
laptop |
Select Multiple Columns
Select the following fields from the trade
table:
Select Multiple Columns
SELECT referrer, device, purchase FROM trade
Displaying records 1 - 6
google |
laptop |
false |
yahoo |
tablet |
false |
direct |
laptop |
false |
bing |
tablet |
true |
yahoo |
mobile |
false |
yahoo |
laptop |
false |
Select All Columns
Displaying records 1 - 6
laptop |
google |
false |
693 |
10 |
Czech Republic |
tablet |
yahoo |
false |
459 |
9 |
Yemen |
laptop |
direct |
false |
996 |
0 |
Brazil |
tablet |
bing |
true |
468 |
3 |
China |
mobile |
yahoo |
false |
955 |
9 |
Poland |
laptop |
yahoo |
false |
135 |
5 |
South Africa |
Limit
SELECT * FROM trade limit 10
Displaying records 1 - 6
laptop |
google |
false |
693 |
10 |
Czech Republic |
tablet |
yahoo |
false |
459 |
9 |
Yemen |
laptop |
direct |
false |
996 |
0 |
Brazil |
tablet |
bing |
true |
468 |
3 |
China |
mobile |
yahoo |
false |
955 |
9 |
Poland |
laptop |
yahoo |
false |
135 |
5 |
South Africa |
Distinct
SELECT distinct referrer FROM trade
5 records
google |
yahoo |
direct |
bing |
social |
Filter
Now that we know how to select columns, let us focus on filtering data. In SQL, the WHERE keyword is used to extract only those records that fulfill a specified condition. Data filter based on both text and numeric values in a table. Below are a few comparison operators we can use:
Filter
SELECT * FROM trade WHERE duration > 300
Displaying records 1 - 6
laptop |
google |
false |
693 |
10 |
Czech Republic |
tablet |
yahoo |
false |
459 |
9 |
Yemen |
laptop |
direct |
false |
996 |
0 |
Brazil |
tablet |
bing |
true |
468 |
3 |
China |
mobile |
yahoo |
false |
955 |
9 |
Poland |
mobile |
direct |
false |
908 |
10 |
Indonesia |
Filter
SELECT * FROM trade WHERE device == 'mobile'
Displaying records 1 - 6
mobile |
yahoo |
false |
955 |
9 |
Poland |
mobile |
yahoo |
false |
75 |
10 |
Bangladesh |
mobile |
direct |
false |
908 |
10 |
Indonesia |
mobile |
bing |
false |
209 |
3 |
Netherlands |
mobile |
google |
false |
208 |
6 |
Czech Republic |
mobile |
direct |
true |
406 |
9 |
Ireland |
And, Or & Not
The WHERE clause can be combined with other operators to filter data based on more than one condition or to create more complex conditions. Below are the operators:
- AND - displays a record if all the conditions separated by AND is TRUE
- OR - displays a record if any of the conditions separated by OR is TRUE
- NOT - displays a record if the condition(s) is NOT TRUE
And, Or & Not
SELECT * FROM trade WHERE n_visit > 3 AND duration > 100
Displaying records 1 - 6
laptop |
google |
false |
693 |
10 |
Czech Republic |
tablet |
yahoo |
false |
459 |
9 |
Yemen |
mobile |
yahoo |
false |
955 |
9 |
Poland |
laptop |
yahoo |
false |
135 |
5 |
South Africa |
mobile |
direct |
false |
908 |
10 |
Indonesia |
mobile |
google |
false |
208 |
6 |
Czech Republic |
And, Or & Not
In the next example, we will use both AND & OR. Our goal is to filter all rows from the trade
table that follow the below conditions:
n_visit
(visit count) is either equal to 3 or 5
device
used to visit the website is either mobile
or tablet
And, Or & Not
SELECT * FROM trade WHERE (n_visit == 5 OR n_visit == 3)
AND (device = 'mobile' OR device = 'tablet')
Displaying records 1 - 6
tablet |
bing |
true |
468 |
3 |
China |
mobile |
bing |
false |
209 |
3 |
Netherlands |
tablet |
yahoo |
false |
80 |
5 |
Philippines |
tablet |
bing |
true |
368 |
5 |
Peru |
mobile |
google |
false |
147 |
5 |
Brazil |
mobile |
yahoo |
false |
700 |
3 |
Canada |
BETWEEN
The BETWEEN operator selects values within a given range and is inclusive: begin and end values are included. The values can be numbers, text, or dates. In the following example, we filter rows from the trade
table where the visit count is between 1 and 3, and the device
used to visit the website is mobile
.
BETWEEN
SELECT * FROM trade WHERE n_visit BETWEEN 1 AND 3 AND device = 'mobile'
Displaying records 1 - 6
mobile |
bing |
false |
209 |
3 |
Netherlands |
mobile |
direct |
false |
501 |
2 |
El Salvador |
mobile |
bing |
false |
25 |
1 |
Ireland |
mobile |
yahoo |
false |
700 |
3 |
Canada |
mobile |
direct |
false |
234 |
1 |
Indonesia |
mobile |
social |
false |
44 |
1 |
France |
IN
The IN operator allows us to specify multiple values in a WHERE clause. It is a shorthand for multiple OR conditions. In the below example, we filter rows from the trade
table where the visit count is either 2 or 4 or 6 or 8 or 10. Instead of using multiple OR conditions, we use the IN operator.
IN
SELECT * FROM trade WHERE n_visit