Agenda


  • extract unique rows
  • rename columns
  • sample data
  • extract columns
  • slice rows
  • arrange rows
  • compare tables
  • extract/mutate data using predicate functions
  • count observations for different levels of a variable

Libraries


Data



## # A tibble: 1,000 x 7
##    referrer device bouncers n_visit n_pages duration purchase
##    <fct>    <fct>  <lgl>      <dbl>   <dbl>    <dbl> <lgl>   
##  1 google   laptop TRUE          10       1      693 FALSE   
##  2 yahoo    tablet TRUE           9       1      459 FALSE   
##  3 direct   laptop TRUE           0       1      996 FALSE   
##  4 bing     tablet FALSE          3      18      468 TRUE    
##  5 yahoo    mobile TRUE           9       1      955 FALSE   
##  6 yahoo    laptop FALSE          5       5      135 FALSE   
##  7 yahoo    mobile TRUE          10       1       75 FALSE   
##  8 direct   mobile TRUE          10       1      908 FALSE   
##  9 bing     mobile FALSE          3      19      209 FALSE   
## 10 google   mobile TRUE           6       1      208 FALSE   
## # ... with 990 more rows

Data Dictionary


  • referrer: referrer website/search engine
  • device: device used to visit the website
  • bouncers: whether a visit bounced (exited from landing page)
  • duration: time spent on the website (in seconds)
  • purchase: whether visitor purchased
  • n_visit: number of visits
  • n_pages: number of pages visited/browsed

Distinct




Traffic Sources


## # A tibble: 5 x 1
##   referrer
##   <fct>   
## 1 google  
## 2 yahoo   
## 3 direct  
## 4 bing    
## 5 social

Device Types


## # A tibble: 3 x 1
##   device
##   <fct> 
## 1 laptop
## 2 tablet
## 3 mobile

Rename




Rename Columns


## # A tibble: 1,000 x 7
##    referrer device bouncers n_visit n_pages time_on_site purchase
##    <fct>    <fct>  <lgl>      <dbl>   <dbl>        <dbl> <lgl>   
##  1 google   laptop TRUE          10       1          693 FALSE   
##  2 yahoo    tablet TRUE           9       1          459 FALSE   
##  3 direct   laptop TRUE           0       1          996 FALSE   
##  4 bing     tablet FALSE          3      18          468 TRUE    
##  5 yahoo    mobile TRUE           9       1          955 FALSE   
##  6 yahoo    laptop FALSE          5       5          135 FALSE   
##  7 yahoo    mobile TRUE          10       1           75 FALSE   
##  8 direct   mobile TRUE          10       1          908 FALSE   
##  9 bing     mobile FALSE          3      19          209 FALSE   
## 10 google   mobile TRUE           6       1          208 FALSE   
## # ... with 990 more rows

Sampling




Sampling Data


## # A tibble: 700 x 7
##    referrer device bouncers n_visit n_pages duration purchase
##    <fct>    <fct>  <lgl>      <dbl>   <dbl>    <dbl> <lgl>   
##  1 bing     tablet FALSE          2       5      150 FALSE   
##  2 social   tablet TRUE           9       1      157 FALSE   
##  3 yahoo    tablet TRUE           6       1       67 FALSE   
##  4 direct   laptop FALSE          1      14      364 TRUE    
##  5 direct   mobile FALSE          2       9      243 FALSE   
##  6 direct   tablet FALSE         10       3       57 FALSE   
##  7 yahoo    tablet TRUE          10       1      668 FALSE   
##  8 yahoo    tablet FALSE          2      20      320 FALSE   
##  9 bing     tablet TRUE           0       1      845 FALSE   
## 10 yahoo    mobile FALSE          8       9      225 FALSE   
## # ... with 690 more rows

Sampling Data


## # A tibble: 700 x 7
##    referrer device bouncers n_visit n_pages duration purchase
##    <fct>    <fct>  <lgl>      <dbl>   <dbl>    <dbl> <lgl>   
##  1 bing     tablet TRUE           6       1      567 FALSE   
##  2 bing     tablet FALSE          6       9      198 FALSE   
##  3 bing     laptop TRUE           3       1      271 FALSE   
##  4 bing     mobile FALSE         10       1       26 FALSE   
##  5 bing     mobile TRUE           5       1      751 FALSE   
##  6 bing     tablet FALSE          1       8      144 FALSE   
##  7 yahoo    mobile TRUE          10       1      761 FALSE   
##  8 bing     laptop FALSE          8      10      260 TRUE    
##  9 direct   tablet FALSE          1       3       69 FALSE   
## 10 google   laptop TRUE           9       1      174 FALSE   
## # ... with 690 more rows

Extract Columns




Sample Data

Extract Device Column


##  [1] mobile mobile mobile laptop mobile mobile laptop laptop tablet tablet
## Levels: laptop tablet mobile

Extract First Column


##  [1] yahoo  google bing   social google yahoo  social yahoo  google yahoo 
## Levels: bing direct social yahoo google

Extract Last Column


##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Extract Rows



Extract 10 rows starting from 15


## # A tibble: 10 x 7
##    referrer device bouncers n_visit n_pages duration purchase
##    <fct>    <fct>  <lgl>      <dbl>   <dbl>    <dbl> <lgl>   
##  1 yahoo    mobile TRUE           9       1      955 FALSE   
##  2 yahoo    laptop FALSE          5       5      135 FALSE   
##  3 yahoo    mobile TRUE          10       1       75 FALSE   
##  4 direct   mobile TRUE          10       1      908 FALSE   
##  5 bing     mobile FALSE          3      19      209 FALSE   
##  6 google   mobile TRUE           6       1      208 FALSE   
##  7 direct   laptop TRUE           9       1      738 FALSE   
##  8 direct   tablet FALSE          6      12      132 FALSE   
##  9 direct   mobile FALSE          9      14      406 TRUE    
## 10 yahoo    tablet FALSE          5       8       80 FALSE

Extract Last Row


## # A tibble: 1 x 7
##   referrer device bouncers n_visit n_pages duration purchase
##   <fct>    <fct>  <lgl>      <dbl>   <dbl>    <dbl> <lgl>   
## 1 google   mobile TRUE           9       1      269 FALSE

Tally