Agenda


  • create
    • simple bar plot
    • stacked bar plot
    • grouped bar plot
  • modify bar
    • direction
    • color
    • line color
    • width
    • labels
  • modify axis range
  • remove axes from the plot
  • specify the line type of the X axes
  • offset the Y axes
  • modify legend

Introduction


  • a bar plot represents data in rectangular bars
  • the length of the bars are proportional to the values they represent
  • bar plots can be either horizontal or vertical
  • the X axis of the plot represents the levels or the categories
  • and the Y axis represents the frequency/count of the variable

Libraries


library(readr)
library(dplyr)
library(magrittr)

Data


ecom <- read_csv('https://raw.githubusercontent.com/rsquaredacademy/datasets/master/ecom.csv',
  col_types = list(col_factor(levels = c('Desktop', 'Mobile', 'Tablet')), 
  col_factor(levels = c(TRUE, FALSE)), col_factor(levels = c(TRUE, FALSE)), 
  col_factor(levels = c('Affiliates', 'Direct', 'Display', 'Organic', 'Paid', 'Referral', 'Social'))))
## # A tibble: 5,000 x 4
##     device bouncers purchase   referrer
##     <fctr>   <fctr>   <fctr>     <fctr>
##  1 Desktop    FALSE    FALSE Affiliates
##  2  Mobile    FALSE    FALSE Affiliates
##  3 Desktop     TRUE    FALSE    Organic
##  4 Desktop    FALSE    FALSE    Organic
##  5  Mobile     TRUE    FALSE     Direct
##  6 Desktop     TRUE    FALSE     Direct
##  7 Desktop    FALSE    FALSE   Referral
##  8  Tablet     TRUE    FALSE    Organic
##  9  Mobile     TRUE    FALSE     Social
## 10 Desktop     TRUE    FALSE    Organic
## # ... with 4,990 more rows

Data Dictionary


Below is the description of the data set:

  • device: device used to visit the website
  • bouncers: whether visit was a bouncer (exit website from landing page)
  • purchase: whether visitor purchased
  • referrer: referrer website/search engine

Using plot function


plot(ecom$device)

Using barplot function


barplot(table(ecom$device))

Data


device_freq <- table(ecom$device)
device_freq
## 
## Desktop  Mobile  Tablet 
##    3335    1484     181

Horizontal Bar Plot


barplot(device_freq, horiz = TRUE)

Labels


barplot(device_freq, names.arg = c('Desktop', 'Mobile', 'Tablet'))

Color


barplot(device_freq, col = 'blue')

Data


device_referrer <- table(ecom$device, ecom$referrer)
device_referrer
##          
##           Affiliates Direct Display Organic Paid Referral Social
##   Desktop        100    463      89    1575   74      647    387
##   Mobile           8    241     182     832   60       17    144
##   Tablet           4     23      18     113    9        0     14

Stacked Bar Plot


barplot(device_referrer)

Color


barplot(device_referrer, col = c('blue', 'red', 'green'))

Legend


barplot(device_referrer, col = c('blue', 'red', 'green'),
        main = 'Gears vs Cylinders', legend.text = TRUE,
        xlab = 'Accquisition Channel', ylab = 'Visitors')

Grouped Bar Plot


barplot(device_referrer, col = c('blue', 'red', 'green'), beside = TRUE, 
        legend.text = TRUE, main = 'Device Distribution by Referrer Type',
        xlab = 'Accquisition Channel', ylab = 'Visitors')

Bar Width (Equal Width)


barplot(device_freq, width = 2)

Space Between Bars


barplot(device_freq, space = c(1, 1, 2))

Border Color


barplot(device_freq, border = 'blue')

Remove axes


barplot(device_freq, axes = FALSE)

Axis Line Type


barplot(device_freq, axis.lty = 3)

Offset Y Axes


barplot(device_freq, offset = 10)

Axis Range


barplot(device_freq, ylim = c(0, 4000))

Putting it all together…


barplot(device_freq, col = c('blue', 'red', 'green'),
        horiz = TRUE, width = c(1, 1, 2),
        names.arg = c('Desktop', 'Mobile', 'Tablet'),
        axis.lty = 2, offset = 10)

Title & Axis Labels


barplot(device_freq, col = c('blue', 'red', 'green'), axis.lty = 2,
        width = c(2, 1, 0.5), names.arg = c('Desktop', 'Mobile', 'Tablet'), offset = 2)
title(main = 'Distribution of Devices',
      ylab = 'Visitors', xlab = 'Device')