Agenda


  • filter rows
  • select variables/columns
  • sort/arrange data
  • generate new variables
  • create grouped summaries

Introduction


According to a survey by CrowdFlower, data scientists spend most of their time cleaning and manipulating data rather than mining or modeling them for insights. As such, it becomes important to have tools that make data manipulation faster and easier. In today’s post, we introduce you to dplyr, a grammar of data manipulation.