Data Cleaning and Transformation

A Scientist's Guide to R: Step 2.5 - dates & times

1 TL;DR 2 Introduction 2.1 load packages 3 date/time basics 4 which day is it? 5 reading dates 6 time zones 7 month names 8 extracting datetime components 9 days in a month 10 custom date formats 11 date calculations 11.1 correcting excel date-to-numeric conversions 12 planning a behavioural neuroscience experiment 12.1 turning the planning script into a function 13 Navigation 14 Notes 1 TL;DR Dates/times are the last type of data you’ll probably work with on a fairly regular basis.

A Scientist's Guide to R: Step 2.4 - forcats for factors

1 TL;DR 2 Introduction 2.1 load packages 2.2 import data 3 Factor basics 4 factors and data visualization 5 factors and modelling 6 Navigation 7 Notes 1 TL;DR Factors are one of the two remaining types of data you’ll encounter on a fairly regular basis. This post will show you how to use the forcats tidyverse package in R so you’ll know how to handle factors when you encounter them.

A Scientist's Guide to R: Step 2.3 - string manipulation and regex

1 TL;DR 2 Introduction 3 Regular expressions 4 Detecting pattern matches with str_detect(), str_which(), str_count(), and str_locate(). 5 Subsetting strings & data frames with str_subset(), str_sub(), str_match(), & str_extract(). 6 Combining and splitting strings using str_c(), str_flatten(), str_split(), & str_glue(). 7 Manage the lengths of strings using str_length(), str_pad(), str_trunc(), & str_trim() 8 Mutating strings with str_sub(), str_replace(), str_replace_all(), str_remove(), & str_remove_all() 9 You can modify the case of a string using str_to_lower(), str_to_upper(), str_to_title(), & str_to_sentence() 10 Example application: Using str_detect() or str_which() to subset with data frames 11 Navigation 12 Notes 1 TL;DR Being able to work with character strings is an essential skill in data analysis and science.

A Scientist's Guide to R: Step 2.2 - Joining Data with dplyr

1 TL;DR 2 Introduction 2.1 setup 3 left_join() 4 right_join() 5 full_join() 6 inner_join() 7 semi_join() 8 anti_join() 9 building data frames using bind_rows() or bind_cols() 9.1 add_row() 10 joining 3 or more data frames 11 merge() 12 Navigation 13 Notes 1 TL;DR Out in the real world you may often find yourself working with data from multiple sources. It will probably be stored in separate files and you’ll need to combine them before you can attempt to answer any of your research questions.

A Scientist's Guide to R: Step 2.1 Data Transformation - part 2

1 TL;DR 2 Introduction 2.1 “long” data, “wide” data, and “tidy” data 3 pivot_longer() 4 pivot_wider() 5 unite() 6 separate() 7 Navigation 8 Notes 1 TL;DR In the 5th post of the Scientist’s Guide to R series we explore using the tidyr package to reshape data. You’ll learn all about splitting and combining columns and how to do wide to long or long to wide transformations.

A Scientist's Guide to R: Step 2.1. Data Transformation - Part 1

1 TL;DR 2 Introduction 3 select() 3.1 Renaming Columns with select() or rename() 4 filter() 4.1 Subset Rows using Indices with slice() 5 mutate() 5.1 Recoding or Creating Indicator Variables using if_else(), case_when(), or recode() 6 summarise() 7 group_by() 8 Chaining Functions with the pipe operator (%>%) 9 Navigation 10 Notes 1 TL;DR The 4th post in the Scientist’s Guide to R series introduces data transformation techniques useful for wrangling/tidying/cleaning data.

A Scientist's Guide to R: Step 2.0. Basic Operations & Data Structures

1 TL;DR 2 Introduction 3 Basic Calculations 4 Logical Operators 5 Object Assignment 6 Basic Summary Statistics 7 Data Structures and Object Assignment 7.1 Numeric and Character Vectors 7.2 Logical Vectors 7.3 Factors 7.4 Matrices 7.5 Dataframes 7.6 Tibbles 8 Random Numbers and Sampling 9 Functions for Describing the Structural Information of Data Objects 10 The Global Environment 11 The Working Directory 12 Projects 13 Useful Keyboard Shortcuts (for R studio users) 13.