R cheat sheet

Install R & RStudio locally.

Install package

To install a package: install.packages("DBI") To install a specif version of a package: install_version("lubridate", version = "1.6.0")

Export to CSV

write.csv(, file=".csv")

SELECT DISTINCT

unique(<data table>$<column>)

Find duplicates

temp <- as.data.frame(table(<data table>$<column>))
temp[temp$Freq >  1,]

Data frame

Number of rows: nrow(<dataframe>)
Top n rows: head(<dataframe>, n)
Filter: <dataframe>[which(column1=="value1" & column2 > value2), ]
<dataframe>[which(column1=="value1" & column2 > value2), c("col1", "col2")]

Unpivot

melt

SQL with sqldf

Easy syntax, but overhead to load the table in the database engine. library(sqldf) defaults to SQLite.

To use variables, add the prefix “fn$” (requires package gsubfn)
fn$sqldf("select * from iris where \"Sepal.Length\" > $minSL and species = '$species' limit $limit")
References:

Benchmarking

SET STATISTICS TIME ON system.time(<statement>)

Save & Load

Provide variable and file name (this example saves in the working directory) save(<data table>, file ="<file name>.RData")
load("<data table>.RData") #load into the eponymous variable