Data.table is a package for the R statistical computing environment. It extends the functionality of data frames from base R, particularly improving on their performance and syntax. A number of related tasks, including rolling and non-equi joins, are handled in a consistent concise syntax like
DT[where, select|update|do, by].
A number of complementary functions are also included in the package:
|Version||Notes||Release Date on CRAN|
|1.10.0||"With hindsight, the last release v1.9.8 should have been named v1.10.0"||2016-12-03|
Install the stable release from CRAN:
Or the development version from github:
install.packages("data.table", type = "source", repos = "http://Rdatatable.github.io/data.table")
To revert from devel to CRAN, the current version must first be removed:
Visit the website for full installation instructions and the latest version numbers.
Usually you will want to load the package and all of its functions with a line like
If you only need one or two functions, you can refer to them like
The package's official wiki has some essential materials:
As a new user, you will want to check out the vignettes, FAQ and cheat sheet.
Before asking a question -- here on StackOverflow or anywhere else -- please read the support page.
For help on individual functions, the syntax is
?fread. If the package has not been loaded, use the full name like
DT[where, select|update|do, by] syntax is used to work with columns of a data.table.
These two arguments are usually passed by position instead of by name.
A sequence of steps can be chained like
|Function or symbol||Notes|
|in several arguments, replaces |
selected by the
|default names for unnamed columns created in |
|join two tables|
|special prefix on DT2's columns after the join|
|special option available only with a join|
|anti-join two tables|
|join two tables, rolling on the last column in |
|transform to long format |
for multiple columns, use
|transform to wide format|
|stack enumerated data.tables|
|stack a list of data.tables|
|split a data.table into a list|
|another way of joining two tables|
|another way of adding or modifying columns|
|set-theory operations with rows as elements|
|the Cartesian product of vectors|
|the number of distinct rows|
|row ID (1 to .N) within each group determined by cols|
|group ID (1 to .GRP) within each group determined by runs of cols|
|apply a shift operator to every column|
|modify attributes and order by reference|
|integer dates and times|