There are cases in which we want to create a data table with all possible arrangements of different sets of values (otherwise called the Cartesian product).

Example

We want to list shirts of different colors, sizes and fabrics. Each shirt can be:

  • one of 5 colors: black, white, red, green, blue

  • one of 4 sizes: small, medium, large, extra large

  • one of 3 fabrics: cotton, linen, wool

To represent shirts with all combinations of these properties, we can need to cross-join them.

The CJ function of the data.table package cross-joins given vectors into a data table. In this particular case, we can use the following three vectors:

color <- c("black","white","red","green","blue")
size <- c("small","medium","large","extra large")
fabric <- c("cotton","linen","wool")

The resulting table should have the following structure:

  • Number of columns: number of vectors (in this case 3)

  • Number of rows: product of vector length (in this case 5 · 4 · 3 = 60)

# Load data.table library
library(data.table)

# Create all possible combinations of color, size, and fabric
CJ(color,size,fabric)
##     color        size fabric
##  1: black extra large cotton
##  2: black extra large  linen
##  3: black extra large   wool
##  4: black       large cotton
##  5: black       large  linen
##  6: black       large   wool
##  7: black      medium cotton
##  8: black      medium  linen
##  9: black      medium   wool
## 10: black       small cotton
## 11: black       small  linen
## 12: black       small   wool
## 13:  blue extra large cotton
## 14:  blue extra large  linen
## 15:  blue extra large   wool
## 16:  blue       large cotton
## 17:  blue       large  linen
## 18:  blue       large   wool
## 19:  blue      medium cotton
## 20:  blue      medium  linen
## 21:  blue      medium   wool
## 22:  blue       small cotton
## 23:  blue       small  linen
## 24:  blue       small   wool
## 25: green extra large cotton
## 26: green extra large  linen
## 27: green extra large   wool
## 28: green       large cotton
## 29: green       large  linen
## 30: green       large   wool
## 31: green      medium cotton
## 32: green      medium  linen
## 33: green      medium   wool
## 34: green       small cotton
## 35: green       small  linen
## 36: green       small   wool
## 37:   red extra large cotton
## 38:   red extra large  linen
## 39:   red extra large   wool
## 40:   red       large cotton
## 41:   red       large  linen
## 42:   red       large   wool
## 43:   red      medium cotton
## 44:   red      medium  linen
## 45:   red      medium   wool
## 46:   red       small cotton
## 47:   red       small  linen
## 48:   red       small   wool
## 49: white extra large cotton
## 50: white extra large  linen
## 51: white extra large   wool
## 52: white       large cotton
## 53: white       large  linen
## 54: white       large   wool
## 55: white      medium cotton
## 56: white      medium  linen
## 57: white      medium   wool
## 58: white       small cotton
## 59: white       small  linen
## 60: white       small   wool
##     color        size fabric

There is no limit to the number and size of vectors we can use, except the computer’s memory.

Sometimes the number of vectors can vary, or it’s just too cumbersome to write them one by one. If they are contained in a list, we can always call CJ using do.call

# Create a list containing the vectors
properties <- list(color=c("black","white","red","green","blue"),
                   size=c("small","medium","large","extra large"),
                   fabric=c("cotton","linen","wool"))

# Give the list as arguments to CJ, using do.call
do.call(what=CJ,args=properties)
##     color        size fabric
##  1: black extra large cotton
##  2: black extra large  linen
##  3: black extra large   wool
##  4: black       large cotton
##  5: black       large  linen
##  6: black       large   wool
##  7: black      medium cotton
##  8: black      medium  linen
##  9: black      medium   wool
## 10: black       small cotton
## 11: black       small  linen
## 12: black       small   wool
## 13:  blue extra large cotton
## 14:  blue extra large  linen
## 15:  blue extra large   wool
## 16:  blue       large cotton
## 17:  blue       large  linen
## 18:  blue       large   wool
## 19:  blue      medium cotton
## 20:  blue      medium  linen
## 21:  blue      medium   wool
## 22:  blue       small cotton
## 23:  blue       small  linen
## 24:  blue       small   wool
## 25: green extra large cotton
## 26: green extra large  linen
## 27: green extra large   wool
## 28: green       large cotton
## 29: green       large  linen
## 30: green       large   wool
## 31: green      medium cotton
## 32: green      medium  linen
## 33: green      medium   wool
## 34: green       small cotton
## 35: green       small  linen
## 36: green       small   wool
## 37:   red extra large cotton
## 38:   red extra large  linen
## 39:   red extra large   wool
## 40:   red       large cotton
## 41:   red       large  linen
## 42:   red       large   wool
## 43:   red      medium cotton
## 44:   red      medium  linen
## 45:   red      medium   wool
## 46:   red       small cotton
## 47:   red       small  linen
## 48:   red       small   wool
## 49: white extra large cotton
## 50: white extra large  linen
## 51: white extra large   wool
## 52: white       large cotton
## 53: white       large  linen
## 54: white       large   wool
## 55: white      medium cotton
## 56: white      medium  linen
## 57: white      medium   wool
## 58: white       small cotton
## 59: white       small  linen
## 60: white       small   wool
##     color        size fabric

See the documentation file on data.table’s CJ for more details.