Colors in R
17 Oct 2011One of my favorite R packages that I use all the time is the RColorBrewer package. The package has been around for a while now and is written/maintained by Erich Neuwirth. The guts of the package are based on Cynthia Brewer’s very cool work on the use of color in cartography (check out the colorbrewer web site).
As a side note, I think the ability to manipulate colors in plots/graphs/maps is one of R’s many great strengths. My personal experience is that getting the right color scheme can make a difference in how data are perceived in a plot.
RColorBrewer basically provides one function, brewer.pal, that generates different types of color palettes. There are three types of palettes: sequential, diverging, and qualitative. Roughly speaking, sequential palettes are for continuous data where low is less important and high is more important, diverging palettes are for continuous data where both low and high are important (i.e. deviation from some reference point), and qualitative palettes are for categorical data where there is no logical order (i.e. male/female).
To use the brewer.pal function, it’s often useful to combine it with another R function, colorRampPalette. This function is built into R and is part of the grDevices package. It takes a palette of colors and interpolates between the colors to give you an entire spectrum. Think of a painter’s palette with 4 or 5 color blotches on it, and then think of the painter taking a brush and blending the colors together. That’s what colorRampPalette does. So brewer.pal gives you the colors and colorRampPalette mashes them together. It’s a happy combination.
So, how do we use these functions? My basic approach is to first set the palette depending on the type of data. Suppose we have continuous sequential data and we want the “Blue-Purple” palette
colors <- brewer.pal(4, "BuPu")
Here, I’ve taken 4 colors from the “BuPu” palette, so there are now 4 blotches on my palette. To interpolate these colors, I can call colorRampPalette, which actually returns a function.
pal <- colorRampPalette(colors)
Now, pal is a function that takes a positive integer argument and returns that number of colors from the palette. So for example
> pal(5) [1] "#EDF8FB" "#C1D7E9" "#9FB1D4" "#8B80BB" "#88419D"
I got 5 different colors from the palette, with their red, green, and blue values coded in hexadecimal. If I wanted 20 colors I could have called pal(20).
The pal function is useful in other functions like image or wireframe (in the lattice package). In both of those functions, the ‘col’ argument can be given a set of colors generated by the pal function. For example, you could call
data(volcano) image(volcano, col = pal(30))
and you would plot the ‘volcano’ data using 30 colors from the “BuPu” palette.
If you’re wondering what all the different palettes are and what colors are in them, here’s a handy reference:
Or you can just call
display.brewer.all()
There’s been a lot of interesting work done on colors in R and this is just scratching the surface. I’ll probably return to this subject in a future post.