02 Nov 2011
In this post Alex Tabarrok argues that not enough people are obtaining “degrees that pay” and that college has been oversold. It struck me that the number of students studying Visual and Performing Arts has more than doubled since 1985. Yet for Math and Statistics there has been no increase at all! We need to do a better job at marketing. The great majority (if not all) of the people I know with Statistics degrees have found a job related to Statistics. With a Master’s, salary can be as high as $110K.So to those interested in Visual and Performing Arts that are good with numbers I suggest you hedge your bets: do a double major and consider Statistics. My brother, a successful musician, majored in Math. He uses his math skills to supplement his income by playing poker with other musicians.
01 Nov 2011
My article on computing on the language was unexpectedly popular and so I wanted to quickly follow up on my own solution. Many of you got the answer, and in fact many got solutions that were quite a bit shorter than mine. Here’s how I did it:
makeList <- function(...) {
args <- substitute(list(...))
nms <- sapply(args[-1], deparse)
vals <- list(...)
names(vals) <- nms
vals
}
Baptiste pointed out that Frank Harrell has already implemented this function in his Hmisc package as the ‘llist’ function (thanks for the pointer!). I’ll just note that this function does a bit more actually because each element of the returned list is an object of class “labelled”.
The shortest solution was probably Tony Breyal’s version:
makeList <- function(...) {
structure(list(...), names = names(data.frame(...)))
}
However, it’s worth noting that this function modifies the object’s name if the name is non-standard (i.e. if you’re using backticks like `r object name`). That’s just because the ‘data.frame’ function automatically modifies names if they are non-standard.
Thanks to everyone for the responses! I’ll try to come up with another one soon.
01 Nov 2011
This fall I have been asked to write seven promotion letters. Writing these takes me at least 2 hours. If it’s someone I don’t know it takes me longer because I have to read some of their papers. Earlier this year, I wrote one for a Biology department that took me at least 6 hours. So how many are too many? Should I set a limit? Advice and opinions in the comments would be greatly appreciated.
29 Oct 2011
It seems like everywhere we look, data is being generated - from politics, to biology, to publishing, to social networks. There are also diverse new computational tools, like GPGPU and cloud computing, that expand the statistical toolbox. Statistical theory is more advanced than its ever been, with exciting work in a range of areas.
With all the excitement going on around statistics, there is also increasing diversity. It is increasingly hard to define “statistician” since the definition ranges from very mathematical to very applied. An obvious question is: what are the most critical skills needed by statisticians?
So just for fun, I made up my list of the top 5 most critical skills for a statistician by my own definition. They are by necessity very general (I only gave myself 5).
- The ability to manipulate/organize/work with data on computers - whether it is with excel, R, SAS, or Stata, to be a statistician you have to be able to work with data.
- A knowledge of exploratory data analysis - how to make plots, how to discover patterns with visualizations, how to explore assumptions
- Scientific/contextual knowledge - at least enough to be able to abstract and formulate problems. This is what separates statisticians from mathematicians.
- Skills to distinguish true from false patterns - whether with p-values, posterior probabilities, meaningful summary statistics, cross-validation or any other means.
- The ability to communicate results to people without math skills - a key component of being a statistician is knowing how to explain math/plots/analyses.
What are your top 5? What order would you rank them in? Even though these are so general, I almost threw regression in there because of how often it pops up in various forms.
Related Posts: Rafa on graduate education and What is a Statistician? Roger on “Do we really need applied statistics journals?”
27 Oct 2011
And now for something a bit more esoteric….
I recently wrote a function to deal with a strange problem. Writing the function ended up being a fun challenge related to computing on the R language itself.
Here’s the problem: Write a function that takes any number of R objects as arguments and returns a list whose names are derived from the names of the R objects.
Perhaps an example provides a better description. Suppose the function is called ‘makeList’. Then
x <- 1
y <- 2
z <- "hello"
makeList(x, y, z)
returns
list(x = 1, y = 2, z = "hello")
It originally seemed straightforward to me, but it turned out to be very much not straightforward.
Note that a function like this is probably most useful during interactive sessions, as opposed to programming.
I challenge you to take a whirl at writing the function, you know, in all that spare time you have. I’ll provide my solution in a future post.