Tag Archives: Andrew Gelman

Andrew Gelman’s Statistical Modeling weblog

Well known statistician Andrew Gelman and his colleagues write an informative and interesting weblog Statistical Modeling, Causal Inference, and Social Science on the quantitative side of the social sciences (which has a lot that ecologists or sustainability scientists can learn from).  Below are some of the recent posts that I found informative and interesting:

1) Suggestions on how to best learn R

2) A data visualization manifesto

At a statistical level, though, I think the details are very important, because they connect the data being graphed with the underlying questions being studied. For example, if you want to compare unemployment rates for different industries, you want them on the same scale. If you’re not interested in an alphabetical ordering, you don't want to put it on a graph. If you want to convey something beyond simply that big cars get worse gas mileage, you’ll want to invert the axes on your parallel coordinate plot. And so forth. When I make a graph, I typically need to go back and forth between the form of the plot, its details, and the questions I’m studying.

3) Testing effectiveness of different approaches to data visualization

Jeff Heer and Mike Bostock provided Mechanical Turk workers with a problem they had to answer using different types of charts. The lower error the workers got, the better the visualization. Here are some results from their paper Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design

4) Novels as perturbations of our models of reality

I used to think that fiction is about making up stories, but in recent years I’ve decided that fiction is really more of a method of telling true stories. One thing fiction allows you to do is explore what-if scenarios. I recently read two books that made me think about this: The Counterlife by Philip Roth and Things We Didn’t See Coming by Steven Amsterdam. Both books are explicitly about contingencies and possibilities: Roth’s tells a sequence of related but contradictory stories involving his Philip Roth-like (of course) protagonist, and Amsterdam’s is based on an alternative present/future. (I picture Amsterdam’s book as being set in Australia, but maybe I’m just imagining this based on my knowledge that the book was written and published in that country.) I found both books fascinating, partly because of the characters’ voices but especially because they both seemed to exemplify George Box’s dictum that to understand a system you have to perturb it.

Andrew Gelman’s statistical lexicon

On his group’s weblog, influential Bayesian statistican Andrew Gelman proposes a statistical lexicon to make important methods and concepts related to statistics better know:

The Secret Weapon: Fitting a statistical model repeatedly on several different datasets and then displaying all these estimates together.

The Superplot: Line plot of estimates in an interaction, with circles showing group sizes and a line showing the regression of the aggregate averages.

The Folk Theorem: When you have computational problems, often there’s a problem with your model. …

Alabama First: Howard Wainer’s term for the common error of plotting in alphabetical order rather than based on some more informative variable.

The Taxonomy of Confusion: What to do when you’re stuck.

The Blessing of Dimensionality: It’s good to have more data, even if you label this additional information as “dimensions” rather than “data points.”

Scaffolding: Understanding your model by comparing it to related models.

Multiple Comparisons: Generally not an issue if you’re doing things right but can be a big problem if you sloppily model hierarchical structures non-hierarchically.Taking a model too seriously: Really just another way of not taking it seriously at all.