Just how reliable is crowdsourced knowledge? To be honest, I’m not sure what to think. On one hand I want to believe that systems founded on democracy are inherently good. Also, a part of me knows that people can be dumb.
Then I wondered: how do RateBeer and BeerAdvocate compare? It turns out this same question has been asked before in the past. Just as soon as I had started looking, I found Top Fermented’s 2009 post on the subject. Mr. Myers took the trouble of finding and analyzing 50ish beers from both sites with some interesting results.
But I think I can do better. How about 800 beers? It took me about 4 hours of fidgeting to make a small program to gather the data but I’m glad I did. The data is pretty interesting and obviously it has changed since Erik’s post. I just wish I could have gotten more.
While I did setup my graph in a similar way to his, there are some differences to note. Firstly, mine shows Beer Rating (on a scale of 0-100) over the different relative rank (1st – 840th). He appears to only use rank for his graph, limiting what can be seen by a little. You can think of each spot on the X axis as a beer with 2 dots showing how they rate on both sites. Really the graph is just the same data twice; one graph is sorted by RateBeer rank (1st, 2nd…) while the other is sorted by BeerAdvocate rank.
I think that using ratings as well as ranks has a few advantages. For instance, it’s easy to see that ratebeer’s ratings are lower, even when they are both put on the same scale. I think this is due to the formula they use, which apparently factors in the number of reviews into the final rating. The BA score is also computed using a formula but I’m not sure which, making it hard to see what’s going on.
I also took the liberty to make another graph:
This one shows the difference in rating between the two sites as the average rating (also between the two sites) increases. I was really hoping to see some correlation here: you’d think that as beer got better/worse people would tend to agree more but clearly this is not the case.
One thing that these graphs show is the lesser precision of the BA rating. This is probably just how they display it; I’m sure they have a more precise figure computed for their “Top 250 Beers” list.
Really this data was pretty patchwork. I started with about 1700 beers scraped off of the different style “Top 25”s on RateBeer’s site. From there I automatically used the search function on BeerAdvocate’s site for each beer in my list in order to get the corresponding ratings. This is where I lost a lot of beers. I ended up with 840 beers that had ratings for both sites.
If I had to do it differently (might again someday) I would use the “top rated beers” page on RateBeer instead. I could probably get their whole database this way, which would be thousands of beers.
I would also use BeerAdvocate’s search function a little better. As it was, I just took the top search result that matched the name I sent, invariably catching some false data. A search for Rochefort Trappistes 8, for instance, returns their #10 and #6 variants before the #8. Here’s what I mean.
As far as what all this means I’m not really sure. I’m not much good for statistical analysis. If anyone else would like to take a crack at the data I gathered let me know. I would post a link to it but I’m not fond of being sued.