Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Numbers Don’t Lie: What Website Analytics Tell About Human Behavior (sparklewise.com)
13 points by toumhi on Jan 26, 2012 | hide | past | favorite | 10 comments


While numbers don't lie, the conclusions you draw from them can still be incorrect.

Look at the second graph (website visits peaking on Dec 24th for giftcertificatefactory.com). The author concludes the following from this peak: "People favor doing things at the very last possible moment."

But this is nonsense. Such a conclusion would require a model for how people's preferences affect web page statitics. If you don't have a model, your intuition is going to fool you. Let me illustrate this with a simple example:

Assume that in our model world you have two kinds of people: Early-Buyers and Late-Buyers. Early-Buyers buy presents on a random day from Dec 1 to Dec 20. Late-Buyers on the other hand buy presents on a random day from Dec 21-24. Assume that 80% of people are Early-Buyers and 20% of people are Late-Buyers.

If you looked at the number of presents bought per day, you would see that the rates are 25% higher in the days from Dec 21-24. Your intuition will tell you: "People favor buying presents late". But that is not true, because in our model world 80% of the people are actually Early-Buyers!

Now, to explain the web page statistics shown in the article, we would need a more elaborate model; but constructing such a model and working with it is difficult, and that's why people avoid thinking about models, just post raw numbers, and then write whatever their intuition tells them, and then claim that it must be true because "numbers don't lie".


Easy tiger. Maybe it's a problem of vocabulary (non-native english speaker here) with "People favor doing things at the very last possible moment." ?

What I meant was that of all days, people buy on the very last. Which makes sense according to our intuition indeed. I didn't mean to imply that (number of people buying the last 5 days) > (number of people buying before during the whole year) as you seem to argue against. I was only observing that absolute numbers increase day by day starting 5 days before Christmas.

Your early buyer-late buyer model is interesting but as you said, it would require more thorough research to set it up and was not in the scope of the article.

I thought the data was interesting and wanted to share it with the community, and I had no ambition to draw a complete buying model from it.


The observations presented in the article are interesting, but as you suggest, the problem lies with the attribution of these observations to the broad and sweeping set of "[all] people."

We can consider the fitness example in the same light as your analysis of the gift-card example. My corporate gym -- and pretty much any gym to which I've ever belonged -- gets deluged by New Year's resolution newbies every January. By mid-February, most of them are gone.

At first blush, we may be tempted to suggest that "most people" sign up for gym memberships in January, then gradually lose interest. But in fact, we are simply observing one subset of people at one touchpoint. The set happens to have a dramatic impact, so our minds assign it unduly high weight through a cognitive bias known as the availability heuristic (http://en.wikipedia.org/wiki/Availability_heuristic). But this subset, in fact, may not even be significant in the overall set of gym users and non-gym users. Perhaps "most people" use the gym on a regular basis. Even more likely, perhaps "most people" don't set foot in the gym at all, New Year's or otherwise. All that we've learned, by observing the New Year's subset in isolation, is how the New Year's subset behaves.


Hey Tommy! Im Tjerk your former collegae.

Anyway nice article. However your conclusions are not well founded. For example. what if majority of the traffic to your Gift Certificate Factory came from a website that went offline at the point in the graph. Then there is a different correlation.

Also i have never seen a huge drop like that. Are you sure there are no other reasons for the drop?

The peak for your MBI site might also be because some other guy linked to it from a favority website. This is often the cause for spikes like that.

Just saying, that the conclusions you make might feel right, you can never be sure with only the graphs. So its a bit of speculation.


Hey Tjerk, welcome on HN ;-)

You're right sometimes peaks can be coverage from other websites and drops can reveal a problem with a website however here it's no such case. It's pure seasonality, which is why I felt compelled to write about it :-)


With no discredit to the author (because this is a decent piece) but numbers sometimes do lie. Outliers? Correlation != causation?

You know what really shows human behavior? Speaking to someone face to face.


>You know what really shows human behavior? Speaking to someone face to face.

Well, not really. People lie all the time.


Also, lupus.


I remember being absolutely amazed after seeing the same number of unique visitors hit my site two days in a row.


I was expecting something more along the lines of this:

Human dynamics revealed through Web analytics

http://arxiv.org/abs/0803.4018




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: