Data visualization is, itself, data

I was reminded tonight how important it is to communicate data in a form that supports its function. Koop and I were poking around some WordPress.com stats over coffee and I pulled up our data for pageviews from iPads, which began the following.

A note about UTC time, which we use for internal stats — peak hours in the US, which our data tends to be highly correlated to, are roughly 13:00-02:00.

First I looked at the daily chart of pageviews from iPads.

I noticed that it didn’t look like most pageview charts, which typically follow long peaks and short valleys for weekdays and weekends, like our aggregate pageview data for WordPress.com.

If you look at the main pageview stat by hour, you see that there are spikes basically when people are at work — during the day in the US, Monday through Friday. This isn’t really news, and it’s very, very common across most websites.

iPad pageviews, on the other hand, look totally different on an hourly basis.

There are two important things to notice here:

  1. Weekends spike, not weekdays
  2. Intra-week differences disappear almost entirely after 21:00 (1pm PST/4pm EST)

The explanation for this is actually quite simple — iPads are primarily used outside of work, which is where people tend to be on the weekends and at night. If you were to translate the last chart into a story, it would basically be this:

On weekends, people wake up and use their iPads throughout the day, well into the night, but on weekdays the iPads are stuck at home alone while their owners at work, and thus dormant[1].

I don’t think this is a particularly important revelation (maybe we should promote iPad stuff on the weekends?) but I do think it’s a cool example of how showing the same data in a different form (line chart vs hourly grid) tells a different and much more useful story. Also interesting is that the use of the hourly grid here is probably not what most people assume it’s good for, which is seeing data on a really granular level. It’s actually the near-exact opposite, it’s the best way to view this data on an aggregate level.

[1] Though the modifier is dangling, I meant the iPads were dormant, not the people — but I suppose from our overall pageview stats, that may not be completely true.

Automattic, travel, and a year of flux

About 8 months ago I came across an interesting job on Automattic’s website, Growth Engineer. Today I’m writing this from an apartment in Prague, working while on a side trip after our company meetup in Budapest last week, on my way to Los Angeles to speak at BlogWorld LA. I’ll have crossed over 65,000 miles traveled this year by the time I touch down in San Francisco. How far we’ve come.

Of all the changes in 2011, the people are overwhelmingly the high order bit. There are the impressive coworkers, the generous WordPress community, and the strangers who, through travel and shared interests, have been maybe the most consistent part of my life this year.

Security concerns seem fundamentally unreasonable

As has become the norm, a new web security meltdown erupted recently, starring KissMetrics, among others. And as has become the norm, I think it’s much ado about nothing.

I was actually thinking about this earlier today, and Nik Cubrilovic’s post on yet another way to secretively store data just reminded me of it. There are a few reasons why I think the concern over this is mostly misguided.

First, it’s not new. Store owners can write down your shirt color, your height, your race, the time you came in, what you bought, etc. They can track plenty of “personal” information, which I put in quotes because I think the whole concept of some special class of information leads to more confusion than awareness. The key difference, obviously, is that computers make tracking this information cheaper and faster. I tend to think that we shouldn’t legislate against actions based only on their relative efficiency, so I don’t see why this should make a difference. Worse yet, it’s far more likely that real world businesses can connect your actions to your real “personal” information, like your name, family and address.

Second, it’s a never-ending game of cat and mouse. There are limitless ways to store data and acting like anyone can stop it is foolish. The whole thing seems eerily similar to the war on drugs, a failure I think we’ll have a hard time contending with any time soon.

Third, for the overwhelming majority of people there is nothing to worry about. Awareness of risks is great, but confused fear based on misinformed media reports is awful, and that’s mostly what we’ve created.

I’d liken responsible web security education to something along the lines of wearing your seatbelt and not driving drunk. What we have today is much closer to fear-mongering along the lines of urban legends about exploding engines or murderers at drive in movies. Those things can certainly happen, but I don’t think it’s reasonable for anyone to treat them as likely consequences of driving a car.

To be clear, I think it’s great if websites and web services don’t do these things, but some of them will and you should probably assume all of them do, particularly if it’s something that concerns you.

No one teaches you how to choose a job

Read job postings. Write a résumé. Practice interview skills. Network.

To the extent that any educational facilities teach you how to find a place in the labor force, that’s it. Pretty accurately, you could call this “How to get a job.” That is, how to get offered a job. With the national unemployment rate at whatever it’s at, that’s certainly a valuable skill these days, but lots of people have the chance to aim much higher — at least high enough to get two job offers. Then what?

You’re pretty much flying blind if you rely on the same folks that taught you about one page resumes and how to talk to recruiters at job fairs.

I think every college student would be well-served to have some guidance on how to choose a job or a career. And the problem starts way before college; high school students pick colleges, which are long-term expensive decisions, basically on a whim. It all seems kind of insane looking back.

Company cultures, hierarchies, job roles, products, businesses and dozens of other factors that will impact your daily life, probably making the difference between being happy and miserable at work, are never formally taught or even discussed. I think it’s one of the reasons that startups seem so magical to so many, because they appear to break rules when they just do things differently and the rules were never there in the first place.

I’m not convinced that a vocational education is the best fit for most, but I do think that some sense of practical decision-making would be immensely valuable and hopefully prevent a lot of people from finding new ways to kill time for eight hours, five days per week.

Test-driven (user) testing

I’ve recently been learning Ruby on Rails and after hearing the term for a long time have actually begun to understand what “test-driven development” is. The idea is that you define success for a piece of an application by writing a test that “uses” the application in predefined ways and tells you whether or not it produces the expected results. Before I go on I should admit that in an effort to learn how to make things in Rails faster, I’ve actually not written any tests. Bad, I know.

I think the idea of test-driven development is pretty cool. It forces you to set specific goals, makes it clear when you achieve them, and hopefully makes achieving them even easier and faster since the scope is so clearly defined. Thinking about it more, there are probably other parts of life and certainly work that would make sense to approach this way. One that came to mind and made for a catchy title was testing itself, in the product and user sense.

Surveys, A/B tests, multi-variant tests and many other methods of learning from customers have become wildly popular, and rightly so, thanks to the work of Steve Blank, Eric Ries, Sean Ellis (who first taught me about it) and others in what’s generally been termed “The Lean Startup” movement. Like any good fad, I think that some followers have fallen victim to function following form — in other words, I’ve seen a bunch of people use these tactics for the sake of using them. One instance of this that sticks out is running tests where virtually no result is going to teach you anything.

The most egregious offender has to be surveys, I think because they’re the easiest to create and the most vulnerable to lazily copying from others. Every question on a survey — like every line of code in an application — has an implicit, real cost. Usually it’s a lower response rate, which means you get less information and it takes longer to get it. So a great deal of effort should be put into eliminating unneeded questions, just like a great deal of effort is often put into eliminating unneeded code. Or (here comes the theme) in a test-driven development world hopefully the unneeded code is never written in the first place, because you’re writing code based on test requirements and writing tests requirements based on what’s needed. It’s a beautiful system.

Back to the surveys. There are several “standard” survey questions that I see almost everywhere and have a way of creeping into new surveys as you write them: gender, age, income, location, referral source, etc etc. Certainly all of these can be valid survey questions, but they’re probably not all important all of the time. Unfortunately it’s really, really easy to add them to your product survey and some services like SurveyMonkey will even automate it for you. Pretty soon this turns into survey-by-committee and you’re asking 30 questions with boondoggles from every part of the company added. Biz dev wants to know about the users’ favorite websites, marketing wants to know about age and gender, sales wants to know about purchase intent for mid-size cars (wtf?) and the list goes on. Meanwhile, if the scope of the survey had been properly defined, it would be clear that none of these were actually required; or, if these were actually required, it would have been made clear sooner and maybe someone would have fought against it, thanks to a test.

This is a trivial example only to make a point, but I think it’s an important point that comes down to two ideas for me:

  1. Keep in mind that just because something is easy to do doesn’t mean it’s low risk
  2. Upfront investment in making sure everyone agrees on the goals can pay huge dividends later

Testing to accelerate learning is great and more companies should do more of it, but adding unneeded complexity (for you and for your users) through more survey questions or button colors doesn’t always improve your tests.

People worry too much for other people

From time to time stories come out that are generally fashionable to be outraged about. Facebook’s numerous privacy skirmishes and Groupon’s Superbowl commercial are a couple examples. So is Kenneth Cole’s off-color Tweet about Spring fashion in Cairo.

People get up in arms about these things. They post on Facebook and Twitter, often promising to never buy the offender’s products again. But it’s nonsense, more often about showing off how upset you are than actually being upset in the first place.

I’d like to ignore whether or not any of these things were good ideas (for whatever it’s worth, Facebook’s and Groupon’s almost certainly were, Kenneth Cole’s almost certainly was not) and instead focus on whether or not it makes any sense at all to be upset about what they said or did.

I think the most reasonable litmus test for these kinds of things is whether you’d be offended (even a little bit) if someone said it to you at a party or if you learned about it in isolation.

In the case of Facebook’s privacy mess last year, I knew many, many people that were up in arms about it. Most of them thought it was unethical, immoral and downright mean-spirited. But for all the wrong reasons, I’d argue. They were all (every single one, among the people I personally knew) fine with their own use of Facebook; some changed their settings, most didn’t, and all went on with their lives. But, boy, were they ever worried about everyone else who wasn’t as smart as them and able to avoid making asses of themselves on the internet. I’m sure it’s all good intentioned, but it tweaks me in the same way that any evangelical religious people attempting to spread their morality to others does–it just seems like a lot of overreaching without a lot of logical thought.

In the case of Groupon and Kenneth Cole, I’m hard-pressed to believe that anyone wouldn’t laugh if similar things were said at a party. “How about those Egyptian riots? Like a Black Friday sale, right!” “I’m sick of all these daily deals, how about something new like 2 Free Tibets for the price of one!?” This is pretty harmless stuff by any reasonable measure. Except when it affords you a soap box to preach from.

And that’s really my biggest problem with all this. That the complaining isn’t really complaining, it’s preaching and it’s often baseless. If you’re legitimately offended by any of this stuff, I really do think that’s fine. But if you just want to protect the less-refined masses, then it doesn’t make any more sense to me than censorship.