What is Zynga making per paying user? Nobody, not even Zynga, will ever know.

by Darius Kazemi on February 6, 2012

in metrics,philosophy,work

There was a post on some news site last week with an article by Louis Bedigian quoting an analyst (Arvind Bhatia) claiming that “Zynga loses $150 on every new paying customer.” I read it, thought to myself, “That’s absurd linkbait,” and then assumed that nobody would take the bait. I was wrong: it got picked up everywhere. Sigh.

This morning, Andrew VandenBossche alerted me to an article by Dylan Collins, quoting industry CEO Torsten Reil, that responds, no, you idiots! Your methodology is wrong! “Zynga is probably MAKING $30 on every paying user!” So… here’s what I think: nobody knows what the fuck is going on. (For those of you wondering why I’m writing about this, before I did HTML5 stuff full time, I spent 6 years as a data analyst for game studios, both MMO and Facebook games.)

Surface analysis

Collins/Reil are absolutely right to call the original analysis oversimplified. It was based on a model that completely failed to account for attrition — they’re correct when they state that Zynga certainly gained far more than 400k paying users for their marketing money.

Unfortunately, Collins/Reil pick a number out of thin air (20% attrition rate) which results in a rough estimate where Zynga spends $120 per paying user, and makes $150 per paying user, resulting in a net profit of $30 per paying user. I say unfortunately because if the number is 10%, then by Reil’s metric they’re losing $21 on every paying user. If it’s 30% they’re earning $57 per acquired paying user. It all hinges on their attrition rate, which we don’t know! Some games see 10%. Some games see 90%. 20% seems like a roughly correct ballpark for a mix of successful and unsuccessful games, but honestly we have no idea what it is because we’re on the outside looking in. But the truly weird thing to consider is: Zynga doesn’t know what their attrition number is either.

Models and black boxes

All numbers like this are built on models that analysts put together, and models are built on assumptions. Simple example: when we talk about attrition, what phenomenon do we refer to? Typically we mean “the moment when someone is no longer a player of the game.” Yet in the context of a social game, how do you define that? Facebook users don’t typically uninstall an app — they usually just stop using it. So you have to pick an arbitrary cutoff point. Does someone fall into an “attrition” bucket after 1 week of inactivity? 2 weeks? A month? Remember, this number is arbitrary, so you can adjust that number all you like (within reason, you’re not going to pick 100 years) until you come up with an attrition percentage that meets your criteria. Whether those criteria are “seems more realistic” or “would appease our shareholders” is another question!

But regardless, this attrition percentage then affects all of your other calculations. Now, ideally you want to remain internally consistent once you pick this number, but a dirty secret is that even if you maintain perfect internal consistency in always using “2 weeks of inactivity” as your cutoff for attrition, there will always be dozens of other fiddly and less directly consequential definitions that you can tweak. And the thing is, on some level you have to tweak these numbers! Otherwise you might find yourself stuck with a model that doesn’t reflect what looks like the reality of your game.

To put it another way: the internal game studio analyst’s job is to assemble a black box known as the concept of “attrition” — to the CEOs and CFOs and shareholders and external analysts and pundits at home, this concept seems pretty straightforward: it’s the people who leave your game. End of story. The black box behaves and does its job, reporting a number between 0% and 100%, and presumably you panic if the number is closer to 100%.

But the internal studio analyst needs to assemble this concept from a variety of sources. They might ask the game designers what they see as a “normal” or “natural” amount of time away from the game — if a game is designed to be played during the work week, then you shouldn’t sweat it when someone isn’t playing over the weekends or on Christmas. They might look at historical data for the game and notice that 80% of players who are inactive for 12 days never come back. And 90% of players inactive for 15 days never come back. So maybe we pick 90% and say 15 days is our cutoff. But of course we’re looking at historical data for the current game, which is different today than it was back then, so it’s not a perfect analogy! So maybe we want to rely on data from the last 30 days, when the game was most similar — but now our definition of “never come back” really means “people who were inactive for 15 of the last 30 days and haven’t been back.” But of course, those people “haven’t been back” for a maximum of 15 days since we’re looking at a 30 day window. So now that our historical data is more representative of the current state of the game, our very definition of “never” comes into question!

An infinite regress of assumptions

An internal game studio data analyst does in fact work in a vacuum, and will get fired for sharing with outside analysts. This means that the chances that our assumptions are off-base are pretty good.

In summary: games are very complex systems, and the numbers that get thrown around in the media are built on black-box-style assumptions. These black boxes can always be broken down into components, and those components into subcomponents, forever and ever into an infinite regress. If this seems mind-bogglingly weird, well: it is. On some level you need to stop digging into the infinite and come up with assumptions about the way the game works that become the foundation for your models. There’s nothing wrong about that in principle: science does this all the time, and manages to come up with some great models to describe the world. But there’s a huge difference between science and analyzing the metrics for social games. Scientists do not work in a vacuum within their universities or corporations. Scientists do not work with “proprietary data” and they do not run the risk of getting fired for sharing their results and even their methodologies with other scientists. An internal game studio data analyst does in fact work in a vacuum, and will get fired for sharing with outside analysts. This means that the chances that our assumptions are off-base are pretty good. And it means that the numbers that different companies throw around can’t even be compared. “Average revenue per user,” which sounds straightforward, can be based on entirely different foundational assumptions at different companies and on different games.

This whole mess is one of the main reasons I stopped being a data analyst for games. I did not feel comfortable coming up with assumptions that weren’t, on some level, complete bullshit. Now, the level on which these assumptions operated was often very low-level, fiddly stuff. But it was an art, not a science. Which, again, nothing wrong with that — except that the black boxes that I generated were being treated as science rather than as art.

In the end, for the purposes of arguments about how much money a company is making, the only numbers that matter are: how much money is coming into the company each month? How much money is leaving the company each month? Everything else should be viewed with utmost suspicion.

{ 5 comments }

Ben Abraham February 6, 2012 at 7:21 pm

That was fantastic, Darius. So many good and sensible ideas in there I don’t know where to begin, and the Latourian influence is strong with this one.

Chris February 6, 2012 at 7:43 pm

“But it was an art, not a science. Which, again, nothing wrong with that — except that the black boxes that I generated were being treated as science rather than as art.”

Ha ha! My audience models are exactly the same in this regard – although because they’re *inspired* by science, they tend to (as I say) “smell like science”, so they get treated as if they are worthy of some kind of twisted empirical respect.

The thing is, even science is an art not a science. The only sciences that function as we expect “Big S” ‘Science’ to function are those that have ceased to be live research areas and become prescriptions for technology e.g. optics. All other science is art masquerading as science (or perhaps even magic masquerading as science). But we live in an age that thinks science is a hugely successful endeavour, rather than a label we selectively attach to the successful elements of our technological research programmes. Go figure.

It’s a funny old world… but I kind of like the nutty ol’ thing. ;)

*waves*

Darius Kazemi February 6, 2012 at 7:59 pm

Indeed! As Ben notes above, my article was highly influenced by Bruno Latour. A great, relevant quote from his Compositionist Manifesto:

In the Fall of 2009, critiques and proponents of anthropogenic climate
change realized, by sifting through the thousands of emails of the climate scientists
stolen by activists of dubious pedigrees, that the scientific facts of the matter had to
be constructed, and by whom? by humans! Squabbling humans assembling data,
refining instruments to make the climate speak (instruments! can you believe
that!), and spotty data sets (data sets! imagine that…), and those scientists had
money problems (grants!) and they had to massage, write, correct and rewrite
humble texts and articles (what? texts to be written? is science really made of texts,
how shocking!)… What I found so ironic in the hysterical reactions of scientists
and the press was the almost complete agreement of opponents and proponents of
the anthropogenic origin of climate change. They all seem to share the same
idealistic view of Science (capital S): “If it slowly composed, it cannot be true” said
the skeptics; “If we reveal how it is composed, said the proponents, it will be
discussed, thus disputable, thus it cannot be true either!”. After about thirty years
of work in science studies, it is more than embarrassing to see that scientists had
no better epistemology to rebut their adversaries. They kept using the old
opposition between what is constructed and what is not constructed, instead of the
slight but crucial difference between what is well and what is badly constructed (or
composed). And this pseudo “revelation” was made at the very moment when the
disputability of the most important tenets of what it means for billions of humans
represented by their heads of states to live collectively on the Planet was fully
visible, in the vast pandemonium of the biggest diplomatic jamboree ever
assembled… While it was the ideal moment to connect the disputability of politics
with the disputability of science (small s)—instead of trying to maintain, despite the
evidence, the usual gap between, on the one hand, what is politics and can be
discussed, and, on the other hand, a Science of what is “beyond dispute”.

Nick Brown February 6, 2012 at 11:58 pm

Hahaha, loving your perspective on this. Somehow I missed that particular bit of link bait (and I’m glad of it). Besides it was much more entertaining to read your summary and subsequent ripping apart of this bit of “news”.

Chris February 7, 2012 at 9:41 am

Nice extract! I’m interested in Latour’s work, but my reading list has been dictating itself for quite a while now… Currently tied up in research for ‘Chaos Ethics’, book number three of my philosophical trilogy concerning the role of imagination (in games = ‘Imaginary Games’, in science = ‘The Mythology of Evolution, in ethics = ‘Chaos Ethics’).

Hope to cross paths soon, but when and where remains shrouded in mystery. :)

Comments on this entry are closed.

Previous post:

Next post: