About

I'm a partner in the advanced analytics group at Bain & Company, the global management consulting firm. My primary focus is on marketing analytics (bio). I've been writing here (views my own) about marketing, technology, e-business, and analytics since 2003 (blog name explained).

Email or follow me:

-->

92 posts categorized "Analytics"

August 08, 2012

A "Common Requirements Framework" for Campaign Management Systems and Marketing Automation

In our "marketing analytics agency" model, as distinguished from a more traditional consulting one, we measure success not just by the quality of the insights and opportunities we can help clients to find, but on their ability to act on the ideas and get value for their investments.  Sometimes this means we simultaneously work both ends to an acceptable middle: even as we torture data and research for bright ideas, we help to define and influence the evolution of a marketing platform to be more capable. 

This raises the question, "What's a marketing platform, and a good roadmap for making it more capable?"  Lots of vendors, including big ones like IBM, are now investing in answering these questions, especially as they try to reach beyond IT to sell directly to the CMO. These vendors provide myriad marketing materials to describe both the landscape and their products, which variously are described as "campaign management systems" or even more gloriously as "marketing automation solutions".  The proliferation of solutions is so mind-blowing that analyst firms build whole practices making sense of the category.  Here's a recent chart from Terence Kawaja at LUMA Partners (via Scott Brinker's blog) that illustrates the point beautifully:

 

 

Yet even with this guidance, organizations struggle to get relevant stakeholders on the same page about what's needed and how to proceed. My own experience has been that this is because they're missing a simple "Common Requirements Framework" that everyone can share as a point of departure for the conversation.  Here's one I've found useful.

Basically marketing is about targeting the right customers and getting them the right content (product information, pricing, and all the before-during-and-after trimmings) through the right channels at the right time.  So, a marketing automation solution, well, automates this.  More specifically, since there are lots of homegrown hacks and point solutions for different pieces of this, what's really getting automated is the manual conversion and shuffling of files from one system to the next, aka the integration of it all.  Some of these solutions also let you run analysis and tests out of the same platform (or partnered components).

Each of these functions has increasing levels of sophistication I've characterized, as of this writing, into "basic", "threshold", and "advanced".  For simple roadmapping / prioritization purposes, you might also call these "now", "next", and "later".

Targeting

The simplest form of targeting uses a single data source, past experience at the cash register, to decide whom to go back to, on the idea that you build a business inside out from your best, most loyal customers.  Cataloguers have a fancy term for this, "RFM", which stands for "Recency, Frequency, and Monetary Value", which grades customers, typically into deciles, according to... how recently, how frequenty, and how much they've bought from you.  Folks who score high get solicited more intensively (for example, more catalog drops).  By looking back at a customer's past RFM-defined marginal value to you (e.g., gross margin you earned from stuff you sold her), you can make a decision about how much to spend marketing to her.  

One step up, you add demographic and behavioral information about customers and prospects to refine and expand your lists of folks to target.  Demographically, for example, you might say, "Hey, my best customers all seem to come from Greenwich, CT.  Maybe I should target other folks who live there."  You might add a few other dimensions to that, like age and gender. Or you might buy synthetic, "psychographic" definitions from data vendors who roll a variety of demographic markers into inferred attitudes.  Behaviorally, you might say "Let's retarget folks who walk into our store, or who put stuff into our online shopping cart but don't check out."  These are conceptually straightforward things to do, but are logistically harder, because now you have to integrate external and internal data sources, comply with privacy policies, etc.

In the third level, you begin to formalize the models implicit in these prior two steps, and build lists of folks to target based on their predicted propensity to buy (lots) from you.  So for example, you might say, "Folks who bought this much of this product this frequently, this recently who live in Greenwich and who visited our web site last week have this probability of buying this much from me, so therefore I can afford to target them with a marketing program that costs $x per person."  That's "predictive modelling".

Some folks evaluate the sophistication of a targeting capability by how fine-grained the target segments get, or by how close to 1-1 personalization you can get.  In my experience, there's often diminishing returns to this, often because the firm can't always practically execute differentiated experiences even if the marginal value of a personalized experience warrants it.  This isn't universally the case of course: promotional offers and similar experience variables (e.g., credit limits) are easier to vary than, say, a hotel lobby.  

Content

Again, a simple progression here, for me defined by the complexity of the content you can provide ("plain", "rich", "interactive") and by the flexibility and precision ("none", "pre-defined options", "custom options") with which you can target it through any given channel or combination of channels.

Another dimension to consider here is the complexity of the organizations and processes necessary to produce this content.  For example, in highly regulated environments like health care or financial services, you may need multiple approvals before you can publish something.  And the more folks involved, the more sophisticated and valuable the coordination tools, ranging from central repositories for templates, version control systems, alerts, and even joint editing.  Beware though simply paving cowpaths -- be sure you need all that content variety and process complexity before enabling it technologically, or it will simply expand to fit what the technology permits (the same way computer operating systems bloat as processors get more powerful).

Channels

The big dimension here is the number of channels you can string together for an integrated experience.  So for example, in a simple case you've got one channel, say email, to work with.  In a more sophisticated system, you can say, "When people who look like this come to our website, retarget them with ads in the display ad network we use." (Google just integrated Google Analytics with Google Display Network to do just this, for example, an ingenious move that further illustrates why they lead the pack in the display ad world.)  Pushing it even further, you could also say, "In addition to re-targeting web site visitors who do X, out in our display network, let's also send them an email / postcard combination, with connections to a landing page or phone center."

Analysis and Testing

In addition to execution of campaigns and programs, a marketing solution might also suport exploration  of what campaigns and programs, or components thereof, might work best.  This happens in a couple of ways.  You can examine past behavior of customers and prospects to look for trends and build models that explain how changes and saliencies along one or more dimensions might have been associated with buying.  Also, you can define and execute A/B and multi-variate tests (with control groups) for targeting, content, and channel choices.  

Again, the question here is not just about how much data flexibility and algorithmic power you have to work with within the system, but how many integration hoops you have to go through to move from exploration to execution.  Obviously you won't want to run exploration and execution off the same physical data store, or even the same logical model, but it shouldn't take a major IT initiative to flip the right operational switches when you have an insight you'd like to try, or scale.

Concretely, the requirement you're evaluating here is best summarized by a couple of questions.  First, "Show me how I can track and evaluate differential response in the marketing campaigns and programs I execute through your proposed solution," and then, "Show me how I can define and test targeting, content, and channel variants of the base campaigns or programs, and then work the winners into a dominant share of our mix."

A Summary Picture

Here's a simple table that tries to bundle all of this up.  Notice that it focuses more on function than features and capabilities instead of components.  

  Marketing Automation Commonn Requirements Framework

 

What's Right For You?

The important thing to remember is that these functions and capabilities are means, not ends.  To figure out what you need, you should reflect first on how any particular combination of capabilities would fit into your marketing organization's "vector and momentum".  How is your marketing performance trending?  How does it compare with competitors'?  In what parts -- targets, content, channels -- is it better or worse? What have you deployed recently and learned through its operation? What kind of track record have you established in terms of successful deployment and leverage from your efforts?  

If your answers are more like "I don't know" and "Um, not a great one" then you might be better off signing onto a mostly-integrated, cloud-based (so you don't compound business value uncertainty with IT risk), good-enough-across-most-things solution for a few years until you sort out -- affordably (read, rent, don't buy) -- what works for you, and what capability you need to go deep on. If, on the other hand, you're confident you have a good grip on where your opportunities are and you've got momentum with and confidence in your team, you might add best of breed capabilities at the margins of a more general "logical model" this proposed framework provides.  What's generally risky is to start with an under-performing operation built on spaghetti and plan for a smooth multi-year transition to a fully-integrated on-premise option.  That just puts too many moving parts into play, with too high an up-front, bet-on-the-come investment.

Again, remember that the point of a "Common Requirements Framework" isn't to serve as an exhaustive checklist for evaluating vendors.  It's best used as a simple model you can carry around in your head and share with others, so that when you do dive deep into requirements, you don't lose the forest for the trees, in a category that's become quite a jungle.  Got a better model, or suggestions for this one?  Let me know!

August 06, 2012

Zen and the Art of IT Planning #cio

It's been on my reading list forever, but this year I finally got around to Robert Pirsig's Zen and the Art of Motorcycle Maintenance.  It was heavy going in spots, but it didn't disappoint. So many wonderful ideas to think about and do something with. Among a thousand other things, I was taken with Pirsig's exposition of "gumption".  He describes it as a variable property developed in someone when he or she "connects with Quality" (the principal object of his inquiry).  He associates it with "enthusiasm", and writes:

A person filled with gumption doesn't sit around dissipating and stewing about things.  He's at the front of the train of his own awareness, watching to see what's up the track and meeting it when it comes.  That's gumption. (emphasis mine; Pirsig, Zen, p. 310, First Harper Perennial Modern Classics edition 2005)

In recent years I've tested my gumption limits in trivial and meaningful ways: built a treehouse, fixed an old snowblower, serviced sailboat winches, messed around in SQL and Python, started a business. For me, gumption was the "Well, here goes..." evanescent sense of that moment when preparation ends and experimentation begins, an amplified mix of anxiety and anticipation at the edge of the sort-of-known and the TBD.  Or, like the joy of catching a wave,  it's feeling for a short time what it's like to have your brain light up an order of magnitude more brightly than it manages on average, and watching your productivity soar.

So what's this got to do with IT planning?

For a while now I've been working with both big and small companies, and seen two types of IT planning happen in both settings. In one case there's endless talk of 3-year end-state architectures that seem to recede and disappear like mirages as you Gantt-crawl toward them.  In the other, there's endless hacks that "scratch itches" and make you feel like you're among the tribe of Real Men Who Ship, but  which toast you six months later with security holes or scaling limits.

Getting access to data and having enough operational flexibility to act on the insights we help produce with this data are crucial to the success we try to help our clients achieve, and hold ourselves accountable for. So, (sticking with the motorcycle metaphor) a big part of my job is to be able to read what "gear" an IT organization is in, and to help it shift into the right one if needed -- in other words, to find a proper balance of planning and execution, or "the right amount of gumption".  One crude measure I've learned to apply is what I'm calling the "slide-to-screen" ratio (aka the ".ppt-to-.php" score for nerdier friends).

It's a simple calculation.  Take the number of components yet to be delivered in an IT architecture chart or slide, and divide them by the number of components or applications delivered over the same time period looking backward.  For example, if the chart says 24 components will be delivered over the next three years, and the same number of comparable items have been delivered over the prior three years, you're running at "1".

Admittedly, the standard's arbitrary, and hard to compare across situations. It's the question that's valuable.  In one situation, there's lots of coding, but little clear sense of where it needs to go, tantamount to trying to drive fast in first gear.  In the other, there's lots of ambition, but not much seems to happen -- like trying to leave the driveway in fifth gear.  When I'm listening to an IT plan, I'm not only looking at the slides and the demos, I'm also feeling for the "gumption" of the authors, and where they are with respect to the "wave".  The best plans always seem to say something like, "Well, here's what we learned -- very specifically -- from the last 24 months' deployments, and here's what we think we need to do (and not) in the next 24 months as a result." They're simultaneously thoughtful and action-oriented.  Conversely, when I don't see this specifics-laden reflection, and instead get a generic look forward, and a squishy, over-hedged, non-committal roadmap for getting there, warning bells go off.

Pushing for the implications of the answer -- to downshift, or upshift, and how -- is incredibly valuable.  Above "1", pushing might sound like, "OK, so what pieces of this vision will you ship in each of the next 4 quarters, and what critical assumptions and dependencies are embedded in your answers?"  Below "1", the question might be, "So, what complementary capabilities, and security / usability / scalability enhancements do you anticipate needing to make these innovations commercially viable?"  The answers you get in that moment -- a "Blink"-style gumption test -- are more useful than any six-figure IT process or organizational audit will yield.

 

July 26, 2012

Wanted: Marketing Analytics Director, Global Financial Services Firm (Mid-Atlantic) # Analytics

I've been working with a global financial services firm to develop its marketing analytics / intelligence capability, and we're now building a highly capable team to further extend and sustain the results and lessons so far.  This includes a Marketing Analytics Director to lead a strong team doing advanced data mining and predictive modeling to support high-impact opportunities in various areas of the firm.  Here's the job description on LinkedIn.  If you are currently working at a large marketer, major analytics consulting firm, or advertising agency, and have significant experience analyzing, communicating, and implementing sophisticated multi-channel marketing programs, and are up for the challenge of leading a new team in this area for a world-class firm in a great city, please get in touch!

July 11, 2012

Wonderfully #Pragmalytic Multi-Channel Attribution Advice From @avinash via @visualiq

Via my friends at VisualIQ, this wonderful post from Avinash Kaushik on doing multi-channel attribution and mix optimization in the real world.  Plus a really rich set of conversations in the comments. My summary of his advice (reassuringly consistent with my own experiences with "pragmalytic" approaches):

  • Start by solving for specific attribution / optimization use cases you face in the real world, not the more general form of the challenge.  He names three dominant ones he sees: "O2S -- Online to Store", "AMS -- Across Multiple Screens", and "ADC -- Across Digital Channels"
  • Use multiple analytic techniques to compensate for imperfect data that any one technique might rely on.  For example, if there are holes or quality problems with your data, supplement it with controlled tests
  • Don't cop out, but accept that there are no perfect answers, just better ones, and that you should bias toward acting on acceptably imperfect information and learning and improving based on actual experience

Absolutely terrific stuff here, gets even better on the third and subsequent reads.

March 20, 2012

Organic Data Modeling in the Age of the Extrabase #analytics

Sorry for the buzzwordy title of this post, but hopefully you'll agree that sometimes they can be useful to communicating an important Zeitgeist.

I'm working with one of our clients right now to develop a new, advanced business intelligence capability that uses state-of-the art in-memory data visualization tools like Tableau and Spotfire that will ultimately connect multiple data sets to answer a range of important questions.  I've also been involved recently in a major analysis of advertising effectiveness that included a number of data sources that were either external to the organization, or non-traditional, or both.  In both cases, these efforts are likely to evolve toward predictive models of behavior to help prioritize efforts and allocate scarce resources.

Simultaneously, today's NYT carried an article about Clear Story, a Silicon Valley startup that aggregates APIs to public data sources about folks, and provides a highly simplified interface to those APIs for analysts and business execs.  I haven't yet tried their service, but I'll save that for a separate post.  The point here is that the emergence of services like this represent an important step in the evolution of Web 2.0 -- call it Web 2.2 -- that's very relevant for marketing analytics in enterprise contexts.

So, what's significant about these experiences?

Readers of Ralph Kimball's classic Data Warehouse Toolkit will appreciate both the wisdom of his advice, but also today, how the context for it has changed.  Kimball is absolutely an advocate for starting with a clear idea of the questions you'd like to answer and for making pragmatic choices about how to organize information to answer them.  However, the major editions of the book were written in a time when three things were true:

  • You needed to organize information more thoughtfully up front, because computing resources to compensate for poor initial organization were less capable and more expensive
  • The number of data sources you could integrate were far more limited, allowing you to be more definitive up front about the data structures you defined to answer your target questions
  • The questions themselves, or the range of possible answers to them, were more limited and less dynamic, because the market context was so as well

Together, these things made for business intelligence / data warehouse / data management efforts that were longer, and a bit more "waterfall" and episodic in execution.  However, over the past decade, many have critiqued such efforts for high failure rates, mostly in which they collapse of their own weight: too much investment, too much complexity, too few results.  Call this Planned Data Modeling.

Now back to the first experience I described above.  We're using the tools I mentioned to simultaneously hunt for valuable insights that will help pay the freight of the effort, define useful interfaces for users to keep using, and through these efforts, also determine the optimal data structures we need underneath to scale from the few million rows in one big flat file we've started with to something that will no doubt be larger, more multi-faceted, and thus more complex.  In particular, we're using the ability of these tools to calculate synthetic variables on the fly out of the raw data to point the way toward summaries and indeces we'll eventually have to develop in our data repository.  This will improve the likelihood that the way we architect that will directly support real reporting and analysis requirements, prioritized based on actual usage in initial pilots, rather than speculative requirements obtained through more conventional means.  Call this Organic Data Modeling.

Further, the work we've done anticipates that we will be weaving together a number of new sources of data, many of them externally provided, and that we'll likely swap sources in and out as we find that some are more useful than others.  It occurred to me that this large, heterogenous, and dynamic collection of  data sources would have characteristics sufficiently different in terms of their analytic and administrative implications that a different name altogether might be in order for the sum of the pieces.  Hence, the Extrabase.

These terms are not meant to cover up a cop-out.  In other words, some might say that mashing up a bunch of files in an in-memory visualization tool could reflect and further contribute to a lack of intellectual discipline and wherewithal to get it right.  In our case, we're hedging that risk, by having the data modelers responsible for figuring out the optimal data repository structure work extremely closely with the "front-end" analysts so that as potential data structure implications flow out of the rubber-meets-the-road analysis, we're able to sift them and decide which should stick and which we can ignore. 

But, as they say sometimes in software, "that's a feature, not a bug."  Meaning, mashing up files in these tools and seeing what's useful is a way of paying for and disciplining the back end data management process more rigorously, so that what gets built is based on what folks actually need, and gets delivered faster to boot.

March 15, 2012

Question for Search Marketing / Digital Content Experts #SEO #SEM #CPC @seomoz

As I understand Google SEM CPC pricing, Google considers not only how much the marketer is willing to pay, but also the quality of results, in determining what price puts you in what slot.

So, to a certain degree, your SEO efforts influence your SEM cost.

SEO these days is about:

  • relevant content
  • relevant links to that content from authoritative sites
  • social relevance of the content
  • content freshness
  • other stuff, like site speed

These things have readily measurable attributes:

  • organic search results position
  • PageRank
  • followers / likes / retweets etc.
  • feed update frequency
  • response time

So logically, one question you might as a search marketer ask yourself is whether you're globally optimized across your SEO and SEM investments.  Should your next dollar go to pay for a click, or to develop and promote and present content that will rise higher directly, and indirectly lower your SEM CPC?  This is not an academic question, as collectively SEM and SEO are today big, front-and-center elements of many firms' marketing efforts.

One way to answer this question is to build a model that, for a range of keywords, predicts CPC (as a dependent variable) from different values of the measurable attributes of SEO mentioned above, and perhaps others.

I'm curious about whether anyone's seen or done analysis like this?  My quick search on "Optimizing SEO vs. SEM" yielded this interesting SEOMoz result": http://www.seomoz.org/blog/the-disconnect-in-ppc-vs-seo-spending .  I think this great article misses however that most of the leverage in SEO comes not from the narrow definition of the term but from developing and promoting great content.

March 12, 2012

#SXSW Trip Report Part 2: Being There

(See here for Part 1)

Here's one summary of the experience that's making the rounds:

 

Missing sxsw

 

I wasn't able to be there all that long, but my impression was different.  Men of all colors (especially if you count tattoos), and lots more women (many tattooed also, and extensively).   I had a chance to talk with Doc Searls (I'm a huge Cluetrain fan) briefly at the Digital Harvard reception at The Parish; he suggested (my words) the increased ratio of women is a good barometer for the evolution of the festival from narcissistic nerdiness toward more sensible substance.  Nonetheless, on the surface, it does remain a sweaty mosh pit of digital love and frenzied networking.  Picture Dumbo on spring break on 6th and San Jacinto.  With light sabers:

 

SXSW light sabers

 

Sight that will haunt my dreams for a while: VC-looking guy, blazer and dress shirt, in a pedicab piloted by skinny grungy student (?) Dude, learn Linux, and your next tip from The Man at SXSW might just be a term sheet.

So whom did I meet, and what did I learn:

I had a great time listening to PRX.org's John Barth.  The Public Radio Exchange aggregates independent content suitable for radio (think The Moth), adds valuable services like consistent content metadata and rights management, and then acts as a distribution hub for stations that want to use it.  We talked about how they're planning to analyze listenership patterns with that metadata and other stuff (maybe gleaning audience demographics via Quantcast) for shaping content and targeting listeners.  He related for example that stations seem to prefer either 1 hour programs they can use to fill standard-sized holes, or two- to seven- minute segments they can weave into pre-existing programs.  Documentary-style shows that weave music and informed commentary together are especially popular.  We explored whether production templates ("structured collaboration": think "Mad Libs" for digital media) might make sense.  Maybe later.

Paul Payack explained his Global Language Monitor service to me, and we explored its potential application as a complement if not a replacement for episodic brand trackers.  Think of it as a more sophisticated and source-ecumenical version of Google Insights for Search.

Kara Oehler's presentation on her Mapping Main Street project was great, and it made me want to try her Zeega.org service (a Harvard metaLAB project) as soon as it's available, to see how close I can get to replicating The Yellow Submarine for my son, with other family members spliced in for The Beatles.  Add it to my list of other cool projects I like, such as mrpicassohead.

Peter Boyce and Zach Hamed from Hack Harvard, nice to meet you. Here's a book that grew out of the class at MIT I mentioned -- maybe you guys could cobble together an O'Reilly deal out of your work!

Finally,  congrats to Perry Hewitt (here with Anne Cushing) and all her Harvard colleagues on a great evening!

 

Perry hewitt anne cushing

 

 

January 26, 2012

Controlling for Impression Volatility in Digital Ad Spend Tests @DataXu

I've recently been involved in evaluating the results of a matched market test that looked at the impact of changes in digital advertising spend by comparing test vs. control markets, and by comparing differential lift in these markets over prior periods (e.g., year on year).  One of the challenges involved in such tests is significant "impression volatility" across time periods -- basically, each dollar can buy you very different volumes of impressions from year to year.  

You can unpack this volatility into at least three components:  

  • changes in overall macro-economic conditions that drive target audiences' attention,
  • changes in the buying approach you took / networks you bought through, due to network-specific structural (like what publishers are included) and supply-demand drivers (like the relative effectiveness of the network's targeting approach)
  • changes in "buy-specific" parameters (like audiences and palcements sought).  

Let's assume that you handle the first with your test / control market structure.  Let's also assume that the third is to be held constant as much as possible, for the purposes of the test (that is, buying the same properties / audiences, and using the same ad positions / placements for the tests).   So my question was, how much volatility does the second factor contribute, and what can be done to control for that in a test?

Surfing around I came on DataXu's March 2011 Market Pulse study.  DataXu is a service that allows you to buy across networks more efficiently in real time, sort of like what Kayak would be to travel if it were a fully automated agent and you flew every day.  The firm noted a year-on-year drop in average daily CPM volatility from 102% to 42% from May 2010 to February 2011 (meaning I think the average day to day change in price across all networks in each of the two months compared).  They attributed this to "dramatically increased volume of impressions bought and sold as well as maturation of trading systems".  Notwithstanding, the study still pointed to a 342% difference in average indexed CPMs across networks during February 2011.  

A number this big naturally piqued my interest, and so I read into the report to understand it better.  The top of page 2 of the report summary presents a nice graph that shows average monthly indexed CPMs across 11 networks, and indeed shows the difference between the highest-priced and the lowest-priced network to be 342%.  Applying "Olympic scoring" (tossing out highest- and lowest-priced exchanges) cuts that difference to about 180%, or roughly by half -- still a significant discrepancy of course.  Looking further, one standard deviation in the whole sample (including the top and bottom values) is about 44%.  Again, though perhaps a bit less dramatic for marketers' tastes, still lots.

(It's hard to know how "equivalent" the buys compared were, in terms of volumes, contextual consistency, and audience consistency, since the summary doesn't address these.  But let's assume they were, roughly.)

So what? If your (display) ad buys are not so property-specific / audience-targeted that run-of-network buys in contextual or audience categories are OK, future tests might channel buys through services like DataXu and declare the buys "fully-price-optimized" across the periods and markets compared, allowing you to ignore +/- ~50% "impression volatility" swings, assuming the Feb 2011 spreads hold.

However, if what you're buying is very specific -- and only available through direct purchase, or one or two specialized networks at most -- then you ignore factor 2, trust the laws of supply and demand, and assume that you've bought essentially the same "attention" regardless of the difference in impressions.

I've asked some knowledgeable friends to suggest some perspectives on this, and will pass along their ideas.  Other feedback welcome, especially from digital advertising / testing pros!  Oh and if you're really interested, check out the DataXu TC50 2009 pitch video.

December 12, 2011

Caveman BI

Many client project workplans I intersect have  the line item "Determine KPIs".  Without further context to frame this work, folks typically end up with something that:

  • is (too) narrow, functionally
  • is impermanent -- KPI reporting and analysis lasts only as long as the effort itself is sustained
  • is disconnected from / uncoordinated with other, interdependent parts of the business

Now, consider the view from the analytic teams supporting efforts like these.  When multiple such requests come in, it's left to these teams to prioritize them, and reconcile them semantically and logistically.  This is hard.

To deal with these challenges, many BI (Business Intelligence) tool vendors provide pre-built "business models" that suggest a set of KPIs based on a more fundamental concept of the operations they reflect.  These can be useful, but they're not free, and it's a lot of work to retro-fit them to your business.  This is expensive.

Whether you use a vendor model as a point of departure, or roll your own, I've found it useful to have a basic "Caveman BI Model" that serves three purposes:

  • it provides a common language to the folks involved in the KPI development effort
  • it provides a broader context on how KPIs fit into the business, so folks don't lose the forest for the leaves
  • it starts simple, so your team can work its way up to the complexity you need, rather than trimming back from the complexity it's given by a pre-existing model

The Caveman BI Model starts with a plain-English, noun-verb description of what the business does that needs to be understood better:

"We use resources to create products that we sell to customers with content through channels."

Generic and esoteric maybe, but in one sentence we've described the entities of the business that we need to understand better:

"We use resources to create products that we sell to customers with content through channels."

Put another way, this one sentence describes "things I need to know about to run the business".

Next, we use metrics to describe how (well) we're managing the entities.  Metrics have the following progression of development:

  • a raw count: "we have n customers"
  • a synthetic measure, commonly a quotient, produced from more than one count: "$x in sales for #y in customers"
  • an index that allows us to compare a raw count or a quotient longitudinally (against past performance), cross-sectionally (say, against competitors or industry averages), or as a variance vs. a budget / plan / forecast

Now, each of these entities has dimensions that allow us to "slice" metrics by to dig into the business a bit.  For example:

  • resources would include capital, people, raw materials, and fixed assets, each of which might have their own characteristics (equity vs. debt; hourly vs. salaried; source; etc.)
  • products could be characterized according to needs served (say, a disease condition for a pharmaceutical), or an underlying technology or legal framework that makes it possible (say, "Forty-Act" in the mutual fund business)
  • customers can have a number of properties that allow us to segment them usefully; these might include wealth, how big their relationship with us is, geography, gender, age, etc.
  • content would include "campaigns" that use different marketing assets, themes, or promotions
  • channels covers distinctions like "direct sales" vs. "distributors", or "TV" vs. "digital"

(Oversimplifying: Ralph Kimball's BI bible, The Data Warehouse Toolkit, calls metrics "facts" and suggests constructs called "fact tables" and "dimension tables" that relate facts and dimensions to each other.  Put another way, what are the different dimensions that we have to slice each fact by in order to understand the business?  "Sales to customers by product and customer geography" would be an example of an intersection we might want to know about.)

These dimensions in turn have hierarchies.  Geography, for example, has "city, state, country".  Some hierarchies are helpfully scalar:  second-minute-hour-day, for example.  But, hierarchies are often a big problem because they don't reconcile neatly.  For example, weeks don't divide neatly into months; some industry data may be sourced from places that use different hierarchies, making apples to apples comparisons hard.

That's pretty much it.  Summing up:

  • what do we want to know about? entities
  • how do we want to measure it? metrics
  • how do we need to slice this measurement? dimensions
  • how do we roll up / roll down these slices? hierarchies

So how do we define and distinguish KPIs  from the myriad metrics we could choose?  One rule of thumb I've applied is to choose metrics that start and end a process.  For example, "qualified leads" are an output of a marketing process that become an input to a sales process.  A transaction is then the end of the sales process.  Each of these could be a KPI; a "synthetic" KPI might compare leads to sales as a "conversion rate".

If you're charged with coming up with KPIs and then provisioning them for decision-makers (who may include you too!), I'd suggest you start by asking someone to explain what the "Caveman BI Model" is for your business, using this framework as your interview guide.  If you find that such a model has not been expressed per se, or doesn't exist, take a pass through creating one, maybe by trying to tie some of the key numbers you see used a lot back into the framework.

Again, the purpose of this post isn't to be the last word in Business Modeling, but rather to share a basic approach you can carry around in your head -- and in common across your heads -- that's been useful to me in the past.  Please let me know what you think!

 

April 21, 2011

Analyzing #Analytics

Take a look at this chart:

Here's what I see:

  • interest bumping along to '06
  • growing interest through '08 (Competing on Analytics appears March 2007)
  • hype-y growth in interest through '10
  • wildly uneven interest so far this year
  • growing interest in the early part of each year
  • summer "blahs" (Who Googles "analytics" on the beach? Don't answer that...)
  • bump in interest in the fall, when business plans featuring "analytics" are developed and presented
  • quiet Decembers

What do you see?  Are we watching "analytics" jump the shark?  More likely, we've chowed down on it as much as we can, and are now digesting.  Further interest will likely be less at the general conceptual level, and more at the "how do I get there" level.  Hence "Pragmalytics" (Part II here).

Stay tuned for more.