Analytics: Frequency Distribution & Bell Curves

8 11 2010

A statistical method we often overlook is the distribution curve.  I think most of the time it is dismissed because people get nervous about using statistics if they are uncomfortable with math.  While there are some advanced concepts around using a frequency curve, it can also be used visually as a simple tool to explain results.

A simple stats lesson….

Normal Bell Curve – roughly 68% of the population is within 1 standard deviation (measure of variation) of the average and 95% is within two standard deviations. Below is an example of IQ scores.  The average score is 100 and 68% of the data is between 85 & 115.

While this visualization doesn’t do a tremendous amount for us, this is what we assume when we think of populations, like customers and employees.  And because of our limited statistical training we make a large number of assumptions based on averages.  We love to look at average revenue: average revenue per employee, average revenue per customer, etc.  This thinking also gets us looking into the outliers (that <5% that sits way out to the left or right of the chart).  How much time do you spend on less than 5% of the business?

OK, so back to thinking of this in terms of running a business….

Let’s map out our revenue per customer.  I would be willing to bet it looks something like the following:

If this is the customer revenue distribution, if we use the average number in our analyzes we can quickly generate a number of wrong assumptions.  First and foremost, our typical customer is larger than reality.  It might lead us to think we are serving mid-sized businesses than more likely smaller market customers.  I am also willing to bet our profitability per customer has a similar curve to it.  In this case we are likely spending money on the wrong customers and aligning our better services to a lower profit generating customer (or more likely a profit destroying customer).

Do we need to use it in everything? Of course not, but it might help everyone once in a while to challenge our overuse of the mathematical average to reassess perspectives of our business.  A great place to start is map out the customer base in terms of revenue (profitability is better, but takes a lot longer to do).  It might just lead you to understand your customer (think customer segmentation) better.

Real life example…I was once part of a research project to understand discounting to one side of the outliers (<1% of the business).  The outcome was to focus on reducing discounting to that <1% of the business.  What I argued was to focus on the larger part of the business, where the same efforts would have resulted in millions more in terms of profits.  It was a clear lesson is where to apply process improvement.

Advanced Analytics

22 03 2010

A major item organizations grapple with is the concept of advanced analytics.  They want it, but have little idea how to use the various tools to make it happen.  Unfortunately too much information often blurs the lines.

For example, I watched a sales presentation on Predictive Analytics where the key outcome showed how to build databases with the tool yet almost completely missed the fact that the real benefit should have been something like “we were able identify two segments to target a marketing program for more effectiveness.  Instead of spending $500k on a generic campaign we were able to identify key attributes that drove increased customer interaction and focus the campaign to only $200k on those segments.”

Why is this? The primary reason is we do not truly understand the tools and how best to use them.  A Swiss army knife is not good for home repair, but is the perfect tool to throw in a hockey bag, or car trunk for occasional use as a widget to get you out of a jam – a screw needs to be tightened, a shoelace needs to be cut, or an apple peeled.  We need to understand which tool to use in the most appropriate situation instead of thinking of various tools as universal.

Business Intelligence, Planning, What-If Scenario Tools, Optimization, Dashboarding, Scorecarding, Cubes, Cluster Analysis, Predictive Analytics are all different tools for vastly separate purposes yet have similar uses.

Advanced Analytical Tools

Here are the core elements of Advanced Analytical tools:

  • Business Intelligence – great for creating an enterprise-wide, data visualization platform.   If you do this right, you should create a single version of the truth for various terms within an organization.  It should enable better reporting consistency standards for the organization.  In the end, it reports what the data says.
    • Scorecard & Dashboards – These are primarily BI tools that have a more organized or structured methodology for presenting ideally the Key Performance Indicators.  These are great tools, but to be most effective, they need a specific purpose that is highly integrated into a management process.
  • Enterprise Scenario Planning – Most enterprise planning exercises are giant what-if scenarios that try to plan out financial outcomes based on a series of drivers (employees, widgets, sales reps, etc.).  We build out plans based on a number of assumptions, like the average sales rep drives $2mil in business, or benefit costs for the year are going to be #of employees * average salary * 2.  We do this primarily to lay out a game plan for the year and we do it as part of an annual or rolling cycle.
  • Tactical or Ad-Hoc What-if Scenario Analysis – Besides the full scale project we do to plan out the company’s cash outlays, we also do a significant amount of smaller, typically tactical “what-if” scenario tests.  This is traditionally done in Microsoft Excel.  We dump a bit of data into excel, make a number of assumptions and try to build out likely scenarios.  For example, “if we were to create a customer loyalty program, what would be the cost and a likely reward.”  We are doing this to test ideas, so yes it might be ideal to bolt those into the Enterprise planning tool, but it typically takes too much overhead.  It is easier to just get something done quickly, then make a go/no go decision.
    • Data Visualization can also be a great help with this – to bolt on a couple of reports to see the data and how different scenarios impact the various facts and dimensions.  This can help us with our conclusions and recommendations.
  • Predictive Analytics – This tool is best used when we have historical data, or representative data set and we want to make a conclusion based on mathematics.   The key is math.  This is not guessing, it is improving the chances of being right with math, or a structured approach to remove risk from decision making.  With a planning tool, we primarily use assumptions to create plans.  We cannot use predictive analytics for all decisions, but for a few specific types of decisions:
    • What transaction details and customer insight can we use to determine credit card fraud?
    • What customer attributes create our buying segments?
    • Which customers are most likely to abandon our offering?
    • What products are most often purchased together?
    • Which taxpayers most likely need to be audited?
  • Optimization Analytics – This is perhaps the most specific advanced analytics tool when looking to solve the specific business question: “With the given parameters of these trade-offs, which mix of resources creates the most effective (or efficient) use of those resources?” This helps make decisions around production locations and product investment.  Like predicative analytics, it is mathematically based (though you may need to make a couple of assumptions as well) in how it determines the answer.

Advanced Analysts

Another reason we lack understanding is analysts.  Our analysts are commonly from the IT team, trained in data structures, or from the finance team, trained in accounting.  Neither is wrong, they just have a default mindset that falls back on using the tool they best know.  This lacks the business/statistical trained person who can both layout the hypothesis and, more importantly, explain the results.

We do not want correlation explained in R-squared values, “63% of the variation of the data is explained by our independent variables.”  While this may make sense to other statisticians and mathematicians, it is lost on the business.   One key value of using a math-based concept is that the explanation should sound more like, “We have found a way to decrease fraud by 3.2%, which should result in a $576K return to the business every quarter” or “We have tested our marketing campaigns and have found three segments that are 25% more likely to purchase based on the campaign, which should result in a payback period of 3 months.”

The right tool with the right skill set is imperative to successfully using advanced analytics.  We also need the discipline to have the right people using the right tools for the right information to drive action.  If you have an algorithm that predicts customer defection, you need to use it and test the results.  It is never going to be perfect, but in most cases, you can bet it will be better than not using it at all.


Get every new post delivered to your Inbox.