Cloud computing being so 2011, big data is going to be a key IT buzzword for 2012. Yet, as far we understand our retail clients, there is one data source that holds above 90% of total information value in their possession: market basket data (tagged with fidelity card information when available).

For any mid-large retail network, the informational value of market basket data simply dwarfs about all other alternative data sources, may it be:

  • In-store video data, which remain difficult to process, and primarily focused on security.
  • Social media data, which are very noisy and reflect as much bot implementations than human behaviors.
  • Market analyst’s reports, which require the scarcest resource of all: management attention.

Yet, beside basic sales projections (aka sales per product, per store, per region, per week …), we observe that, as of January 2012, most retailers are doing very little out of their market basket data. Even forecasting for inventory optimization is typically nothing more than a moving average variant at the store level. More elaborate methods are used fore warehouses, but then, retailers are not leveraging basket data anymore, but past warehouse shipments.

Big Data vendors promise to bring an unprecedented level of data processing power to their clients to let them harness all the potential of their big data. Yet, is this going to bring profitable changes to retailers? Not necessarily so.

The storage capacity sitting on display on the shelves of an average hypermarket with +20 external drives in display (assuming 500GB per drive) typically exceeds the raw storage needed to persist a whole 3 years of history of a 1000 stores network (i.e. 10TB of market basket data). Hence, raw data storage is not a problem, or, at least, not an expensive problem. Then, data I/O (input/output) is a more challenging matter, but again, by choosing an adequate data representation (the details would go beyond the scope of this post), it’s hardly a challenge as of 2012.

We observe that the biggest challenge posed by Big Data is simply manpower requirements to do anything operational with it. Indeed, the data is primarily big in the sense that the company resources, to run the Big Data software and to implement whatever suggestions come out of it, are thin.

Producing a wall of metrics out of market basket data is easy; but it’s is much harder to build a set of metrics worth the time being read considering the hourly costs of employees.

As far we understand our retail clients, the manpower constraint alone explains why so little is being been done with market basket data on an ongoing basis: while CPU has never been to cheap, staffing has never been so expensive.

Thus, we believe that Big Data successes in retail will be encountered by lean solutions that treat, not processing power, but people, as the scarcest resource of all.


Reader Comments (1)

Joannes, I’m impressed with the work you are doing. If you haven’t done so yet, please check what QlickView has to offer in “business intelligence” or “business discovery”. The platform is really fast+advanced in turning data into knowledge. I am sure you can learn something from them. But yes, the numbers can’t speak for themselves even in BigData, so the manpower constraint can’t be totally erased. Salut 5 years ago | Ali