Last week, we introduced the concept of events for the second version of our Forecasting API. This week, we introduce the second major feature of Forecasting API v2: tags.

Another limitation of the API v1 is the management of very short time-series. Indeed, retailers need to forecast their sales starting from Day 1 - but at Day 1, you have exactly zero historical sales data to leverage for this product. Thus, you can’t produce a statistical forecast. Seems obvious …

… but wrong because at Day 1 when launching a new product, there are a lot of historical data already: it’s the historical sales of all the other products.

Intuitively, let assume that we have a retailer selling candies who introduces a lollipop with new orange flavor. Even if the retailer doesn’t know exactly how much orange-flavored lollipops he is going to sell, he can reasonably approximate his sales forecast by assuming it’s going to be roughly on the same level than the other lollipop’s flavors (assuming no cannibalization) for the sake of simplicity).

Through this small example, it’s clear that lack of historical data on a single series do not prevent forecasts to be produced. Yet, most of the forecasting literature is devoted to the study of stationary processes that are usually a significant mismatch with business where time-series do have both a start and an end, frequently associated to the lifetime of a product or a service.

This is why our Forecasting API v2 introduces the notion of time-series tags. For those who are not familiar with tags, it’s both a user-friendly way and an expressive way to decorate almost any object - a web page, a file, or a time-series in the case of Lokad.

There are many businesses where long lasting products are rather rare:

  • fashion with two collections per year, the products of the new collection never exactly matching the products of the previous collection.
  • electronics with fast-paced technological changes where almost no product stay on the market more than 18 months.
  • media where the market is driven by novelties - new clips, new movies, new songs.

Tags should express product properties and Lokad will use them to perform a loose matching with the products sold during the past and somehow exhibiting the same tags.

Thus, tagging time-series is the answer provided by Lokad in order to produce statistical forecasts with little or no data.