00:00:08 Introduction to demand forecasting and data aggregation.
00:00:41 Different types of granularities in demand forecasting.
00:02:00 Challenges of various data aggregation levels in forecasting.
00:05:28 Disaggregated level: SKU per day, and reconstructing other aggregation levels.
00:08:31 Edge cases and challenges with perishable products in demand forecasting.
00:09:42 Importance of granular information in data aggregation.
00:11:01 Reconstructing the desired level of aggregation from the most granular data.
00:13:01 Limitations of time series techniques in very disaggregated data.
00:15:01 Time series techniques and the assumption of the future being more of the same.
00:17:00 The seductive and misleading nature of time series models.
00:19:03 Discussing the disadvantages of aggregation in forecasting.
00:20:00 Exploring the importance of granularity in decision-making.
00:21:38 Examining relevant horizons and their impact on supply chain decisions.
00:23:48 Arguing against arbitrary aggregation and its potential impact on supply chain efficiency.
00:26:35 Suggesting a focus on decision-driven granularity and avoiding premature optimization.

Summary

Joannes Vermorel, founder of Lokad, discusses the importance of selecting the right level of data aggregation for demand forecasting in an interview with Nicole Zint. Two dimensions are considered: temporal and structural, including time intervals used to aggregate data, and the organization of the supply chain. Vermorel notes that the daily level and SKU level are most applicable to most supply chain networks, but edge cases may require more granular data. Vermorel warns against the limitations of time series models in supply chain forecasting, encouraging a broader perspective that considers factors such as perishability, cannibalization, substitution, and variable lead times. He emphasizes the importance of decision-driven granularity and extending forecasting horizons beyond lead times.

Extended Summary

In this interview, host Nicole Zint discusses demand forecasting and the right level of data aggregation with Joannes Vermorel, the founder of Lokad, a software company specializing in supply chain optimization. They explore the different types of granularities in demand forecasting and the impact of these granularities on forecasting methods.

Vermorel explains that there are two main dimensions to consider when choosing the granularity for demand forecasting: temporal and structural. The temporal dimension refers to the time intervals used to aggregate the data, such as hourly, daily, weekly, monthly, or yearly. The structural dimension relates to the organization of the supply chain, including product categories and locations. This could involve aggregating data by SKU (Stock Keeping Unit), product reference, product family, product super family, or category, and then aggregating by site, region, or country.

When discussing the types of forecasts associated with these granularities, Vermorel mentions that the concept of business intelligence, or hypercubes, popularized in the 1990s, is relevant. The historical data can be represented as vectors, with each level of granularity creating a unique vector. When a time dimension is added, these vectors can be interpreted as time series data, which can then be used for forecasting.

This approach primarily deals with time series forecasts, which are the mainstream practice in the industry. Vermorel notes that there may be multiple forecasts generated from the same data due to the variety of potential aggregation levels.

The interview also touches on the technical term “equispaced” in relation to time series. Equispaced time series have regular, uniform intervals between data points. Vermorel acknowledges that most people in the supply chain industry may not have considered working with non-equispaced time series, as equispaced time series are more common. However, he points out that some intervals, like months, are not precisely regular in the physical sense, as months have varying lengths.

This interview segment focuses on the importance of selecting the appropriate granularity for demand forecasting. There are two main dimensions to consider: temporal and structural. Various types of forecasts can be generated from the same data based on the chosen granularity, with time series forecasts being the most common in the industry. Additionally, the concept of equispaced time series is discussed, highlighting the potential complexities in dealing with varying time intervals.

Vermorel speaks about forecasting time horizons, decision-driven granularity, and the importance of not limiting one’s thinking when it comes to supply chain management.

They discuss the challenges of data aggregation levels in supply chain optimization. Vermorel explains that the choice of aggregation level depends on the sensibility of the industry, as some industries may require more disaggregation. He also highlights that the daily level and SKU level are the most sensical levels of disaggregation for most supply chain networks. However, he notes that edge cases, such as perishable products, may require more granular data points. Vermorel emphasizes that every arbitrary decision about data aggregation comes with pros and cons, and it is crucial to understand where those decisions come from. When asked if a more granular level of data could be reconstructed from the most granular level, Vermorel explains that whenever one aggregates data, they lose information. Thus, the more granular the data, the more accurate the forecast. However, the most granular data is not even aggregated data but the raw transactional data. He explains that the reason people stop at per SKU a day is that it is the last level at which they can still operate with time series. If they go further than that, they would have to give up on the time series perspective as the data is not structured as a time series.

They also discussed the limitations of time series models in supply chain forecasting with host Nicole Zint. Vermorel notes that although the supply chain industry typically operates with a time series mental model, time series techniques tend to perform poorly on sparse, erratic, and intermittent data. He argues that there is a fundamental asymmetry between the past and the future, and that the assumption that the future is exactly the same as the past is misguided. Vermorel also challenges the practice of aggregating data, which he believes results in the loss of data and misaligned metrics, and suggests that the only relevant horizon for decision-making is the one that is relevant to the specific decision.

Vermorel begins by explaining that time horizons for forecasting should extend beyond lead times, as they don’t always fit well with the traditional time-series perspective. He argues that the decision-making horizon should consider not just the period between now and the arrival of a product, but also the time it takes to sell the received goods. The applicable horizon depends on factors such as the expected speed of liquidating the stock and the variability of demand. While there’s no clear limit to how far into the future one should look, Vermorel acknowledges that forecasts become fuzzier the further out they go. Ultimately, the trade-off lies in balancing the cost of computing resources against the potential improvements to the supply chain.

When discussing granularity, Vermorel emphasizes that it should be driven by the decisions a company wants to make. He advises against confusing the need for visualization with other predictive and optimization requirements, as granularity can be arbitrary and can lead to a loss of data. Instead, he recommends focusing on decisions that have tangible, financial impacts on the supply chain, such as reordering or adjusting prices.

Vermorel warns against becoming too fixated on aggregation levels, which he views as a highly technical aspect of the problem. Modern computer systems have more than enough capability to handle various levels of granularity, and there’s no need to impose arbitrary constraints on one’s thinking. In the past, aggregating data for visualization was a challenge, but modern systems can easily handle it, even down to millisecond-level granularity.

The interviewee also cautions against relying solely on traditional data cube approaches for supply chain optimization. He asserts that doing so can impose unnecessary restrictions and limit the scope of potential solutions. Factors such as perishability, cannibalization, substitution, and variable lead times should be considered for a more comprehensive view of the supply chain. Vermorel encourages a broader perspective and avoiding arbitrary constraints that can hinder problem-solving in supply chain management.

In summary, Joannes Vermorel advocates for considering a wider range of factors when optimizing supply chains, extending forecasting horizons beyond lead times, and adopting decision-driven granularity. He emphasizes the importance of not limiting one’s thinking and leveraging modern computer systems to tackle complex supply chain problems effectively.

Full Transcript

Nicole Zint: When it comes to demand forecasting, there’s an incredible diversity of different methods and different levels of data aggregation that are chosen both between companies and within. Some forecast on a daily basis, others on a weekly, monthly, or yearly basis. Some forecasts on a SKU level, others on a category level. This begs the question, what is the right level of data aggregation? This is the topic of today’s episode. Before we delve into the answer to this question, Joannes, what are the different types of granularities to choose from in demand forecasting?

Joannes Vermorel: In demand forecasting, you have essentially two main dimensions to the problem. The first one is temporal, which is about whether you want your transactional data aggregated at an hourly, daily, weekly, monthly, or yearly level. The other dimension is typically the product/supply chain topology, so you might choose to aggregate per SKU, per product reference, per product family, super family, category, etc. You also have your locations where you might want to aggregate per site, per region, or per country. The two main dimensions are time and the structure of your catalog/supply chain network, which creates a matrix of possibilities when it comes to choosing the granularity.

Nicole Zint: When we talk about these granularities, what kind of forecasts are we talking about? Is there a specific type of forecast?

Joannes Vermorel: What you have is a conceptually a model that has been popularized in the 90s, essentially under the name of business intelligence or hypercubes. What you have is a way to represent your historical data as vectors. You choose a level of granularity, let’s say per SKU per week, and then for every single SKU at the weekly level, you have a vector of values which, because there is a time dimension, can be interpreted as a time series. Then you can forecast this time series into the future. Due to the many potential aggregation levels, there might be plenty of forecasts that can be made on top of the same data. So, we’re talking about time series forecasts when we discuss this problem, at least this is a mainstream practice in the industry.

Nicole Zint: What about the timeline in the time series, are they all equispaced or is there a different approach?

Joannes Vermorel: Equispaced is a super technical term, which most people in the supply chain industry might have never considered working with. The equispace is a technicality where you say that your time series is divided into buckets that are completely regular. However, keep in mind that this is a bit of an abstraction because, for example, months are not exactly regular in the physical sense. Physicists would say that there are months that are longer than others, so it is only regular according to our calendar.

Nicole Zint: Another question when it comes to a month: We have different numbers of, say, Fridays or weekends in a month, and if we see spikes in sales on Fridays, won’t that get disturbed?

Joannes Vermorel: Here we are getting to the sort of question: Which aggregation level do I pick? You have plenty of concerns that can emerge. Obviously, there are some aggregation levels that have certain effects. If you’re looking at the hourly level, it might be, for most industries, incredibly disaggregated and it might not even be sensible during the night because maybe there are plenty of areas where, let’s say, retail, where nothing happens during the night. So, it might not even be sensible.

Then, indeed, if you pick the monthly aggregation, it’s always a tricky one because you have the fact that some months have five days of a given day of the week, and you can either have four or five. So that is a tricky aspect that will actually introduce some sort of biases in the way you look at this record and potentially the way you construct your forecast. But it is also true for other dimensions, like are you looking per SKU or per product or per category? All of those introduce concerns of their own.

Nicole Zint: So when it comes to these different levels of data aggregation, can’t we technically just choose, say, SKU per day, which is the most disaggregated level, and then reconstruct essentially any other level of aggregation from that?

Joannes Vermorel: First, yes, there is this temptation to go for super disaggregated levels. In supply chain, the most sensible level of disaggregation, time-wise, is per day. However, it is a fairly arbitrary decision. We could have decided it was by the minute, and for example, if you’re running a call center and you want to look at your arrival rate of incoming calls, you are going to have a much more granular level of observation for the incoming calls. So, this is really about what sort of things make sense for the mainstream situation in supply chain.

Now, if we go back slightly in time, we have to understand a bit where we come from. Let’s have a look at a typical store with 10,000 SKUs in a typical retail network with 100 stores. So it’s not even a very large retail network. We are talking about 10,000 times 100, which equals 1 million SKUs, and then daily data. So if we want to have three years’ worth of history, we are talking about a thousand days. So, we are talking about one billion data points. To represent the daily aggregated data at the SKU level in a modest retail network, we are already talking about something that is a billion data points.

In a computer, that would already be four gigabytes of memory. When you go back a little bit in time, you would see that this sort of memory capacity was not even accessible before the 90s. By the way, the term “business intelligence” as a class of enterprise software tools emerged in the 90s, precisely when gigabyte-sized computers arrived on the market. So, the two things were going hand in hand. You needed computers that were able to represent such large amounts of data.

Nicole Zint: So, those big cubes were actually software designed for in-memory computing, which was just a grand way to say let’s take advantage of this newfound random access memory. And based on that, it becomes the default, although we should not forget that it was fairly arbitrary. When you say it is the smallest level that makes sense for supply chains, is this daily aggregation and SKU level accurate?

Joannes Vermorel: Yes, but you have a lot of edge cases. For example, if you have a product that is perishable, the question is whether aggregating per day per SKU is sufficient to give you an accurate picture of your stock level. If you’re looking at a perishable product, the answer is no. You may have 10 units in stock, but if 9 out of the 10 units are going to expire tomorrow, what you truly have in stock is mostly one unit plus nine that are on the verge of disappearing. So, in this case, the stock level is not granular enough and the SKU level is not granular enough. What you would like to have would be a stock level with at least one week worth of shelf life and maybe stock level with at least one month worth of shelf life. So, you would introduce another dimension to give you a better intuition.

Nicole Zint: And what about the time? Is the daily level fine or should we consider a more granular level?

Joannes Vermorel: The daily level might be fine, except that there might be stores that are only open, let’s say, during the weekend or only in the morning. If you don’t know that you have a store that is only open half a day, you’re lacking information. So maybe having a more granular level, like the morning and the evening, would give you something that is more valuable. Every single arbitrary decision about your aggregation level comes with pros and cons. My message here is that it’s fairly arbitrary, and there is no grand truth in that, but it’s very interesting to understand where those decisions come from.

Nicole Zint: Say we find the most granular level that is within our price reasonability. If we have access to the most granular level but still want to look at a forecast on a weekly basis, for example, can we just reconstruct the level we want to get simply from the most granular level?

Joannes Vermorel: Absolutely. If we go back to the raw transactional history, whenever you aggregate, you lose data in terms of information. No matter which sort of aggregation you’re performing, you can always reconstruct a higher level from the most granular data.

Nicole Zint: You are actually, this is a lossy process, so you’re losing information. So you have less information, so surely that would make sense that the accuracy would drop, right? The higher the aggregation, the less accurate it becomes?

Joannes Vermorel: Yes, but what we have is that this was very much the reason why we want to have this sort of aggregation set up. I would say it is cube-driven because we have the sort of software that operates relatively swiftly. The idea is that when you have a hypercube, operations that are slice and dice can be done very efficiently. This is a very technical reason. Thus, if you want to go from daily to weekly, it is a very efficient operation that you can do on the cube.

Indeed, in terms of pure information theory, whenever we go into a more aggregated level, we are losing information. So in theory, if we want to have something that would be a more accurate statement about the future, we would want to operate on the most disaggregated data. However, people would think the most disaggregated data would be data per SKU per day, and I would say, hold on, the most disaggregated data is not even aggregated data at all. That would be the raw transactional data.

The reason why people stop at per SKU per day is essentially because it’s the last level at which you’re still operating with time series. If you want to go any further than that and deal with the raw transaction history, then essentially, you have to give up on the time series perspective. Why? Because the data is not structured as a time series. It’s literally relational data, so you have tables in your database. It is not structured as a time series anymore, certainly not like an equispaced time series.

Time series only emerge when you essentially construct vectors where you say, per period (the period can be a day, week, or month), you have a quantity, and then you have a vector of quantities. You want to extend this with a time series model. If you operate with just a table with, let’s say, 100 columns, this is not a time series; this is just a relational table in a database. This is very common, but it’s not the time series. It’s the forecasting method chosen itself that is now another limiting factor.

The question is, why is it so appealing? The answer is that most of the supply chain industries operate with a time series mental model. So obviously, if you’ve decided that everything has to fit the time series model, then the hypercube is a very appealing factor because everything you’re looking at, as long as one of the dimensions is time, you’re always looking at time series one way or another at various levels of aggregation.

But here comes the crux of it. Although, in theory, information theory would tell us that the more we disaggregate, the more information we have, and thus the more we can possibly know about the future. The reality is that time series techniques, most of them, not all of them, tend to perform very poorly on very sparse, erratic, intermittent data. The problem is that when you go into very disaggregated data, time series techniques are less effective.

Nicole Zint: From your perspective of your time series technique, not from the real perspective, the real perspective is you have more data, but from the perspective of your time series technique, you have a vector that is sparser and sparser, so more zeros. And the time series is about more of the same, right? So we assume we make the assumption that the future is symmetrical to the past. Is that where it comes from?

Joannes Vermorel: Yes, but this is true for all data-driven methods. So all data-driven methods basically rely, one way or another, on the idea that the future will be more of the same. You see, it doesn’t really depend, you can say it’s machine learning, AI, time series, whatever, it is always the same idea. All our statistical methods are rooted in the idea that the future will be more of the same compared to the past.

Nicole Zint: But surely, if you go more granular, you lose out on maybe seasonalities and things like that, right?

Joannes Vermorel: No, the characteristic of the time series is a very technical one. It is the fact that the time series model gives you a highly symmetrical model in the sense that the future, in terms of data structure, looks exactly the same as the past. This is something that is very specific to the time series. When you say more of the same, yes, but I make a statement about the future. This statement doesn’t have to have the exact shape, form, and format compared to my historical records. So it may, but it may not.

With time series, it’s incredibly seductive, but I believe it is also misleading people a lot. It’s incredibly seductive because essentially the future and the past are exact symmetrical. And when I say the exact symmetric, just imagine your hypercube or your cube. You have a dimension for the SKUs, a dimension for the day, a dimension for something else, and essentially the future is just taking your dimension for the day and extending this dimension by, let’s say, 100 more cells.

And then, here it is, you have the future, and then you would say the forecast is just filling it in, filling in the gaps. So literally, you would say it’s the exact same data; there is data that I’ve observed and then data where I’m going to fill in the blanks with my time series forecasting model. However, there is a very radical and fundamental asymmetry between the past and the future.

If you go for this classical time series average forecast perspective, you are doing something which is pretending that the future is exactly the same as the past, in nature, not only the fact that it hasn’t happened yet. It’s literally in terms of data format, in terms of how to think about it, you’re just saying that it’s completely the same. And my proposition, which is more like a philosophical statement rather than a scientific one, is that no, it isn’t, it is very different.

Nicole Zint: I still see many RFPs, and they ask vendors, can you give us all of these levels at once? Different levels of aggregation, why?

Joannes Vermorel: Again, it is a standard question. People insist on this because it’s what they’re used to, but it’s important to recognize that different levels of aggregation can lead to very different results and insights.

Nicole Zint: The fallacy here is that you start with this time series model, and this time series model has its counterpart in the software industry with business intelligence, where everything is basically a cube or a version that is sliced or diced of a cube. Now, people realize that they lose information when they go, but somehow they’re not really sure why. The metric tells them that their very disaggregated forecast is complete crap. The reality might be that just because they are not using the right method, it is indeed very poor.

Joannes Vermorel: So they say, “Okay, our forecast is super poor.” I say, “Well, we need the possibility to bubble up to some higher level of aggregation. It’s going to be maybe per week or maybe per product instead of having per SKU.” But they don’t know which one they want to pick. So when they ask a vendor, they want to keep their options open, and they end up with semi-ridiculous RFPs where they have a hundred-plus questions, and they want to have all the aggregation levels.

Just because from their perspective, they are keeping the option open about at which level they want to apply the time series forecasting model. But here, I really challenge the very fact that why should you even aggregate your data in the first place, and why should your forecasting technique start to discard data in the first place before it even starts to operate? You’re losing data, so this is a problem, and aggregating more is just making you lose more data.

And then if you say, “But wait, we can’t operate at a super disaggregated level because our metric, which is the percentage of error, tells us it is very bad.” We say, “Yes, but you’re not optimizing the percentage of error; you want to optimize dollars of error. But you’re looking at the metric of percentages, so it’s kind of misaligned with the dollars.”

Nicole Zint: Yes, and exactly. So because if you go with this fallacy, you would go from daily to weekly, you get better accuracy; then weekly to monthly, better accuracy; then weekly to yearly. And then people say, “Oh, hold on, yearly forecast, what am I going to do with the yearly forecast? If you make decisions on a weekly basis, how is a monthly forecast going to help you?”

Joannes Vermorel: That’s the problem. The reality is that the only relevant horizon is the one that is relevant for your decision. But let’s have a look at a very simple decision, such as inventory replenishment. Give us some example of what is the relevant horizon. The answer is very tricky. First, you will have the lead times, but the lead time is not guaranteed. Let’s say you have an overseas supplier, so your lead times may be—it’s not a constant, it’s something that varies. So your lead time might be something that is kind of 10 weeks-ish, but potentially with the potential for huge variations.

Some of those variations, by the way, are seasonal, just like Chinese New Year. Factories in China close, so you get four extra weeks of lead times. So your horizon, if we’re just looking at the lead times, is something that is very varying and would need a forecast of its own. By the way, one of the problems with those time series models is that we are always looking at something that is kind of the sales. All the other stuff that you need to forecast, like your lead times, they are still constant. It’s even worse than that; they don’t even exist, you know.

Nicole Zint: So, the cube doesn’t even represent the sort of literature; it’s kind of arbitrarily chosen. Your horizon would be your lead times, but your lead times would deserve a forecasting red zone which doesn’t really fit this time series perspective and cube software. But, should your horizon to assess the validity of your decision only stop at the lead time?

Joannes Vermorel: No, because obviously, if you decide that you want to place a reorder now, you want to fulfill the demand that is going to happen between now and the arrival date of your product. But then, you will have to sell what you’ve just received. In order to assess the relevance of the purchase order, you have to look at what happens afterwards. And how far into the future should you look? Well, it depends. If the order you pass has a surge of demand, then you might actually receive the goods and have everything sold in two days. But what if it’s the opposite, and then you have a drop in demand? You may keep the stock for a whole year, obviously not if it’s perishable, but I’m just simplifying.

So, the horizon that is applicable is something that is incredibly dependent on the way you look at the future, and it is a forecast of its own because it’s a forecast where you have to predict the lead times. And then, the horizon that we have to consider, even if we are looking at just the demand, depends on how fast you expect to actually liquidate your stock. Thus, ultimately, there is no clear limit in terms of the applicable horizon for your forecast. The only concern is that the further we look into the future, the fuzzier the forecast becomes.

However, this is a technicality, and at some point, there is a trade-off in terms of the cost of CPU versus potential marginal improvement for your supply chain. But, you see, from a conceptual perspective, there is no limit in how far you want to look into the future.

Nicole Zint: So, to conclude, the level of granularity should always be on the level of the decisions that you want to take?

Joannes Vermorel: Yes, I would say your granularity is going to be very much decision-driven. But be mindful that this notion of having to aggregate is making an hypothesis on the sort of method that you want to use. My suggestion would be to keep your eyes on the very decision that you take. Decisions are the things that have a tangible impact on your supply chain, such as your reordering, moving your price up or down, and other actions that have a real financial tangible impact on the supply chain. But then, I would say beware of the very notion of granularity. This is very made up, very arbitrary, and don’t confuse your need for visualization, which is fine – you want to be able to visualize – with the granularity that is needed for decision-making.

Nicole Zint: Time series are an incredibly powerful tool to visualize data. However, don’t confuse this need for visualization with other predictive and optimization requirements that don’t have to operate with any kind of made-up granularity. When I say made-up granularity, I mean anything that is not just the reflection of the data as it exists in your enterprise systems. Any kind of aggregation that you add on top is going to lose data.

Joannes Vermorel: Maybe it’s going to be a good trade-off in the sense that maybe by aggregating, you will save CPU or memory, but maybe not. This is a super technical discussion, and my suggestion would be not to do some kind of premature optimization. Try not to immediately think about those aggregation levels as if they were hard problems; they are mostly easy problems when it comes to visualization. With modern computer systems, it is very easy to have an excess of capabilities with regard to your actual needs.

In the 90s, it was a challenge to aggregate data per day, but nowadays, it is not. If there is a vendor that tells you that they have a limit at five years’ worth of history, this is just super weird. There is no such limitation. There are plenty of ways we can deal with any kind of granularity, even down to the millisecond. However, it’s not necessarily something that is super sensible, and you don’t want to do that with an actual cube where you’re using one byte of memory for every single cell in your cube. This is a very technical aspect.

Modern systems will give you any kind of aggregation that you need and more. This is not a constraint. Don’t reason by implication, trying to think of all the techniques that you want to use based on this cube, as if it was the only way to look at the problem. It is not. There are plenty of things that get lost, such as perishability, cannibalization, substitution, and varying lead times. The fact that you frame everything in a cube puts enormous restrictions on what you can even think about your supply chain, and this is something bad. My suggestion is not to put your mind in a cage. Just try to have a broad perspective because there are so many more arbitrary constraints that do not help the problems getting solved for your supply chain.

Nicole Zint: Thank you very much, Joannes, for sharing your thoughts on this topic. Thank you for watching, and we’ll see you next week.