00:00:00 Introduction of the debate participants and the debate format
00:02:56 Joannes’s opening remarks
00:09:53 Jeff’s opening remarks
00:16:56 Joannes’s rebuttal
00:21:47 Jeff’s rebuttal
00:26:53 Joannes’s concluding remarks
00:28:56 Jeff’s concluding remarks
00:31:05 Follow-up questions
00:48:36 Audience questions
01:18:57 Revisiting initial perspectives and conclusions
Summary
In a debate moderated by Conor Doherty, the topic “Is Forecast Value Added (FVA) a best practice or a time-waster?” was explored by Jeff Baker and Joannes Vermorel. Jeff Baker, with extensive experience in supply chain management, argued in favor of FVA, emphasizing its value when applied correctly and structured, highlighting the importance of addressing biases and leveraging expert inputs. Conversely, Joannes Vermorel, CEO of Lokad, contended that FVA can be inefficient, advocating instead for financial optimization of supply chain decisions with probabilistic forecasts. The debate underscored the contrasting views on FVA’s role in supply chain management, providing insights into improving decision-making processes.
Extended Summary
In a recent debate moderated by Conor Doherty, Head of Communication at Lokad and host of the LokadTV YouTube channel, the topic of “Is Forecast Value Added (FVA) a best practice or a time-waster?” was thoroughly examined. The debate featured two prominent figures in the supply chain industry: Jeff Baker and Joannes Vermorel.
Jeff Baker, who supports the use of FVA, is the Founder and Managing Director at Libra SCM, Course Lead at MITx MicroMasters in SCM, and Associate Editor for Foresight, the International Journal of Applied Forecasting. With over 25 years of experience in supply chain management, Jeff frequently speaks and writes about Sales and Operations Planning (S&OP) and FVA.
On the opposing side, Joannes Vermorel, CEO and Founder of Lokad, a Paris-based software company dedicated to the financial optimization of supply chain decisions, argued against the use of FVA. Joannes is well-known for his extensive publications on supply chain topics and his regular public lectures and debates with industry leaders.
Conor Doherty began the debate by introducing the topic and giving each participant a chance to introduce themselves. Following the introductions, each participant presented their views on FVA.
Jeff Baker argued that FVA is a valuable practice when applied correctly. He emphasized that adjustments to forecasts should be guided and structured. According to Jeff, adjustments should be based on the direction of the adjustment (whether it is up or down), the inherent forecastability of the time series, and the size of the override. He stressed the importance of looking for substantial increases rather than making minor tweaks.
Jeff also highlighted the need for structured inputs, where assumptions are clearly stated and based on new data that may not have been modeled yet. He advocated for a proactive approach to identifying and addressing biases, understanding the motivations behind them, and learning from past mistakes. Jeff believes that good judgment is based on experience, which in turn is based on bad judgment. By tying back to assumptions and validating them, organizations can address issues such as over-forecasting due to a lack of trust in the supply chain.
Furthermore, Jeff argued that experts in sales and marketing are better equipped to contextualize information than supply chain scientists. He suggested that valuable inputs from these experts should be automated over time, but acknowledged that we do not live in a world with infinite pure data. Therefore, it is essential to collect the right data from various sources, including manufacturing, sales, and marketing.
Joannes Vermorel, on the other hand, argued against the use of FVA. He contended that FVA can be a time-waster and may not always lead to better decision-making. Joannes emphasized the importance of focusing on financial optimization and leveraging probabilistic forecasts to make more informed supply chain decisions automatically. He argued that relying too heavily on FVA could lead to inefficiencies and distract from more critical aspects of supply chain management.
The debate included follow-up questions and a free exchange between the participants, allowing them to delve deeper into their arguments and address each other’s points. The discussion concluded with a Q&A session, where the audience had the opportunity to submit questions live in the chat.
In summary, the debate highlighted the differing perspectives on the value of FVA in supply chain management. Jeff Baker advocated for a guided and structured approach to FVA, emphasizing the importance of learning from experience and addressing biases. Joannes Vermorel, however, argued that FVA could be a time-waster and stressed the need for financial optimization of supply chain decisions instead of forecasting overrides. The debate provided valuable insights into the complexities of supply chain forecasting and the various approaches to improving decision-making in this field.
Full Transcript
Conor Doherty: Welcome to the third edition of Lokad Supply Chain Debates. Today, I have the pleasure of hosting a much-anticipated debate between Jeff Baker and Joannes Vermorel. Jeff teaches Supply Chain Dynamics at the MIT Center for Transportation and Logistics and is the founder and managing director at Libra SCM. Meanwhile, to my left, Joannes is the founder and CEO of Lokad. He’s an engineer with the École Normale Supérieure in France and taught software engineering there for six years.
Now, the topic of today’s debate is “Forecast Value Added (FVA): A Best Practice or a Time-Waster?” Jeff will argue that FVA is, in fact, a best practice, while Joannes will argue it’s a waste of time. Now, as quickly as possible, I will try to get through the parameters of the debate, the housekeeping, so that we can get to the good stuff.
First, there will be opening remarks, maximum 7 minutes each. As agreed ahead of time, Joannes will speak first. Then, each speaker will have a 5-minute rebuttal. This will be followed by a two-minute conclusion from each speaker, at which point I will pose some follow-up questions. These questions can be submitted by viewers at any time during the event in the live chat.
Now, in preparation for the debate, both speakers agreed to the following definition: Forecast Value Added (FVA) is a simple tool for evaluating the performance of each step and contributor in a forecasting process. Its goal is to eliminate waste by removing processes and activities of any kind that fail to increase forecast accuracy or reduce bias.
That definition, as well as full bios for both speakers, are in an open Google document that you can find in the comments section or the comment chat for this video. During the debate section, I will strictly time both speakers and will politely remind them when they’re running out of time with a modest throat clear at best. But I do recommend that each person times themselves so you know when time is running out.
Almost done. Speakers are to remain completely silent during each person’s turn, so please don’t interrupt each other, at least not during the timed section. And lastly, some shameless self-promotion: while you’re here, if you enjoy these debates and you like what we do, I encourage you to subscribe to Lokad’s YouTube channel and to follow us on LinkedIn. And with that, is Forecast Value Added a best practice or a waste of time? Joannes, please, your opening remarks.
Joannes Vermorel: First, I would like to thank Jeff for being a good sport and agreeing to this debate. In simple terms, FVA is a tool for tracking accuracy increases and decreases. Now, is FVA a best practice? For anything to be considered a best practice, it must, by definition, reflect the community’s understanding of what is the most effective way to achieve a particular goal. However, forecasting is not a goal in and of itself, nor does it happen in a vacuum.
Forecasting is a tool that we use to accomplish a specific goal. Some may argue that the goal of forecasting is greater accuracy. This is a heavily disputed position. In my opinion, forecasting is just one more tool that helps us to make better business decisions, that is, business decisions that make more money. So the question is, does FVA, by measuring accuracy increases or decreases, move us closer to accomplishing the goal of making more money? I am not convinced, and for now, I will present three criticisms to support my case.
First, even though FVA was not originally designed to facilitate collaborative forecasting, FVA does, by design, provide a framework for measuring the accuracy impact of collaborative forecasting. This is important. FVA doesn’t say you should use the best forecasting practices. FVA does say here is the accuracy impact of whatever you are doing. Why is this important? Well, what is the best practice in the forecasting community?
Since the 1980s, Spyros Makridakis has organized a series of public forecasting competitions (the M-competitions) to find those best practices. Since the M4 in 2018, those competitions have consistently demonstrated the superiority of algorithmic methods. In fact, probably the greatest expert alive when it comes to human forecasting capabilities, Philip Tetlock, wrote that whenever a forecasting algorithm is available, this algorithm must be used. The reason? The algorithm invariably delivers superior accuracy compared to human judgment. This algorithm and obviously the expert using it is the best practice.
So if collaborative forecasting and manual overrides are not the best practice, and demonstrably they are not, then companies measuring them with FVA is also not best practice and, I would argue, a waste of time. Some might say, but Joannes, FVA does not explicitly advocate collaborative forecasting or manual overrides. Okay, but that is how it is popularly used, even in Jeff’s writings.
However, my second criticism is something FVA is built on, and that is the time series perspective. FVA requires classic time series forecasts, also known as point forecasts. At minimum, FVA advocates using a naive forecast, just a copy of the last actual as a baseline for comparing the forecasting overrides. This no-change forecast is a time series. But is time series a best practice? Again, no. Point forecasts aren’t just incomplete forecasting tools; they can be downright misleading, for example, in scenarios of high variance. This is because point forecasts don’t factor in uncertainty.
In fact, the M5 forecasting competition included a separate challenge, the uncertainty challenge, that focused on quantile predictions, which Lokad entered and won at the SKU level, by the way. In reality, a much better class of forecasts already exists, and that is probabilistic forecasts. Unlike time series, probabilistic forecasts don’t single out a single possible future, for example, demand next week. Instead, we look at all possible futures and their respective probabilities. Why is it important? It’s important because identifying all the possible future scenarios is essential for selecting the best possible choice. This is important whenever financial risks are involved, which happens to be all the time as far as supply chains are concerned.
However, FVA is not compatible with probabilistic forecasting. Why? Because probabilistic forecasting means looking at probability distributions and not time series. And let’s be real, people in sales and marketing are not going to manually edit probability distributions with or without FVA. That’s an absolute nonstarter. If time series are not best practice, and they certainly are not when it comes to risk management, then using FVA to compare accuracy overrides is also not best practice. I would argue it’s a waste of time.
My third criticism is that forecast value added does not measure value; it measures accuracy. And does accuracy add value? Not necessarily. A more accurate forecast does not in and of itself add value. In many real-world situations, a 90% accurate forecast and a 60% accurate forecast lead to the very same inventory decisions if MOQs or other constraints are present. If the decision’s financial outcome doesn’t change, then measuring the accuracy gain does not add business value. As such, from a business perspective, it is absolutely incorrect to say that accuracy in and of itself adds value. If it is the case, and it is the case, how can focusing on accuracy with FVA be the best practice? It isn’t.
Even if you don’t personally use FVA to support collaborative forecasting, others do. FVA is still based on time series forecasts, which ignore uncertainty, and also the idea that increased accuracy equals increased value, which defies business settings. All of these are bad practices, and therefore I argue that FVA itself cannot be best practice. It is, in my opinion, a waste of time. Thank you.
Conor Doherty: You still have 15 seconds, Joannes.
Joannes Vermorel: Okay, thank you.
Conor Doherty: Well, thank you very much, Joannes, for your opening remarks. Jeff, at this time, I invite you to please make your opening remarks.
Jeff Baker: Yeah, great. Thank you, Conor. Thank you, Joannes. I really appreciate the opportunity to engage in this conversation. So, obviously, I’m for the use of FVA as a best practice. This comes from the fact that all supply chains need to plan, right? We need to make decisions far in advance, maybe two to three months to set a manufacturing schedule, to collaborate with suppliers, maybe set it six months ahead. We need a good forecast to make sure, you know, if we’ve got small capacity changes, maybe staffing, maybe a co-manufacturer we want to bring on. So we have to make these decisions on a long time frame, and we do end up revisiting those.
I’ve got a couple of favorite quotes in supply chain. One is, “The plan is nothing, but planning is everything.” So we need to have this plan to get us in the right position. Now, my second favorite quote in supply chain is from Mike Tyson: “Everyone has a plan until they get punched in the face.” So it points to the fact that we know we’re going to take some knocks in supply chain. The goal is to make the best decision possible with the greatest chance of success in the executional space. The best way to do that is to engage our functional experts. Sales and marketing have local insights that can be used to improve the forecast and therefore should be included in the consensus process. Engaging those experts in a structured fashion gives us the better data we need to make better decisions. And FVA is an effective tool to measure the effectiveness of those inputs.
Now, the caveat being, yes, you could potentially open the door for bias, but from my perspective, what we need to do is work on correcting that bias, not eliminating them entirely. We have to be cross-functional. We’ve been preaching that cross-functionality is the way to go for many, many years now. I don’t want to go back to a functional silo where everyone’s only responsible for their own decision, not for their impacts on others. To do this best, I think you have to have guided adjustments, again, where FVA shines. It gives me those products that are highly valuable but have a high error. That’s the fertile ground to go looking for the better inputs there.
If I’ve got a large impact that’s expected—some materially adverse effect that’s coming in, or it could be a positive effect as well—we need to be able to plan for those. So, the first thing is those adjustments should be guided. The thesis I did for my Master’s Degree said we need to look at the direction of the adjustment: Is it up or is it down? What’s the inherent forecastability of the time series? What is the size of the override? And let’s make sure that if we’re going to add value, that we’re looking for a substantial increase—we’re not just tweaking things.
Next is structure. I’m not going to let anyone from sales or marketing just give me an arbitrary number. I’m going to ask: What are their input assumptions? Is this based on new data that we haven’t even modeled yet? Then I’ll dig deeper, asking: Is this a best-case scenario? A worst-case scenario? What’s the most likely case? What would have to happen? What would have to be true for this scenario in the future to take hold? In that way, what we’re trying to do is proactively root out causes of bias and understand the motivations behind them.
Once we do that, then yes, next month we look at those adjustments to say, “Did it add value or did it not add value?” What I like to say is, good judgment is based on experience. Experience is based on bad judgment. So, we learn from our mistakes. We tie back to the assumptions, we validate, we might find something that—hey, you know what—maybe the salespeople are constantly over-forecasting because they don’t have trust in supply chain. That’s not just bias, that’s a trust issue, and we can start to address that.
Furthermore, they are the ones that contextualize that information better than we can. I don’t think we’re going to have a supply chain scientist who’s the expert in all the nuances of marketing or all the nuances of forecasting. I’m going to rely on those experts, use them to give me that data. Now, if over time I find those inputs to be valuable, I’m going to try to automate them. I’ve got no argument against automation about collecting that right data. But the problem is, we don’t live in a world with infinite pure data. If there is manufacturing data, if there is sales data, if there is marketing data, oftentimes we have to proactively go out and seek that out.
If we do that well, there may be a lot of these decisions we can automate. Adjustments aren’t required just because we have an FVA process in a forecasting system. In fact, the number one forecasting mantra ought to be “Do no harm.” Kraft Heinz, one of the largest food and beverage companies in North America, has the metric “low-touch forecasting percent.” How do I make sure that I’m not touching it every single time? I think Deming said it best: “Don’t just do something, stand there.” Because he realized the natural tendency of someone is to think, “They’re looking for me to make an input, I better make an input to show that I’m busy.” No, that’s totally the wrong way to look at it.
As we look at proof that error reduction is valuable, we can point to The Institute for Business Forecasting and Planning. They did a survey of eight CPG companies and found that for a 1% reduction in forecast error, companies gain $1.7 million in benefits per billion-dollar revenue by avoiding the cost of forecasting. That includes avoiding discounts, transshipments, obsolete products, excess inventory, and tying up working capital. Companies also reduce the cost of under-forecasting by about $1 million per billion in revenue, avoiding lost sales, case fill fines, and increased production or expedited shipment costs.
IBF has seen this in their research. Gartner has found similar benefits: a 2 to 7% reduction in inventory value, 4 to 9% decrease in obsolete inventory, 3 to 9% reduction in transportation cost. The magnitude of these numbers makes it attractive to pursue avenues towards improving forecast accuracy, especially in areas where an item is high value and high error, or if we know ahead of time that external events will impact the supply chain one way or another. Thank you.
Conor Doherty: Well, thank you, Jeff. You also still have 15 seconds if you’d like.
Jeff Baker: I have 13, but I’m good.
Conor Doherty: Right. Well, Jeff, thank you very much for your opening statement. At this point, we shall proceed to the rebuttal. Joannes, 5 minutes when you are ready.
Joannes Vermorel: Thank you, Jeff, for your opening remarks. I think you argue your position very well, though I think there are a few points that I would like to clarify. I’m not challenging the intent that people are trying to do something good for the company. I am trying to challenge the reality of the real outcome.
First, if we think that accuracy is best, something worth chasing, then the reality, as I pointed out in my initial statement, was that FVA does not reflect forecasting best practices. If a company is truly chasing accuracy, then by not doing those collaborative forecasts, they will actually get more accurate results. That is unfortunately what has been empirically demonstrated.
Second, the overrides themselves are a very bureaucratic way to approach the problem. As soon as you put a mechanism in place—yes, people may say “do no harm”—but if you put a bureaucratic mechanism in place, it will be used. FVA implies setting up a mini-bureaucracy or mini-technocracy with software elements involved. There will be people checking whether those in sales and marketing are indeed making the corrections and everything. And so for me, it paves the way for something that will generate a lot of bureaucratic busy work.
Because the reality is that when you start looking at what those forecast and adjustment look like, we are talking of tens of thousands of time series, each time series having like 50 points or more, you know, weekly forecast one year ahead. But that leads me to another criticism, is that by focusing on accuracy in isolation, I believe that FVA misallocates money and blinds companies to much more effective ways that people could contribute to the forecasting process. And to be perfectly clear, I’ve never argued that members of sales, marketing, and finance could not contribute meaningfully to the forecasting process. I am perfectly aware that any relevant member of staff can hold valuable information in their head, information that could provide financially rewarding to the company.
However, what I disagree with is the idea that people should wait until after a point forecast is produced to then manipulate it with manual overrides in order to increase accuracy. This, as I said, is not best forecasting practices, and yet it is what people usually do with FVA. However, there is a constructive way to involve people in the forecasting process. It is by contributing to the forecasting algorithms that generate the forecast and later algorithms that generate decisions. In reality, this means assisting the forecasting expert. At Lokad, this would be a supply chain scientist in writing and refining these algorithms by providing domain expertise and insights. This doesn’t mean that everyone in sales and marketing has to read probability distributions and write scripts in Python. Instead, they help by providing the forecasting expert with the actionable insights that might be useful, and then the expert translates those insights into the lines of code.
And finally, the automated algorithms run and generate the forecast. At the end of the day, those insights are just pieces of information, and they are distributed, I agree, everywhere in the company. However, the forecasting expert is the person who knows how to translate all of this into a meaningful forecast and a sensible set of supply chain decisions. Tons of excellent data can be available, but it’s a forecasting expert and this expert only who should decide how this data should be used to produce or revise a forecast. This is the best practice supported by decades of experimental forecasting results. And unfortunately, FVA has no place in this arrangement. By measuring accuracy instead of directly contributing to the improvement of the forecast algorithm, the kindest thing that you can say is that FVA is a distraction. I, however, would call it a waste of time.
Conor Doherty: Joannes, you still have 20 more seconds.
Joannes Vermorel: I’m good, thank you.
Conor Doherty: Okay, thank you. Jeff, at this moment, I see you smiling. Feel free to respond with your five-minute rebuttal.
Jeff Baker: I think you were muted, by the way.
Yes, sorry. No, interesting perspective. I want to reflect back on a couple of things. One, you mentioned the M5 competition and Makridakis. One thing I would like to point out is that 92% of those people lost to a very simple exponential smoothing algorithm benchmark. So, there is an argument for simplicity. There is a difference, I think, between best practice and bleeding edge. I want to make sure we have a distinction there because we have a lot of times where simpler actually is better and more acceptable to the people that use it. From an explainability standpoint, if we are in an S&OP meeting in a demand review, it’s a lot easier to explain how that’s coming and get more buy-in to that.
The other one you mentioned about time series focusing on just a point. It is a best practice to not only give the time series but also what the prediction intervals are going to be, right? And that does tie into the accuracy. So yes, a best practice can be time series plus relaying to the people what the prediction intervals around that are. We’re in agreement that a point forecast is a piece of information. A point forecast plus the prediction intervals is more valuable.
You mentioned that probabilistic forecasts aren’t amendable to FVA. I think if you look at one of the recent issues of Foresight, you’ll see an article by Stefan De Kok about probabilistic forecasting and a variant of this, stochastic value added, which I think points to the value of this framework. I’m somewhat agnostic to my forecasting method. However I do it, I want to look at an improvement as I add different inputs into my forecast. How am I improving it? Then also making sure we have an efficient and effective use of our resources. That whole tradeoff between the cost of inaccuracy and the cost of generating the forecast is something that’s been known since 1971. There’s a Harvard Business Review article on how to balance the cost of too much time generating a forecast versus the accuracy I’m getting. In colloquialism, is the juice worth the squeeze? Based on those numbers that I create, for a reasonably sized company, there’s a lot of benefit there, and I can afford to have some people look at that.
I don’t think collaborative forecasting is bureaucratic. I think you need to involve those people with the process so they can create value add through the system. These inputs are great. There are always going to be events happening. Supply chains are not getting any less complex; they’re getting more complex, more dynamic, more of the butterfly effect. Because of that, we need people to be able to contextualize that information and make the best decision at the time period that we need to make that decision. So, from that perspective, it has to be collaborative. If I’m collaborative, I’m always working with sales and marketing. It becomes, you know, I’m not boiling the ocean; I’m looking for what’s changed. If I continue to do that, I’ve got that relationship. I am able to then get those inputs and have a better relationship with them.
The opposite is that I have an ad hoc process where I involve sales and marketing at my discretion when I want to. I pretty much guarantee the quality of your inputs from sales and marketing are going to be significantly decreased if they’re not part of a regularly scheduled process. What I got from salespeople all the time is, “I’m too busy selling, leave me alone.” So, if you want to get that input, you have to involve them in the process.
You mentioned the disconnect with actual business value. You can’t make good decisions with bad data. The argument is, I need that better forecast. There is no argument for a decrease in forecast accuracy. I need to make sure I’m using the best data I can for decision-making. Will that be tied directly to an ROI? Can I calculate the ROI of one decision in isolation for the entire supply chain? It’s not going to happen. I would love to say it is, but my decision from a forecasting standpoint is completely separate from the functional decisions that manufacturing makes, purchasing makes, warehousing, transportation. Any of those things can lead to a poor ROI. My role is to be as accurate as possible. Thank you.
Conor Doherty: Thank you very much, Jeff. Sorry for speaking right over the end there, but I did have to just remind you.
Jeff Baker: No worries.
Conor Doherty: At this point, thank you, Jeff. I will turn to Joannes. Please, your concluding remarks, two minutes.
Joannes Vermorel: Ladies and gentlemen, the topic of the debate was, “Is FVA best practice or a time-waster?” Throughout this debate, you’ve heard a lot of information, but please keep a few things in mind when deciding what you think about FVA. Number one, if you use FVA as part of an ongoing collaborative forecasting process understood as micromanaging the forecasting points, that is not best practice. Manual override of the forecast is without question not best practice in the forecasting community, and as such, measuring them with FVA is not best practice either.
Number two, FVA is built on time series forecasts. I’m sure somebody somewhere is trying to apply it to probabilistic forecasts, but let’s be real. FVA only works at scale if it works at all in combination with classic time series forecasts, which are completely disregarding uncertainty. These are not best practice, and thus measuring the accuracy with FVA is also not best practice.
Number three, by design, FVA assumes that increased accuracy is something worth having. This is not the case. Contrary to what was said, there are cases where increasing accuracy can actually hurt your business. We have a very simple example. For sparse series forecasting, zero is very frequently the most accurate forecast, even if forecasting zero demand makes no sense. Thus, even if you disagree with me on all those elements of forecasting and supply chain, this much is abundantly clear: FVA, for all these reasons and more, cannot be considered a best practice. If it is, it is a very sad indictment of the forecasting community.
Conor Doherty: Thank you very much, Joannes. Jeff, I turn to you. Your concluding remarks, please, two minutes.
Jeff Baker: Okay, great, thanks very much. Again, as I started off, better planning leads to better decisions. We have to learn how to plan, we have to learn how to replan, and we need to make sure that we have an accurate forecast going in there. I don’t think there is a good business case that poor data will always lead to better decisions. If you have poor data and make good decisions, I’m calling that blind luck. That doesn’t happen very often. Where FVA shines, again, I’m not advocating micromanaging, I’m not advocating overriding everything. There are examples where we have items that are high value, have high error. We know that there are external events that could happen that can’t be reduced to a neat algorithmic input. We need to understand those, we need to plan for those on the time horizon that’s necessary to make those decisions. Any critique on FVA is largely based on a misunderstanding of what FVA is designed to do.
I know Lokad, you sell software packages. In prior work, I have been at a software company, and I’ve also implemented software packages. I know that the tool works. Oftentimes, it’s a problem with implementation. If the customer is not getting what they want, it’s an implementation issue nine times out of ten. Implementation and a data issue. Any critique on FVA stems from not knowing how to implement it correctly, not understanding how this operates and how this adds value. I’ll give you another simple analogy. If I’m building a deck on the back of my house and I heard that I ought to screw the boards into the joist and I use a hammer to start hammering in wood screws, it’s not going to work, and I’m not going to be happy. That doesn’t mean there’s a problem with the hammer; it means I just don’t know how to use the hammer properly. I’m using the wrong tool for the job.
Conor Doherty: Thank you. Sorry to cut you off, but I do have to be strict with the times to maintain impartiality. Thank you very much, gentlemen. Thank you both for your prepared remarks, your argumentation, and your passion. At this point, I’d like to transition to a few follow-up questions. There are some coming through. I took some notes based on things that were said. Before we transition to the audience questions, just specifically because I’m sitting here, I’m listening, and I want to clarify a couple of points that were raised.
I shall start with, I think, actually, I’ll push Joannes just to show impartiality. I’ll push Joannes. So, Jeff, in your rebuttal, you mentioned the aphorism “simpler is better.” I think you mentioned the M5 results and you made the point that just because something is sophisticated or bleeding edge, that doesn’t necessarily make it better. So, Joannes, your response to the concept there that basically, if I can just unpack that a little bit, that probabilistic forecasting, pure algorithmic modeling, that’s just too fancy. You should be simple.
Joannes Vermorel: The reality is that we won at the SKU level the M5 competition with a model that is parametric and has like five parameters. That’s it. So, yes, again, algorithmic doesn’t mean better. In fact, Tetlock in his book “Superforecasting” shows that a moving average will beat 99% of humans at forecasting anything. Humans see patterns everywhere; they have an immense cognitive problem. It’s extremely difficult to deal with noise, so you see patterns everywhere, and this is just bad for forecasting. By the way, that’s something where artificial intelligence, like a moving average, actually beats the human mind the vast majority of the time. So, that was just a point. In particular, the probabilistic forecasting algorithms of Lokad are not naturally very fancy. They just have a strange shape, but they’re not super fancy in the sense of using deep learning and whatnot.
Conor Doherty: Jeff, how does that sound to you? Is there anything you wish to push back on?
Jeff Baker: The only challenge I would have, and I’m all for bringing in the latest data and technologies—again, I was at a software company, we were selling technology—my challenge is that sometimes we don’t need to overcomplicate things. Maybe it is better, but we need to make sure that we’re getting that incremental value for anything that we do. Also, the explainability is critical.
For us here in the room, people interested in watching this, we’re all into probabilistic forecasting and data. When we get to the implementation, if I’m trying to sell whatever algorithmic forecast, if it suffers from explainability issues, we can start to get some pushback because humans naturally have some algorithm aversion. They’re going to push back on things they don’t understand. A lot of these forecasting techniques, the simpler ones, are relatively easy to explain. That’s where I’m saying sometimes the simpler ones, from a performance standpoint, work just as well. Simple ensemble forecasts beat 92% of the people fighting for fame and fortune in the M5 competition. So, there’s a certain value in that.
I would also say you don’t want to overtax any individual organization. Some organizations have a maturity level we need to bring them up to. For many of them, if I can get them to do exponential smoothing forecasts, great. Ensembles, great. Talk about what the prediction intervals are, fantastic. No problem with the fancier, more sophisticated algorithms. We just need to make sure we bring them up based on their ability to digest and accept that technology. Otherwise, we run the risk—and I’ve had that problem before—of using mixed-in linear programs for companies, and if they didn’t understand it, they wouldn’t accept it. That’s my watch out as we try to push more sophisticated algorithms.
Conor Doherty: Joannes, anything to add there?
Joannes Vermorel: Again, I think there is slightly a disconnect because what I’m saying is that first, when we do probabilistic forecasting, the methods themselves are quite simple. Any sophistication there is only to precisely take into account the factors. Our alternative to FVA is to say that when people have information and raise a hand, that’s going to be factors to be included. The forecast itself is just going to use this extra information as an input, not as an override to the output.
It has plenty of positive consequences, such as if you refresh the forecast, people don’t have to refresh the override. The problem is that people think of it as an override, as something static. But if you override the forecast, what if next week there is new information and your baseline has just moved? What do you do? Do you reapply the same override, the same delta compared to what you had before? There are tons of complications that just result from the fact that marketing provides information in the form of a forecast override. It is much easier to say, “Marketing, tell us we are going to promote this product with an ad spending of this much,” and then you factor that as an input to your forecast. You can even backtest if it adds accuracy or not. This backtest is something you can get on the spot; you don’t need to wait for three months to see if your manual correction yielded something positive or not.
Conor Doherty: If I can just put in for a moment there, because you touched on something that I did want to say to Jeff as well. With regards to “simpler is better,” building on what you just said, would it not theoretically be simpler for an expert to interview other experts, elicit insights, and then translate them into an algorithmic decision versus everyone, including non-experts, touching a forecast? In terms of simplicity, what are your thoughts there? Are they equally simple, or is one more complex than the other?
Jeff Baker: Typically, where I have used FVA has been in the sales and operations planning process, demand review, looking at the time horizon that’s three months out. We’ve got the current month, current month plus one, two. Typically, a lot of CPG companies, a lot of companies, that’s kind of the frozen schedule. That’s where you start to transition between the planning and the execution, so it’s S&OP versus S&OE.
When we are in that S&OP domain, we are looking at some of these events, some of the drivers. What we do there is, if we’re doing it at an aggregate level, we’re gathering inputs. If it’s input at a product family level, you can use those high-level drivers to propagate down. You can disaggregate some of those decisions, which is a common practice—take a high-level number and disaggregate it down to the details.
On the executional side, if there was a significant event that happened, then yes, you could also do that at a family level or a company level, depending on the size of that impact. I think that’s easier. I don’t necessarily advocate adjusting each and every SKU because, as you mentioned earlier, that can be burdensome. But if there are huge impacts, we need to make those adjustments.
In that planning, in that replanning, going back to the plan is nothing, planning is everything. That notion of replanning, revisiting, if we do that in the S&OP process once a month, we’ve got all the decision-makers in the room. We do it once a month, and we also start to focus on what’s already in the model, what’s new information, and how do we incorporate that new information. In fact, one of the times I almost stood up and cheered during a demand review meeting was when the sales leader, the VP, was in the room, and he goes, “Okay, what new information do we know about this?” No new information. “Okay, the stat forecast, the algorithm forecast stands,” and that was it. There wasn’t a lot of bureaucracy in that.
I think that’s the ideal way to do it unless there are some really big events that we know about. But then, if you are the supply chain scientist, you have to proactively go out and seek that input. Whereas, from the way I’m suggesting it, sales and marketing come to the meetings knowing that they have to tell me what’s net new, what’s changed, what wasn’t in last month’s assumptions. We try to make that as quick as possible. For a large CPG firm, you can go through that pretty quickly at a family level.
Conor Doherty: Thank you, Jeff. Joannes, anything to add?
Joannes Vermorel: Not too much, but to keep it concise, the reason why we really advocate having the expert take the information instead of having the forecast being tweaked is that almost invariably, the information mismatches in terms of granularity the business of the company. They have, “Oh, we have this competitor who is going bankrupt.” There is no clear overlap between this competitor and exactly what he’s doing. It’s not a one-to-one. There is an extensive problem of, okay, category by category, which are the products being impacted.
The idea that information can come in the head of a person from sales or marketing or whatever as, “Okay, this information can find a matching time series to override,” is almost never the case. Just finding what exactly is relevant to touch is difficult. What about all the other sources of uncertainty? When Lokad operates a company with predictive models, we have easily half a dozen predictive models—one for the demand, yes, but also the lead times, production yields, future price of suppliers, anticipation of volatility of price of competitors, etc.
When you say the point with FVA is that it also has this problem of impedance mismatch between the granularity of the information that you have and the time series. It just puts demand on a pedestal while you can have tons of other uncertainties that need to be forecasted. There is also information such as, “Okay, this supplier is getting completely overwhelmed, lead times are just going to explode.” That should be reflected in the lead time forecasting algorithm.
Jeff Baker: Yeah, I’ve got no argument with you there in terms of, yeah, there are things that we need to look at in supply chain. In no way, shape, or form am I recommending that we focus on FVA and forget about looking at supplier lead times or anything like that. That’s not correct. We need to focus on what the demand is, from an efficient standpoint, what is my best demand, right?
And I also need to know all those other things about my supply side too. We need to do both of those, right? And yes, totally agree. We need lead time, lead time variation, we need to understand manufacturing, when they could be down, supplier pricing, we need to know that as well. So I’ve got no argument there. The only argument I have is I am not saying focus on FVA to the exclusion of those other things.
Conor Doherty: Well, we do have other questions from the audience I will get to. There’s just one last point that was raised, and I will pose it. I first tried to paraphrase fairly, Jeff, so just confirm if I paraphrase fairly. But then I want to press you, Joannes, on it. Jeff, in your concluding remarks, you made the claim that there exist items that cannot be reduced to an algorithm. Is that a fair summation of what you said? There exist certain items, I think it was low forecastability, that you can’t simply depend on an algorithmic solution to provide.
Jeff Baker: Yeah, and there are events, you know, it’s more related to events being non-repeatable.
Conor Doherty: And the implication there being that they do require manual override, manual input.
Jeff Baker: Some contextualization of the fact that there is an event. I don’t have anything I can model it with, but I’m going to have to make a decision. I’m going to use an expert to help me out on that.
Conor Doherty: Joannes, your thoughts on that? Because I was very curious listening to that to get your take.
Joannes Vermorel: So that’s where I was mentioning the expert Philip Tetlock in my argumentation. He actually wrote a book called “Superforecasting” and he assessed human forecasting capabilities with a project that has been running for a decade called the Good Judgment Project. It was funded by IARPA, the intelligence equivalent of the US DARPA.
What they found was that the people who were good forecasters for this kind of intuitive way to forecast things, for things where you don’t have an algorithmic recipe, their immediate conclusion was when there is an algorithmic recipe available, it’s better. When there is none, okay, back to humans and high-level judgments. But what they concluded, and that’s one of the conclusions of the book, is that the superforecasters, so people who achieve consistently superior forecasting accuracy, are in fact building micro-algorithms tailored to the case. That’s literally it. And when people are able to do that, they have a massive improvement in accuracy. The order of magnitude is we’re talking about a 30% accuracy increase, even on things that are super difficult to assess, like for example, would the former president of Syria get back in power in the next five years? Just a question that is very difficult to answer.
So the bottom line is if we go back to those conclusions, it again supports the idea that when there is information, it’s not the person who has the information that needs to be the person who translates this information into a quantitative statement about the forecast of the company. That’s what I’m saying. And that’s why I think where FVA and the practice of those manual overrides get it wrong is that the way at Lokad we approach that is that somebody gives us the information, the raw information, and then if we have an item that is sitting out of nowhere, then we need to invent kind of a mini numerical recipe that converts that to a number.
And the interesting thing is that you not only have to invent it, but you have to document it. You have to explain what was the logic, even if it’s like three sentences that just say, “Okay, I do this, multiply this by this, apply a ratio and a discount,” something very simple like a cooking recipe. Again, if we go back to “Superforecasting,” this book, it is exactly how the superforecaster people who achieve without algorithms superior forecasting results do it. They have the explicit numerical recipe which makes their process improvable. So it’s not just information, you need to have a process that is repeatable and improvable on how you convert those insights into numbers. That should not be like magic in the mind of the people.
Jeff Baker: No, I totally agree. We have people document what your assumptions are. An extension of that, yes, if you could have some large language model AI for the sales team or for the marketing, fantastic. Because that is one of the biases, you ask people, you try to get their inputs, and sometimes they remember stuff, sometimes they don’t remember stuff. A lot of times we’re going back through the data going, “Okay, when did we do that price increase? Oh, we’re coming up on a year, maybe the first bump we hit from the pricing increase has died off by now.” I’m all in favor of automating that if you can. You need to have that conversation with the people and start getting that. It has to be a way of life for them because there are so many of those cases right now in a lot of companies. So I think, yeah, you start down that road.
Conor Doherty: Right, well at this point I will push forward to some of the questions that have come in from the audience. So I think I asked you first last time, Jeff, this is for both of you but I’ll go to you first. This is from Nicholas. How can one manage a situation where too much information is coming in, forcing statistical models to change frequently, even with an S&OP in place? How can the pressure from marketing and finance teams be balanced effectively?
Jeff Baker: So the question is if there is a lot of different information coming from sales and marketing?
Conor Doherty: Yeah, basically if there’s a wave of information coming in, how do you handle that, especially if it’s forcing the statistical models to change quite frequently with, let’s say, lots of overrides, for example. Though I added that parenthesis myself.
Jeff Baker: Okay, so the statistical model itself wouldn’t be changed. So we’re talking a stat forecast model based on time series and then overrides from sales and marketing is what I take it to be. We’re not talking about…
Conor Doherty: I don’t have Nicholas with me right now, sorry.
Jeff Baker: Okay, so in that case, you need to make a decision at the time horizon, right? So if I need to make it three months out in order to set my manufacturing schedule, my run sequencing for my finite capacity scheduling, then yeah, we need to have the practice of, you know, don’t come in at the last minute with new information. One of the other things from a manufacturing side, what I’m a huge advocate for is frozen time fence violations to train the sales team. Hey, surprises in the current month and current month plus one, they are not welcome. And that’s a cultural thing, right? And that’s how I would address that. I mean, one instance where this, you know, a salesperson came in and said, “Hey, we got this big huge sale,” and they waited till the last minute to tell manufacturing. That’s not a good deal, right? You’ve just cost us quite a bit of money.
So that idea of frozen, like you need to make the call on that time frame, that has to be your best call and realize we’re going to plan based on that and don’t surprise us. That’s how I would deal with a wave like that. In fact, one of the S&OP metrics that I love is that frozen time fence violation. It’s like how often are we whip chaining our manufacturing folks just because you waited to the last minute to tell us there is a new sale.
Conor Doherty: Thank you, Jeff. Joannes, any thoughts there? Also, feel free to fold into your answer again, how would an expert under your rubric handle a lot of sudden insights from a lot of different people?
Joannes Vermorel: First, the approach of Lokad is that we automate everything. So for us, you know, that’s the sort of situation where, first, you need to have bandwidth for that. And you see, that’s what’s interesting about automating everything. By definition, the supply chain scientist has, once the thing is automated, a lot of bandwidth to actually take in an exceptional situation. That is typically not the case in a situation where people have already all their time eaten up dealing with the routine. So that’s the first thing.
The second thing is that the instability of forecasting is a characteristic of classic time series forecasts, you know, point forecasts. So you add a little bit of information, and the thing will just jump up and down because, according to your accuracy metric, this is what you should do to be super reactive, to be most accurate. Very frequently, you have this tradeoff: if you want to be quite accurate, you need to capture change very rapidly, and that makes the forecast very jumpy. Here, if you go for probabilistic forecasting, it tends to eliminate the problems of jumpiness because you already have a probability distribution that is kind of diffused. So, the fact that you observed an outlier and whatnot, you still have this diffused probability distribution. There is no major jump in the spread of the mass of the probability distribution.
Also, the problem of jumpiness of the forecast, even if we go for point forecasts, which can be radically altered, is, as I said, why do people dislike jumps in the forecast? The answer is that because forecasts are manually processed again, with manual overrides, manual revisits, and whatnot. This is not how Lokad does it. The forecast is automated, the decisions are automated. So when the forecast moves, decisions automatically and immediately reflect whatever is the new state, taking into account the fact that you might be financially committed to a certain course of action. So, yes, the demand has changed, but you’ve already produced inventory. So now, even if the demand is not what you expect it to be, you still have to liquidate, sell this inventory somehow. So, you see, automation streamlines and largely removes the problems of putting a tempo on when the information is added. The information can be added whenever it becomes available, and as soon as it becomes available, it is taken into account.
Jeff was speaking about culture. The interesting thing is that it immediately rewards people for bringing the information because, literally, the day they add their information, it is validated. The next day, production schedules are all steered. Production schedules, inventory allocation, dispatch, purchase orders, everything immediately reflects this information that has been provided just yesterday. So, for people, you see a way to develop a culture of bringing the information forward. They have to see that when they bring information, within almost hours, it is reflected at every scale in every single decision. That’s how you can make it very tangible, not by telling them, “Come back next month, and then we will consider starting to look at your stuff.”
Conor Doherty: Thank you, Joannes. Jeff, I definitely want to hear your thoughts on that last part because I could see it in your face. It was resonating at least a little bit.
Jeff Baker: Yeah, yeah. I mean, to me, that sounds like a recipe for a bullwhip effect, right? You’re saying that any little bit of information I’m going to throw in there, I mean, I appreciate the responsiveness and the technical capability to immediately reflect what the best decision is. The challenge lies in that we’ve already made a lot of those decisions. If I’ve done my scheduling and, let’s say, I’m making oatmeal, and I’ve got regular oatmeal, then I’ve got oatmeal with cinnamon, I’ve got cinnamon and apple, and then I make cinnamon, apple, and walnut. Well, I’ve got an allergy now. There’s a huge changeover cost between that. Now, if you’re jumping in and I have to disrupt that schedule immediately, there’s a huge financial cost, a potential financial cost to that. If all of a sudden I need to order more of a raw material more quickly, I’m going to have a bullwhip on my supplier.
So, there are some advantages to stability. In fact, there’s a lot of interesting conversation going on about whether there is a valuable and stable forecast, not the most accurate, but accurate plus stable, because stability does have some benefits in supply chain. So, I mean, that’s an area we’re just starting to do research in, but it speaks to the fact that many of these decisions have to be made. We’ve kind of planted a stake in the ground, and everyone will laugh when I say, you know, a frozen forecast. Like, man, it’s not really frozen. Okay, we all know it’s not frozen, but there is a financial consequence to making changes in decisions.
So, while technically I think it’s great that we can reflect, “Hey, now this is the best decision,” I think we need to temper that with the fact that if we change based on everything that comes in, there are going to be some costs associated with it. That might be perfectly fine for some supply chains. If I’ve got a responsive supply chain, maybe we’re fine. Maybe that’s the world we live in. If we have an efficient supply chain where changes are expensive and difficult to make, that’s where I see an issue.
Joannes Vermorel: I understand. I mean, at Lokad, obviously, modeling the cost of change is something we do. Every resource allocation, if it deviates from what was previous, we model that in the cost. It’s super basic. So, the thing is not going to jump if the cost of change exceeds the anticipated benefits. For me, it’s like, again, I think people usually approach looking at extremely nonsensical numerical recipes and say, “Oh, look, this is a problem.”
For example, we have a decision that is super naively slave to a forecast with no consideration whatsoever of the current commitment, etc. It is incredibly naive. Obviously, you need part of your numerical recipe to have the decision that implements whatever cost of change and all sorts of costs. There are plenty of them. And that’s where probabilistic forecasting also shines. It gives you even more. You factor in the fact that if you take this decision now, will you have to revise it in the future? Because, again, if you have this point forecast, by definition, you assume that you already know the future. So, your model prevents you from considering automatically that your forecast might be wrong. But with probabilistic forecasts, it is a given. The forecast already tells you that demand can be anywhere within this range, and you have the probabilities. So, not only will you compute when you optimize the decision, the cost of change if there is any change, but also the fact that change might be needed in the future.
Conor Doherty: Well, gentlemen, again, I am mindful of time and there’s still at least four more questions to get through. So, but in the spirit of parity, I will pose this next one to you, um, Jeff. So would FVA be a good approach to help ease the pressure to adjust statistical models to meet budget expectations? At second part, how can a supply chain data scientist…
Sorry, I should actually pose this. Forgive me, I should actually pose this to Joannes. Sorry. Would FVA be a good approach to help ease the pressure to adjust statistical models to meet budget expectations? And how can a supply chain data scientist navigate the politics and hierarchy when faced with such challenges? And Jeff, I will come to you for a comment.
Joannes Vermorel: Again, that’s the problem with point forecasts. Point forecasts assume that you know the future. So, if you know the future, everything, the plan, everything is a matter of orchestration, and your forecast tells you the budget that you need for everything. And that’s wrong because, first, the forecast has inaccuracies, and you’re completely dismissing the uncertainty.
A point forecast is rigidly attached to a given budget. That’s absolutely not best practice. But if we go to the world of probabilistic forecasts, then suddenly all those problems go away. What you have are possible futures, and then all the budget spending can be considered. If you tell me you have this amount of resources, then you will look at how to allocate those resources to make the most according to those probabilities on the future.
And by the way, we have an example of that. If people want to have a spreadsheet, they can look at prioritized inventory replenishment on our website. It’s an Excel spreadsheet that demonstrates that with a probabilistic forecast, you can pick whatever budget you have, and it will give you the best you can get for your budget. Again, that’s a problem of point forecasts being defective as a paradigm. Classic time series are defective as a paradigm, and you end up with a lot of problems that you would not have, even conceptually, if you were not attached to a defective paradigm.
Jeff Baker: Just one kind of off the cuff about a defective paradigm that has been working really well for a lot of companies for dozens and dozens of years. So, I wouldn’t classify it as defective. Now, with respect to the budget, you know, the forecast and the budget are a huge problem because budgets are aspirational, right? The best thing that we can do, and again, this is from a sales and operations standpoint, I always have my clients forecast out 18 months. So, guess what? Mid-2025, we’ve already started to look at 2026. We can understand what the most likely view of 2026 is going to be.
Then from there, you can layer on your aspirational goals on top of it. No problem with that, but then you force the conversation: what would have to happen for us to increase our sales by this much, to decrease our costs by that much, right? That’s part of the conversation. So, the best way to do that is to base your budget on a solid statistical forecast or whatever consensus forecast that you’ve got for the future. Base your budget on that. Then, if you’ve got deviations from that, also have sales, marketing, and manufacturing plans to shore up those gaps. Worst practice, hope everyone agrees with that, is the Ivory Tower approach. This is what our budget is, finance puts it in, and it reminds me of the old calculus problems where the derivation is left to the user, and we’re like, “Oh, how the heck are we going to do that?” So, that’s a worst practice, just putting those plugs in there.
So, that I think is a way to make sure your original budget is based in reality with plans to get there. Then the second thing I’m saying, FVA is perfect for that. This is what our statistical forecast is. The budget, there is no FVA to budget, right? That’s aspirational. But you point out where those gaps are and force that conversation on how to address it. Yes, I think all of us would agree another absolute worst practice is that the forecast equals the budget. That would drive me crazy.
Conor Doherty: Well, thank you, Jeff. I will push on to the next comment, next question, excuse me, and this one is directly to Joannes. This is from Timur, I believe. I find FVA useful but sometimes limited in scope. Would you agree with Jeff’s metaphor of FVA being like a hammer, or do you see it differently?
Joannes Vermorel: I mean, yes. I’m not too sure. My criticism of FVA is really not that it is a hammer. It’s really that I believe it operates within the wrong paradigms. It is strange and due to the wrong paradigms. Again, time series, classic point forecast, the fact that it’s rooted in accuracy and not measurements in percentage of error, not dollars of error. You see, there are plenty of paradigmatic problems, and that’s when I say those things are defective. I stand by my point on this front. The sort of friction that companies endure in practice is more like the manifestation of all those problems.
If you want a piece of anecdotal evidence that those paradigms are defective, since the 1970s, the supply chain theory was promising complete automation of those decision-making processes. That was, by the way, the pitch by Oracle from the ’70s: you will get completely algorithmically driven inventory management. This did not happen, and it failed again and again. The point that I’m making is, I believe, and I have a lot of supporting arguments, that it reflects the fact that the paradigms, the mathematical tools, the instruments are just wrong. So, you end up with all sorts of weird issues. To cycle back on this hammer thing, yes, it feels sometimes like trying to use a hammer to tighten screws. It’s not that the hammer is bad in itself; it’s that you’re trying to do something for which the hammer is not the right tool.
Conor Doherty: Jeff, I will give you space there for a comment if there’s anything you wish to add.
Jeff Baker: No, other than the fact that the analogy I made about the hammer was that you have to use the tool correctly. So, FVA is a tool. If you’re not using the tool correctly, you’re not going to get value out of it. That was my analogy.
Conor Doherty: Thank you. I’ll push on. This is from Marina. There’s no clear designation of who this is for, so I will go to Jeff first. With AI rapidly advancing and the possibility of having all data available in the near future, do you think FVA will become more effective or even more essential?
Jeff Baker: More effective or more essential? That’s an interesting question. What I think, as AI starts to become more and more prevalent, with more and more data, we have to learn how to contextualize that and make decisions with it. You could almost envision a case far down the future where we’re able to contextualize all this information, large language models put it in there, and, to your point, Joannes, actually start to systematize that stuff. That is potentially something that would lead to FVA being like, “Well, okay, we’re making all these decisions, and they’re great decisions.”
Maybe then you’re left with very edge cases of significant events, like a competitor going out of business or doing a promotion at the same time where there’s a downturn in the economy, at the same time there’s an increase in your seasonality. You might be able to start to get those. So, I think there’s probably going to be some low-hanging fruit where AI is going to be fantastic at taking all that data, understanding relationships, and also understanding the noise that exists and what’s valuable and what’s not valuable. So, I could see it potentially even being a little bit less valuable in the future as we do start to automate.
Conor Doherty: Well, thank you, Jeff. Joannes, AI and FVA, the future: yes, no, good, bad?
Joannes Vermorel: I think, again, we have to step back on artificial intelligence. Let’s consider that in terms of mass of information, transaction systems hold gigabytes worth of information. I mean, gigabytes if you want to be really fancy. The transaction data is gigabytes of information. The people, in comparison, is kilobytes of information. What people have in their heads is not a huge amount of information. People are not like, you know, they are not mentats as I would say in the Dune series for the people who like that. So, that means that 99% of the problem of getting those correct decisions is about, in terms of mass of information, taking the mundane transaction information that you have and generating the decisions out of that. That’s 99% of the bulk of the information here.
For this part of the problem, which is processing numerical data found in tabular forms, I do not see the large language models being very relevant. Yes, they can be very effective coding tools so that you can actually use the tool to write the code for you. That’s one thing. But can they do more than or do other things than write the code? It becomes very unclear.
Now, for the kilobytes of information that people have in their heads, can they actually use it to take this information and just convert it into bridging the gap towards something that would be quantitative? I would say yes, but the challenge is still engineering this thing end-to-end, this pipeline to have this automated predictive optimization. That is a real challenge, and here we are stressing the limits of human intelligence to do it right. So, I do not see in the short future that the AI we have are really able to do that any more than, let’s say, an AI company replacing Microsoft by having an AI rewriting a version of Microsoft Word. It’s the sort of thing where the AI can help you to write the code, but it will still require a lot of human intelligent supervision, at least with the current paradigm that we have with LLMs. They are not super intelligent yet.
Conor Doherty: Yet well, there are still two questions to go. So Jeff, if you want, we can just push straight to the next one.
Jeff Baker: Perfect.
Conor Doherty: Thank you. This one is for you. I will go first. Um, this is from Mark. How can forecast confidence intervals be effectively translated into a single discrete number such as a purchase or work order? Would post-forecast analysis be the best approach to determine that number?
Jeff Baker: Yeah, so you know, I have no problems with probabilistic forecasting or the intervals, but at the end of the day, you need to put a number into your ERP system, into your scheduling system. You need to make a choice on the number. Now, where the interesting conversation becomes is what is the variation in that number? Is my system robust? What happens if it was up, you know, 20%, 30%, whatever? But those become scenarios that you can start to investigate, right? So yeah, that would be my response.
Conor Doherty: Thank you. Joannes, anything you wish to add there?
Joannes Vermorel: Yes, again, if you think that you need to approach the problem from a paradigmatic perspective where you have to make a decision on inventory quantities, and thus the forecast must be a number, then you have this problem that uncertainty does not exist. It cannot exist. And that’s why people, again, we are going back to this incorrect perspective, are approaching the problem by asking, “What about those confidence intervals? What do I do with them? Oh, I need to think of a number.” And that’s a paradigmatic trap. You’re trapped into concepts that are defective.
So, if you approach the problem with the current paradigm where a demand forecast will give you a straight number for inventory decision, the only way to solve it is to preserve even in the decision the fact that the decision reflects all the possible scenarios. It’s not that you pick a number for the inventory; you pick a specific demand. No, your inventory decision should reflect all the possible scenarios with the priorities and express the various risks in monetary terms. So, that’s why it’s a different way to think about it. And back to the question, if you stay in the sort of time series paradigm, you don’t know what to do with your confidence intervals. They don’t fit in the system.
Jeff Baker: I would argue that you do know exactly what to do with these confidence intervals because if I’ve got the forecast error at my lead time, that goes into my safety stock calculation. Whether you agree with that or not, there are very well-defined safety stock calculations to take in demand variability, lead time variability. So, if we take that, now we have, and again, I don’t want to blow this up and go sideways into an inventory management theory, but there are statistical safety stock calculations that are perfectly happy taking a point forecast along with a standard error of your forecast at the lead time and giving you a safety stock number. We can argue all day about what the distribution of that looks like and if a normal distribution is the proper one, but that is how that deviation in forecast error is addressed in a majority of the companies that I have worked for and a majority of the companies I’ve heard presented at conferences.
Conor Doherty: I will push through to the last question. How should machine learning models handle adjustments for known events such as a large new customer that are not included in the causal factors of the statistical forecast? Joannes, we’ll go to you first.
Joannes Vermorel: So again, here we are touching the problem of dealing with information where you don’t have a clear algorithmic structure, and you can’t invoke machine learning as a buzzword to say, “Oh, the tech is going to do something for me here.” Here we are entering the territory of informal forecasting. For the audience, I really recommend the book of Philip Tetlock called “Superforecasting.” If you do not have a clear baseline, what do you do?
Machine learning provides no answer to this question. Machine learning, at least the classic machine learning paradigm which is supervised learning, input-output, does not provide an answer to this question at all. I do believe that if you read what the Good Judgment Project has done and the techniques they have developed, they have developed techniques of higher intelligence. What I mean by higher intelligence is that to apply these techniques, we are looking at something that exhibits the same sort of fuzzy intelligence like an LLM or above.
They have identified techniques by talking with super forecasters, people who have pioneered and demonstrated superior forecasting skills in this sort of situation, and they looked at what sort of techniques all those people had in common. Surprisingly, they all came up with kind of the same set of techniques. Long story short, there are techniques, but they require a lot of judgment. Based on those empirical results, I do not think that if you have such a situation, you can get away with just a machine learning algorithm.
You need to build a case, a little bit like a business case in business where you have to engineer your own assumptions, decompose the case, assess the various factors, and try to come up with something that is reasonable. But using words like “reasonable,” what does that mean formally? It’s very difficult, and yet people can actually look at a rationale and agree on that.
So, my take on that would be, do not expect that classic machine learning will be an answer. LLM would maybe be a supporting tool to help you build this sort of reasoning, certainly to brainstorm on how to even decompose and quantify the various factors in the problem. But at the end of the day, it would be a forecasting expert who really looks at that and makes a judgment call on the ad hoc numerical modelization. That would be the best practice, at least based on those empirical studies from the Good Judgment Project.
Conor Doherty: Jeff, your thoughts?
Jeff Baker: Yeah, I mean, I think we need to be careful with too much of this. Whether it’s machine learning, whether it’s AI, throwing too much stuff in it, right? Because then we might start to confuse correlation with causation. One of the classical learning experiences I had during one of my classes was a multiple regression model. We kept throwing factors, throwing factors, the fit got better and better. We threw in price, and all of a sudden, guess what? If I increase the price, I’m going to increase the sales, right? Totally counterintuitive. The causal was obviously off, but the correlation was better.
So, we need to be really, really careful with that because at some point, we begin to model noise. We begin to ask whether it’s machine learning or AI, we begin to ask it questions. The problem is, with AI, it’ll answer with all the overconfidence of a 5-year-old toddler who believes in the Tooth Fairy. This is exactly what happens. So, we need to be careful. And that’s where I would agree with Joannes. You’ve got to contextualize that, find an expert who can get that. Don’t try to build a perfect model because at some point, your results are not going to be what you anticipate.
Conor Doherty: Well, at this point, gentlemen, there are no further questions from the audience. But one last question that can serve as a closing, a curtain call. I’ll come to Jeff first. The topic of the debate was FVA: Is it a best practice or a time-waster? Now, you’ve both heard each other for almost 80 minutes. Jeff, how do you feel about that proposition now? Listening to you make your points and your rebuttals to Joannes, you do seem quite charitable and accepting of several of the points that Joannes is making. I’m just curious, how do you reconcile that Joannes may be correct on some or all or many of his points? How do you reconcile that with the position that FVA is still a best practice?
Jeff Baker: Yes, yeah. I appreciate this open exchange of ideas, and I think, yes, I can still argue that currently, Forecast Value Added is a best practice. We need that in the present.
Barina asked a great question earlier—in the future, as technology advances, FVA may become less and less of a critical technique. Maybe we can quantify this, maybe we can start to parameterize things, put them in a model, and make those decisions automatically. However, I think we are always going to need a process where we establish collaboration, understand sales, marketing, and external influences.
Do I see its role potentially diminished? Yes. I talked about that high-value, high-variability quadrant, and I can definitely see that becoming less relevant in the future. But for now, I still see FVA as a best practice and believe it will continue to be a best practice well past my retirement.
So, in the near future—yes, FVA remains important. In the long-term future, I think Joannes has a very nice vision of what could be, and I don’t see many issues with that. I’d say we’re 50% aligned on a lot of these ideas.
Conor Doherty: Well, thank you, Jeff. And Joannes, are you unmoved after all that you’ve heard? In simple terms, is it an immovable object and an unstoppable force essentially?
Joannes Vermorel: I mean, I would say, you see, if we go back to what I called the mainstream supply chain theory, and FVA is part of that, it is fairly consistent. I grant that.
So indeed, if you accept all these ideas, all those paradigms, and everything, then yes, from this perspective, it doesn’t look bad. I would still be a little bit careful with the amount of bureaucratic overhead that you can generate.
Again, involving a lot of people is a recipe to consume the time of a lot of people. As soon as you create some kind of transverse entity, because that would be one that is going to challenge everybody, that can still create a lot of busy work.
I have examples in my network of people who are doing an immense amount of busy work on these sorts of things, especially everything that is peripheral or supportive of S&OP.
Now, Lokad has been operating for more than a decade on different paradigms. There is a world series of lectures, by the way, Jeff, that have almost 100 hours worth of YouTube series to back this alternative vision.
But the interesting thing is that when you go for different paradigms, different tools, different instruments, then the vast majority of those problems just go away. You have new problems, completely different, but still operationally you end up with something very strange.
It’s supply chains where the quasi-totality of the decisions are done automatically. And by the way, we had this experience which was very strange in 2020-2021 where we had dozens of clients who sent all their white-collar workforce home.
We had a client with over a billion euros worth of inventory who sent all their workforce home for 14 months without access to the internet because they wanted to get government subsidies. Their supply chain kept running at about 80% of their nominal capacity with Lokad taking all the decisions without even the supervision.
Normally, we are generating the decisions, but we have a lot of people who validate that what we are generating is okay. My take is that if you can run multi-billion hyper-complex supply chains for 14 months without all the people doing this micromanagement, it really begs the question of the added value of all those people and what we should even expect from automation.
I think people are talking about AI and all sorts of things, but the way I approach it is that you don’t necessarily need to have a super fancy hyper-parametric trillion-parameter model to achieve automation.
My conclusion is that I believe FVA belongs to a world where it’s really about people directly piloting the supply chain. I approach it from a perspective where the machine pilots the supply chain, and people pilot the machine, not the supply chain.
Conor Doherty: Well, thank you. As a custom here, we like to give the final word to the guests. So, Jeff, thank you very much for joining us. If you have any closing comments you wish to make, feel free.
Jeff Baker: No, just thank you very much for the opportunity. I appreciate the conversation, very well moderated, Conor. Thank you very much.
I aim to please. Appreciate the audience participation in the questions there. I think it’s always interesting when two opposing viewpoints come together because I think both of them come away a little bit better for the exchange. Quite honestly, I appreciate the opportunity. Pleasure speaking with you.
Joannes Vermorel: Thank you, Jeff.
Conor Doherty: Joannes, thank you very much for your time. Jeff, thank you very much for yours. Thank you all for watching. We’ll see you next time.