00:00:00 Introduction to the interview

00:02:15 Probabilistic forecasting and supply chain optimization

00:04:31 Stochastic optimization and decision making

00:06:45 Ingredients for stochastic optimization: variables, constraints, loss function

00:09:00 Perspectives on modeling and optimization

00:11:15 Constraints and worst-case scenarios in optimization

00:13:30 Uncertainty, constraints and poor solutions

00:15:45 Deterministic optimization and varying scenarios

00:18:00 MRO space operations and inventory optimization

00:20:15 Uncertainty in lead times and repairable parts

00:22:30 Consequences of missing parts and classic approach limitations

00:24:45 Stochastic elements and human-based stochasticity in supply chain

00:27:00 Repairing an aircraft engine and sourcing parts

00:29:15 Inventory optimization and probabilistic bill of material

00:31:30 Policies for supply chain optimization and reactivity

00:33:45 Scalability issues and convex functions in supply chain

00:36:00 Problem relaxation and constraints in supply chain problems

00:38:15 Local search tools and feasible solution in supply chain

00:40:30 Meta heuristic genetic algorithms and scalability challenges

00:42:45 Mathematical optimization as a scalability problem

00:45:00 Lokad’s development of stochastic optimization technology

00:47:15 Interdependencies in supply chain and solving problems with money

00:49:30 Shelf limit constraints and yogurt inventory example

00:51:45 Summarizing stochastic optimization and uncertainty

00:54:00 Role of solver in supply chain optimization

00:56:15 Clarifying the term ‘solver’ and computation of final decision

00:58:30 Challenging the solver’s solution and potential shortcomings

01:00:45 Key takeaways: importance of stochastic optimization

01:03:00 Ignoring uncertainty in supply chain and benefits of a good solver

01:05:15 Dependencies and interdependencies in non-trivial supply chains

01:07:30 End of interview

### Summary

In a discussion between Lokad’s CEO, Joannes Vermorel, and Head of Communication, Conor Doherty, the importance of stochastic optimization and probabilistic forecasting in supply chain management is emphasized. Vermorel explains the concept of stochasticity, where the loss function is uncertain, a common occurrence in supply chain scenarios. He outlines the three ingredients of mathematical optimization: variables, constraints, and the loss function, and explains that in stochastic optimization, the loss function is not deterministic but randomized. Vermorel also discusses the scalability issues of mathematical optimization techniques for supply chain, which have been a roadblock for four decades. He concludes by emphasizing that stochastic optimization is a crucial aspect often overlooked in supply chain textbooks.

In a conversation between Conor Doherty, Head of Communication at Lokad, and Joannes Vermorel, CEO and founder of Lokad, the duo delves into the intricacies of stochastic optimization and probabilistic forecasting in supply chain management. Vermorel emphasizes the importance of having a clear, quantified anticipation of the future, which is crucial for optimizing a supply chain. He introduces the concept of stochasticity, referring to situations where the loss function is uncertain or noisy, a common occurrence in supply chain scenarios.

Vermorel explains that the loss function is expressed through economic drivers and is fine-tuned to reflect the dollars at stake in the business. He argues that even with effective probabilistic forecasting, optimization is still necessary due to the inherent uncertainties and nonlinearities in supply chain management. He outlines the three ingredients of mathematical optimization: variables, constraints, and the loss function, and explains that in stochastic optimization, the loss function is not deterministic but randomized.

Vermorel further elaborates on the concept of constraints in mathematical optimization, which are a way to express unacceptable solutions. He emphasizes that these constraints should align with the business strategy, just like the loss function. He also notes that constraints are not mathematically true or false, they just exist. For example, a maximum capacity of 100 units is not mathematically valid, it’s just a given. He explains that in a stochastic world, constraints become more subtle mathematically and may not always be enforced due to variability in factors like delivery times.

In the context of a Maintenance, Repair, and Overhaul (MRO) company, Vermorel explains that inventory optimization is crucial. The bill of materials is probabilistic, and if one part is missing, the component can’t be repaired. Lokad uses probabilistic forecasts to anticipate the arrival of components and the parts needed for repair. Decisions about part purchasing need to take into account parts that are coming back and potential scrap rates. The goal is to solve part purchasing problems.

Vermorel emphasizes the need for a stochastic optimization approach that considers parts not in isolation but together. The economic value of acquiring certain combinations of units can be vastly different compared to analyzing parts one by one. He confirms that the stochasticity of human ability to perform repairs can be accounted for in this model.

Vermorel also discusses the scalability issues of mathematical optimization techniques for supply chain, which have been a roadblock for four decades. He explains that the problems faced in supply chain are not nicely behaved, and the scalability of these techniques depends on how much leeway remains once all constraints are applied. He notes that solvers that approach solution space elimination techniques perform poorly beyond a thousand variables.

Vermorel explains that Lokad had to develop a new class of technology for stochastic optimization to address these problems at a scale that makes sense for supply chain. He agrees with Doherty’s summary that stochastic optimization is a more flexible and reactive way of optimizing decisions compared to traditional mathematical optimization. He also mentions the need for a software component, a solver, to address these problems.

Vermorel confirms that the solver generates the final decisions proposed by Lokad to its clients. He explains that there are different ways to approach optimization, including heuristics, but the solver is the tool that generates the solution given the forecast. He explains that ’numerical recipe’ refers to the chain of processing from data preparation to result generation, while ‘solver’ refers to the computation of the final decision, which takes the forecast as input.

Vermorel concludes by emphasizing that stochastic optimization is a crucial aspect often overlooked in supply chain textbooks. He criticizes established players for selling solvers for deterministic optimization problems, ignoring the uncertainty inherent in supply chain problems. He highlights the benefits of a stochastic solver, which allows for a holistic view of the supply chain and its interdependencies.

### Full Transcript

**Conor Doherty**: Stochastic optimization is the tool through which economically viable decisions can be quantified and ultimately optimized. Here to discuss its importance and importantly how it works is Lokad founder, Joannes Vermorel. So Joannes, I think most people have heard the terms optimization many times from us in different contexts and it’s usually used in the same breath as probabilistic forecasting because ultimately those are the two key tools that we use. So before we get into all the mathematics of stochastic optimization, what is the executive level summary of both of these tools and why they’re so important?

**Joannes Vermorel**: Probabilistic forecast is really about having a clear, quantified anticipation of the future. So if you want to optimize your supply chain, you need to have some information about the future. So you need some quantified information. Probabilistic forecast is about knowing the future but also knowing what you don’t know, quantifying the uncertainty. That’s a part about understanding what lies ahead so that you can make more informed decisions.

The second part is getting to the better decision. What does that mean? Better according to what? And here there is optimization in a loose sense which just means making things better. But there is also optimization in the mathematical sense. In the mathematical sense, it would mean to find a solution to your problem that according to a numerical criterion gives you a loss that is smaller. For the sake of this discussion, we’ll probably stick to the loss perspective where you want to just minimize the loss.

So the optimization in this sense is a purely mathematical operation. It’s about, for a given problem, finding the solution that minimizes a loss function that you’ve given to yourself.

**Conor Doherty**: So for example, try not to lose too much money with the outcome of this decision.

**Joannes Vermorel**: So in the case of a supply chain, the basic decision would be how many units do I order? And then for any quantity that I pick, there is an outcome with carrying cost, stock out penalties. Obviously, there is also the gains that I realize by selling stuff at a profit that would come as kind of a negative loss so that it would you could minimize further by actually selling your products.

**Conor Doherty**: Okay, so I think most people will follow that when you’re talking about constraints. But where’s the stochasticity? I mean, you can optimize but where’s the stochasticity in what you just said?

**Joannes Vermorel**: So the stochasticity refers to classes of problems where we have an uncertain loss function, where the loss function is noisy. So if we go to this very classical inventory replenishment problem, I pick the quantity I want to reorder today but whatever score or loss I get from this decision, I will only know later. And for now, I am just stuck with uncertainty in what will be the final loss. And thus, when you have a situation where your loss function is fundamentally not known reliably in advance, you end up with a stochastic optimization problem as opposed to classic deterministic optimization problems where everything is perfectly known.

If we want to do component placement, so you have you know you just want to fit different components inside let’s say a box taking into account all the physical dimensions of the various components, all of that is completely perfectly known so this is a problem that has no uncertainty. It can be very difficult to find the complete combination, but unlike supply chain situations, there is no uncertainty. Supply chain situations are, I would say, ultra-dominantly stochastic situations where there is some uncertainty.

The probabilistic forecast actually embeds this uncertainty, but it doesn’t solve anything. It just tells you this is the future and this is the uncertainty. It’s not a decision-making process or any kind of decision generation process either. That part is the optimization, and most specifically, a stochastic optimization.

**Conor Doherty**: And how does one sort of fine-tune that loss function then, in light of all that stochasticity?

**Joannes Vermorel**: The loss function is very straightforward. At Lokad, we express that through economic drivers, so it is just about minimizing the dollars of error. There is a lot of know-how that goes into fine-tuning this loss function to make it really adequate to the business, so that it really reflects the dollars at stake. We have two orthogonal concerns. One is finding a loss function that is completely faithful to the business strategy. But that doesn’t require any specific numerical optimization skills. It’s just about, do I have something that reflects my business? It only requires basic arithmetic operations. It’s not something fancy. The other one is whatever we need to do stochastic optimization for any given loss function.

**Conor Doherty**: Well, it occurs to me that when you do very good or very effective probabilistic forecasting, couldn’t you skip the optimization part entirely? Because if you knew better what you were going to do or what demand was going to be, why do you need to balance all of these other things? Couldn’t you just order the amount that you’ve identified or the most probable return?

**Joannes Vermorel**: If you had perfect knowledge about the future, then indeed the decision-making part is not that complicated. Although even in this sort of situations, you would have to deal with MOQs and flat ordering costs, flat transportation costs. Even if you were to perfectly know the future, you would still be facing quite a few nonlinearities that prevent you from having a trivial immediate solution.

But in fact, the situation is much worse than that because you only have very imperfect knowledge about the future. It’s a completely unreasonable assessment to say we are going to ever get a forecast that eliminates the uncertainty. The uncertainty can be reduced, but it is to a large extent irreducible. Thus, you’re stuck with this uncertainty and there is no obvious solution. There is no evident solution to be obtained.

**Conor Doherty**: Okay, well then circling back a little bit then to stochastic optimization. You talked about the ingredients. You listed the loss function. The loss function can’t be the only ingredient that goes into stochastic optimization. So what is the breadth of ingredients required for this?

**Joannes Vermorel**: When we approach mathematical optimization, there are three types of ingredients. The first are the variables. The variables are essentially what you can pick. This is what defines your solution. Your solution is a specific combination of variables.

If you want to think of a discrete problem such as finding the right combination for a padlock, you have let’s say four variables. Each variable has 10 positions and every combination defines a potential solution. You want to find the one solution that clicks and opens.

The first ingredients are the variables. In supply chain, we are frequently referring to discrete problems because the variables are integers. You can reorder zero unit, one unit, two units, three units, etc., but you can’t really usually reorder 0.5 units. The discrete aspect makes it more difficult because you can’t just easily go from one solution to another. You have a lot of severely nonlinear patterns that happen.

For example, going from zero to one unit is very different as opposed to going just from tiny fractions. But also, you can have things like MOQs, minimal order quantity, where you have to jump 100, let’s say, units ahead to even get to a solution.

**Conor Doherty**: It’s a constraint.

**Joannes Vermorel**: And that brings me to the second thing, which are the constraints. So typically, you list the variables and you list the constraints. The constraints are a set of mathematical expressions over the variables that tell you whether this is an acceptable, a feasible solution.

So that would be, in a replenishment case, we can reorder how many units we want, but there is a finite capacity for the shelf. So the shelf can only have so many units. There might be a max capacity for the day for how many goods can be received, processed by the store, or ingested by the warehouse, or by any other location in your supply chain.

And you have tons of constraints like that. You can have a constraint that says I need to have at least this, this, and that so that I can present a nice looking shelf. That would be a merchandising constraint in a store, etc.

So we have the variables, we have the constraints, and the third one is the loss function. The loss function gives you, for any solution that satisfies all the constraints, here is your loss and you want to just minimize that. That’s just a convention.

And those three elements together define the general mathematical optimization framework. The reason why mathematicians during the last 100 years have used this framework is that it’s actually very general.

And here we are even looking at the twist of stochasticity, where we are adding a fairly unusual twist to the problem, which is to say the loss function is not deterministic. It is not a classic mathematical function where you give an input and then you have a guaranteed output. We say you give an input and then you have a randomized output.

**Conor Doherty**: Just to come back to the idea of constraints, and correct me where I’m wrong, you could subdivide those constraints. So for example, you listed capacity, you know the capacity of your shelves, of your warehouse, how much can be processed in a day. Maybe that can be ramped up a little bit in terms of processing, but again, shelf capacity is finite. That’s not changing anytime soon. But MOQs could change. I mean, you could renegotiate. So there’s a bit of fluidity here to some of these constraints. Is that the kind of stochasticity that you’re talking about and that is factored into decisions?

**Joannes Vermorel**: Not really. The constraints are literally just a mathematical way to express the idea that some solutions are just not acceptable. Again, just like there is a real deep question, but that’s not a mathematical question, of the adequacy of your loss function with regard to your business strategy, the same goes for your constraints. So you would say, well, is this constraint really a constraint? Can I actually invest to lift this constraint or can I just think differently about the business?

Again, the idea is that you will have really two different perspectives. One is the modeling approach where you say you really want to have variables, loss functions, constraints that are faithful with regards to your business. And that’s fundamentally a non-mathematical, non-algorithmic undertaking. It’s really about understanding whether those things are true to the business.

But not true in the mathematical sense. A constraint is neither true or false in the mathematical sense. It just is. It’s literally something where you say max capacity is 100 units. There is no mathematical validity in this statement. This is just a given. A mathematician can say, yeah, you picked 100. Mathematically speaking, I cannot tell you whether 100 is a good number. I can only tell you, for example, that if you tell me that there is a constraint that says it should be less than 100 units and then another constraint that says it should be more than 100 units strictly, then I can tell you that there is no solution.

Mathematics do not pass judgment on the sort of inputs that are given to them. It’s just about having internal consistency. But then, the interesting thing about the stochastic aspect in the stochastic world is that suddenly constraints become a lot more subtle in a mathematical sense.

So let’s see what does it mean to have a stochastic problem. In the old non-stochastic optimization perspective, we had a clear separation between this solution works or it doesn’t, due to constraints, not even taking the loss into account. But let’s see what does that mean in a world where our loss function is stochastic. Let’s say, for example, we have a warehouse and we can pass replenishment orders and we have an inbound daily capacity for the warehouse.

So every single day, we have a limit on how many units we can process from the suppliers. And so what is usually done in this warehouse to take that into account is that we spread out the replenishment orders, taking into account the lead times of the suppliers, so that we don’t have all the suppliers delivering everything on the same day and going beyond the capacity, the daily capacity for the warehouse to receive the goods.

Looking at it from a stochastic optimization perspective, I choose my quantities and then I have variability in the time for the deliveries. This means that if I am extremely unlucky, I might end up with a decision that is perfectly fine. Everything has been spread out in terms of replenishment orders, but my very early orders are late and then my later orders are even in advance. By a random fluke, all of that ends up collapsing on the same day and then on this day, I overload my warehouse. I am beyond the nominal reception capacity.

There are plenty of situations where it is not possible anymore to have a solution that is perfectly feasible. That means that you will live with situations where there is a probability that your constraints will not be enforced. This is just the way it is. This is a mathematical statement that I’m making. Due to the nature of the loss functions and the fact that your decisions may have consequences that are themselves non-deterministic, such as you decide on the quantity but then you don’t have complete control on the day of delivery.

Even if you’ve made the very best decisions, there is a possibility to have inventory quantities that collide. The only way to make absolutely sure that it would never happen would be to make sure that in your entire pipeline of pending orders, you never exceed what you could receive on any given day, taking into account the worst-case scenario where all your pending orders would be delivered on the same day. This is obviously extreme.

Businesses have to deal with this sort of situation where yes, there is one chance out of 10,000 that I will exceed my capacity. But in reality, the idea of a constraint being absolute is more like a mathematical idea. In practice, if you exceed your capacity, there will be a cost.

When we go into the stochastic optimization perspective, we see that fundamentally, constraints have to a large extent become part of the loss function. Otherwise, or approach in a way that says I am okay with a certain degree of tolerance to accept that the constraints will be violated with a low probability. For most of the interesting situations in stochastic optimization problems facing supply chain, there will be a residual small probability that your constraints will be violated.

**Conor Doherty**: When you talk about that tolerance, that’s the feasibility that you’re talking about, right? And by extension, that’s measured in?

**Joannes Vermorel**: That’s measured in whatever capacity you’ve set for your problems. If you say I want to replenish a store, you take your decision so that it will fit the capacity of the store. But what if the store, for some fluke, sells nothing on a given day? Let’s say you’re doing fresh food, you decide today to replenish milk, and you take this decision without having the sales of the day. But you still assume that some units will be sold. And if by a random fluke, on this store that normally on every single day sells 80% of its stock of fresh milk, sells nothing today, just pure fluke, then you might end up replenishing and going over your constraints.

The key insight is that as soon as you’re in a situation where there is uncertainty, not only does your loss function vary, but also the satisfiability of the constraints varies too. For most of the interesting situations, you end up with constraints that will not be perfectly satisfied. It would be a mistake to say I only want to go for situations where I’m guaranteed that my constraints will be satisfied. Why? Because mathematically you will get solutions, but they will be very poor. That will be solutions like don’t order anything, don’t do anything, just let it be. And that will give you satisfaction in the sense of no constraint violation, but it will certainly not give you profit.

**Conor Doherty**: I do want to return to the example that you gave about receiving orders, staggering orders, and trying to do it in the most economically viable way. That would be, I guess, the stochastic approach. How did previous models that were based on mathematical optimization, so lacking the stochastic dimension, approach the problem that you just described?

**Joannes Vermorel**: They just completely ignore the problem. It does not even exist in classical deterministic mathematical optimization. The varying consequences of your decisions are not even taken into account, they just don’t exist.

There are ways to mitigate the case. One simple way would be to say, “Well, I macro expand the definition of my problem by saying, instead of looking at one optimization, I say I’m going to jointly optimize, let’s say, 100 distinct scenarios. And I say that my decision has to be common for all possible futures, and I have to make sure that for all those varying scenarios, all my constraints are still satisfied.”

So how do you go back to a deterministic case? Well, you can just say, “I can copy my situation 100 times that represent 100 variants of the situation that would be at 100 trajectories, and then optimize the macro expanded problem that has 100 instances at once.”

And I can do that with a classical solver, but then it only makes worse one problem that already prevents today’s supply chain practitioners from using actually mathematical optimization tools in the first place. And this problem is scalability.

**Conor Doherty**: Okay, well I think it would be a good point to actually start applying that to a specific vertical so people can actually start to get a sort of a three-dimensional understanding of how the theory then interacts with real complexity, real constraints, real variables. So if you take, let’s say, an MRO company, typical size services a typical normal sized fleet, let’s say 10 planes, each of which has, I don’t know, a quarter of a million parts, how would these three ingredients fit into a stochastic optimization for an MRO versus an old school mathematical one that doesn’t work according to you?

**Joannes Vermorel**: Let’s see what sort of problems we have for MRO. We want to optimize inventory so that you will be able to perform your repairs. You have a component that comes in, it is unserviceable, you start your repair, you discover the parts that you need. So you have a bill of material, but the bill of material is probabilistic, so it’s uncertain. There is a stochasticity that happens here. You have the uncertainty of will I get components to repair, that’s fluctuating demand. But then once you get the component, you will actually discover what you actually need to repair the component.

The problem is that if one part is missing, you can’t repair the component. So you see, it’s not being able to have like 90% of the parts doesn’t solve the problem. You’re stuck. You need all the parts for the repairs or you won’t be able to repair the component at all.

At Lokad, we started years ago to do probabilistic forecasts for those situations. Probabilistic forecast is to anticipate with the proper probabilities the arrival of components to be repaired, anticipate the probability distributions of the parts that you will need. So that’s going to be this probabilistic bill of material. And now we have to decide what do we reorder, what are the parts that we want to have in stock and in which quantities. And then for those parts, there is also the uncertainty of lead times. And some of those parts are repairable.

For some of them, there is uncertainty that not only will they have their turnaround time because you take out of your component a part, but this part itself is repairable. So you can have it repaired and put it back, or most likely, you will take the parts, have this part repaired, but put another one into the component because you don’t want to wait till the part comes back to be repaired.

But that means when you want to decide if you need more parts, you have to take into account the parts that will be coming back, which are already in the process. So it’s not just the parts that you have, it’s also the parts that are coming back. And then there are other factors such as scrap rates where you try to repair, but the repair may not work. So you thought that you had like 10 parts that were coming back, but you only get 8 because two were scrapped because repair was not possible.

That’s a part of the forecast, all the uncertainties. Now, the decisions you want to make ultimately are about solving your part purchasing problems. The question is, should I purchase more units of parts, taking into account all the parts that come back and everything else?

A part will have an economic value if it contributes to a repair. But just like the padlock that I mentioned earlier, you have this click effect where if you have all the parts, you can repair and all those parts have value. But if you’re missing parts, all of that is just dead weight. The parts that you have only serve if you have the complete combination. If you have the combination minus one, then you have delays for your clients.

In any case, the inventory only serves its purpose if you have everything. And if you don’t have everything, then the question will be, “How much time will it take to get because you will discover at the last minute that something is missing and how much time does it take for this something to become available if you pass an order very late for that?”

If we are in the simple settings where all my SKUs are strictly independent, different clients, different everything, then for every stock position, I can compute an economic score and say according to all the probabilities, I can compute the expected dollars of return of having this unit in stock.

But for the MRO, I can’t have this approach because there are dependencies across part numbers. If I decide to purchase one unit, on its own it may have no value. But if I purchase another unit, then I can complete a repair and then both parts have a lot of value.

Until you have all the parts that you need for your probabilistic build of material, the parts that you have are basically kind of useless. Their economic value is essentially when they are together. Apart, they have no value. So, whatever you use to do this stochastic optimization problem, you need to be able to investigate your decisions where you’re not purchasing parts one by one or considering parts that you have in isolation but together. The combinations of certain units to be acquired can have a widely different economic value compared to a standalone analysis where you just look at the parts one by one.

**Conor Doherty**: I’m going to try and follow this thought and bear with me, but when you describe all of these stochastic elements, you’re saying the part, let’s say the turnaround time to get that part could be one day, could be half a day, could be three days, could be four days. There is another area of stochasticity which presumably is also people’s ability to actually do the repair, like how long it takes a person once they’ve received the part, which varies, to actually do the repair. Can you account for that level of stochasticity, like the human-based stochasticity as well?

**Joannes Vermorel**: Yes, humans are just one sort of delays among the others and they can have varying abilities. For example, some operators are more talented than others and so they might even need fewer parts. Someone might manage to do a repair consuming less stuff than a lesser talented employee that just throws away things where he or she does not succeed in doing the repair.

In aviation MRO, components are highly modular so components are made of components who are made of components. So there is always the option when you don’t know how to repair and just throw away the entire sub-module and just put a new one on that is brand new in terms of replacement as opposed to just identifying the one thing that is failing and changing only that.

If you’re very good in diagnosing what needs to be changed, you will change what needs to be changed. If you’re less good you might end up changing a lot more.

But back to the case, the trick here is that when I define the solution, we have to look at it from the perspective of a policy. So that means that your solution is not necessarily just the decision that you take right now but the general principle that guides your decisions. A policy might govern what parts you have in stock but you will take into account how you react when you discover your probabilistic bill of material.

Why does it matter? Let’s say for example you want to repair an aircraft engine. There are some parts that are just at the exterior of the engine that they will be the first to be diagnosed. So as you receive your engine to do the repair, you will discover what you need for the exterior of the engine just because as you dismount the engine, that will be the first part that you touch, because an engine is just like a Matryoshka with plenty of layers that goes to the core.

If you discover a part that is on the exterior of the engine, then you will most likely have a lot of time to source this part because you will first have potentially many days to dismount the aircraft engine to the core and then gradually remount from the core outward the engine and you will need the part that fits at the exterior of the engine at the very end of the process.

So this part, I don’t even need in stock because by the time I get to need the parts, I can reorder the parts on day one and on day 60 when I actually need the parts, I have the parts readily available because my lead time for my suppliers was just let’s say 20 days.

When you want to look at your inventory optimization for parts, you need to take into account the policy which is what parts do I need to have readily available and what will be my typical reaction to when I face the discovery of this probabilistic bill of material.

If I assume a different policy such as the person doing the dismounting of the aircraft has no information at all about the stock availability or unavailability of parts, then it’s a completely different story because then I dismount the aircraft engine, remount the aircraft engine and then 60 days afterward I still discover that I’m out of stock for this one part that is missing.

So you see, the policy would express this sort of sequential decision-making situations where what sort of decisions will take place and how it will shape the final economic outcome of the situation as it unfolds.

We have two policies here, one is smart, I react as early as I have the information, I pass the purchase order. The other is, I wait until I have to mount the part and then I realize that the part is needed and then I pass the order. If you’re in the situation where the policy is the second one, then that means that it will put a lot more economic value on having the parts in stock because it’s the only way to make sure that we don’t delay the aircraft engine further at the very end because this one part is missing.

If the, the first policy, a smart one, is in place then it means that there is no economic value in having those parts at the exterior of the engine in stock. Due to the policy, I will not be missing them because the purchase orders will be passed on early.

**Conor Doherty**: What kind of technological overheads will be associated with the kind of reactivity you’re describing? So, if I work my way into the engine and discover I need a part, that completely reconfigures the entire projected timeline of this repair.

**Joannes Vermorel**: That’s a very interesting question. Scalability has been a major concern. When I say scalability, I mean scalability of mathematical optimization techniques for supply chain, which has been a roadblock for essentially four decades.

Mathematical optimization is supposedly a super established field of research and there are super established software players that sell what is known as solvers. Solvers are software designed to address mathematical optimization problems and they typically come with their own programming language. It’s typically mathematical programming languages that let you express your variable, your loss functions, and your constraints.

The interesting thing is that although these solvers have been introduced to the market four decades ago, and there are even open source solvers nowadays, these solvers are nowhere to be seen in supply chain. I believe that scalability is a big problem.

If we decompose the sort of techniques that are available on the market, we have essentially the nicely behaved loss function, the convex functions. Convex functions mean that your functions have a gentle curve and you can, when you pick a solution, gently roll to the bottom. You just have to follow the gradients and you will get to the bottom. So, those nicely behaved will be linear functions, quadratic functions. This sort of functions here don’t have any scalability problems, we can have literally billions of variables. But the problems we face in supply chain are not nicely behaved like that.

Then we have a second class of solvers, which are branch and bounds, branch and cuts, which essentially assume that the constraints predominate, that you have very few valid feasible solutions. So, you have so many constraints that you can eliminate an entire hyperplane of your space of solution. Essentially, you can slice your set of solution in half and say this half is just throw away because I know that these solutions will not ever satisfy the constraints that I have. And literally, you can throw away half of the solutions and repeat the process of eliminating half of the solutions a large number of times. And then at the end, you end up with a very small space and then you can investigate this small space quite extensively.

There are a lot of techniques, what is called relaxation of the problems, which is you look at the problem without the constraints, you find the ideal solutions without the constraint and then you reapply the constraint. These problems, again, if you don’t have super tight constraints, they scale very poorly. So, the scalability of these techniques very much depends on how much leeway you still have once all the constraints are applied. And that’s the problem is that in supply chain, the sort of problems that we consider, there are a lot of constraints but they are not super tight.

Think about the padlock. The padlock is, you have 10,000 combinations, there is only one that does click, all the others are just wrong. Well, in supply chains, you have constraints but those constraints are not super tough. For example, within the constraints of the shelf capacity, you still have a huge amount of solution. You can decide to put more of these products, more of this products. When you look at it, it’s a very weak constraint. It’s not the sort of constraints that reduce your solution space to a few solutions. You still have an absolutely enormous amount of solutions.

All those solvers that approach those solution space elimination sort of techniques, branch and cuts, branch and bound, etc., beyond a 1000 variables, they typically perform extremely poorly. Maybe, if you go crazy, 10,000 variables, but that’s already pushing it to the extreme limits. We are talking of very big machines with dozens of gigabytes of RAM, dozens of CPUs and potentially hours to get a resolution. So, it’s going to be super slow and for 10,000 variables, you would say oh that’s already a lot. Not quite, it’s tiny.

Just consider that a mini market is going to be 5,000 products. But then it’s not 5,000 variables because the question is really do I bring zero unit, one unit, two units, three units. So, let’s say you stop at 10, 10 is going to be enough but I’m already at 50,000 variables and then the location and per mini market and obviously you have many mini markets. So, you see even a top problem like a mini market is already at 50,000 variables and that’s way beyond what you can do.

And then we have a third class of tools, which are local search. Local search is a class of techniques that say let’s assume that you can find a feasible solution. In the case of supply chain, it is a very reasonable assumption. So, finding a solution that does not violate any constraint is typically fairly easy. If your constraint is you should not overflow the shelf, just order less. It’s not something difficult, just decrement your units until you satisfy the constraint. If you have a minimal order quantity, well if you want to satisfy the constraint you just add plus one to a product until you have the quantity.

So, it’s not difficult to find a solution that satisfy the constraints. It’s not like a cryptographic puzzle where you need to get dozens of variables exactly right so that it fits. In supply chain, usually when I say problems are easy, it’s that usually by adjusting just one variable you can get the solution to click. So, you can just diminish a quantity until it fits in the shelf, diminish or increase a quantity until you have this minimum quantity. Same thing like that. So, it’s just you have semi-trivial ways to get a solution that satisfy your constraints. But I’m not saying anything about the quality of the solution, I’m just saying that you will find a solution.

And essentially, local search just says once you have a solution that fits, you can mutate randomly this solution and then if the mutated solution violates one of the constraints, you get rid of it. And if it still satisfies the problem and you have a loss function that tells you that this solution is better, then you jump to this solution that is better and you keep iterating.

So, loss function means that you have already a solution that is legal in a sense, you randomly modify that and when out of luck you get a solution that according to your loss function is better and that the solution satisfies your constraints, then you jump to this new solution and you repeat.

There are variants of this, they’re called typically meta-heuristics, genetic algorithms, taboo search and whatnot and all of that are on the premise that you start with a solution and you just iterate of a solution with random mutations that gives you more scalability. You would be able to, with this sort of techniques, to go to maybe a million variables. But it is still very sluggish.

And at Lokad, we have tried and it still does not pass the test of scalability for supply chain. So, it’s out of the classical way, it’s the better but it is still too weak to scale to the sort of problems where we have very rapidly millions of variables and we want to have fast convergence.

And we also want to consider the stochasticity aspect of the problem. Because you see, when I was mentioning this problem for these mini markets where we had 50,000 variables, if we macro expand with 100 trajectories just like I described to take into account the possible futures, then we are at 5 million variables. So, it rapidly inflates and again it’s just not enough.

**Conor Doherty**: I want to append the original question a bit. If I could summarize up to this point, the problem with the older mathematical optimizations was that it was deterministic. Things are known, there’s a right and a wrong, basically you can make a right decision or a wrong decision.

Then I asked you about the complexity of MRO, and you gave a very clear insight into just how complex it was. So, what are the technological overheads for the stochastic optimization of this small glimpse of complexity that you gave? It’s obviously insane, but what I’m asking actually is not what is the perfect way to implement this, but what is a better way to use stochastic optimization? It might not be perfect, but what’s a functional or a feasible way to implement it that doesn’t violate everything that you just said?

**Joannes Vermorel**: The problem with mathematical optimization is really about scalability. You can get back to a deterministic problem by reexpressing a stochastic problem as a deterministic one. But we started with mathematical optimization techniques that already suffered from severe scalability problems.

Now we are going to inflate the problem to reexpress the stochastic problem as a deterministic one, which makes your scalability problem even worse. There is a trivial way to deal with stochastic optimization, just execute my loss function that varies a million times and average out the outcome. That would work, except that the computational overhead is just gigantic.

So, the point that I’m making is that these mathematical optimization tools have been available for decades, but they don’t scale and they don’t even handle stochasticity. Even before considering stochasticity, which results from probabilistic forecasts, they were already not scalable enough. If you pile up stochasticity, then we are orders of magnitude off. That’s why Lokad had to essentially redevelop a class of technology for stochastic optimization so that we can address these problems at a scale that makes sense for supply chain.

And if we go back to why we really want that, the answer is that when Lokad introduced probabilistic forecasts back in 2012, we very quickly realized that we had a big problem. Optimizing under uncertainty is very, very difficult.

For years, we crafted smart heuristics to kind of duct tape the situation. So, you can get away with smart heuristics. Heuristics just mean a clever way that will just work somehow in this very specific setting. So, it’s cheating. You find a cheat that kind of works in a narrow situation. The issue is that those heuristics tend to be fragile.

And then, when you introduce cross-product constraints or cross-skew constraints, or anything where you have interdependent things in your supply chain, you know, it can be anything, then those heuristics tend to fall apart. That’s why you need to have a stochastic optimization.

If you don’t, well, what does that mean for businesses? It means that you rely on typically very conservative human judgment. It kind of works, but the problem is that you tend to play super safe to satisfy your constraints. The problem is that in supply chain, all the problems can be solved by just throwing more money at the case.

If I go back to my aviation MRO example, there is the obvious solution: just say sky is the limit. I can have as many parts as I want, so I’m going to have a ton of inventory and then I will have a good service level. If you just throw money at the problem, yes, you will kind of solve the problem, but this is not a tight resolution, this is a very loose resolution.

Same thing for the shelf limit. You can decide that you partition your store with lots of very narrow constraints. Say, these two products or three products should not have this much space and this one no more than this much space. That limits the sort of variation of internal composition that your store may undergo.

If I say, all yogurts combined, there should be no more than 200 units, fine. But what if this constraint is wrong? What if there is a surge in demand locally and this constraint for the total amount of yogurts is way too low? You end up at the end of every Saturday with not a single yogurt left in your mini market.

What happens is that when you do not have a stochastic optimizer, a stochastic solver available, what businesses do is that they typically tend to add a lot of constraints to reduce the solution space. That way, the people or potentially the software that are going to pick the solutions, make decisions, operate in a solution space that is much more narrow. That alleviates all those sort of cross-queue dependencies, cross-queue concerns.

But that’s cheating in a sense that there might be much better solutions that are out there and you’ve just eliminated those solutions by just piling up lots of fake constraints.

**Conor Doherty**: We’ve covered a lot there. So, in terms of trying to summarize or collapse a lot of this to make it more appreciable to people who necessarily might not have a mathematical training, stochastic optimization is a much more flexible and reactive way of optimizing decisions compared to the traditional mathematical optimization, right?

**Joannes Vermorel**: Yes, it is a more expressive way. Whatever you can express as a deterministic problem, you can also express as a stochastic problem. But in a way, stochastic is much more general, because deterministic just mean that your function doesn’t vary at all. If you have a function that very you can always pick a function that does not vary. It will still work. So the framework is first define what class of problems you can approach.

If your problems involve uncertainty, you need stochastic optimization. That’s literally the class of problems your situation belongs to. And now, ideally, you want a software component to address that. Probabilistic forecasts are the tools that let you generate those basic forecasts to assess the uncertainty.

When it comes to the decision making itself, we need a component as well. The typical perspective for mathematical optimization is to have a solver, a generic piece of software that can take any problem statement, deterministic loss function, variables, constraints, and give you the solution, combination of variables that minimize the loss function. You can have the exact same thing, a solver, but it’s going to be a stochastic solver. And so it will give you as an output the combination of variables that you seek.

And why do you want to have a solver? Well, your loss function that represents your dollars of gains and losses, they are up to change. Maybe you want to retweak the function so you adapt your strategic vision. You don’t want to reimplement a full software resolution of the numerical recipe that will give you a solution.

You just want to say, here is a new loss function, just reapply the resolution to this updated loss function. And that’s what a solver can do for you. It’s like the packaged software component that will take the definition of a loss function, definition of constraints, and definition of variables and give you the solution.

**Conor Doherty**: So, the software tool that you’re talking about, the solver software tool that automatically regenerates the optimization or the solution that you’re talking about, well that then reduces the amount of work that a traditional supply chain practitioner would have to do, right?

**Joannes Vermorel**: So, in practice, this would entirely automate the decision-making process? Yes, the solver is what generates the final decisions proposed by Lokad to its clients. There are different ways to approach optimization. You can use heuristics, some of which are very good. They work nicely in certain situations, so you don’t necessarily need the solver. You can just have your heuristic that plays the role of a solver. But the bottom line is that the solver is the thing that, given the forecast, generates the solution.

**Conor Doherty**: For clarification, when you use ‘solver’, do you use that interchangeably with ’numerical recipe’ that generates the recommended decisions, for example, in an inventory replenishment?

**Joannes Vermorel**: When I use the term ’numerical recipe’, I’m usually referring to the entire chain of processing. It’s more like everything from the data preparation till the generation of the results. This numerical recipe is typically decomposed into a series of stages: preparation of the data, generation of the forecast, optimization, and then presentation of the results. Today, we are only discussing the computation of the final decision, which takes the forecast as input.

**Conor Doherty**: If the supply chain practitioner disagrees with the generated supply chain decision provided by the solver, what’s the recourse there? If it’s taken into consideration millions of variables operating on a scale that is way beyond human capacity, then how can you as a supply chain practitioner evaluate the rightness or wrongness of it?

**Joannes Vermorel**: A solver is very straightforward to challenge. You can just say, “Here is my better solution. Let’s challenge this solution against the loss function.” All it takes to prove that your solver is not good is to show a solution that is better than the one that has been found by the solver.

So, the solver gives you a combination and typically, once you’ve given a solution, accessing the loss function is there, the constraints are there. So here is the tentative solution, let’s first check that the constraints are validated, okay, check they are. And now, let’s apply the loss function that gives me the loss in dollars. Okay, this is the loss, and this is the solution presented to me by the solver.

If I manually tweak this solution by cherry-picking my variables and end up with something that is better than the loss function, then I have proven that I am able, as a human, to generate a solution that is superior to the solver. In this case, the solver is not very good. That can very much happen.

The problem with off-the-shelf solvers that you would find in the market is not that they don’t find solutions, they do. They just find very poor solutions. Solutions where supply chain practitioners would manually retweak the quantities to be purchased and it would still satisfy all the constraints and according to the loss function, it would be better. So, challenging a solver is much easier than challenging a probabilistic forecasting model. All you have to do is exhibit a solution that happens to be better according to the loss function.

If you have special insights into the problem, you can maybe craft manually something that outperforms your solver. For most of the software, I would say all the commercially available solvers in supply chain settings, it is fairly straightforward to manually outperform them. They are really not that good when it comes to stochastic problems. Their scalability is terrible. So, you will have to run a solver for let’s say 30 minutes and be done with it. If the solution after 30 minutes is complete crap, then it’s not that difficult frequently for a human to do better.

**Conor Doherty**: What would you say is the key takeaway then for people having listened to this?

**Joannes Vermorel**: The key takeaway is that stochastic optimization is a very important angle that is mostly absent from supply chain textbooks. Most authors don’t even acknowledge that the problem exists in the first place. The very large, very established players who are selling solvers are selling solvers for deterministic optimization problems. Those are good solvers, don’t get me wrong, but they are not solving the class of problems that we have in supply chain due to this uncertainty. They just ignore the uncertainty.

The upside of having such a solver is that it lets you improve your supply chain in ways that truly look at all the interdependencies that exist. Instead of looking at things in isolation, they look at your supply chain as a system where everything contributes to the system and you need to look at those dependencies.

Those dependencies can take many forms. In aviation, it’s the list of parts that you need to do repair. In fashion, it’s the fact that a store needs to have garments of all colors to be appealing. This is something that can’t be expressed at the product level. In a hypermarket, you need to think of what is the actual shopping list that people want. They don’t come to buy one stuff, they want a whole list of stuff. Maybe they want to cook recipes so you need to have all the stuff. For pretty much all non-trivial supply chains, you have interdependencies all over the place and unless you have a stochastic solver or a stochastic resolution technique, you cannot even approach the problem in a satisfying way.

**Conor Doherty**: Well, Joannes, thank you very much for your time. I don’t have any further questions. And thank you all for watching. We’ll see you next time.