- Inventory forecasting
- Prioritized ordering report
- Old forecasting input file format
- Old forecasting output file format
- Choosing the service levels
- Managing your inventory settings
- The old Excel forecast report
- Using tags to improve accuracy
- Oddities in classic forecasts
- Oddities in quantile forecasts
- Stock-out's bias on quantile forecasts
- Daily, weekly and monthly aggregations

Home » Resources » Here

The stock reward function quantifies the expected returns, both positive and negative, of holding a certain number of units in stock. Fundamentally, the stock reward function answers the question

However, while these probabilities give a detailed picture of the future, they don't tell us anything about decisions to be made as far as inventory is concerned. Inventory decisions cannot be based on demand probabilities alone; the financial risks should be factored in too.

For example, let’s consider two products having the same probabilities of demand. If the first product is long-lived while the second has a short shelf-life, then, from an inventory perspective, it make sense to keep more units in stock for the long-lived product.

The stock reward is a mathematical function that computes the profitability of adding one more unit of stock for a given item, by taking into account a probabilistic forecast of the future demand and a few economic variables reflecting the expected profit when servicing the item, as well as the expected costs when the unit remains in stock due to lack of demand.

Lokad considers the stock reward function to be a key ingredient in inventory optimization. The solutions brought by the stock reward function are typically superior to those obtained through the naïve approaches that consist of targeting a specific service level or a fill rate. In reality, these latter approaches ignore all downside scenarios, that is, the costs associated with not selling the items in stock.

The economic angle should not be restricted to a naive profit-maximization analysis. In particular, the costs incurred from clients experiencing stock-outs should constitute an integral part of the analysis. However, the economic approach only provides the framework which aims to balance inventory costs with stock-out costs; but finding the right balance itself tends to be completely business-specific.

Let's define three variables associated with a single SKU when considering a duration that is equal to the lead-time:

- $M$ is the profit reward for selling 1 unit
- $S$ is the stock-out penalty (negative) for not serving 1 unit
- $C$ is the carrying cost penalty (negative) for not selling 1 unit in stock

These variables are

- the method fails to reflect the inventory risks associated with a future demand that may not happen. Hence, while the method may deliver good service levels, it's creating dead stock.
- the method fails to reflect the costs incurred on the client-side due to stock-outs, and also fails to demonstrate the opportunity loss of not serving the clients.
- the method fails to reflect the importance of serving the given unit and generating a profit that actually sustains the inventory itself.

Based on these considerations, let's review two simple scenarios depending on whether the demand exceeds the stock or not. Let $k$ be the number of units in stock, and let $y$ be the number of units requested by clients.

If the stock exceeds the demand, that is, $k \geq y$, then the immediate reward associated with the stock is $yM+(k-y)C$. Indeed, $yM$ accounts for the $y$ units that are served with their associated rewards, while $(k-y)C$ accounts for the carrying costs for the $(k-y)C$ units not sold at the end of a given the period.

If the demand exceeds the stock, that is, $k < y$, then the immediate reward is alternatively written as $kM+(y-k)S$. In this case, the first $k$ units get properly serviced and accounts for $kM$ in rewards, but then $y-k$ units are missing and incur the $(y-k)S$ penalty of stock-out.

We define the stock reward function as: $$R(t,k)= \begin{cases} kM+(y_t-k)S, & \text{if $y_t \geq k$ (stock-out)} \\ y_tM+(k-y_t)C + \alpha R^*(t+1, k-y_t), & \text{if $y_t < k$ (leftover)} \end{cases}$$ where :

- $k$ is the number of units held in stock
- $y_t$ is the demand for the period $t$
- $M$, $S$ and $C$ are the economic variables introduced previously
- $\alpha$ is a discount factor which will be discussed below
- $R^*$ is identical to $R$ but with $S=0$, and will also be discussed below

At first glance, this formula may look a bit overwhelming, but it's actually a straightforward model of a single SKU with $k$ units in stock confronting a demand of $y_t$ units. In fact, except for the $\alpha R^*(t+1, k-y_t)$ component, this expression is just like the immediate reward that we have detailed in the previous section.

Then, in order to take all the subsequent time periods into account, there are two twists. First, we have a recursive call to the reward function itself; signifying that the reward is the sum of the rewards (or losses) for the next time period plus all the rewards (or losses) for all the time periods that follow. At first, it might look puzzling to have a function that "walks" indefinitely into the future, but it merely reflects the fact that unsold inventory is carried on from one time period to the next.

Second, we introduce $\alpha$ as a discount factor for future rewards. This approach is inspired by the discounted cash flow concept that reflects the fact that a profit generated in a distant future has less value than a profit generated in a very near future. Conversely, the same logic applies for costs as well: an immediate cost is more impacting that a cost that is incurred in a distant future.

Finally, the recursion is performed using $R^*$, which ignores stock-out costs, instead of $R$. This reflects the fact that it is not the "responsibility" of current stock to prevent stock-outs for any other lead time period but the current one. By definition, the lead time represents the time duration to be covered by the current stock. For the next time period, there will be, by definition, another opportunity to buy more stock (we will see how the no-reorder case can be accommodated in the following section). Therefore, the responsibility of not hitting a stock-out for a time period that follows the next one falls on a later inventory decision.

As we will see in the following section, $\hat{R}$ can be computed for practical purposes. As a matter of fact, Lokad provides a built-in function named

`stockrwd`

that implements this precise formula. This point is covered in greater detail in the next section.In practice, the only measurement available is $\hat{R}$ because $R$ cannot be computed effectively since the future demand is not yet known. Thus, by using the stock reward function, we do indeed refer to its estimate $\hat{R}$ rather than to the "real" $R$ function. It should also be noted that the accuracy of the $\hat{R}$ estimate naturally depends on the accuracy of the underlying probabilistic forecasts. However, this discussion goes beyond the scope of the present document.

`stockrwd`

function in Envision`stockrwd`

is a function of Lokad's Envision feature that implements the stock reward function (or rather its probabilistic estimation), given that a probabilistic forecast is readily available. In case we are interested in the reward increment for the kR = stockrwd(D, M, S, C, A)

The first argument

`D`

is expected to be a distribution. This distribution represents the probabilistic demand and is typically produced by the forecasting engine. As such, `D`

is not only expected to be a distribution, but it is also expected to be The last four arguments

`M`

, `S`

, `C`

and `A`

reflect the economic variables defined at the beginning of this document. In practice, `S`

and `C`

are expected to be negative. The `A`

value is also expected to be included in the segment $[0;1[$.The function returns

`R`

, a distribution which reflects $k \to R(k) - R(k-1)$. Beware, this distribution is not a random variable. Actually, the formal definition implies that it is not even a `R`

is truncated to match the support of the distribution `D`

.Let's review a typical definition of the economic variables:

M = SellPrice - BuyPrice S = -0.5 * (SellPrice - BuyPrice) // 0.5 arbitrary C = -0.3 * BuyPrice * mean(LeadTime) / 365 // 0.3 arbitrary A = 1 - 0.2 * mean(LeadTime) / 365 // 0.2 arbitraryWe have:

`M`

is defined as the gross margin per unit.`S`

is arbitrarily defined as 0.5 times the gross margin. Naturally, the impact may vary from one industry vertical to the next depending on client tolerance for stock-outs.`C`

is expressed as annual carrying costs accounting for 30% of the initial purchase price per year. The factor`C`

reflects periods of`mean(LeadTime)`

days instead of years.`A`

is expressed as a 20% annual discount on future rewards. Likewise, the value is scaled to fit the lead time through`mean(LeadTime) / 365`

.

In practice, the probabilistic lead times are also expected to be forecast by the forecasting engine. Consequently, in the illustrative example, we assume that

`Leadtime`

is a distribution.`BackorderQty`

represents the quantities backordered for each item - possibly zero if there are no backorders. Then, the stock reward calculation can be adjusted as follows:
R = stockrwd(D +* dirac(BackorderQty), M, S, C, A)The

`+*`

operator is the additive convolution and `dirac(BackorderQty)`

is the Dirac distribution at `BackorderQty`

. The convolution is shifting the demand distribution to the right of `BackorderQty`

units, which represents a `stockrwd()`

function also provides a grid-flavored syntax with:
Grid.Reward = stockrwd(Id, Grid.Probability, Grid.Min, Grid.Max, M, S, C, A)

where the first four arguments

`Id`

, `Grid.Min`

, `Grid.Max`

and `Grid.Probability`

merely represent the probabilistic forecasts produced by Lokad's forecasting engine. The other arguments remain as described in the previous section. However, whenever possible, it is suggested to use the distribution-flavored `stockrwd()`

syntax as described in the previous section.`stockrwd`

, the Envision function provided by Lokad.In this context, it's reasonable to have:

- M=0, unless the parts are serviced for a price, there is no differentiated upside in servicing the part.
- S=
*constant*, since all NO-GO parts are equally capable of grounding an aircraft, the stock-out penalty is uniform. - C=
*constant*(annualized), since most parts are long-lived, it's acceptable, in a first approach, to approximate the annual carrying costs as a constant.

In this context, we would have:

- M, the gross margin
- S, a fraction of the gross margin
- C = 0, as nothing is carried on from one period to the next
- A = 0, idem, no reward can be gained from future periods