Forecasting Technology »
Developers » Here
This document contains reference documentation and guidance for the
Forecasting API v3.1 of Lokad. You can programmatically send time-series data toward Lokad and get forecasts back.
Compatibility with previous versions
The Forecasting API v3.1 is fully backward compatible with the v3.0. The version v3.1 introduces the notion of
quantile forecasts that did not exist in the v3.0.
REST
The Forecasting API v3.1 comes with a REST endpoint at
http://api.lokad.com/rest/forecasting3/. Basically, it's a very
web-oriented API that leverages most of the common building blocks typically available through the ubiquitous HTTP protocol.
Authentication
Forecasting API v3 relies on the API keys that can be obtained through any
Lokad account. Just log in your account, go to
Users and generate a new key.
API concepts
The API v3 introduces the notion of
datasets. A single Lokad account can hold many datasets, each one of them identified by a unique name (within the account).
Each dataset contains several
rich time-series, each one of them uniquely identified by its name (two series may have the same name if they belong to different datasets). Then, each time-series can contains multiple time-values, tags and events.

The dataset also defines the
forecasting settings which include:
- the Period defining the level of aggregation of the forecasts, such as day, week, month, ...
- the Horizon defining the number of periods being forecasted ahead.
The Period is only relevant to the classic forecasts, not the historical data. If you want 4-months ahead monthly forecasts, you can either upload daily or monthly aggregated data. You do not need to tell Lokad that your historical data is daily aggregated, the forecasting engine will figure that on its own.
Dataset settings are
immutable: once defined, they cannot be altered. Although it is possible to delete the dataset, and to recreate one with different settings.
Datasets represent
data containers but with a forecasting angle. We suggest to keep all related data within the same container, as Lokad will be able to exploit more intensely the statistical correlations that exist in your data, if the correlated data happen to be gathered in the same dataset in the first place.
For example, a eRetailer who wants to forecast both sales (to optimize inventory) and call volumes (to optimize staffing), should create a first dataset for the sales (with
weekly forecasts) and a second dataset for the call volumes (with
quarter-hourly forecasts).
Datasets are
persistent. In particular, they can be
incrementally updated, sending only the most recent data to be merged with the older data (which has been already uploaded). Such incremental upload saves you the cost of re-uploading the entire history each time the forecasts need to be refreshed.
In the case of quantile forecasts, the Period and Horizon values of the dataset are ignored.
Call flow overview
The most typical usage of Forecasting API v3 is the
forecasting pipe:
- insert a dataset (if no dataset is already available)
- insert or update time-series in this dataset.
- trigger the forecast flow (or the quantile flow).
- checks forecasts status until they are ready.
- download forecasts associated to the dataset (or the quantile forecasts).
Once the
forecast flow (resp.
quantile flow) has been triggered, the dataset cannot be altered any more until the forecasts are computed. Lokad guaranties that the results are delivered in
less than 2h no matter the size of the dataset. In practice though, if your dataset is small, computation is much faster (only a few minutes).
The API also offers the possibility to:
- list the datasets (or delete them).
- list the time-series of a dataset (or delete them).
The
listing features are provided for debugging purposes rather than for production. We suggest not to rely on those feature in your production logic, and instead, get a client-side record of the
state of your datasets.
Continuation tokens
The two methods
list datasets and
list time-series let you enumerate all the data stored in your Lokad account. Since the amount of data can be arbitrarily large, a single call might not return all the data contained in your account.
Instead, the method returns a
continuation token. This token should be treated as a black-box (not to be parsed or tempered) and returned to Lokad in the next method call to retrieve the next batch of items. As long a continuation token is returned, there are items that remains to be enumerated.
Scalability notes: iterating on datasets or time-series is a strictly sequential process (hence not
exactly scalable). Yet, as we were outlining before, retrieving historical data from Lokad is provided to facilitate debugging, but those operations are not expected to be performed for production purposes.
Retry policies
When interacting with remote API such as the one offered by Lokad, we advise to setup
retry policies on the client side. In short, network connections are rather reliable but
not 100% reliable: a single failed is not supposed to cause the client app to implode.
In our own experience observing a random network error every 100,000 web requests is nothing out of the ordinary. Without retry policies, failures within process handling large datasets (requiring thousands of web requests) are going to become painful.
Most of the time, when a web request fails, it's just a transient error. Just wait to a few hundreds milliseconds and try again.
Obviously, retry polices come with a potential issue: when a request fails we can't be sure if the call has been processed or not, as the failure might have occurred randomly either while sending the request, or while retrieving the response.
Thus,
all methods exposed by the Forecasting API v3 are idempotent. Calling twice (or more) the same method with the same arguments has identical effect than the single initial call (because we have designed the API that way). Hence, the
API is retry-policy friendly.
Setting-up your own home-made retry policies is rather straightforward. We strongly
advise to setup retry policies in your client logic, as it will save you a lot a trouble later on.
Dataset async deletion
Deleting a dataset also delete all the series contained in the dataset. The deletion of datasets presents one subtlety: it follows an
asynchronous pattern. Indeed, it would been hard for Lokad to support atomic deletion of arbitrarily large amount of data. Hence, the deletion of a dataset is not immediate after the call to
delete dataset. Instead the dataset enter a state of
deletion pending, and the deletion typically occurs within a few minutes after the initial call.
Then, when designing your logic against the Forecasting API v3, you should keep in mind that you won't be able to immediately recreate an dataset with the same name than the dataset you've just deleted.
Dates and value aggregation in time-series
The API enforces no specific constraints on the
level of aggregation of the input time-series. For example, you can aggregate sales data at a daily level (one time-value per day) within a dataset associated to
weekly period. If multiple time-values end-up within a single period, then the semantic associated to those values is
sum.
Best practice: For weekly and monthly datasets, we suggest to aggregate the data at the
daily level. Indeed, providing the API with daily data enable Lokad to leverage subtle patterns that could be hidden by a higher aggregation otherwise (ex: Mother's day effect).
Although, the simplest (and yet reasonable) approach consists of aggregating the input data at the same level than the desired forecasts. For example: weekly aggregated time-series in order to obtain weekly forecasts. In this situation, the actual date for the time-value handling the total for the period can be positioned anywhere within the period (but we suggest to always stick to the same convention).
Last period gotcha: When aggregating data at the daily level for weekly or monthly datasets, it's important to
prune the time-series not to include partial periods at the end of the series. Indeed, this
partial period would be interpreted as holding the entire total for the period, which would mislead the forecasting logic. Instead, time-values that do not belong to the last period should not be pushed to the API until the period is complete.
For quantile forecasts, it is strongly advised to aggregate data at the daily level.
Numbers and dates format
Numbers are using the dot (.) as decimal separator. The API handles
double precision digits both for inputs and outputs. In particular, the forecasts are not rounded. Hence, you might get numbers returned with quite a few decimals most of them being non-significant. We believe it's not the responsibility of the forecasting engine to round the values. Instead, we suggest the
client app to apply some prior rounding before displaying the forecasts to the user.
The date format consumed or returned by the API is
2010-12-28T06:00:00.0000000Z. The UTC time zone ('Z') is present for clearness and for compatibility with most default date time formatting. Times where no time zone is specified are assumed to be UTC. Local time zones (e.g. '+02:00' instead of 'Z') are not supported.
Incremental time-serie updates
The method
upsert time-series (i.e. insert or update) provides an optional
merge behavior. This behavior is provided to support incremental update of data already uploaded toward Lokad. Let assume that we have an
old time-series already stored in the Lokad account, we are now pushing a new time-series:

If the update is done with the option
merge=true, then, instead of simply overwriting the existing time-series with new value (plain overwrite is the default behavior), the resulting time-series is obtained as a merge of the old time-series and the new time-series.
In short,
all time-values located after (inclusive) the first time-value of the fragment are overwritten. Hence, if a single time-value located before the start of the time-series is merged it would overwrite the time-series entirely, effectively removing several time-values to replace them by a single earlier time-value.

It should be noted that all sections that overlap between the old and new time-series get overwritten by the data of the new time-series (most recent data take precedence).
Then, we are illustrating here the merge behavior with time-values, but the Forecasting API v3 behaves identically for events which could be incrementally updated too.
Zero values and missing values in time-series
By default, the API treats a missing time-series values as a
zero. This heuristic enables to significantly reduce the data transfer while uploading sparse time-series (aka time-series with a lot of zeroes).
Nevertheless there are two exceptions to this heuristic:
- The starting date of the time-series should not be omitted. Optionally a zero time-value might be inserted to actually represent the actual starting date of the time-series. Indeed, if we observe a product that has been only sold once last month, the forecasts for next month won't be the same if the product has been already been for sale over the last 12 months, or if the product was just newly inserted into the shelves.
- The ending date of the time-series should not be omitted. Indeed, the API has a data-centric semantic. Forecasts start where the data ends. As a results, if the last period is associated to a zero value, the time-value must still be inserted to make sure the forecasts start from the proper date.
Then, for
truly missing values (ex: data loss caused by database corruption). We suggest to flag the loss with a proper
event (as defined by this Forecasting API).
Object naming rules
Whenever a
string token is used in the API (dataset names, series names, tags, ...), the following restrictions apply:
- Length should be between 1 and 32 characters.
- Characters should be strictly alphanumerical [a-zA-Z0-9].
The equivalent
REGEX pattern is:
^[a-zA-Z0-9]{1,32}$Capacity limitations
There is no limit in the number of datasets in a Lokad account. Although, unless you're acting on behalf of plenty of clients, there are usage not many compelling reasons to have more than a handfew datasets in your account. There is no limit in the number of time-series in each dataset.
Then, all arrays processed by the API (within the XML messages being sent to the API) are limited to
100 items (for ex: 100 tags, 100 events, ...); except for the time-series that may hold up to 65,536 time-values. Any web request beyond
4MB will also be rejected.
All compound API operations (within the XML messages being sent to the API), i.e. update-many or read-many operations, are limited to
100 items as well (for example
update 100 series' or
get 100 forecasts).
Those rules imply (among other examples):
- only 100 time-series can be updated at a time.
- only 100 forecasts can be downloaded at a time.
- a single time-series may not hold more than 100 tags or more than 100 events.
- a single event may not hold more than 100 tags.
- ...
Forecasting periods
The API supports the following periods (periods are part of the dataset settings):
quarterhourhalfhourhourdayweekmonth
For best forecasting accuracy, for
day,
week and
month periods, we suggest to upload
daily pre-aggregated data. For
quarterhour,
halfhour,
hour, we suggest to upload pre-aggregated data that match the forecasting period.
Quantile forecasts do not handle intra-day data. In the context of quantiles, there is no notion of periods per se. Instead, each time-serie is associated with an horizon (lambda) expressed in days.
Data centric forecasts
The API adopts a
data centric approach to forecasting: the forecasts start where the input time-series data end. For example, for a monthly forecast, if the last time-value is located at
2010-12-01, then the first forecast will be at
2011-01-01.
The API adopts the following conventions:
- Months always start the 1st.
- Weeks always start on Mondays.
In practice, you can aggregate the data from Tuesday to Tuesday, and represent the resulting time-series as a list of time-values always starting on Mondays (or any other day of the week for that matter). We have found that it's simpler to stick to this behavior rather than making the Forecasting API more complex to internally support many data aggregation scenarios.
The API does not enforce all time-series to end at the same date, yet, we believe that it's best to keep a dataset consistent with
all time-series ending at the same date. Adopting this pattern will improve the time-series correlation effort of Lokad, and consequently the accuracy of the forecasts.
In the case of quantile forecasts, the forecast starts the day after the final data point of the time-serie, following a behavior identical to a classic daily forecast.
Error codes
Every single method of the API may return an error code (always enclosed in a
<ErrorCode/> element) or respond with HTTP error status codes. The existing error codes are:
AuthenticationFailed or HTTP 401: The authentication failed. This situation can be caused either because the API key is incorrect, or because the account has been locked for lack of payments.OutOfRangeInput or HTTP 400: The input is malformed XML, or it is not compliant with the naming rules (ex: invalid characters in dataset names) or the capacity rules (ex: too many series updated at once).DatasetNotFound or HTTP 404: No such dataset available in your account. You need to insert the dataset first, and/or correct your logic.InvalidDatasetState or HTTP 412: The dataset is not in an appropriate state to be the target of the specified operation. For example, no data could be uploaded once the forecast computation is in-progress, and the forecasts cannot be retrieved until the forecasts are ready. Also, if a dataset is being deleted, you should wait until completion.ServiceFailure or HTTP 500: Transient failure on the Lokad side. We suggest to delay your call from 15min (for example), and then try again.
Compression
All API calls can optionally be compressed. To make it feasible to upload large data sets, the request body can be compressed as well. Only the GZip compression algorithm is supported (not deflate).
- If the request is compressed, the HTTP request header
Content-Encoding: gzip must be set. - If the response is compressed, the HTTP response header
Content-Encoding: gzip will be set. - The API may send a compressed response only if the request was compressed as well, or if the HTTP request header
Accept-Encoding: gzip is set.
Service fees
The Forecasting API is part of the
Partner Plan. Please
email us for more information.
REST API methods
The REST Forecasting API 3 follows canonical REST patterns. The only supported output format at this point is XML, and we rely on Basic Authentication. This section lists the methods available for the REST API.
Basic Authentication
All REST methods rely on the same
Basic Authentication pattern.
- The user name must be auth-with-key@lokad.com
- The password must be an API key, as retrieved from your Lokad account.
As far we know, nearly all decent development environments supports Basic Authentication.
Debugging with your web browser
One of the advantage of REST is that is allows you to start experimenting directly with your web browser, at least for all GET requests. For example, just
try and see for yourself (you will need to login). For other request verbs (PUT and DELETE), you can use simple extensions such as the excellent
Poster add-on for Firefox.
PUT /datasets
Insert a new dataset in your account.
URL: http://api.lokad.com/rest/forecasting3/datasets
HTTP method: PUT
Request:
<Dataset>
<Name>candysales</Name>
<Period>week</Period>
<Horizon>7</Horizon>
<Threshold>2012-05-01</Threshold>
</Dataset>
Response:
<ErrorCode/>
Remarks:
If a dataset with the same name already exist in your account, then the call is simply ignored.
Each period comes with a maximal horizon value:
quarterhour: max horizon is 10000.halfhour: max horizon is 10000.hour: max horizon is 10000.day: max horizon is 400.week: max horizon is 100.month: max horizon is 100.
In the case of quantile forecasts, the values chosen for
Period and
Horizon have no impact on the actual results. However, for full compatibility with v3.0, the v3.1 of the API maintains
Period and
Horizon as required tags (they cannot be omitted). In practice, if you are not using classic forecasts at all, we suggest to use
Period=day and
Horizon=1, as placeholder values.
The remaining optional
Threshold parameter can be used in classic forecasts to truncate all series to strictly before that date. If set,
Threshold represents the first date of the forecast. A typical usage of
Threshold is to ignore an incomplete last period for classic forecasts yet keep all data for quantile forecasts in dual forecast scenarios.
Threshold is ignored in quantile forecasts.
GET /datasets
List the datasets available in your account.
URL: http://api.lokad.com/rest/forecasting3/datasets/token
HTTP method: GET
{token} optional argument: continuation token (must have been retrieved from a previous call).
Request:
Request body should be left empty.
Response:
<DatasetCollection>
<ContinuationToken>02k29xk2c82ks</ContinuationToken>
<Datasets>
<Dataset>
<Horizon>6</Horizon>
<Name>candysales</Name>
<Period>month</Period>
<Threshold>2012-05-01</Threshold>
</Dataset>
</Datasets>
<ErrorCode/>
</DatasetCollection>
DELETE /datasets
Delete one dataset from your account. Deleting a dataset also delete all the series contained in the dataset.
URL: http://api.lokad.com/rest/forecasting3/datasets/dname
HTTP method: DELETE
{dname} argument: name of the dataset.
Request:
Request body should be left empty.
Response:
<ErrorCode/>
Remarks:
If there is no such dataset, or if the dataset is already marked for deletion, then the call has no effect.
PUT /series
Add decorated time-series to a dataset.
URL: http://api.lokad.com/rest/forecasting3/series/dname?merge=B
HTTP method: PUT
{dname} argument: name of the dataset.
{B} optional argument: boolean value set as
false when the argument is omitted. You can enable time-series merge with
true.
Request:
<TimeSeries>
<TimeSerie>
<Name>lollipop</Name>
<Tau>0.95</Tau>
<Lambda>21</Lambda>
<Tags>
<string>sugar</string>
<string>candy</string>
</Tags>
<Events>
<EventValue>
<Time>2010-06-01</Time>
<KnownSince>2001-01-01</KnownSince>
<Tags>
<string>promo</string>
</Tags>
</EventValue>
</Events>
<Values>
<TimeValue>
<Time>2010-05-01</Time>
<Value>42</Value>
</TimeValue>
<TimeValue>
<Time>2010-05-08</Time>
<Value>17</Value>
</TimeValue>
</Values>
</TimeSerie>
</TimeSeries>
Response:
<ErrorCode/>
When
B=false (or omitted), it represents an
overwrite semantic: the existing serie (if any) is overwritten by the new serie passed with the request. We suggest to use this behavior unless series are extremely long (which typically happens with hourly, half-hourly or quarter-hourly series).
When
B=true, the new serie is merged with the existing one (if any). Most recent overlapping time-values and events from the existing serie are overwritten with the values passed with the request; older values stay untouched.
Check also the
Incremental time-serie updates paragraph here above for a more visual illustration of the process.
Remarks:
The two tags
Tau and
Lambda are only required for quantile forecasts. Those tags can be omitted in case of classic forecasts. Those values (if any) will be ignored in the case of a classic forecast.
Tau: This floating point number represents the target percentage of the quantile forecast. The range of acceptable values for Tau is 0.001 to 0.999 (inclusive).Lambda: This integer represents the horizon, expressed in days of the quantile forecast. The range of acceptable values for Lambda is all the integers from 1 to 366.
The
KnownSince value should typically be set to
2001-01-01 for an event known in advance when the exact time for the event being known is not precisely known or simply not relevant. If the event was unplanned, then
KnownSince should be equal to
Time.
GET /series
Retrieve time-series from your account.
URL: http://api.lokad.com/rest/forecasting3/series/dname/token
HTTP method: GET
{dname} argument: name of the dataset.
{token} optional argument: continuation token (must have been retrieved from a previous call).
Request:
Request body should be left empty.
Response:
<TimeSerieCollection>
<ContinuationToken>jsoxln5028s02ksix</ContinuationToken>
<ErrorCode/>
<TimeSeries>
<TimeSerie>
<Name>lollipop</Name>
<Tau>0.95</Tau>
<Lambda>21</Lambda>
<Tags>
<string>sugar</string>
<string>candy</string>
</Tags>
<Events>
<Event>
<Time>2010-06-01</Time>
<Tags>
<string>promo</string>
</Tags>
</Event>
</Events>
<Values>
<TimeValue>
<Time>2010-05-01</Time>
<Value>42</Value>
</TimeValue>
<TimeValue>
<Time>2010-05-08</Time>
<Value>17</Value>
</TimeValue>
</Values>
</TimeSerie>
</TimeSeries>
</TimeSerieCollection>
Remarks:
The number of series to be retrieved for each call is decided on the server side. In practice, you should expect about 100 time-series or 4MB of data (whatever limit is reached first) to be retrieved at each request.
The tags
Tau and
Lambda will only be returned if those tags have been uploaded in the first place. For a classic forecast where
Tau and
Lambda are not provided, those tags are not present in the response.
It is not currently possible to retrieve partial information about the time-series (such as only the time-series names), and it is not possible either to randomly access to the information hold by a single time-series. The Forecasting API is not intended as a data storage. The method
GET /series is primarily provided for debugging purposes and sanity checks rather than the production usage.
DELETE /series
Delete one or several series from your account.
URL: http://api.lokad.com/rest/forecasting3/series/dname?n=snames
HTTP method: DELETE
{dname} argument: name of the dataset.
{snames} argument: name(s) of the series. In order to delete multiple series at a time, series names must be contacted with semicolon used as a separator:
serie1;serie2;serie3.
Request:
Request body should be left empty.
Response:
<ErrorCode/>
GET /status
Used both to trigger the
classic forecast computation for a given dataset AND to check whether forecasts are available.
URL: http://api.lokad.com/rest/forecasting3/status/dname
HTTP method: GET
{dname} argument: name of the dataset.
Request:
The request body should be left empty.
Response:
<ForecastStatus>
<ErrorCode/>
<ForecastsReady>true</ForecastsReady>
</ForecastStatus>
Remarks:
Once you have finished uploading your data with
/put/series, you must
call this method once to trigger the classic forecast computations. Then, we suggest we call this method every 5min until the status indicates that the forecasts are ready.
Once the forecasts are ready, re-calling this method
will not trigger another immediate forecast recomputation. Indeed, at least one time-series need to be updated first before the forecast computation could be restarted. This behavior is implemented so that a single doubled-call to
GET /status (for example, caused by a retry policy) does not cause needless re-computation.
GET /qstatus
Used both to trigger the
quantile forecast computation for a given dataset AND to check whether forecasts are available.
URL: http://api.lokad.com/rest/forecasting3/qstatus/dname
HTTP method: GET
{dname} argument: name of the dataset.
Request:
The request body should be left empty.
Response:
<ForecastStatus>
<ErrorCode/>
<ForecastsReady>true</ForecastsReady>
</ForecastStatus>
Remarks:
Once you have finished uploading your data with
/put/series, you must
call this method once to trigger the quantile forecast computations. Then, we suggest we call this method every 5min until the status indicates that the forecasts are ready.
Once the forecasts are ready, re-calling this method
will not trigger another immediate forecast recomputation. Indeed, at least one time-series need to be updated first before the forecast computation could be restarted. This behavior is implemented so that a single doubled-call to
GET /qstatus (for example, caused by a retry policy) does not cause needless re-computation.
After uploading the time-series, it is possible to call both /status and /qstatus to trigger both a classic forecast flow and a quantile forecast flow at the same time. A concurrent execution of the two flows is supported by the Forecasting API v3.1. Opting for a concurrent execution reduces the overall delay to get both results, classic and quantile.
GET /forecasts
Retrieves the
classic forecasts associated to a dataset.
Calling this method is charged based on our public pricing. Make sure not to retrieve twice the same forecasts as it would count as 2x more consumption.
URL: http://api.lokad.com/rest/forecasting3/forecasts/dname?n=snames
HTTP method: GET
{dname} argument: name of the dataset.
{snames} argument: name(s) of the series to be retrieved. If you want to retrieve multiple forecasts (associated to distinct series) in a single request, you must concatenate the names of the series using the semicolon as a separator:
serie1;serie2;serie3.
Request:
Request body should be left empty.
Response:
<ForecastCollection>
<ErrorCode/>
<Series>
<ForecastSerie>
<Name>lollipop</Name>
<Values>
<ForecastValue>
<Time>2010-05-15</Time>
<Value>18</Value>
<Accuracy>0.17</Accuracy>
</ForecastValue>
<ForecastValue>
<Time>2010-05-22</Time>
<Value>21</Value>
<Accuracy>0.19</Accuracy>
</ForecastValue>
</Values>
</ForecastSerie>
</Series>
</ForecastCollection>
Remarks:
You should not assume that the
ForecastSeries will be returned in the same order than the one implicitly specified in the request. You should rely on the
ForecastSerie/Name element instead.
The
Accuracy represents the expected
MAPE (Mean Absolute Percentage Error) associated with the forecast.
Caveat: our terminology is a bit confusing, one would expect
Accuracy = 1 - MAPE which is not the case here.
Each time a new forecast computation is launched, all forecasts get overwritten. The API does not keep track of the forecasts historically produced. This behavior is left to the client app.
GET /quantiles
Retrieves the
quantile forecasts associated to a dataset.
Calling this method is charged based on our public pricing. Make sure not to retrieve twice the same forecasts as it would count as 2x more consumption.
URL: http://api.lokad.com/rest/forecasting3/quantiles/dname?n=snames
HTTP method: GET
{dname} argument: name of the dataset.
{snames} argument: name(s) of the series to be retrieved. If you want to retrieve multiple forecasts (associated to distinct series) in a single request, you must concatenate the names of the series using the semicolon as a separator:
serie1;serie2;serie3.
Request:
Request body should be left empty.
Response:
<QuantileCollection>
<ErrorCode/>
<Quantiles>
<QuantileValue>
<Name>lollipop</Name>
<Value>17</Value>
</QuantileValue>
</Quantiles>
</QuantileCollection>
Remarks:
You should not assume that the
QuantileValues will be returned in the same order than the one implicitly specified in the request. You should rely on the
QuantileValue/Name element instead.
The quantile value
QuantileValue/Value is an integer. No fractional forecasts will be returned in the case of quantile forecasts.
Each time a new quantile forecast computation is launched, all forecasts get overwritten. The API does not keep track of the forecasts historically produced. This behavior is left to the client app.