Call us: +1 (716) 989 6531 or email at:

Forecasting Software for retail, wholesale and manufacturing

RSS RSS

Navigation





Search the wiki
»

PoweredBy

Salescast Intermediate SQL schema

RSS
Home » Salescast » Here

Intermediate SQL schema

Intended audience: database administrators, developers

Salescast is a forecasting client webapp that leverages the Lokad Forecasting Services. Salescast can import data from any remotely reachable SQL database. Yet, if you want to avoid the production database to be directly accessed to a 3rd party app such as Salescast, it is usually considered a best practice to setup an intermediate database.

This document details how to setup such an intermediate database, with a strong focus on the expected database schema that will allow Salescast to auto-import your sales data without any customization need on Lokad's side.

This document contains SQL script templates for creating such export databases in the following flavors:

  • Microsoft SQL Azure or Microsoft SQL Server (both systems are nearly the same);
  • MySQL;
  • Postgre SQL 7.3+
  • Oracle

Make sure to customize the template to your needs, before creating the actual database.

If you don't find an SQL template for you favorite database you want to integrate, please drop us an email!


Getting an extra (cloud) database

The process outlined in this document requires an extra database to be setup. This database will be populated with data originating from your current production system. This happens typically through an ETL tool. Data is consumed by Salescast in the end.

Getting an extra database on-demand is now rather straightforward through cloud computing solutions.
  • If your production database is Microsoft SQL Server, we suggest to get a database from Microsoft SQL Azure. Setting up your SQL Azure database instance is a matter of minutes.
  • If your production database is MySQL, we suggest to get a database from Amazon RDS. Setting up your MySQL database instance is also a matter of minutes.

Those cloud-hosted databases are natively designed for remote access, and are best suited for interacting with Salescast.

Firewall access and IP address

Salescast is hosted on Windows Azure, so we don't have a fixed IP addresses. The IP range of the Microsoft data-center is 94.245.*.*. (or 94.245.0.0/16 written with the CIDR notation). With MySQL it is possible to grant an access for the IP range with the following command: GRANT SELECT, LOCK TABLES ON databaseName.* TO username@'94.245.%.%' IDENTIFIED by 'password';

Data: the big picture

The sales data imported by Salescast fall into three SQL tables used for read-only purposes:
  • Lokad_Items
  • Lokad_Orders
  • Lokad_Events

Lokad_Items table should contain all the descriptions of the products, articles or services offered by your company. This table is primarily used to generate a human-readable report with proper labels (not just database identifiers). This table may also contain key inventory-oriented information such as StockOnHand. It will be used to generate reports more oriented toward inventory optimization purposes. In addition this table may also contain Tags to refine the forecasts.

Lokad_Orders table should contain the bulk of the sales history. A line in the Lokad_Orders table typically represents the amount of items sold on a given day.

Lokad_Events table is optional and represents extra information about phenomenon such as retail promotions that are likely to impact the demand. Check the guide about Events for more information.

Convention over configuration

Instead of following a rigid SQL schema that might not be appropriate for your business, Salescast relies on a flexible approach based on SQL naming conventions. In short, Salescast can potentially import a lot of data, but all this data may not be available in your systems (or might not even be relevant to your business).

Salescast lets you choose, which data you want to include in the intermediate database. Nearly all fields are optional. If a field cannot be found, Salescast will just go on processing your data further and skip any missing fields.

Caution: Salescast is case-sensitive, make sure to respect the given capitalization for fields and table names.

Items table

Lokad_Items table should contain descriptions of the articles / SKU / services being offered. The table can have any number of extra columns in addition to the Primary Key (PK). Those extra columns can be numeric (integers, float, ...) or plain text (such as nvarchar).

Additional indexes and constraints could be added as needed for your convenience.

The table must be named Lokad_Items and should contain:

  • Id - the primary key (PK) of type Int64 or bigint. Required.

This table is required.

Display fields

Any field prefixed by the token Label will be inserted into the report that will be produced by Salescast. For example, if your Lokad_Items table contains a field named LabelProductName, then report will include a column named ProductName (the Label part will be removed from the name).

This convention allows to add any arbitrary set of extra fields for reporting purposes. Fields typically include, without limitations: Supplier, Brand, ISBN, etc. These fields are for display only and they do not impact the forecasting operation itself.

All display fields can hold NULL values. The report displays the corresponding cell in the report as blank.

Tagging fields

Salescast uses any field prefixed with the token Tag to assign this tag to the item. Color, size, brand etc. are typical examples of tags used to improve the accuracy of the forecast. Please refer to the guide for tags and events to learn more about which data should be used for tags.

Any tagging field can hold NULL values. When a NULL value is encountered, the tag will simply be omitted.

If you wish to combine both tag and label behaviors for a single field, then you need to prefix the column with TagLabel.

Fields specific for the inventory

Inventory specific fields do not contribute to the forecasting operations. They allow Salescast to perform some simple safety stock reporting using reorder points and stock coverages.

There are 4 fields that are natively associated with a specific semantic:
  • StockOnHand the number of items that are physically available. This value is expected to be a number (eventually a floating number, if the stock represents a quantity of materials).
  • StockOnOrder the number of items that are already ordered, but not present, yet. This value is expected to be a number (eventually a floating number too).
  • LeadTime represents the number of days that the supplier requires to replenish the local storage of items. Caution: the duration is always expressed in days even if your report is setup for weekly or monthly forecasts. Learn more about lead time.
  • ServiceLevel is a number between 0 and 1 that represents the probability of not experiencing a shortage. This field acts as a safety stock policy. To learn more, check the guide about safety stock and the service level page for practical examples.

Slow mover's auto-exclusion rules

When a forecast report is generated on your data, some items being sold might been sold so infrequently they are not worth being forecasted. For example, this situation is typical in eCommerce where very large product catalog are frequent.

Salescast apply default exclusion rules to your report:
  • monthly forecasts: items not sold at least 3x over the last 3 months are excluded.
  • weekly forecasts: items not sold at least 3x over the last 3 weeks are excluded.
  • daily forecasts: items not sold at least 10x over the last 15 days are excluded.

Monthly, weekly and daily refer the level of aggregation of the sales data. This policy does not take into account the frequency of the forecast refreshes. Ex: a weekly 12-weeks ahead forecast executed only once a month is treated equally to a a weekly 12-weeks ahead forecast executed every week.

Also, this policy is stateless: a item can alternate between slow and non-slow as time goes, depending on its sales patterns. Hence, each execution might generate a slightly different set of items to be forecasted.

Those exclusion rules are applied by default, so you don't have to worry about slow mover's massively inflating your Lokad forecast consumption.

You can override those rules by introducing a NULLable boolean field Lokad_Items.IsForecasted (or integral field, if boolean is not available). If the value is set at 0 (false) then the item is excluded from the report; otherwise the item is included in the report. If value is NULL, then the default exclusion rule is applied.

Changes in Table Schema and Data

All changes in Lokad_Items schema (table structure) and data (rows and values) between the Refresh operations are detected and handled by Salescast automatically.

You don't need to create a new solution when columns are added, modified or removed from this table. Salescast will automatically detect such changes. On every Refresh operation it scans the entire table, detecting changes in table schema as well as the actual values. These changes are automatically synchronized with your solution.

Please, do not change table schema while Refresh operation is running. This might result in unpredictable behavior.

Orders table

Lokad_Orders table should contain actual sales history. It is is expected to be comprised of four fields:
  • Id the primary key. Required (not NULL).
  • Item_Id a foreign key toward the Items table (not NULL).
  • OrderDate that represents the date associated with the order quantity. Required (not NULL).
  • Quantity that represents the number of items that are ordered. Required (not NULL).

We suggest to pre-aggregate orders on a daily basis, whenever possible. Lokad does not need all the fine-grained details of sales within one day to produce sales forecasts. A daily aggregation of the daily sales data, however, is likely to reduce the amount of data by factor x10 or more. This will improve performance and reduce stress on the database.

Note that aggregating the data is not a requirement. Salescast will be able to perform the aggregation on its end if needed.

In addition to the PK, Orders Table should have index on Item_Id+OrderDate as shown in the queries for SQL and MySQL.

Events table

Events are part of Lokad's Tags+Events framework. Events help to keep track of phenomenons that impact the demand, such as promotions or shortages.

The optional table Lokad_Events, when present, should contain the following fields:
  • Id - the primary key (PK) of type Int64 or bigint. Required.
  • Item_Id a foreign key toward the Lokad_Items table. Required.
  • StartDate that represents when the event started.
  • TagFoo that represents a tag (descriptor) to be associated with the event. Only the prefix Tag matters, the suffix Foo is left for your convenience.

Although this is not a requirement, we suggest to avoid naked events where:
  • There are no descriptive field prefixed by Tag.
  • There are records where all Tag prefixed fields are null.

Indeed, naked events do not tell Lokad much except that something took place.

Forecasts table

The optional Lokad_Forecasts and Lokad_ForecastSync are Salescast's output tables. If they are present and contain the proper fields, Salescast will append Lokad's forecasts.

Fields of the table Lokad_ForecastSync:
  • Id the primary key.
  • SyncDate is the sync date (date when sync export was started).
  • SyncReference is the unique identifier for the specific sync operation.
  • CompletedDate is the NULL-able date field that is set once export is finished.

The table Lokad_ForecastSync contains the list of all forecasting operations that were performed by Salescast.

Fields of the table Lokad_Forecasts:
  • Id the primary key.
  • Sync_Id a foreign key referencing ForecastSync.
  • Item_Id a reference to Items (note, that this is not a foreign key).
  • ForecastDate that represents date of the forecast.
  • Quantity that represents the forecasted value.
  • Accuracy that represents anticipated precision of the forecast. Lokad Forecasting Service estimates the value itself based on MAPE (Mean Absolute Percentage Error).

The table Lokad_Forecasts contains fine-grained information of the forecasts that produced by Lokad. In particular, every single forecasts is associated with its own accuracy.

The process of Salescast only appends forecast data. In particular, Salescast does not modify any existing data. Depending on your database capacity, you may want to archive or clean-up old forecasts after a while.

Optimized Inventory Table

For rich reporting experience you can optionally include Lokad_OptimizedItems table in your database schema (requires forecast tables to be present). This table would be populated by Salescast following a simple naming convention. It will contain information about the products (i.e. Lokad_Items) at the moment of the report generation (one row per product). Additionally inventory optimization information could be exported into this table, as calculated by Salescast based on the provided inventory-specific values and the actual forecasts.

This behavior is useful for reporting purposes and making various integration scenarios simpler. For instance, ReorderPoint and StockCover values could be passed to the inventory management software. Additionally, this table could be useful for the BI purposes, since it will represent a detailed snapshot of the most important products in your inventory at the moment of report generation.

Data exported into the table Lokad_OptimizedItems depends on the columns being defined in this table. There are a few required columns, product-specific columns (copied from the original Lokad_Items table) and columns containing inventory optimization information. This behavior let you choose what data you wish to be output by Salescast.

Required columns

Required fields of Lokad_OptimizedItems table:
  • Id - the primary key.
  • Sync_Id a foreign key referencing ForecastSync.
  • Item_Id a reference to Items (note, that this is not a foreign key).

Product-specific Columns

Product-specific columns are optional columns for the Lokad_OptimizedItems. They must be NULLable and should have name and data type that matches column name in Lokad_Items table. In this case Salescast will copy values towards from Lokad_Items the Lokad_OptimizedItems for every row that is included in the report. For columns that do not have matching field in Lokad_Items, value will stay NULL.

Inventory Optimization Fields

These fields could be included in Lokad_OptimizedItems in order to have calculated inventory optimization values exported towards this table:

  • StockCover - numeric (floating point) NULLable field to contain length of time (in days) that inventory will last if current usage continues.
  • ReorderPoint - numeric (floating point) NULLable field to contain recommended minimum level of inventory at which a new order must be placed.
  • LeadDemand - numeric (floating point) NULLable field to contain total forecasted demand during the lead time.

Changes in Table Schema

You can customize the table as you see fit, as long as the table specifications are met. Salescast will detect and handle changes on the next Refresh operation.

Please, do not change table schema, while Refresh operation is running. This might result in unpredictable behavior.

Best integration practices

This section provides a general guideline on making data available to Salescast.

3 years of sales history

Organizations frequently have long sales histories that comprise multiple years. When you provide it to the Salescast via an indexed view or database replica, it is sometimes not feasible to push all the available data. You may ask yourself: How much data does Lokad's Forecasting Service need?

The answer is simple: The more data, the better the forecasts. Sales history that comprises, for example, multiple years, allows for forecasting models that capture yearly seasonality patterns, increasing accuracy for holidays, seasonal promotions and other reocurring events that might be important for your industry.

According to our experience, 3 years of sales history provides sufficient data to capture such patterns.

Daily pre-aggregation

The forecasting engine does not need sales data down to individual transactions. Aggregating data at the daily level can reduce the amount of data to be exposed and later to be retrieved by Lokad typically by 10x or more; and this aggregation no downside on the forecast's accuracy.

Treating shortages as event

The Inventory section, mentioned above, does not take shortages into consideration, although a sales shortage typically impacts the demand (first by causing a sales slowdown, secondly by causing a sales burst when inventory is replenished).

Lokad considers shortages using events. You need to step a specific token associated with the status of the item that experiences a shortage The event should then be positioned at the beginning of the actual shortage.

If the shortage comprises multiple days or even multiple weeks, there is no need for replicating the shortage event multiple times. From a statistical viewpoint, a single event, positioned at the beginning of the shortage, is sufficient to capture the phenomenon.

Multi-day promotions

There is no need to replicate the promotion event at each period, if it comprises multiple days, or multiple weeks. A single promotion event, positioned at the beginning of the promotion, is enough for Lokad to capture the statistical pattern.

Script templates

This section contains SQL script templates that can be used to setup your intermediate database.

Microsoft SQL Server / Azure

Salescast can integrate with databases on Microsoft SQL Server and Windows SQL Azure. Data retrieval, forecast exports and customizable inventory optimization exports are supported.

We provide customizable script template for Microsoft SQL that can help you to set up intermediate Oracle database according to the conventions and specifications outlined above.

Learn also how to grant access to Salescast to your Microsoft SQL Server. Since Salescast is hosted on Windows Azure, it requires a static port.

MySQL / Amazon RDS

Salescast supports integration with on-premises MySQL databases and cloud-hosted Amazon RDS instances. Data retrieval, forecast exports and customizable inventory optimization exports are supported.

We provide customizable script template for MySQL to help in setting up intermediate MySQL database according to the conventions and specifications outlined above.

PostgreSQL

Salescast supports automatic integration with PostgreSQL database version 7.3 or later. Data retrieval, forecast exports and customizable inventory optimization exports are supported.

There is a customizable Intermediate Schema template for PostgreSQL.

Oracle

Automatic integration with intermediate Oracle databases is supported by Salescast. You can use customizable intermediate template for Oracle to streamline the setup process.

Content

Does Salescast apply to my company?

What people say

Classical solutions require too much manpower and don't scale correctly over hundreds of thousands of products. Lokad and Windows Azure were exactly the solution my business needed. Pierre-Noël Luiggi, CEO of Oscaro
The Lokad forecasting solution allows us to precisely forecast our sales and to optimize our inventory accordingly. The result is there: we are maintaining a 99% customer satisfaction level and deliver food that is often fresher than what can be found at local pet stores. Anthony Holloway, CEO at k9cuisine
Lokad improved the accuracy of our planning process significantly. The immediate impact was a stock reduction of almost 1 million € at a monthly cost of 150€. It was almost frightening to see our inventory levels getting so low! But what impressed me most is the ease of implementation and use. The integration was painless, and now it takes only a the click of a button and within 10 minutes I receive my forecast. The time saving for me is significant. Thomas Brémont, Head of Supply Chain Bizline

More success stories.