Home »
Salescast » Here
Intermediate SQL schema
Intended audience: database administrators, developers
Salescast is a forecasting client webapp that leverages the Lokad Forecasting Services. Salescast can import data from any remotely reachable SQL database. Yet, if you want to avoid the production database to be directly accessed to a 3rd party app such as Salescast, it is usually considered a best practice to
setup an intermediate database.
This document details how to setup such an intermediate database, with a strong focus on the
expected database schema that will allow Salescast to auto-import your sales data without any customization need on Lokad's side.
This document contains SQL script templates for creating such export databases in the following flavors:
- Microsoft SQL Azure or Microsoft SQL Server (both systems are nearly the same);
- MySQL;
- Postgre SQL 7.3+
- Oracle
Make sure to customize the template to your needs, before creating the actual database.
If you don't find an SQL template for you favorite database you want to integrate, please drop us an email!
Getting an extra (cloud) database
The process outlined in this document requires an
extra database to be setup. This database will be populated with data originating from your current production system. This happens typically through an
ETL tool. Data is consumed by Salescast in the end.
Getting an extra database on-demand is now rather straightforward through
cloud computing solutions.
- If your production database is Microsoft SQL Server, we suggest to get a database from Microsoft SQL Azure. Setting up your SQL Azure database instance is a matter of minutes.
- If your production database is MySQL, we suggest to get a database from Amazon RDS. Setting up your MySQL database instance is also a matter of minutes.
Those cloud-hosted databases are natively designed for remote access, and are best suited for interacting with Salescast.
Firewall access and IP address
Salescast is hosted on Windows Azure, so we don't have a fixed IP addresses. The IP range of the Microsoft data-center is 94.245.*.*. (or 94.245.0.0/16 written with the
CIDR notation). With MySQL it is possible to grant an access for the IP range with the following command: GRANT SELECT, LOCK TABLES ON databaseName.* TO username@'94.245.%.%' IDENTIFIED by 'password';
Data: the big picture
The sales data imported by Salescast fall into three SQL tables used for
read-only purposes:
- Lokad_Items
- Lokad_Orders
- Lokad_Events
Lokad_Items table should contain all the descriptions of the products, articles or services offered by your company. This table is primarily used to generate a
human-readable report with proper labels (not just database identifiers). This table may also contain key inventory-oriented information such as
StockOnHand. It will be used to generate reports more oriented toward inventory optimization purposes. In addition this table may also contain
Tags to refine the forecasts.
Lokad_Orders table should contain the bulk of the sales history. A line in the
Lokad_Orders table typically represents the amount of items sold on a given day.
Lokad_Events table is optional and represents extra information about phenomenon such as
retail promotions that are likely to impact the demand. Check the guide about
Events for more information.
Convention over configuration
Instead of following a rigid SQL schema that might not be appropriate for your business, Salescast relies on a flexible approach based on
SQL naming conventions. In short, Salescast can potentially import a lot of data, but all this data may not be available in your systems (or might not even be relevant to your business).
Salescast lets you choose, which data you want to include in the intermediate database. Nearly all fields are optional. If a field cannot be found, Salescast will just go on processing your data further and skip any missing fields.
Caution: Salescast is case-sensitive, make sure to respect the given capitalization for fields and table names.
Items table
Lokad_Items table should contain descriptions of the articles / SKU / services being offered. The table can have any number of extra columns in addition to the Primary Key (PK). Those extra columns can be numeric (integers, float, ...) or plain text (such as nvarchar).
Additional indexes and constraints could be added as needed for your convenience.
The table must be named
Lokad_Items and should contain:
Id - the primary key (PK) of type Int64 or bigint. Required.
This table is required.
Display fields
Any field prefixed by the token
Label will be inserted into the report that will be produced by Salescast. For example, if your
Lokad_Items table contains a field named
LabelProductName, then report will include a column named
ProductName (the
Label part will be removed from the name).
This convention allows to add any arbitrary set of extra fields for reporting purposes. Fields typically include, without limitations: Supplier, Brand, ISBN, etc. These fields are
for display only and they do not impact the forecasting operation itself.
All display fields can hold NULL values. The report displays the corresponding cell in the report as blank.
Tagging fields
Salescast uses any field prefixed with the token
Tag to assign this
tag to the item. Color, size, brand etc. are typical examples of tags used to improve the accuracy of the forecast. Please refer to the
guide for tags and events to learn more about which data should be used for tags.
Any tagging field can hold NULL values. When a NULL value is encountered, the tag will simply be omitted.
If you wish to combine both tag and label behaviors for a single field, then you need to prefix the column with TagLabel.
Fields specific for the inventory
Inventory specific fields
do not contribute to the forecasting operations. They allow Salescast to perform some simple
safety stock reporting using
reorder points and
stock coverages.
There are 4 fields that are natively associated with a specific semantic:
StockOnHand the number of items that are physically available. This value is expected to be a number (eventually a floating number, if the stock represents a quantity of materials).StockOnOrder the number of items that are already ordered, but not present, yet. This value is expected to be a number (eventually a floating number too).LeadTime represents the number of days that the supplier requires to replenish the local storage of items. Caution: the duration is always expressed in days even if your report is setup for weekly or monthly forecasts. Learn more about lead time.ServiceLevel is a number between 0 and 1 that represents the probability of not experiencing a shortage. This field acts as a safety stock policy. To learn more, check the guide about safety stock and the service level page for practical examples.
Slow mover's auto-exclusion rules
When a forecast report is generated on your data, some items being sold might been sold so infrequently they are not worth being forecasted. For example, this situation is typical in eCommerce where very large product catalog are frequent.
Salescast apply default exclusion rules to your report:
- monthly forecasts: items not sold at least 3x over the last 3 months are excluded.
- weekly forecasts: items not sold at least 3x over the last 3 weeks are excluded.
- daily forecasts: items not sold at least 10x over the last 15 days are excluded.
Monthly, weekly and daily refer the
level of aggregation of the sales data. This policy does not take into account the frequency of the forecast refreshes. Ex: a weekly 12-weeks ahead forecast executed only once a month is treated equally to a a weekly 12-weeks ahead forecast executed every week.
Also, this policy is
stateless: a item can alternate between slow and non-slow as time goes, depending on its sales patterns. Hence, each execution might generate a slightly different set of items to be forecasted.
Those exclusion rules are applied by default, so you don't have to worry about slow mover's massively inflating your Lokad forecast consumption.
You can override those rules by introducing a NULLable boolean field
Lokad_Items.IsForecasted (or integral field, if boolean is not available). If the value is set at 0 (false) then the item is excluded from the report; otherwise the item is included in the report. If value is NULL, then the default exclusion rule is applied.
Changes in Table Schema and Data
All changes in
Lokad_Items schema (table structure) and data (rows and values) between the
Refresh operations are detected and
handled by Salescast automatically.
You don't need to create a new solution when columns are added, modified or removed from this table. Salescast will automatically detect such changes. On every
Refresh operation it scans the entire table, detecting changes in table schema as well as the actual values. These changes are automatically synchronized with your solution.
Please, do not change table schema while Refresh operation is running. This might result in unpredictable behavior.
Orders table
Lokad_Orders table should contain actual sales history. It is is expected to be comprised of four fields:
Id the primary key. Required (not NULL).Item_Id a foreign key toward the Items table (not NULL).OrderDate that represents the date associated with the order quantity. Required (not NULL).Quantity that represents the number of items that are ordered. Required (not NULL).
We suggest to
pre-aggregate orders on a daily basis, whenever possible. Lokad does not need all the fine-grained details of sales
within one day to produce sales forecasts. A daily aggregation of the daily sales data, however, is likely to reduce the amount of data by factor x10 or more. This will improve performance and reduce stress on the database.
Note that aggregating the data is not a requirement. Salescast will be able to perform the aggregation on its end if needed.
In addition to the PK, Orders Table should have index on
Item_Id+OrderDate as shown in the queries for SQL and MySQL.
Events table
Events are part of Lokad's
Tags+Events framework. Events help to keep track of phenomenons that impact the demand, such as promotions or shortages.
The optional table
Lokad_Events, when present, should contain the following fields:
Id - the primary key (PK) of type Int64 or bigint. Required.Item_Id a foreign key toward the Lokad_Items table. Required.StartDate that represents when the event started.TagFoo that represents a tag (descriptor) to be associated with the event. Only the prefix Tag matters, the suffix Foo is left for your convenience.
Although this is not a requirement, we suggest to avoid
naked events where:
- There are no descriptive field prefixed by
Tag. - There are records where all
Tag prefixed fields are null.
Indeed,
naked events do not tell Lokad much except that
something took place.
Forecasts table
The optional
Lokad_Forecasts and
Lokad_ForecastSync are Salescast's
output tables. If they are present and contain the proper fields, Salescast will append Lokad's forecasts.
Fields of the table
Lokad_ForecastSync:
Id the primary key.SyncDate is the sync date (date when sync export was started).SyncReference is the unique identifier for the specific sync operation.CompletedDate is the NULL-able date field that is set once export is finished.
The table
Lokad_ForecastSync contains the list of all forecasting operations that were performed by Salescast.
Fields of the table
Lokad_Forecasts:
Id the primary key.Sync_Id a foreign key referencing ForecastSync.Item_Id a reference to Items (note, that this is not a foreign key).ForecastDate that represents date of the forecast.Quantity that represents the forecasted value.Accuracy that represents anticipated precision of the forecast. Lokad Forecasting Service estimates the value itself based on MAPE (Mean Absolute Percentage Error).
The table
Lokad_Forecasts contains fine-grained information of the forecasts that produced by Lokad. In particular, every single forecasts is associated with its own accuracy.
The process of Salescast only
appends forecast data. In particular, Salescast does not modify any existing data. Depending on your database capacity, you may want to archive or clean-up old forecasts after a while.
Optimized Inventory Table
For
rich reporting experience you can optionally include
Lokad_OptimizedItems table in your database schema (requires forecast tables to be present). This table would be populated by Salescast following a simple naming convention. It will contain information about the products (i.e.
Lokad_Items) at the moment of the report generation (one row per product). Additionally inventory optimization information could be exported into this table, as calculated by Salescast based on the provided inventory-specific values and the actual forecasts.
This behavior is useful for reporting purposes and making various integration scenarios simpler. For instance,
ReorderPoint and
StockCover values could be passed to the inventory management software. Additionally, this table could be useful for the BI purposes, since it will represent a detailed snapshot of the most important products in your inventory at the moment of report generation.
Data exported into the table
Lokad_OptimizedItems depends on the columns being defined in this table. There are a few required columns, product-specific columns (copied from the original
Lokad_Items table) and columns containing inventory optimization information. This behavior let you choose what data you wish to be output by Salescast.
Required columns
Required fields of
Lokad_OptimizedItems table:
Id - the primary key.Sync_Id a foreign key referencing ForecastSync.Item_Id a reference to Items (note, that this is not a foreign key).
Product-specific Columns
Product-specific columns are optional columns for the
Lokad_OptimizedItems. They must be NULLable and should have name and data type that matches column name in
Lokad_Items table. In this case Salescast will copy values towards from
Lokad_Items the
Lokad_OptimizedItems for every row that is included in the report. For columns that do not have matching field in
Lokad_Items, value will stay
NULL.
Inventory Optimization Fields
These fields could be included in
Lokad_OptimizedItems in order to have calculated inventory optimization values exported towards this table:
StockCover - numeric (floating point) NULLable field to contain length of time (in days) that inventory will last if current usage continues.ReorderPoint - numeric (floating point) NULLable field to contain recommended minimum level of inventory at which a new order must be placed.LeadDemand - numeric (floating point) NULLable field to contain total forecasted demand during the lead time.
Changes in Table Schema
You can customize the table as you see fit, as long as the table specifications are met. Salescast will detect and handle changes on the next
Refresh operation.
Please, do not change table schema, while Refresh operation is running. This might result in unpredictable behavior.
Best integration practices
This section provides a general guideline on making data available to Salescast.
3 years of sales history
Organizations frequently have long sales histories that comprise multiple years. When you provide it to the Salescast via an indexed view or database replica, it is sometimes not feasible to push all the available data. You may ask yourself:
How much data does Lokad's Forecasting Service need?The answer is simple:
The more data, the better the forecasts. Sales history that comprises, for example, multiple years, allows for forecasting models that capture yearly seasonality patterns, increasing accuracy for holidays, seasonal promotions and other reocurring events that might be important for your industry.
According to our experience,
3 years of sales history provides sufficient data to capture such patterns.
Daily pre-aggregation
The forecasting engine does not need sales data down to individual transactions. Aggregating data at the daily level can reduce the amount of data to be exposed and later to be retrieved by Lokad typically by 10x or more; and this aggregation no downside on the forecast's accuracy.
Treating shortages as event
The
Inventory section, mentioned above, does not take
shortages into consideration, although a sales shortage typically impacts the demand (first by causing a sales slowdown, secondly by causing a sales burst when inventory is replenished).
Lokad considers shortages using
events. You need to step a specific token associated with the status of the item that experiences a shortage The event should then be positioned at the beginning of the actual shortage.
If the shortage comprises multiple days or even multiple weeks, there is no need for replicating the shortage event multiple times. From a statistical viewpoint, a single event, positioned at the beginning of the shortage, is sufficient to capture the phenomenon.
Multi-day promotions
There is no need to replicate the promotion event at each period, if it comprises multiple days, or multiple weeks. A single promotion event, positioned at the beginning of the promotion, is enough for Lokad to capture the statistical pattern.
Script templates
This section contains SQL script templates that can be used to setup your intermediate database.
Microsoft SQL Server / Azure
Salescast can integrate with databases on Microsoft SQL Server and Windows SQL Azure. Data retrieval, forecast exports and customizable inventory optimization exports are supported.
We provide
customizable script template for Microsoft SQL that can help you to set up intermediate Oracle database according to the conventions and specifications outlined above.
Learn also how to
grant access to Salescast to your Microsoft SQL Server. Since Salescast is hosted on Windows Azure, it requires a static port.
MySQL / Amazon RDS
Salescast supports integration with on-premises MySQL databases and cloud-hosted Amazon RDS instances. Data retrieval, forecast exports and customizable inventory optimization exports are supported.
We provide
customizable script template for MySQL to help in setting up intermediate MySQL database according to the conventions and specifications outlined above.
PostgreSQL
Salescast supports automatic integration with PostgreSQL database version 7.3 or later. Data retrieval, forecast exports and customizable inventory optimization exports are supported.
There is a customizable
Intermediate Schema template for PostgreSQL.
Oracle
Automatic integration with intermediate Oracle databases is supported by Salescast. You can use
customizable intermediate template for Oracle to streamline the setup process.