» HereIntended audience:
data administrators, developers
Phantomscan is an in-store productivity webapp
that helps store employees to get rid of phantom inventory
. In order to do that, Phantomscan can import inventory data from plain text files published on an FTP server in TSV format, as long as the files are provided in a predefined format.
See also our BigFiles
, our file hosting service that supports FTP. Push your files directly to Lokad.
Phantomscan relies on two types of files:
- regular snapshots of inventory state at points in time, where each file is separate snapshot
- continuous history of recorded inventory movements over periods of time, where each file represents a portion of ongoing history.
This document details how to setup these TSV files in a way that will enable Phantomscan to auto-import your inventory data without any further integration needed from Lokad's side. Download sample TSV files
Tab-separated Values (TSV)
TSV stands for Tab-separated Values. It is a format of storing datasets in a specially formatted text files, which can be used by software. Although the format is extremely old, it works surprisingly well for data exchange with large datasets.
TSV format is defined by:
- A file in TSV format consists of lines.
- Each line contains fields separated from each other by TAB characters (horizontal tab, HT, ASCII control code 9).
- "Field" means here just any string of characters, excluding TABs. The point is simply that TABs divide a line into pieces, components.
- Each line must contain the same number of fields.
- The first line contains the names for the fields (on all lines), i.e. column headers.
Since TAB is used as a separator between fields, a field cannot contain a TAB. However, TABs usually don't appear in data items that you wish to tabulate, so this is seldom a restriction. Otherwise, we suggest to strip them upon export, replacing with 4 space characters.
More detailed description (along with instructions to Export from Excel) is available: Tab Separated Values (TSV): a format for tabular data exchange
Numbers and dates
In all TSV files, floating numbers must be formatted using
(dot) as decimal separator. No other digit separators are allowed.
- Good: 1234.00
- Good: -12
- Bad: 1234,12
- Bad: 1.123,12
- Bad: 1,123.12
All dates should be formatted using
- yyyy - 4 year digits
- MM - 2 month digits
- dd - 2 day digits
- HH - 2 digits of time in 24 hour format
- mm - 2 minute digits
- ss - 2 second digits
- Good: 2012-09-18
- Good: 2012-09-18 19:23:00
- Bad: 2012-9-18
- Bad: 9-18-2012
- Bad: 18-9-2012
- Bad: 18-9-2012 5:34 PM
Prices and currencies
All prices are expected to be expressed in the canonical currency unit, following the guidelines here above for floating numbers. A given Phantomscan project supports only a single currency. The currency is defined from the Phantomscan settings.
For example, in order to represent $199.50:
19950 (cents not USD)
Overall file hierarchy
Phantomscan imports two types of files:
- SKU files - include static information about each SKU, generally representing a snapshot of inventory at certain point in time
- inventory events files - include the history of inventory movements during a certain period of time.
These files are to be uploaded to a FTP directory exposed to Phantomscan. This directory can be kept up-to-date by uploading new files, while preserving existing files. Here's how such a directory might look like:
The lines with
represent omitted set of files.
Usually, it's desirable to update data daily, uploading new SKU snapshots and inventory events, which happened since the previous upload. However different intervals (even irregular) are also possible.
If files are hosted on the Lokad FTP Service, Phantomscan might compress them with GZIP, appending
.gz extension. This enables you to use space more efficiently.
An essential property of this file naming convention is that files are never modified
. Once a file has been uploaded, it is considered to be part of history (for the date that appears in its name) and will never be changed again.
Format of Phantomscan_SKUs_YYYY-MM-DD.tsv
file is intended as a snapshot
of all the SKUs being part of the inventory at the date embedded in the file name. Whenever the ambiguity exists, the content is assumed to be associated with the state of the inventory at the end of the day
, that is, after the daily sales or the daily shipment.
Please, keep in mind precise naming pattern of the file. This is needed by Phantomscan in order to properly associate inventory snapshots with dates.
Capturing daily snapshots of the SKU information can seem very redundant, because SKUs change very little from one day to the next. In practice, however, we observe that most ERP or inventory management software do not support fine-grained tracking of all changes that impact inventory, such as tracking that an SKU has been renamed for example. Hence, Phantomscan relies on snapshots, and on snapshots comparison to figure out changes over time.
If your software allows retrieval of fine-grained events about inventory change events, please get in touch with Lokad support.
Phantomscan applies the following interpretations to the SKU files: as long no newer SKU file is provided, the most recent file is considered as being in effect.
Let's illustrate the situation with an example. Let's assume that the FTP repository contains 4 files:
Then, we have the following observations:
- The set of files is valid. There are some days missing but Phantomscan does not require SKU snapshots to be taken every day (although doing that can improve accuracy).
- On December 1st and 2nd, the file
- From December 3rd to December 6th, the file
- If a given SKU appears in
Phantomscan_SKUs_2012-12-01.tsv but not in
Phantomscan_SKUs_2012-12-03.tsv, then it is assumed to have disappeared on December 3rd.
TSV Fields for SKUs
Core fields, must always be present in the extract.
||Required. The identifier associated to the SKU itself. A given file is not supposed to contain two lines with the same ID value. This identifier is used to match SKUs from one snapshot file to another.
||Required. The primary human-readable entry associated to the SKU. Within Phantomscan, this entry is used to refer to the SKU.
||Required. The expected inventory level for the SKU. This quantity should be computed at the end of the day (hence all stock movements for the day are supposed to be already applied).
||Required. A comma (
,) separated list of unique item identifiers (usually, the GTIN barcode) that are part of the SKU.
||The per-unit selling price of the item, no tax applied. May be used by Phantomscan to compute the expected ROI for counting specific SKUs, which in turn may lead to improved counting suggestions.
||An unique identifier for the location where a given SKU is stored. If the column is provided, then blank values are forbidden. If the column is not provided, then a single
"default" location is assumed. Used if you need to support multiple locations, such as multiple stores
||All the fields that start with the prefix
Label are treated properties to be visually associated to the SKU within Phantomscan. For example,
LabelSupplierName could be used to list the name of the supplier. Multiple labels can be used, just replace
Foo with a relevant suffix. This mechanism is intended to offer flexibility in the information to be exposed by Phantomscan.
||Identifier of the first level of hierarchy associated to the item. The name of the hierarchy itself is hold by the suffix. For example,
H1SuperFamily could be introduced to a top level hierarchy named
SuperFamily. Most companies have their own hierarchical terminologies, Phantomscan adapts to your conventions.
H1Foo for nested levels (sub-categories).
H1Foo for nested levels (sub-sub-categories).
Format of Phantomscan_InventoryEvents_YYYY-MM-DD.tsv
The file represents inventory movement events (e.g.: item sold in store, item shipped, item lost), within a certain date range. Unlike the SKU files (which are mostly redundant snapshots), each event file represents a portion of history which will be combined by Phantomscan with the other event files.
Unlike details about SKUs, inventory movements are nearly always properly historically captured within the software. Hence, instead of dealing with snapshots, Phantomscan imports inventory events as recorded over defined periods of time.
Phantomscan applies the following interpretations to the event files: an event file is supposed to contain all events since the end of the previous file
. Let's illustrate the situation with an example. Let's assume that the repository contains 4 files:
Then, we have the following observations:
- As being the early file present in the repository, the file of December 1st might contain all the event history going back several years in the past.
- The file of December 3rd contains all events from December 2nd and December 3rd.
- The file of December 7th contains all events from December 4th until December 7th.
The date ranges covered by the event files should not overlap.
TSV fields for Inventory Events
Inventory events are represented as rows in TSV file. Phantomscan supports several distinct event types. Move types (in and out) represent individual stock changes, usually when a sale, shipment or replenishment occurs, where the quantity that leaves or enters the inventory is precisely known. Count events represent manual inventory counts when the exact inventory becomes precisely known. Two fields are mandatory and should be present for every event:
||Required. The date and time of the event entry.
||Required. The type of the event. Allowed values:
The specific details for each event type are described below. Each event type has its own fields with specific meanings, some of which are mandatory. TSV columns which are not used by a specific event type should be left blank on the corresponding lines.
) inventory event represents cases where a specific quantity of a certain item enters (resp. leaves) the inventory.
||Required. The quantity of items involved. For example, if 2 units are sold, then an
out event with a quantity of
2 should be issued.
||Required. The item involved in the inventory movement. The inventory event will be associated with the SKU that contains that item at the specified location.
||The location where the event took place. If this column is missing or blank, a
"default" location is assumed.
event represents situations when a manual inventory count was performed for a given SKU. This establishes the current stock level for the SKU with certainty.
||Required. The exact current quantity for the SKU, as determined by the manual count. For example, if a manual count determines that there are 11 units in stock, a
count event with a quantity of
11 should be issued.
||Required. The SKU that underwent manual counting.
FAQ (Frequently Asked Questions)
Difference between ItemId and SkuId?
In a store, sales are fundamentally recorded leveraging the barcode
information. However, over time, the barcode of a product might change. Then, since the two products (old and new) can be extremely similar, a single SKU typically aggregates the two products; that is, the store manager manages the two products as if they were a single product. Phantomscan operates its aggregation at the SKU level, but by supporting the notion of ItemId
, one does not need to rewrite the history (ake inventory events) with the new barcode identifiers in case of barcode evolution; only the SKU file needs to be refreshed with the latest mappings.