statement in Envision offers the possibility to load one or more files into a table. This page gathers the different options relating to the
The usual syntax for reading a file in Envision is as follows:
read "/sample/Lokad_Orders.csv.gz" as Orders with
"Order Date" as OrderDate : date
"Quantity" as Qty
The text which comes just after the
keyword is referred to as the path
. The path is intended to match the files that exist in the file storage associated with a given Lokad account.
The columns are not required to be explicitly listed after the
keyword. In this case, the columns are available within the Envision script as long as the column names are valid variable names (for example, no spaces allowed in the column name). The primary purpose of the keyword
is to rename both tables and columns.
option that follows the
is omitted, then, Envision reads this file as part of the items
Reading multiple files
It is possible to read multiple files at once through the use of the wildcard [
] in the specified path:
read "/sample/Lokad_Orders*.gz" as Orders
Multiple wildcards are authorized.
Filter one file among many
In order to restrict the set of files originally captured by the wildcard to a single file, the
statement supports 3 distinct modifiers:
min: only the first file within the set is retained, ordering is based on the file names.
max: only the last file within the set is retained, ordering is based on the file names.
latest: only the most recently updated file within the set is retained.
Using the above modifiers can be done as follows:
read max "/sample/Lokad_Orders*.gz" as Orders
Filter many files among many
In order to restrict the set of files originally captured by the wildcard to a single set of files that share the same folder, the
statement can use the percentage sign (
) - one at most - to indicate the folder match of interest.
read max "/%/Lokad_Orders*.gz" as Orders
statement will capture all the files that match the folder as restricted by the modifier. As a folder does not have a last updated
date in Lokad, only the
modifiers are allowed when the percentage sign is present. Conversely, when the percentage sign is present, one
modifier should be present too.
Multiple options are supported in relation to the
skip (number): Indicates how many lines should be skipped at the beginning of the file while the file is being read. If omitted, the default value is zero.
separator (text): Indicates the separator between cell values to be used when parsing the file. If omitted, Envision will use auto-detection heuristics to determine the most likely separator.
quotes (boolean): If
false is used, the extreme quotes found in a cell token are considered to be part of the token and not escaped. If omitted, the default value is
true and Envision escapes extreme quotes, as is usually done with CSV (comma-separated values) files.
encoding (text): The text encoding to be used while parsing the file. The accepted encoding values are:
latin1. If omitted, Envision will use auto-detection heuristics to determine the most likely encoding.
The script below illustrates how the above options can be specified:
read "/sample/Lokad_Orders.csv.gz" skip:0 separator:"\t" quotes:false encoding:"UTF-8"
Table type options
Envision offers the possibility to define the primary keys for a table, by listing them between parentheses right after specifying the table name. There are only 7 primary key combinations that are allowed:
read "a.csv" as A[*]
read "b.csv" as B[Id]
read "c.csv" as C[Id, *]
read "d.csv" as D[Date]
read "e.csv" as E[Date, *]
read "f.csv" as F[Id, Date]
read "g.csv" as G[Id, Date, *]
The table type option can be omitted. In this case, the default type applied is
[Id, Date, *]
Multiple options are also supported for each column to be read:
- The data type of the column, which can be
distribution. The data type is introduced after a colon (
:) that follows the column name.
date: the date format string
number: the number format string
The following script illustrates the usage of the column options:
read "/sample/Lokad_Orders.csv.gz" as Orders with
date:"dd-MM-yyy" as OrderDate : date
number:"1,000.00" as Qty
Specifying the columns is optional. Specifying the data type of a column is also optional even if the column is specified. When data type is not specified with the
statement, the column data type is inferred from usage within the Envision script.
Distributions are an advanced data type that is only supported by Lokad's Ionic file format
The number format string imitates the actual desired output for the numbers:
- The character after the initial 1 is the thousand separator. It can be omitted.
- The character before the last zeroes is the decimal separator. It can be omitted.
The format string for the
option can be built from the following tokens:
d: day of the month from 1 through to 31
dd: same as d but 0 prefixed
ddd: Abbreviated day of the week (i.e. Mon, Tue...)
dddd: Full day of the week (Monday...)
MM: Month number from 1 through to 12
MMM: Abbreviated name of the month (Jun)
MMMM: Full name of the month (June)
yy: Year number from 00 to 99; ignore the hundreds and thousands
yyyy: year with four digits