Filtering data with Envision - Inventory Optimization Software

Filtering data with Envision












Home » Resources » Here

Data filtering is the bread and butter of commerce analytics. Envision provides extensive support to achieve this. Filtering is supported by the where and when conditions, that can alternatively be used to filter either large blocks of script at once, or to filter individual statements. This page provides a hands-on introduction to these concepts.


An illustrating script

Once again, let’s start with the sample dataset which should be accessible from the path /sample in your Lokad account. The script below is moderately complex and illustrates some of the filtering patterns available in Envision. If you have not done so already, we advise you to read Making calculations with Envision first, before working with information on this page.
read "/sample/Lokad_Items.tsv"
read "/sample/Lokad_Orders.tsv" as Orders

show label "Filtering and Aggregating with Envision" a1f1 tomato

oend := max(Orders.Date)

when date > oend - 365
  LastYearQty = sum(Orders.Quantity)
  where StockOnHand + StockOnOrder > LastYearQty
    show table "Overstocked items, +1 year of stock" a2f3 tomato with 
      Id
      Name
      StockOnHand + StockOnOrder as "Stock"
      LastYearQty

  where Orders.NetAmount > 1000
    show table "Large transactions over $1000" a4f5 tomato with 
      Id
      Name 
      Orders.Date 
      Orders.Quantity
      Orders.NetAmount
      Orders.Client

when date >= monday(oend) - 52 * 7 & date < monday(oend)
  Week.sold := sum(Orders.NetAmount)
  show linechart "Sold by week{$}" a6f7 tomato with Week.sold
We suggest to start by copy-pasting this script into your Lokad account, and running it once in order to observe the resulting dashboard. If everything works correctly, you should see the dashboard illustrated below.

Image

Filter blocks

When one begins working on, for example, all historical data no older than last year, then most probably, there will be many calculations involved. In such a situation, the last year only filter should be applied to all such calculations. For this reason, with Envision, data filtering mostly happens through blocks of code. The script fragment below illustrates two nested filter blocks. A first block that starts on line 1 with a when condition, followed by a second block that starts on line 3 with a where condition.
when date > oend - 365
  LastYearQty = sum(Orders.Quantity)
  where StockOnHand + StockOnOrder > LastYearQty
    show table "Overstocked items, +1 year of stock" a2f3 tomato with
      Id
      Name
      StockOnHand + StockOnOrder as "Stock"
      LastYearQty
Envision is a language that is sensitive to whitespaces found at the beginning of each line. We refer to a sequence of lines that begin with the same count of whitespaces as a code block; and code blocks can be nested, i.e. a block can contain another block.

If it is the first time that you encounter such a programming pattern, this might feel a bit disconcerting. However, in practice, the Envision script editor gives you a lot of built-in support: when you press Enter at the end of a script line that contains a filter condition, the next line gets an automatic indentation with two extra whitespaces.

Developer’s note: Envision adopts a whitespacing pattern quite similar to the one used in Python. This approach fits well with Envision which also intends to provide a concise syntax to handle data. Since Envision has no loop and no branch, the degree of indentation rarely exceed 3 levels, which makes these whitespaces very manageable in practice.

A filter block starts with a condition – more details on that below – which can be true or false for every line of every table. Within the filter block, only the lines where the condition is true keep being present and all the other lines get filtered. As the name suggests, the when condition involves a temporal filtering, and it applies to all the tables indexed with a Date column. The where condition is usually used to filters the items, and when it does, the filter is applied to all the tables indexed with an Id column.

Once a block starts within an Envision script, it is as if all the table lines satisfying the filter were kept. Otherwise, everything else remains the same, and every piece of script that was possible to write outside the block can be moved within the block. In particular, it is possible to define another filter block within a block. This pattern is called nesting. In the example above, the where block starting at line 3 is nested within the when block that starts at line 1.

Composing a condition

A condition is an expression that can be evaluated as true or false. The notion of filtering is based on the composition of conditions which are evaluated with the input data and possibly with results obtained from intermediate calculations. Envision offers the possibility to compose rich conditions as illustrated with the following script:
when date >= monday(oend) - 52 * 7 & date < monday(oend)
  Week.sold := sum(Orders.NetAmount)
  show linechart "Sold by week{$}" a6f7 tomato with Week.sold
In this snippet, the and operator indicates that both expressions left and right of the operator need to be true. The logical operators supported by Envision are: and, or and not. In addition, numbers can be compared with the numeric operators == (equality), != (inequality), <= (lower than or equal), >= (greater than), < (lower than), > (greater than). The following snippet illustrates how these operators are combined and evaluated.
a := 1 > 10 // false
b := not a // true
c := a | b // true
d := a & b // false
e := 10 >= 3 | 5 > 7 // true
show table "Conditions" with a, b, c, d, e
When a filter block is nested within another filter bock, it is as if an and operator was used with the two conditions being placed on the left and right of the and operator.

In the example above, you might have noticed the presence of the monday(oend) syntax. This is a function call to the function named monday. For any date, this function returns the last Monday that happens no later (inclusive) than the date passed as argument. Hence, if a Monday is passed as argument, the function returns the same date.

This monday() function is used to define a time window composed of whole weeks (instead of partial weeks). Indeed, when performing a weekly aggregation, only whole weeks should be considered, otherwise, the first and the last data points might display puzzlingly low values which merely reflect that only a partial week was used.

Filtering tables by date

Envision offers extensive support to filter tables against temporal conditions. The script above filters all the lines that happen to be older than 1 year. Let’s have a closer look at the relevant piece of script.
oend := max(Orders.Date)
when date > oend - 365
  LastYearQty = sum(Orders.Quantity)
Envision internally treats dates as the integral number of days since January 1st, 2001. This offers numerous advantages including the possibility to perform arithmetic operations on dates. For example, subtracting 7 from a date gives the previous week, while subtracting 365 gives an (approximate) last year date.

The date keyword also comes with a special behavior. Implicitly, the date variable refers to all the tables that contain a Date column. As there is only one table involved here, namely the Orders table, it is as if we had written Orders.Date > oend - 365 instead. However, the date syntax is not only more concise, but it also applies to all the relevant tables at once. If instead of looking for the quantities sold, we would have been interested in the purchased quantities, this could have been written with:
read "/sample/Lokad_PurchaseOrders.tsv" as PO // upload of PurchaseOrders

// snipped

when date > oend - 365
  LastYearQty = sum(PO.Quantity)

Filtering items

Beyond filtering lines by date, Envision also offers the possibility of filtering the items, which is another frequent need whenever one seeks to exclude certain products, locations, categories, etc. The script at the top of this page illustrates how the scope can be restricted to items that are most qualified to be referred to as dead inventory. The relevant lines are specified below.
  where StockOnHand + StockOnOrder > LastYearQty
    show table "Overstocked items, +1 year of stock" a2f3 tomato with
      Id
      Name
      StockOnHand + StockOnOrder as "Stock"
      LastYearQty
The where keyword is followed by a condition. The condition applies to three vectors that implicitly belong to the Items table. Indeed, this is the only table that can be referred to without having the table name as a prefix in the variable name. For example, one needs to use Orders.NetAmount and not just NetAmount when referring to the lines in the Orders table. Here, the condition can be read as: include only the items where the sum of the stock-on-hand plus the stock-on-order is greater than the number of units sold last year.

Once a condition is defined for the Items table, every table that includes an Id column – that is, an identifier referring to the items in line with the Envision conventions – gets filtered similarly. This behavior is identical to temporal filtering introduced previously. Indeed, as one begins to filter items, it does not make sense to keep sales order lines or purchase order lines that are no longer attached to any item, as these items have been filtered out of the scope.

Filtering items impacts the definition scope of newly computed vectors. Let’s illustrate this with a tiny piece of script.
where StockOnHand > 5
  GreaterThanFive = "yes"
  show table "Hello" with Name, GreaterThanFive // CORRECT!
// from this line, we are outside the filter block
show table "Hello" with Name, GreaterThanFive // WRONG!
Line 5 is incorrect because the GreaterThanFive vector has only been defined for the lines where the StockOnHand > 5 condition was true. Thus, while this vector is correctly defined within the block, and thus can be used as illustrated by line 3, this vector cannot be used outside the filter block because some its values would be left undefined. This situation can be fixed by making sure that the vector is properly defined for all its item values, as illustrated below.
GreaterThanFive = "no"
where StockOnHand > 5
  GreaterThanFive = "yes"
  show table "Hello" with Name, GreaterThanFive // CORRECT!
// from this line, we are outside the filter block
show table "Hello" with Name, GreaterThanFive // CORRECT!
This snippet begins on line 1 with a proper definition of the vector GreaterThanFive across all items. This definition is revised on line 3 for a subset of the items. However, this revision does not change the fact that the GreaterThanFive vector is explicitly defined for all items, and as a result, the display on line 6 is now correct.

Filtering arbitrary tables

While filtering dates or items is very useful, sometimes, one requires to precisely filter one specific table. Such a task can also be accomplished with the where keyword. Let’s review the lines that illustrates this capability of Envision.
  where Orders.NetAmount > 1000
    show table "Large transactions over $1000" a4f5 tomato with
      Id
      Name
      Orders.Date
      Orders.Quantity
      Orders.NetAmount
      Orders.Client
Here, we are filtering the Orders table to exclude all the table lines that are lower than $1000. The lines that are not filtered out are displayed through show table on lines 2 and 3. The example illustrates how a single table can be filtered. This filter only impacts the Orders table and all the other tables are left untouched by such a condition.

If a vector associated to the Orders table is calculated within a filter block, then the access to this vector is restricted to the block itself. We have already observed this behavior for the items, now let’s take a look at how it applies to arbitrary tables as well.
where Orders.NetAmount > 1000
  Orders.LargeTxn = "yes"
  show table "Large transactions" with Name, Orders.LargeTxn // CORRECT!
  // last block line
// indentation decreased, we are outside the block
show table "Large transactions" with Name, Orders.LargeTxn // WRONG!
Since the Orders.LargeTxn vector is not defined for all the lines of the Orders table, only line 3 is correct, while line 5 is incorrect. Like we did for the previous example, this script can be fixed by properly defining a LargeTxn value for the entire Orders table. This can be achieved with the script below.
Orders.LargeTxn = "no"
where Orders.NetAmount > 1000
  Orders.LargeTxn = "yes"
  show table "Large transactions" with Name, Orders.LargeTxn // CORRECT!
  // last block line
// indentation decreased, we are outside the block
show table "Large transactions" with Name, Orders.LargeTxn // CORRECT!
As a rule of thumb, Envision tries to allow block “leaks” as much as possible: a vector calculated within a block can be used outside the block, as long as this usage does not violate the rule that all the vector values are explicitly defined when the vector appears on the right side of an assignment.

Syntactic sugars for filtering

The syntax pattern used by Envision filter-and-indent for filtering is concise and readable, but when multiple filters are involved, the indentation may become a bit hard to decipher. Thus, Envision provides a couple of syntactic sugar, that is, alternative syntaxes which happen to require fewer indentation. Those syntax are reviewed in this section.

Skipping indentations when multiple filters are used

Envision only requires 1 extra level of indentation per filter if filters are intended to be used separately. If only the most inner scope is of interest, then a single indentation level is needed, as illustrated by:
// each 'where' filter brings its own indentation level
where Orders.Quantity > 10
  where StockOnHand < 100
    show table "Filtered orders" with Orders.Quantity

// but when multiple filters are used, a single indentation is required
where Orders.Quantity > 10
where StockOnHand < 100 // no ident here!
  show table "Filtered orders" with Orders.Quantity
The second block has the same semantic than the first one, but it requires only a single indentation. More generally, the syntax is:
where A
  when B
    where C
      show table "Filtered by A, B and C" with X
// same as
where A
when B
where C
  show table "Filtered by A, B and C" with X

Merging filters with the keyword and

We have already seen that, in Envision, the Boolean operator AND was represented with the symbol &. However, Envision also offers a keyword and which provides a slightly different semantic:
// the two nested 'where' filters
where Orders.NetAmount > 1000 
  where StockOnHand > 10
    show table "Filtered transactions" with Name, Orders.Quantity

// can be rewritten as a single filter with 'and'
where Orders.NetAmount > 1000 and StockOnHand > 10
  show table "Filtered transactions" with Name, Orders.Quantity
The and keyword is strictly equivalent at nesting where filters. Through the and keyword, it is possible to introduce several filters in sequence, with only a single indentation level. More generally, we have:
where A
  where B
   where C
     // snipped

// can be rewritten
where A and B and C
  // snipped
In practice, the and keyword offers the possibility to merge multiple filters which are not intended to be used separately.

No-indent with the keyword keep

A frequent coding pattern consists of introducing filters at the very top of the script in order to restrict the data analysis to a specific scope. While the filter syntax of Envision works relatively well for this scenario, the entire Envision script ends-up written with one or two level of indentations. The keyword keep offers a way to remove those indentations:
// the 'where' filter introduces an indented block
where Orders.Quantity > 10
  // beginning of block
  show table "Inside the filter" with sum(Orders.Quantity)
  // end of block
show table "Outside the filter" with sum(Orders.Quantity)

// but when 'keep' is used, the filter applies without indentation
keep where Orders.Quantity > 10
show table "Inside the filter" with sum(Orders.Quantity)
The keyword keep should be placed before where or when, and indicates that the filter takes effect without an indentation. The filter keeps being active until the end of the scope.
where A
  keep where B
  show table "Filtered by A and B" with X
  // end of A and B filters
show table "Not filtered" with X
Thus, if the keep is placed on a script line without any indentation, then the filter applies until the very end of the Envision script.

Suffixed inline filters

So far, all the filters that we have observed where written as filter blocks. However, Envision also provides an alternative syntax, more compact, which is known as condition suffixes. Let’s go back to the calculation of the quantities sold last year.
when date > oend - 365
  LastYearQty = sum(Orders.Quantity)
This script can be rewritten as:
LastYearQty = sum(Orders.Quantity) when date > oend - 365
For readers familiar with relational databases, this syntax might appear to be somewhat closer to the way the where conditions are written in SQL. Within Envision, this syntax is primarily a syntactic sugar to avoid introducing 1-line blocks when there is only a single statement to be written within the block. Both where and when can be “suffixed” on the rightmost side of an assignment.

The aggregators can be filtered inline with when or where conditions. Envision also offers the possibility to add an else modifier on the condition. For example, it is incorrect to write:
oend := max(Orders.Date)
Week.sold := sum(Orders.NetAmount) when date < monday(oend)
show linechart "Sold by week" with Week.sold // WRONG
Because Week.sold is not defined on the entire scope, as it has been filtered at the line above. Yet, by adding an else option, we are correctly defining Week.sold everywhere, which can then be displayed:
oend := max(Orders.Date)
Week.sold := sum(Orders.NetAmount) when date < monday(oend) else 0
show linechart "Sold by week" with Week.sold // CORRECT