Envision provides an implicit mechanism to join tables whenever those tables contain an
(item index) field; as it is frequently the case to represent sales history or purchase history. However, this implicit mechanism - called natural joins
is not suitable for more complex situations. Thus, Envision offers a more general join mechanism referred to as the left-by
. Through the left-by, it's possible to perform arbitrary joins over tables.
An illustrating script
On this page, we use the sample data set
which should be available from the
path in your Lokad account. If it's not already done, we suggest first to read the entry reading files with Envision
, with a closer look at the file type expectations
as this section is very relevant to better understand the following script.
read "/sample/Lokad_Orders.tsv" as Orders
read "/sample/Lokad_Suppliers.tsv" as Suppliers[*]
Moq by Supplier = same(Suppliers.Moq) by Suppliers.Supplier
show table "Item list" a1d4 tomato with Id, Name, Supplier, Moq
The first three lines are our usual statements for reading files. On line 3, the type expectation
specifies that the suppliers table isn't getting any type expectations - it's an independent table with no primary key and no foreign key. Indeed the file
- unlike all the other
files of the folder - does not contain an
column. As this column is missing, Envision cannot implicitly join this table with the Items
table. As a matter of fact, the supplier's table is aggregated per supplier
with one supplier per line, while all the other files are detailed per item
On line 5, we leverage the left-by
construct to perform a join
table and the
table. We could have similarly written:
Items.Moq by Items.Supplier = same(Suppliers.Moq) by Suppliers.Supplier
However, per Envision's usual conventions, the
table name is omitted. This statement is referred to as a left-by
keyword appears on the left side of the assignment, which is somewhat peculiar, as the
keyword usually appears on the right side of the assignment along with an aggregator.
In this specific example, we have populated a vector
with the data obtained from the table
. A join has been made between the two fields
In the case that a supplier entry found in the
table does not have a counterpart in the
table, a zero value will be used. However, through the optional
keyword, following the usual behavior of the Envision aggregators, other default values could be used. We could have written:
Items.Moq by Items.Supplier = same(Suppliers.Moq) or 0 by Suppliers.Supplier
On line 6, the resulting
table is displayed in order to illustrate the calculation performed by the left-by the line above. As expected, each item is associated with the MOQ value of its respective supplier.
General syntax for the left-by
The general syntax for the left-by is:
T1.A by [T1.X, T1.Y, T1.Z] = agg(T2.B) or T1.C by [T2.X, T2.Y, T2.Z]
Much like the usual aggregators in Envision, the left-by offers the possibility to perform a tuple
match, that is, matching n
fields at a time. Then, all the usual Envision aggregators can be used:
, but also
, etc. The
block can be omitted just like it is done with regular aggregation.
The semantic of the left-by is as follows:
- Build all distinct groups of tuples on the right
[T2.X, T2.Y, T2.Z]. For each one of those groups, if at least one matching group exists on the left, compute the aggregate
- Build all distinct groups of tuples on the left
[T1.X, T1.Y, T1.Z]. For each one of those, assign the matching aggregate from the right.
- For the groups on the left that don't have a matching group on the right, then fall back on the default
Just like regular aggregators, the
statement is optional, and provides the value to be used as a result of the aggregation if the group is empty.
Tips for the left-by
The left-by is a powerful construct that can be leveraged in many situations, not just to join tables from the usual SQL perspective.A table can be left-joined to itself:
Orders.DaySum by Orders.Date = sum(Orders.Quantity) by Orders.Date
illustrates how to compute daily totals over the
table without resorting to the
table.Calendar tables Day and Week can be left-joined
: For example,
Day.Shift by [Day.Id, Day.Date] = sum(Orders.Quantity) by [Orders.Id, Orders.Date - 1]
illustrates how to shift the quantities from one day through a left-by.
Translating the left-by into SQL
For the readers who would already be familiar with SQL, then it can be noted that the Envision expression:
Moq by Supplier = same(Suppliers.Moq) or defaultMoq by Suppliers.Supplier
has the following equivalent in SQL:
LEFT OUTER JOIN Suppliers ON Items.Supplier = Suppliers.Supplier
SET Moq = COALESCE(Suppliers.Moq, defaultMoq)
The Envision syntax emphasizes Excel-like calculations more than the relational algebra itself.