Time series calculations with Envision

Time-series calculations












Home » Resources » Here

Most business operations of any given commerce can be accurately represented with time-series: historical sales, historical purchase orders, historical prices, etc. Since working with time-series is of primary importance for companies, Envision provides extensive native support for this type of scenarios. In particular, Envision allows to aggregate data per day, week and month. Envision also supports more complex time-series analysis such as lagging data, or calculating moving averages. These features are illustrated and documented in this page.

An illustrating example

Once more, we use our sample dataset to illustrate the capabilities of Envision. As this page represents a slightly more advanced usage of Envision, we suggest reading Making calculations and Aggregating data before proceeding with this page.
read "/sample/Lokad_Items.tsv"
read "/sample/Lokad_Orders.tsv" as O
read "/sample/Lokad_PurchaseOrders.tsv" as PO

show label "Time-series calculations" a1f1 tomato

end := max(date)
lastMon := monday(end)

Week.sold := sum(O.NetAmount)
when date >= end - 52 * 7
  show linechart "Weekly sales" a2f4 tomato unit:"$" with 
    Week.sold as "This year"
    Week.sold[-52] as "Last year"

Week.ma := sum(O.NetAmount / 4) over [-3 .. 0]
when date >= end - 52 * 7
  show linechart "Weekly sales with 4 weeks moving average" a5f7 tomato unit:"$" with 
    Week.ma as "This year"
    Week.ma[-52] as "Last year"

Day.cashFlow := sum(O.NetAmount) - sum(PO.NetAmount)
Day.balance := avg(Day.cashFlow) over [-13 .. 0]
when date >= lastMon - 6 * 7 & date < lastMon
  show linechart "Cash flow over the last 6 weeks" a8f10 tomato unit:"$" with 
    Day.balance as "Balance"

PO.Price = PO.NetAmount / PO.Quantity
O.PurchasePrice = latest(PO.Price)
O.Cost = O.PurchasePrice * O.Quantity
O.Profit = O.NetAmount - O.Cost
Week.profitblty := sum(O.Profit) / sum(O.Cost) or 1
when date >= lastMon - 13 * 7 & date < lastMon
  show linechart "Profitability over the last 13 weeks" a11f13 tomato unit:"%" with 
    Week.profitblty
After this script is run with the sample dataset, the following dashboard is obtained.

Image

Virtual calendar tables

Since daily, weekly and monthly aggregations are ubiquitous, Envision was designed to natively support these periodic calendar patterns. More specifically, Envision benefits from three virtual tables named Day, Week and Month respectively. These tables are referred to as “virtual” because they do not have tabular file counterparts: or in other words, these tables only exist for the duration of the execution of the script. The script above leverages these virtual tables in order to display the line charts. Let’s review the relevant lines re-copied below.
Week.sold := sum(O.NetAmount)
end := max(date)
when date >= end - 52 * 7
  show linechart "Weekly sales" a2f4 tomato unit:"$" with 
    Week.sold as "This year"
    Week.sold[-52] as "Last year"
At line 1, the content of the O table is summed into the Week table. Since we are using the scalar assignment := here, there is a single value computed per week. At line 3, a filter is defined to exclude data that is more than 52 weeks old. Finally at lines 4 to 6, a line chart with two time-series displayed is defined. The second time-series Week.sold[-52] comes with a lag operator which is reviewed in the next section. This script can be easily modified for a daily or a monthly aggregation instead. For example, it is possible to add the following lines at the very end of the script:
Day.sold := sum(O.NetAmount)
show linechart "Daily sales" a14f16 tomato unit:"$" with
  Day.sold
Month.sold := sum(O.NetAmount)
show linechart "Monthly sales" a17f19 tomato unit:"$" with
  Month.sold
This block of code displays two more line charts, featuring time-series aggregated at the daily and the monthly levels respectively. The variable Day.sold can be interpreted as the content of the column named sold within the Day table, but it can also be interpreted as an equally spaced time-series of the “1 day” period – as opposed to the regular Envision tables that contain a Date column such as O and that can be interpreted as unevenly spaced time-series.

Displaying lagged time-series

In time-series analysis, “lagging” refers to the operation which consists of moving time-wise the elements of the time-series. One of the most elementary purposes of time-series lagging consists of comparing two distinct periods of time. Envision supports a lag operator precisely targeted at dealing with such situations. Let’s revisit the previous code snippet.
Week.sold := sum(O.NetAmount)
when date >= end - 52 * 7
  show linechart "Weekly sales" a2f4 tomato unit:"$" with 
    Week.sold as "This year"
    Week.sold[-52] as "Last year"
We have two time-series defined at lines 4 and 5. The first time-series Week.sold is the original weekly aggregated sales. The second series comes with an extra suffix [-52]. This suffix is the lag operator itself. It means that the data that goes back 52 weeks in the past is moved forward and finally displayed in the line chart. When the lag operator is applied to the Week table, then the lag argument is an integer expressed in weekly units. Similarly, the applicable units are days and months for the Day and Month tables respectively.

The lag operator benefits from the smart cooperative behavior of the filter operator when. In fact, without this cooperative behavior, the filter when would have already excluded all the data older than 52 weeks, and as a result, lagging the time-series of 52 weeks would have resulted in moving zeroes forward. Yet, as illustrated by the screenshot of the dashboard above, the lag operator is correctly moving the actual one-year old data forward, and not merely the zeroes. This behavior is achieved through the built-in cooperation between the when filter and the lag operator.

If you hover your mouse over the line chart within your dashboard, you should notice that dates and values get displayed. In particular, with the script at the top of this page, the dates reported for the time-series named “Last year” are indeed short of one year compared to the dates of the time-series named “This year”. However, it is important to note that the lag operator does not really preserve the original dates of the time-series. Instead, Envision simply uses a convention to make it work: if the lag operator is defined within the declaration of a line chart, then, and only then, the original dates remain preserved.

Let’s modify the script in order to apply the lag operator outside the linechart. This can be done by introducing a variable named Week.lastYear.
Week.sold := sum(O.NetAmount)
Week.lastYear := Week.sold[-52]
when date >= end - 52 * 7
  show linechart "Weekly sales" a2f4 tomato unit:"$" with 
    Week.sold as "This year"
    Week.lastYear as "Last year"//date display issue
If you run this modified script, and if you hover your mouse over the time-series dots, you should notice that both time-series are reported with the same dates. Indeed, in this case, there is no non-ambiguous semantic for the dates to be displayed. For example, the Week.lastYear time-series could have actually been interpreted as a year-to-year forecast; and as a result, having the same dates for the two time-series would have been what we were looking for in this case. In conclusion, if you wish to preserve the original dates in your line chart for side-by-side comparison of lagged time-series, then the lag operator should be defined within the show statement.

Aggregating data over a time window

In one of our previous guides, we have reviewed how to aggregate data with Envision. With time-series, there is another type of aggregation that is very desirable: the aggregation over a time window. The script at the top of this page illustrates how to compute a moving average of the sales over 4 weeks. The relevant lines of script are re-copied below for the sake of clarity.
Week.ma := sum(O.NetAmount / 4) over [-3 .. 0]
when date >= end - 52 * 7
  show linechart "Weekly sales with 4 weeks moving average" a5f7 tomato unit:"$" with 
    Week.ma as "This year"
    Week.ma[-52] as "Last year"
At line 1, an aggregation with the sum() aggregator is performed, and this aggregation comes with a statement starting with the keyword over at the end. At line 2, data is filtered specifically taking the last 52 weeks of data. Finally, at lines 3 to 5, two time-series are displayed within the line chart. Both time-series are “smoothed” as they are averaged over 4 weeks.

The operator over is used to define the applicable time window, and it is expected to be written as [a .. b] where a and b are integers with a being lower than or equal to b. The unit used for a and b depends on the expression on the left side of the assignment. In the present case, we have the Week table on the left side of the assignment, and as result, -3 and 0 are expressed in weeks.

Gotcha: between the week of index -3 and the week of index 0, there are 4 weeks, not 3 weeks. As a result, we have the following weeks of indices -3, -2, -1 and 0.
The option over can be used with all aggregators. When this option is used, the left side of the assignment is usually a virtual calendar table such as Day, Week and Month. However, this is not a requirement, and any table can be used as long as it is indexed by a Date. Also, by definition, when calendar tables are used, using over [0 .. 0] gives the same results as the default aggregation:
Week.sold := sum(O.NetAmount)
// same result!
Week.same := sum(O.NetAmount) over [0 .. 0]

More complex time-series aggregations

The syntax of Envision offers the possibility to perform more elaborate time-series calculations. For example, it is possible to compute time-series and then, to make further calculation based on those initial time-series. The third code block in the script at the top of this page illustrates this. The relevant lines of script are copied below.
Day.cashFlow := sum(O.NetAmount) - sum(PO.NetAmount)
Day.balance := avg(Day.cashFlow) over [-13 .. 0]
when date >= monday(end) - 42 & date < monday(end)
  show linechart "Cash flow over the last 6 weeks" a8f10 tomato unit:"$" with 
    Day.balance as "Balance"
At line 1, the time-series Day.cashFlow is defined as the difference between the total of sales and the total of purchases. At line 2, the time-series Day.balance is calculated as the moving average over 14 days of Day.cashFlow. At line 3, a filter with two specific conditions is defined: data should not be more than 7 weeks old – counting from last Monday, and data should not be “fresher” than last Monday. Here the use of the monday() function ensures that the scope includes exactly 6 whole weeks. Finally, the time-series Day.balance is displayed as the sole time-series of the line chart defined at lines 4 and 5.

The aggregation taking place at line 2 leverages the over option that we have detailed in the previous section. Here, the time-series is averaged over a time window of 14 days – indeed, from the index -13 to index 0 there are 14 distinct indices. This aggregation is somewhat unlike the previous aggregations we have seen so far because the table Day appears both on the left and the right sides of the assignment, while the more usual aggregations that do not have an over suffix typically aggregate data from one table into another table.

It is also possible to re-aggregate daily time-series into weekly time-series. The script below illustrates how this can be done to calculate a weekly cash flow instead of a daily one.
Day.cashFlow := sum(O.NetAmount) - sum(PO.NetAmount)
Week.balance := sum(Day.cashFlow / 2) over [-1 .. 0]
when date >= monday(end) - 42 & date < monday(end)
  show linechart "Cash flow over the last 6 weeks" a8f10 tomato unit:"$" with 
    Week.balance as "Balance"

Dealing with event-driven data

Event-driven data refers to a certain representation of the historical data that focuses on “changes”. For example, instead of having all the historical prices for every single day of the data history, it is much more practical to only collect the list of historical price changes: each price change gives a new price and a date, and it is assumed that the new price remains unchanged until a new price change is observed. Envision supports scenarios where the historical data is represented as a list of changes.

The sample dataset does not include a table with the historical prices; however, it is possible to approximate this data by looking at the past purchase transactions, and assuming that prices stay unchanged until the next transaction is observed. This is exactly what is taking place in the last block of script in the example at the top of this page. Let’s have a closer look at the relevant lines copied below.
PO.Price = PO.NetAmount / PO.Quantity
O.PurchasePrice = latest(PO.Price)
O.Cost = O.PurchasePrice * O.Quantity
O.Profit = O.NetAmount - O.Cost
Week.profitblty := sum(O.Profit) / sum(O.Cost) or 1
when date >= lastMon - 13 * 7 & date < lastMon
  show linechart "Profitability over the last 13 weeks" a11f13 tomato unit:"%" with 
    Week.profitblty
At line 1, the purchase price per unit is computed for every line of the PO table through a vector operation in the usual Envision style. At line 2, the latest function is used, and this function has a rather specific behavior: for every line of the O table – the table being assigned to – the latest function looks up the most recent PO.Price line available, no more recent that the O line being considered, and copies this value on the left side of the assignment. At lines 3 and 4, we have more calculations involving vectors and aggregations. Finally, at line 5, a filter is defined to restrict the scope of operations to the last 13 whole weeks, and within this filter block, a line chart is displayed, leveraging the Week.profitability time-series that has been previously calculated at line 4.

The latest function is the specific ingredient that supports the calculation being considered here. This function is precisely intended at capturing the semantic of an event stream where a value is assumed to be constant until it is overridden by a new event. In particular, it is also possible to “densify” purchase prices, calculating a price for every single day of the history with:
Day.PurchasePrice = latest(PO.Price)
In practice, the latest function can be used to deal with many situations such as stock-outs, promotions, product lifecycles, etc.