This page lists the aggregators supported by Envision. The aggregators are similar to functions, but they benefit from a richer syntax with options.
There are two main usages for aggregators, whether a join between tables is performed or not.
When there is no join, the general usage for an aggregator
u = agg(v) by [w1, w2, w3] sort [w4, w5] if c or x over [a .. b]
agg is the name of the aggregator.
by is an option indicating the groups to be used when aggregating. It is possible to aggregate against one or more values.
sort is an option indicating the sorting order to be used when aggregating. It is possible to sort against one or more values.
if is an aggregator filter. When present, all lines that are
false are omitted from the aggregation.
or is an option indicating the value to return if the group is empty.
over is an option used to define an aggregation time-span with boundaries expressed in days and relative to the date of the target line.
is not specified, the grouping happens according to the context, typically using the left-side of the statement in the case of an assignment. It is possible to list multiple variables after
; if only one variable is used, then the parentheses are not required.
option has typically no impact on most aggregators, e.g.
, because those operators do not depend on any sort order in the first place. However, we suggest to leverage this option with
is not specified, all lines are included which is equivalent to
is not specified, default values are returned when the grouping is empty:
(zero) for number,
for booleans, the empty string for text and the zero distribution for distributions.
Aggregation with table joins
The general syntax for this second case is similar but different to the one described above:
u = agg(v) by [w1, w2, w3] sort [w4, w5] if c at [z1, z2, z3] or x
option indicates the groups used when joining the tables. Note that there must be a one-to-one correspondence between these groups and the groups used in the
option. As a consequence, the
option is mandatory and the number of entries specified in the
and in the
options must be identical.
The most commonly used aggregators are:
avg: numeric argument only; returns the average line value.
count: counts the number of lines that are either
true (booleans) or non-zero (numbers) or non-empty (text).
distinct: counts the number of distinct values.
first: returns the value associated to the earliest line, as defined per the
last: returns the value associated to the latest line, as defined per the
max: returns the highest value.
median: behaves like the MEDIAN function is Excel.
min: returns the lowest value.
product: numeric argument only; returns the product of line values.
ratio: returns the percentage (between 0 and 1) of Booleans that are true, or of numbers that are non-zero. The ratio of an empty group is 1.
same: expects all the grouped values to be identical (fails otherwise) and returns this value.
sum: numeric argument only; returns the sum of line values.
The aggregators below returns a boolean value:
true if all the input lines are
true and return
false if all the input lines are
false and return
true if there is at least one line passed to the aggregator.
Those aggregators apply only to tables that benefit from a
latest (cross-table, special case): returns the most recent value observed.
latest is not compatible with any of the
The aggregators below provides statistical calculation:
entropy: returns the Shannon Entropy of the group. The value is returned expressed in shannons.
mode: similar to the MODE function in Excel.
percentile (two arguments, special case): similar to the PERCENTILE function in Excel. The second argument is percentile target between 0 and 1.
stdev: similar to the STDEV function in Excel.
stdevp: similar to the STDEV.P function in Excel.
Distribution can be generated through aggregation (see also algebra of distributions
ranvar: returns the empirical random variable associated to the numeric observations.
sumr: returns the sum of additive convolutions of random variables. Just like
sum() is the aggregator associated to the
+ operator, the aggregator
sumr() is associated to the