The Custom Expression Page in the Benchmarking Module gives you access to a feature called "Custom Expressions". The word "Expression" is our term for a textual representation of a benchmarking calculation. A simple expression might just be  DPA01 + DPA02, which would calculate each network's total circuit length, as DPA01 and DPA02 are the AER's codes for overhead and underground circuit lengths respectively. Expressions can also contain numeric values and functions as well as benchmarking data. The way expressions are used is not dissimilar to the concept of formulas in Excel, the major difference being that benchmarking codes are used instead of cell ranges.

Expressions are entered into the text box labelled "Expression". As you enter an expression, suggestions for benchmarking series will be displayed, and can be selected and inserted using the mouse. Below the form, a graphical representation of the expression will be shown; it uses colours and tool tips to identify valid benchmarking series and functions. A list of any errors will also be displayed, explaining what any issues with the expression are.

Benchmarking expressions can be very powerful; they have full access to all of the historical benchmarking data released by the AER, yet are simple to work with. All you need to do is provide a mathematical definition of the calculation you want to perform, and the results will be calculated when you view the Results tab.

There are several more advanced features possible, which are covered further in this article, such as unit analysis and aggregation.

Expression Components

Expressions can contain many types of components that are ultimately calculated together to produce a final set of results. These components include numbers, data, functions, and operations that all define how the results are calculated.

Benchmarking Series

Benchmarking series are the foundation of expressions; their usage determines what data is retrieved and the units of the results. A benchmarking series is ultimately a collection of data for a particular metric, such as line length, customer numbers, Opex, etc.  These series are are aware of each network, and each year data was published. In an expression, a benchmarking series is identified by its AER assigned code, or otherwise by a code that Bespoke Insights has applied to unlabelled data sets. For a comprehensive listing of all of the benchmarking data sets available, see our Benchmarking Data Explorer in the Benchmarking Module.

Mathematical Operations

Operations include Addition +, Subtraction -, Multiplication *, Division /, and Exponentiation ^. Furthermore, parentheses () can be used to control the order of operations more finely. Operations can be applied between any parts of an expression, such as numbers, benchmarking series, and aggregated series.

Functions

Functions can be applied to other expression components (Including the results of other functions), and produce either new data series, aggregated series, or numbers. Some functions are simply numeric transformations, and return the same shape of data that they were given. These functions include abs, log, and ln. Other functions perform aggregation on benchmarking series, and condense per-network, per-year benchmarking data into simpler forms that have just a single value for each network without any year details. These functions operate based on each network's historical data, and can be used to produce a wide range of results. The aggregation functions currently supported are sum, mean, median,std, min, and max.

Numbers

Numbers are the simplest component, and are simply numeric constants that can be useful for modifying more complex data structures in the expression.

Concepts

Aggregation

A raw benchmarking series contains data-points for each network and for each year benchmarking data has been made available. When an aggregation function is applied to a benchmarking series, each network's data is condensed to a single value, producing an aggregated series. Aggregated series can be the final result of an expression, or can be operated on as a part of larger expressions.

An example of using an aggregation could  be calculating the mean of each network's total line lengths. This can be performed by summing the overhead and underground line lengths, then taking the mean. The expression therefore needed is mean(DPA01 + DPA02). Note the results calculated are provided as one value for each network, and are displayed in a bar chart.

Aggregations can be used in more sophisticated expressions too. As aggregated series are derived per-network, they can be used in operations will full benchmarking series, and will operate with every year value. An example of this could be to compare the difference of each years minutes off supply with the median yearly off supply duration. For this, the expression would be EIDNSPR001 - median(EIDNSPR001).

Units

One of the more subtle but significant features of Benchmarking Expression evaluation is the analysis and validation of the units used. Each benchmarking series looked up has an associated unit definition. These can be units such as kilometres, power output, number of customers, and minutes, as well as units already in proportional form, such as power, minutes, or line length per customer. Because of units, benchmarking series may or may not be compatible with each other for basic subtraction and addition operations. We have already seen that series of the same units type can be added together just fine, however it does not make sense to sum different categories of units together; for instance adding the number of customers to a series of gigawatt hours for instance is meaningless and will not be calculated. 

Units can always be expressed as proportions however, such as ratios involving Opex per units of power, customers, and/or revenue.

Examples

Listed here are several examples to help you get started and see what is possible with  Custom Expressions. The simplest expressions can just be lookups of raw benchmarking data series, while complex expressions can transform and aggregate multiple datasets to produce the result.

Total Circuit Length

  • DPA01 + DPA02 
  • Units: Km

Opex per unit Energy Delivered

  • EIOPEX001 / DOPED01
  • Units: $'000 per GWh

Customers per unit Circuit Length

  • EIOPEX001 / (DPA01 + DPA02) 
  • Units: Customers per Km

Simulating 10% Opex increase

  • EIOPEX001 * 1.1 
  • Units: $'000

Simulating 10% Opex increase, per Unit Energy Delivered

  • (EIOPEX001 * 1.1) / DOPED01
  • Units: $'000 per GWh

Median Absolute Deviation (MAD) of Minutes Off Supply

  • median(abs(EIDNSPR001 - median(EIDNSPR001)))
  • Units: Minutes
Did this answer your question?