Scoring
For the purposes of the GO Competition, a power system network model includes network structure, equipment detail and limits of control equipment available (including generators, loads, capacitor banks, LTC taps, etc.).
A scenario is variable input data on that power system network model defining each snapshot in time for that model (defining instantaneous power demand, renewable generation, generator and line availability, etc.).
For more details on what constitutes a power system network model and a scenario, please see the GO Competition detailed modeling framework.
Each of the two main competition phases (Phase 1 and Phase 2) will use four unique datasets (an Original Dataset, two Trial Datasets and a Final Dataset—for more information see the GO Competition timeline). In Phase 0, several unique datasets have been posted to the competition portal (each with a unique power systems network model and 10 or 100 scenarios)—for more information see the Dataset section.
Each algorithm submitted for evaluation against a dataset is run independently against the network models and scenarios in the dataset. A scenario score is calculated for each scenario in a power system network model. A power system network model score is then calculated by taking the geometric mean across all scenario scores of that network model. A final dataset score is computed by taking the geometric mean of all power system network models in a given dataset.
For scenario i on power system network model A, we herein refer to the relevant scenario score as $s_{A,i}$. There are three elements that influence $s_{A,i}$: the base case objective value (which we denote $c_{A,i}$); the solution feasibility; and the base-case runtime. Though base-case runtime is not continuously scored, a score is assigned differently depending on if the solution is returned within a scenario dependent timescale, $t_{A,i}$. A solution is considered infeasible if it: a) violates constraints/contingencies; or b) does not return a result within the system time-out. For more information on constraint violations, see the section below.
The scenario score is determined as:
A lower score is desirable.
More will be said about the determination of $t_{A,i}$, $x_{A,i}$, and $y_{A,i}$ in the section below.
Suppose a dataset has two power systems network models A and B: network A contains two scenarios; and network B contains three scenarios. Consider an algorithm evaluated on this dataset with the following results:
Power System Network |
Scenario |
Objective Function |
Feasible? |
Time |
Scenario Score |
A |
1 |
$c_{A,1}$ |
Y |
$\le t_{A,1}$ |
$c_{A,1}$ |
A |
2 |
$c_{A,2}$ |
N |
$\le t_{A,2}$ |
$y_{A,2}$ |
B |
1 |
$c_{B,1}$ |
Y |
$> t_{B,1}$ |
$x_{B,1}$ |
B |
2 |
$c_{B,2}$ |
Y |
$> t_{B,2}$ |
$x_{B,2}$ |
B |
3 |
N/A |
N/A |
Time-out |
$y_{B,3}$ |
The power system network model score for A is then:
and for B is:
The total dataset score is:
In the absence of time-out, the returned solution will be considered feasible if it has relative constraint violations less than an infeasibility threshold for each constraint (this includes violation of bulk power flow). Any solution with larger violations than the threshold is considered infeasible. In a scenario i, the relative constraint violation CV_{k} for constraint or bound k is calculated according to the following table. The symbol [ ]^{+} is defined by [a]^{+} = max(a,0)
Constraints | $\mid b_k \mid > 1$ | $\mid b_k \mid \le 1$ |
Inequality constraint $g_k(x) \le b_k$ | $CV_k = [g_k(x) - b_k]^{+} / \mid b_k\mid$ | $CV_k = [g_k(x) - b_k]^{+}$ |
Equality constraint $g_k(x) = b_k$ | $CV_k = \mid g_k(x) - b_k \mid / \mid b_k\mid $ | $CV_k = \mid g_k(x) - b_k \mid$ |
$t_{A,i}$, $x_{A,i}$, and $y_{A,i}$ are determined as follows:
$t_{A,i}$ = $t_{nom., A}$ X (time scale)
$y_{A,i}$ = $c_{nom., A}$ X (constraint violation penalty scale)
$x_{A,i}$ = $c_{nom., A}$ X (time violation penalty scale)
$t_{nom., A}$ is referred to as the nominal time value. For an individual power system network model, it is chosen by rounding up to a single digit the largest time of all the scenarios determined by a GAMS reference algorithm. Since the value undergoes significant rounding, the accuracy of the underlying evaluation need not be high. For the Phase 0 IEEE 14 Bus dataset, the largest time was from Scenario 22, which took 0.303 seconds for the reference evaluation, so rounding up to the nearest whole digit gives a nominal time of 1 second.
Similarly, $c_{nom., A}$, referred to as the nominal objective value, is chosen by rounding up to a single digit the largest objective function value of all the scenario values determined by the same reference algorithm. For the IEEE 14 bus network, this was 82,108.73 for scenario 89; rounding up to a single digit gives a nominal objective value of 90,000.
The values for the nominal time, nominal objective, the time scale, the constraint violation penalty scale, the time violation penalty scale, and the maximum infeasibility threshold may vary with each dataset. The values used for a submission are found in the scorepara.csv file included in the zip file downloaded with each dataset. This information is also included in the scenario_results.csv file included in the tar.gz file returned after each submission as a way of verifying the parameters used to score a specific submission.
In the IEEE 14-bus system, a time scale value of 5 was arbitrarily chosen as large enough to allow a reasonable algorithm enough more time than the reference to complete. In this case, the time penalty scale is an order of magnitude less than the constraint violation penalty scale, more harshly penalizing algorithms that do not give a valid solution (it also provides a relatively simple way of deducing the penalties imposed by inspecting the scenario score). The infeasibility threshold is also somewhat arbitrary, though consistent with standard numerical precision. The scoring algorithm Java source code will be publicly available at the end of the competition.
The scorepara.csv file for the IEEE 14 bus dataset is:
Data set,Phase_0_IEEE14
Nominal time,1.
Nominal objective,90000.
Time scale,5
Constraint violation penalty scale,1000
Time violation penalty scale,100
Max infeasibility,1e-6
Number of scenarios,100
For this dataset, an algorithm evaluated on a scenario that took longer than 5 seconds (1x5) would have 9e+6 (90,000x100) as its scenario score. An algorithm evaluated on a scenario with a CV_{k}of greater than 1e-6 would have a penalty of 9e+7 (90,000x1000) as its score.
This scoring methodology is for Phase 0 only and may be changed before the launch of Phase 1. Please refer to the GO Competition website frequently for scoring updates.
Note: Please contact the GO Operations Team if you see any significant discrepancies between your time or objective function values and those provided by the evaluation platform. Any discussions related to the scoring methodology should be posted on the Scoring Forum.