Home
build details

Show: section status errors & todos local changes recent changes last change in-page changes feedback controls

Performance metrics

Modified 2019-04-14 by Liam Paull

Measuring performance in robotics is less clear cut and more multidimensional than traditionally encountered in machine learning settings. Nonetheless, to achieve reliable performance estimates we assess submitted code on several episodes with different initial settings and compute statistics on the outcomes. We denote $\objective$ to be an objective or cost function to optimize, which we evaluate for every experiment. In the following formalization, objectives are assumed to be minimized.

In the following we summarize the objectives used to quantify how well an embodied task is completed. We will produce scores in three different categories:

Performance criteria (P)

Modified 2019-05-16 by Julian Zilly

Lane following (LF / LFV)

Modified 2019-04-14 by Liam Paull

As a performance indicator for both the “lane following task” and the “lane following task with other dynamic vehicles”, we choose the integrated speed $v(t)$ along the road (not perpendicular to it) over time of the Duckiebot. This measures the moved distance along the road per episode, where we fix the time length of an episode. This encourages both faster driving as well as algorithms with lower latency. An episode is used to mean running the code from a particular initial configuration.

$$ \objective_{P-LF(V)}(t) = \int_{0}^{t} - v(t) dt $$

The integral of speed is defined over the traveled distance of an episode up to time $t=T_{eps}$, where $T_{eps}$ is the length of an episode.

The way we measure this is in units of “tiles traveled”:

$$ \objective_{P-LF(V)}(t) = \text{# of tiles traveled} $$

Autonomous mobility on demand (AMoD)

Modified 2019-04-14 by Liam Paull

In an autonomous mobility-on-demand system a coordinated fleet of robotic taxis serves customers in an on-demand fashion. An operational policy for the system must optimize in three conflicting dimensions:

  1. The system must perform at the highest possible service level, i.e., at smallest possible wait times and smallest possible journey times.
  2. The system’s operation must be as efficient as possible, i.e., it must reduce its empty mileage to a minimum.
  3. The system’s capital cost must be as inexpensive as possible, i.e, the fleet size must be reduced to a minimum.

We consider robotic taxis that can carry one customer. To compare different AMoD system operational policies, we introduce the following variables:

\begin{align*} &d_E &= &\text{ empty distance driven by the fleet} \\ &d_C &= &\text{ occupied distance driven by the fleet} \\ &d_T = d_C + d_E &= &\text{ total distance driven by the fleet} \\ &N &= &\text{ fleet size} \\ &R &= &\text{ number of customer requests served} \\ &w_i &= &\text{ waiting time of request } i\in \{1,...,R\} \\ &W &= &\text{ total waiting time } W = \sum_{i=1}^{R} w_i \end{align*}

The provided simulation environment is designed in the standard reinforcement framework: Rewards are issued after each simulation step. The (undiscounted) sum of all rewards is the final score. The higher the score, the better the performance.

For the AMoD-Task, there are 3 different championships (sub-tasks) which constitute separate competitions. The simulation environment computes the reward value for each category and conatenates them into a vector of length 3, which is then communicated as feedback to the learning agent. The agent can ignore but the entry of the reward vector from the category that they wish to maximize.

The three championships are as follows:

In the Service Quality Championship, the principal goal of the operator is to provide the highest possible service quality at bounded operational cost. Two negative scalar weights $\alpha_1\lt{}0$ and $\alpha_2\lt{}0$ are introduced. The performance metric to maximize is

\begin{align*} \mathcal{J}_{P-AMoD,1} = \alpha_1 W + \alpha_2 d_E \end{align*}

The values $\alpha_1$ and $\alpha_2$ are chosen such that the term $W$ dominantes the metric. The number of robotic taxis is fixed at some fleet size $\bar{N} \in \mathbb{N}_{\gt{}0}$.

In the Efficiency Championship, the principal goal of the operator is to perform as efficiently as possible while maintaining the best possible service level. Two negative scalar weights $\alpha_3\lt{}0$ and $\alpha_4\lt{}0$ are introduced. The performance metric to maximize is

\begin{align*} \mathcal{J}_{P-AMoD,2} = \alpha_3 W + \alpha_4 d_E \end{align*}

$\alpha_3$ and $\alpha_4$ are chosen such that the term $d_E$ dominantes the metric. The number of robotic taxis is fixed at some fleet size $\bar{N} \in \mathbb{N}_{\gt{}0}$.

Traffic law objective (T)

Modified 2019-05-16 by Julian Zilly

The following shows rule objectives the Duckiebots are supposed to abide by within Duckietown. These penalties hold for the embodied tasks (LF, LFV).

Comfort objective (C)

Modified 2019-05-16 by Julian Zilly

Lane following (LF, LFV)

Modified 2019-04-14 by Liam Paull

In the single robot setting, we encourage “comfortable” driving solutions. We therefore penalize large angular deviations from the forward lane direction to achieve smoother driving. This is quantified through changes in Duckiebot angular orientation $\theta_{bot}(t)$ with respect to the lane driving direction.

For better driving behavior we measure the median per episode lateral deviation from the right lane center line.

$$ \objective_{C-LF/LFV}(t) = \text{median}(\{d_{outside}\}), $$

where $\{d_{outside}\}$ is the sequence of lateral distances from the center line.

No questions found. You can ask a question on the website.