Home
build details

Show: section status errors & todos local changes recent changes last change in-page changes feedback controls

Performance metrics

Modified 3 days ago by Liam Paull

Measuring performance in robotics is less clear cut and more multidimensional than traditionally encountered in machine learning settings. Nonetheless, to achieve reliable performance estimates we assess submitted code on several episodes with different initial settings and compute statistics on the outcomes. We denote $\objective$ to be an objective or cost function to optimize, which we evaluate for every experiment. In the following formalization, objectives are assumed to be minimized.

In the following we summarize the objectives used to quantify how well an embodied task is completed. We will produce scores in three different categories:

Performance criteria

Modified 3 days ago by Liam Paull

Lane following (LF / LFV)

Modified 3 days ago by Liam Paull

As a performance indicator for both the “lane following task” and the “lane following task with other dynamic vehicles”, we choose the integrated speed $v(t)$ along the road (not perpendicular to it) over time of the Duckiebot. This measures the moved distance along the road per episode, where we fix the time length of an episode. This encourages both faster driving as well as algorithms with lower latency. An episode is used to mean running the code from a particular initial configuration.

$$ \objective_{P-LF(V)}(t) = \int_{0}^{t} - v(t) dt $$

The integral of speed is defined over the traveled distance of an episode up to time $t=T_{eps}$, where $T_{eps}$ is the length of an episode.

The way we measure this is in units of “tiles traveled”:

$$ \objective_{P-LF(V)}(t) = \text{# of tiles traveled} $$

Autonomous mobility on demand (AMoD)

Modified 3 days ago by Liam Paull

In an autonomous mobility-on-demand system a coordinated fleet of robotic taxis serves customers in an on-demand fashion. An operational policy for the system must optimize in three conflicting dimensions:

  1. The system must perform at the highest possible service level, i.e., at smallest possible wait times and smallest possible journey times.
  2. The system’s operation must be as efficient as possible, i.e., it must reduce its empty mileage to a minimum.
  3. The system’s capital cost must be as inexpensive as possible, i.e, the fleet size must be reduced to a minimum.

We consider robotic taxis that can carry one customer. To compare different AMoD system operational policies, we introduce the following variables:

\begin{align*} &d_E &= &\text{ empty distance driven by the fleet} \\ &d_C &= &\text{ occupied distance driven by the fleet} \\ &d_T = d_C + d_E &= &\text{ total distance driven by the fleet} \\ &N &= &\text{ fleet size} \\ &R &= &\text{ number of customer requests served} \\ &w_i &= &\text{ waiting time of request } i\in \{1,...,R\} \\ &W &= &\text{ total waiting time } W = \sum_{i=1}^{R} w_i \end{align*}

The provided simulation environment is designed in the standard reinforcement framework: Rewards are issued after each simulation step. The (undiscounted) sum of all rewards is the final score. The higher the score, the better the performance.

For the AMoD-Task, there are 3 different championships (sub-tasks) which constitute separate competitions. The simulation environment computes the reward value for each category and conatenates them into a vector of length 3, which is then communicated as feedback to the learning agent. The agent can ignore but the entry of the reward vector from the category that they wish to maximize.

The three championships are as follows:

In the Service Quality Championship, the principal goal of the operator is to provide the highest possible service quality at bounded operational cost. Two negative scalar weights $\alpha_1\lt{}0$ and $\alpha_2\lt{}0$ are introduced. The performance metric to maximize is

\begin{align*} \mathcal{J}_{P-AMoD,1} = \alpha_1 W + \alpha_2 d_E \end{align*}

The values $\alpha_1$ and $\alpha_2$ are chosen such that the term $W$ dominantes the metric. The number of robotic taxis is fixed at some fleet size $\bar{N} \in \mathbb{N}_{\gt{}0}$.

In the Efficiency Championship, the principal goal of the operator is to perform as efficiently as possible while maintaining the best possible service level. Two negative scalar weights $\alpha_3\lt{}0$ and $\alpha_4\lt{}0$ are introduced. The performance metric to maximize is

\begin{align*} \mathcal{J}_{P-AMoD,2} = \alpha_3 W + \alpha_4 d_E \end{align*}

$\alpha_3$ and $\alpha_4$ are chosen such that the term $d_E$ dominantes the metric. The number of robotic taxis is fixed at some fleet size $\bar{N} \in \mathbb{N}_{\gt{}0}$.

Traffic law objective

Modified 3 days ago by Liam Paull

The following are a list of rule objectives the Duckiebots are supposed to abide by within Duckietown. All individual rule violations will be summarized in one overall traffic law objective $\objective_{T}$. These penalties hold for the embodied tasks (LF, LFV).

Quantification of “Staying in the lane”

Modified 3 days ago by Liam Paull

TODO: To be implemented

previous task next (2 of 8) index
task

The following was marked as "todo".

TODO: To be implemented

File book/AIDO/10_challenges/07_measuring.md.

File book/AIDO/10_challenges/07_measuring.md
in repo duckietown/docs-AIDO branch master19 commit 1235a3b7
last modified by Liam Paull on 2019-04-14 02:37:36

Created by function create_notes_from_elements in module mcdp_docs.task_markers.
Picture depicting a situation in which the "staying-in-the-lane rule" applies.

The Duckietown traffic laws say:

“The vehicle must stay at all times in the right lane, and ideally near the center of the right lane.”

We quantify this as follows: let $d(t)$ be the absolute perpendicular distance of the center of mass the Duckiebot-body from the middle of the right lane, such that $d(t)=0$ corresponds to the robot being in the center of the right lane at a given instant. While $d(t)$ stays within an acceptable range no cost is incurred. When the safety margin $d_{\text{safe}}$ is violated, cost starts accumulating proportionally to the square of $d(t)$ up to an upper bound $d_{max}$. If even this bound is violated a lump penalty $\alpha$ is incurred.

The “stay-in-lane” cost function is therefore defined as:

$$ \objective_{T-LF}(t) = \int_0^{T_{eps}} \begin{cases} 0 & d(t) \lt{} d_{safe} \\ \beta d(t)^2 & d_{safe} \leq d(t) \leq d_{max} \\ \alpha & d(t) \gt{} d_{max} \end{cases} $$

An example situation where a Duckiebot does not stay in the lane is shown in Figure 2.2.

Quantification of “Stopping at red intersection line” and “Stopping at red traffic light”

Modified 3 days ago by Liam Paull

TODO: To be implemented or removed

previous task next (3 of 8) index
task

The following was marked as "todo".

TODO: To be implemented or removed

File book/AIDO/10_challenges/07_measuring.md.

File book/AIDO/10_challenges/07_measuring.md
in repo duckietown/docs-AIDO branch master19 commit 1235a3b7
last modified by Liam Paull on 2019-04-14 02:37:36

Created by function create_notes_from_elements in module mcdp_docs.task_markers.

There are two different possibilities forcing the Duckiebot to a stop at an intersection. Some intersections have red stopping lines whereas others have traffic lights. The stopping behavior in both cases is similar and serves a similar purpose however. We therefore join the two cases into the “stopping at intersection”-rule.

The Duckietown traffic laws say:

“Every time the vehicle arrives at an intersection with a red stop line, the vehicle should come to a complete stop in front of it, before continuing.”

Picture depicting a Duckiebot stopping at a red intersection line.

Likewise, the traffic law says that: “Every time the vehicle arrives at an intersection with a red traffic light, the vehicle should come to a complete stop in front of it, and shall remain at rest as long as the red light is turned on.”

During each intersection traversal, the vehicle is penalized by $\gamma$ if either of the above stopping rules are violated.

Let $\mathbb{I}_{SI1}$ denote the red intersection line stopping rule as an indicator function.

The red line stopping rule applies if there was not a time $t$ when the vehicle was at rest ($v(t) = 0$) in the stopping zone defined as the rectangular area of the same width as the red line between $a$ and $b$ cm distance from the start of the stop line perpendicular to the center of mass point of the Duckiebot. This situation is demonstrated in Fig. Figure 2.4. $a$ and $b$ will be determined empirically to ensure reasonable behavior.

$$ \mathbb{I}_{SI1} = \begin{cases} 1, \quad {\nexists t \text{ s.t. } v(t)=0 \wedge p(t) \in S_{zone}} \\ 0,\quad otherwise \end{cases} $$

The condition that the position $p(t)$ of the center of mass of the Duckiebot is in the stopping zone is denoted with $p(t) \in \mathcal{S}$.

Let $I_{SI2}$ denote the red traffic light stopping rule.

$$ \mathbb{I}_{SI2} = \begin{cases} 1, \quad {p(t) \text{ crosses intersection} \wedge \text{ traffic light red}} \\ 0,\quad \text{ otherwise} \end{cases} $$

Then we write the objective as the cumulative sum of stopping at intersection rule infractions. The sum is over all intersection time periods, in which a rule violation may have occurred.

$$ \objective_{T-SI}(t) = \sum_{t_k} \gamma (\mathbb{I}_{SI1} + \mathbb{I}_{SI2}) $$

Here the sum over time increments $t_k$ denote the time intervals in which this conditions is checked. The rule penalty is only applied once the Duckiebot leaves the stopping zone. Only then is it clear that it did not stop within the stopping zone.

To measure this cost, the velocities $v(t)$ are evaluated while the robot is in the stopping zone $\mathcal{S}$. An example of a Duckiebot stopping at a red intersection line is depicted in Fig. Figure 2.4.

Quantification of “Keep safety distance”

Modified 3 days ago by Liam Paull

TODO: To be implemented

previous task next (4 of 8) index
task

The following was marked as "todo".

TODO: To be implemented

File book/AIDO/10_challenges/07_measuring.md.

File book/AIDO/10_challenges/07_measuring.md
in repo duckietown/docs-AIDO branch master19 commit 1235a3b7
last modified by Liam Paull on 2019-04-14 02:37:36

Created by function create_notes_from_elements in module mcdp_docs.task_markers.

The Duckietown traffic laws say:

“Each Duckiebot should stay at an adequate distance from the Duckiebot in front of it, on the same lane, at all times.”

We quantify this rule as follows: Let $b(t)$ denote the distance between the center of mass of the Duckiebot and the center of mass of the closest Duckiebot in front of it which is also in the same lane. Furthermore let $b_{\text{safe}}$ denote a cut-off distance after which a Duckiebot is deemed “far away”. Let $\delta$ denote a scalar positive weighting factor. Then

$$ \objective_{T-SD}(t) = \int_0^t \delta \cdot \max(0,b(t)- b_{\text{safe}})^2. $$

Quantification of “Avoiding collisions”

Modified 3 days ago by Liam Paull

TODO: To be implemented

previous task next (5 of 8) index
task

The following was marked as "todo".

TODO: To be implemented

File book/AIDO/10_challenges/07_measuring.md.

File book/AIDO/10_challenges/07_measuring.md
in repo duckietown/docs-AIDO branch master19 commit 1235a3b7
last modified by Liam Paull on 2019-04-14 02:37:36

Created by function create_notes_from_elements in module mcdp_docs.task_markers.

The Duckietown traffic laws say:

At any time a Duckiebot shall not collide with a duckie, Duckiebot or object.

Picture depicting a collision situation.

Collisions in Duckietown are generally not desired.

The vehicle is penalized by $\nu$ if within a time a time interval of length $t_k$ $t \in [t, t+t_k)$, the distance $\ell(t)$ between the vehicle and a nearby duckie, object or other vehicle is zero or near zero. $\ell(t)$ denotes the perpendicular distance between any object and the Duckiebot rectangular surface. The collision cost objective therefore is

\begin{align*} \objective_{T-AC}(t) = \sum_{t_k} \nu \mathbb{I}_{\exists t \in [ t-t_k, t ) \ell(t) \lt{} \epsilon} \end{align*}

where $\nu$ is the penalty constant of the collision.

Time intervals are chosen to allow for maneuvering after collisions without incurring further costs.

An illustration of a collision is displayed in Fig. Figure 2.6.

Hierarchy of rules

Modified 3 days ago by Liam Paull

TODO: finalize this section

previous task next (6 of 8) index
task

The following was marked as "todo".

TODO: finalize this section

File book/AIDO/10_challenges/07_measuring.md.

File book/AIDO/10_challenges/07_measuring.md
in repo duckietown/docs-AIDO branch master19 commit 1235a3b7
last modified by Liam Paull on 2019-04-14 02:37:36

Created by function create_notes_from_elements in module mcdp_docs.task_markers.

To account for the relative importance of rules, the factors $\alpha, \beta, \gamma, \delta, \nu$ of the introduced rules will be weighted relatively to each other.

Letting $\gt{}$ here denote “more important than”, we define the following rule hierarchy:

$$ \objective_{T-AC} \gt{} \objective_{T-SI} \gt{} \objective_{T-SD} \gt{} \objective_{T-LF} $$

I.e.:

Collision avoidance $\gt{}$ Stop line $\gt{}$ Safety distance $\gt{}$ Staying in the lane.

This constrains the factors $\alpha, \beta, \gamma, \delta, \nu$ whose exact values will be determined empirically to enforce this relative importance.

While the infractions of individual rules will be reported, as a performance indicator all rule violations are merged into one overall traffic law objective $\objective_{T}$. Let $\task$ denote a particular task, then the rule violation objective is the sum of all individual rule violations $\objective_i$ which are an element of that particular task.

$$ \objective_{T} = \sum_i \mathbb{I}_{\objective_i \in \task} \objective_{T-i}, $$

where $\mathbb{I}_{\objective_i \in \task}$ is the indicator function that is $1$ if a rule belongs to the task and $0$ otherwise.

Comfort objective

Modified 3 days ago by Liam Paull

Lane following (LF, LFV)

Modified 3 days ago by Liam Paull

In the single robot setting, we encourage “comfortable” driving solutions. We therefore penalize large angular deviations from the forward lane direction to achieve smoother driving. This is quantified through changes in Duckiebot angular orientation $\theta_{bot}(t)$ with respect to the lane driving direction.

As a comfort objective, we measure the average absolute squared changes in angular orientation of $\theta_{bot}(t)$ over time (“good_angle metric”).

$$ \objective_{C-LF/LFV}(t) = \frac{1}{t} \int_0^t |\theta_{bot}(t)|^2 dt $$

As an additional pointer we calculate the fraction of times the Duckiebot has a “good” angular heading or valid direction (VD, “valid_direction metric”).

$$ \objective_{VD-LF/LFV}(t) = \frac{1}{t} \int_0^t \mathbb{I}_{|\theta_{bot}(t)| \lt{} \theta_{good}} dt, $$

where $\theta_{good}$ corresponds to an angle of 20 degrees (converted to radians).

No questions found. You can ask a question on the website.