Table of Contents
Simultaneous Computation with Multiple Prioritizations in Multi-Agent Motion Planning #
January 11, 2026
Multi-agent path finding (MAPF) in large networks is computationally challenging. An approach for MAPF is prioritized planning (PP), in which agents plan sequentially according to their priority. Albeit a computationally efficient approach for MAPF, the solution quality strongly depends on the prioritization. Most prioritizations rely either on heuristics, which do not generalize well, or iterate to find adequate priorities, which costs computational effort. In this work, we show how agents can compute with multiple prioritizations simultaneously. Our approach is general as it does not rely on domain-specific knowledge. The context of this work is multi-agent motion planning (MAMP) with a receding horizon subject to computation time constraints. MAMP considers the system dynamics in more detail compared to MAPF. In numerical experiments on MAMP, we demonstrate that our approach to prioritization comes close to optimal prioritization and outperforms state-of-the-art methods with only a minor increase in computation time. We show real-time capability in an experiment on a road network with ten vehicles in our Cyber-Physical Mobility Lab.
Supplementary Materials #
| Code | github.com/embedded-software-laboratory/p-dmpc |
| Video | youtu.be/Mb59zQ3j3s0 |
1 Introduction #
When solving multi-agent planning problems, agents may share objectives or be constrained by each other. Traditional multi-agent path finding (MAPF) is a multi-agent planning problem in which agents move in an environment represented by a graph \(\graphMap =(\setVerticesMap ,\setEdgesMap )\), consisting of a set of vertices \(\setVerticesMap \), which are locations, and a set of edges \(\setEdgesMap \), which are paths between locations. The objective for each agent \(\anAgent \) is to move quickly from a start vertex \(s_{\anAgent } \in \setVerticesMap \) to a target vertex \(t_{\anAgent } \in \setVerticesMap \). The constraints for each agent are to avoid collisions with other agents. A collision occurs when two agents \(\anAgent \) and \(\anotherAgent \) at a time \(\timestep \) are located at the same vertex \(\aVertex \), represented by a tuple \(\tuple {\anAgent , \anotherAgent , \aVertex , \timestep }\), or when they traverse the same edge \(\edgeUndirected {\aVertex }{\anotherVertex }\in \setEdgesMap \), represented by a tuple \(\tuple {\anAgent , \anotherAgent , \aVertex , \anotherVertex , \timestep }\).
Solving a MAPF problem optimally is NP-hard (Yu, 2016; Yu & LaValle, 2013). Centralized approaches exploit the problem structure or lazily explore the solution space to increase their computational efficiency (e.g., Barer, Sharon, Stern, & Felner, 2021; Guo & Yu, 2024; Li, Ruml, & Koenig, 2021; Pan, Wang, Bi, Zhang, & Yu, 2024; Sharon, Stern, Felner, & Sturtevant, 2015; Yu, 2020). Prioritized planning (PP), introduced by Erdmann and Lozano-Pérez (1987), is a computationally more efficient method for MAPF. Instead of solving a MAPF problem optimally, PP divides the problem into multiple subproblems, and each agent solves only its subproblem. Since the subproblems are smaller in size compared to the centralized problem, the computation time is reduced. Additionally, agents can compute the solution in a distributed fashion.
1.1 Motivation #
While efficient, PP often produces suboptimal solutions and is not guaranteed to find a solution at all. In the example in Figure 1, out of the six possible prioritizations, only the two in which agent 2 has the lowest priority produce a solution. This is referred to as the incompleteness of PP: while some prioritizations can result in solutions, others may not (Ma, Harabor, Stuckey, Li, & Koenig, 2019). Given an underlying planning method, the prioritization determines the cost of the solution.
If we view prioritization and planning as a two-stage optimization, the solution given a prioritization can be seen as a local optimum. In general, for \(\numAgents \) agents, there exist up to \(\numAgents !\) prioritizations. Let \(\fcnObjectiveNcsA {\fcnPrio }\) denote the networked cost given a prioritization \(\fcnPrio \). In traditional MAPF, the networked cost is measured in terms of the flowtime, i.e., the sum of travel times of all agents, or the makespan, i.e., the maximum travel time of all agents.
The solution given the optimal prioritization is denoted as the global optimum in PP, which notably can be different from the optimal solution to the multi-agent planning problem. A core challenge in PP is finding a prioritization that results in a solution with a low networked cost.
To approach the global optimum, traditional approaches rely on heuristics to reduce the networked cost (e.g., Chen et al., 2023; Kloock, Kragl, Maczijewski, Alrifaee, & Kowalewski, 2019; Scheffe, Dorndorf, & Alrifaee, 2022; Wu, Bhattacharya, & Prorok, 2020; Yao & Zhang, 2018). Such heuristic approaches require domain-specific knowledge, making them unsuitable for generalized application. Further, their performance can be subpar, as heuristics are often tailored towards specific planning problems.
\begin{align} a &= b +c \\ a &= b +c \\ \end{align}
In domain-independent approaches, there is usually no a-priori knowledge about which prioritization yields a close-to-optimal solution. Thus, the prioritization must be optimized, which involves computing multiple solutions using different prioritizations. However, computing multiple solutions sequentially might be intractable for real-time applications with constrained computation time. Consequently, a simultaneous computation of multiple solutions is needed.
1.2 Problem Setting #
In this article, we use prioritized multi-agent motion planning (MAMP) for connected and automated vehicles (CAVs) for road use. MAMP is a variant of MAPF, in which the kinodynamic constraints of vehicles are taken into account during planning. In the setting of MAMP for CAVs, algorithms must be able to run decentralized to account for the lack of a centralized computation entity. Each agent computes its own motion plan. Furthermore, motion plans should respect the nonlinear system dynamics of road vehicles. Due to model mismatch and the road environment being generally highly dynamic, the problem setting requires frequent control updates. In turn, real-time capability of the algorithms requires them to be fast enough and to have bounded computation time. The former is challenging due to the computationally involving motion planning with nonlinear system dynamics and nonconvex collision constraints, while the latter is challenging due to the increase of computation time with the number of vehicles and the desired solution quality in traditional PP approaches. A more detailed comparison of MAPF and MAMP is given in Section 4.
This article focuses on the challenge of achieving good solution quality in short computation time with a distributed computation of the motion plans. We keep this focus by making two simplifying assumptions:
-
(i) The communication time is an order of magnitude faster than the planning time. This assumption is justified with the computationally involving motion planning task of solving a nonconvex optimization problem, and with the technological progress in communication systems.
-
(ii) We consider a setting with exclusively connected and autonomous vehicles, i.e., without uncontrolled vehicles. Such a setting can be achieved in controlled environments, such as warehouses.
1.3 Related Work #
Figure 2 displays the computations of three interacting, sequentially computing agents. The blocks represent a prioritization, i.e., an order of computations, while the thick lines display the computation time.
When selecting the prioritization, heuristics are often consulted. Heuristics aim to improve the prioritization mostly for their domain-dependent objective, and are thus less likely to generalize well. The following approaches use objective-based heuristics for prioritization to improve the networked cost. As illustrated in Figure 2a, these approaches solve PP problem using a single prioritization. In (Wu et al., 2020), the prioritization is determined by the number of possible paths a robot can take to reach its goal. The lower this number is, the higher is a robot’s priority. Thereby, the probability for each robot to find a solution is meant to be increased, which directly affects the networked cost. In (van den Berg & Overmars, 2005), robots with longer estimated travel distances receive higher priority to more evenly distribute the travel time among robots. Zhang, Li, Huang, and Koenig (2022) proposes a prioritization algorithm based on machine learning, which performs competitively compared to heuristic algorithms. Our previous work (Kloock, Kragl, et al., 2019) prioritizes agents with the goal of increasing the traffic flow rate at a road intersection through a heuristic based on the remaining time before an agent enters the intersection. In (Yao & Zhang, 2018), automated vehicles approaching an intersection are prioritized using mixed integer programming. Both the prioritization and the vehicle’s speed are optimized to minimize the travel time of the vehicles. The algorithm in (Chalaki & Malikopoulos, 2022) prioritizes vehicles on intersections using job-shop scheduling and thereby reduces the vehicles’ average travel time. Most of the above approaches have in common that the solution for a single prioritization is computed. Consequently, their computation time is low as the complexity is in the order of the number of agents \(\bigO ( \numAgents )\). However, the solution quality can be lower than with a different prioritization, or they can fail to find a solution, even though one exists with a different prioritization.
Instead of finding a prioritization heuristically, other works optimize the prioritization. As illustrated in Figure 2b, these approaches solve the networked planning problem by exploring multiple prioritizations sequentially. These approaches aim to improve the solution quality by investing computation time. Bennewitz, Burgard, and Thrun (2002) propose a randomized hill-climbing search for finding the optimal prioritization. Beginning at an initial prioritization, priorities of agents are swapped randomly to increase the solution quality. Recomputing the solution in each iteration increases the computation effort. Chan, Stern, Felner, and Koenig (2023); Ma et al. (2019) present an algorithm to lazily explore the space of all possible prioritizations. Whenever a conflict between two agents occurs, both possible prioritizations are added to a search tree. Thus, the computation effort depends on the number of needed iterations until a conflict-free prioritization is found. The above approaches can compute solutions to multiple prioritizations, and thus find solutions with higher quality than approaches considering only a single prioritization. However, the solution quality improvement comes at the cost of higher computation time. For example, the worst-case effort of conflict-based approaches is \(\bigO ( \numAgents \cdot 2^{\numAgents ^2} )\), as for each conflict up to \(\numAgents \) plans must be recomputed, and for each of the possible \(\numAgents ^2\) conflicts two options for prioritization of the involved two agents exist.
In traditional approaches to PP, a single, central processing unit computes solutions for all agents in sequence (e.g., Chalaki & Malikopoulos, 2022; Chan et al., 2023; Ma et al., 2019; Sharon et al., 2015; van den Berg & Overmars, 2005; Yao & Zhang, 2018; Zhang et al., 2022). A central processing unit is particularly effective if the computation time for planning is short in comparison to a potential communication time. Even so, PP can be executed by multiple agents, which can be distributed physically, with several processing units on several agents, or logically, with several processing units on a central entity (e.g., Čáp, Novák, Kleiner, & Selecký, 2015; Kloock, Kragl, et al., 2019; Kuwata, Richards, Schouwenaars, & How, 2007; Wu et al., 2020). Distributed computation on multiple agents is suitable for the problem setting described in Section 1.2. With multiple processing units, a sequential computation regarding a prioritization yields periods of idle time in each unit. The problem of reducing idle time due to sequential operation is addressed in production automation. Johnson’s rule (Almhanna, Al-Turaihi, & Murshedi, 2023) and sequencing (Mak, Rong, & Zhang, 2014) are methods to decrease idle time by optimizing the assignment and sequence for jobs on machines in assembly lines. Further, Çam and Sezen (2020) reduce idle time of control systems, i.e., unproductive time of the system, but not idle time during computation. In this article, we will utilize idle time in sequential computation for simultaneous computation of multiple prioritization in PP, which should increase solution quality without increasing computation time.
1.4 Contribution #
In PP, the optimality of the solution depends on the prioritization. To find the optimal prioritization and therefore the global optimum of the PP problem, the set of possible prioritizations and their corresponding solutions to the PP problem needs to be explored. If computation time is limited, such as in real-time applications, the sequential computation of multiple solutions can be intractable. This paper presents an approach that exploits the sequential nature of PP to explore multiple prioritizations simultaneously. Our approach improves the solution quality over sequential computation with a single prioritization while keeping computation time short. We select prioritizations such that
-
(i) we always improve or maintain the solution quality of a previously found prioritization, and
-
(ii) all explored prioritizations are computed simultaneously.
Figure 2 illustrates the benefit of our approach in an example with three interacting, sequentially computing agents. Blocks display the prioritization, thick lines represent the computation time of agents. Figure 2a shows how approaches based on heuristics solve the PP problem for only a single prioritization. The solution quality strongly depends on the heuristic. Figure 2b shows how the PP problem can be solved multiple times sequentially regarding multiple prioritizations. Even though it can improve solution quality, it prolongs the computation time. In this work, we make use of the fact that in sequential computation the agents idle most of the time: In the example shown in Figure 2a, agents idle two-thirds of the available time. Figure 2c showcases our approach: By utilizing the idle time, we solve the PP problem regarding the three prioritizations from Figure 2b simultaneously.
Applying our strategy to the example in Figure 1, we always explore a prioritization in which agent 2 has the lowest priority.
1.5 Structure of this Article #
The rest of this article is organized as follows. In Section 2, we formalize time-variant coupling and prioritization of agents in a receding horizon planning framework. In Section 3, we formulate requirements for simultaneously computable prioritizations. We further present our algorithm for simultaneous computation, which can both retain prioritizations to utilize the predictive nature of receding horizon planning, and explore new prioritizations to increase the likelihood of finding a solution. In particular, the time required for exploration spreads across consecutive time steps, which keeps computation time at each time step short. We present our multi-agent motion planner for CAVs driving on roads in Section 4. In Section 5, we present a comparison of our approach to state-of-the-art prioritization approaches in numerical experiments with CAVs, and showcase real-time capabilities of our approach in an experiment with ten CAVs in our Cyber-Physical Mobility Lab (CPM Lab).
2 Preliminaries #
This section provides the background to prioritized receding horizon planning. In PP, agents first receive all plans from their predecessors, then solve the planning problem, and finally send their plans to their successors. To determine the predecessors and successors of agents, we couple and prioritize them. An overview of the PP framework is given in Figure 3. The following section introduces our notation and the background on coupling, prioritization, and receding horizon planning.
2.1 Notation #
In this paper, we refer to agents whenever concepts are generally applicable to PP. We assume that states of agents are observable, and therefore we refer to states instead of outputs. The definitions and methods can be transferred to outputs as well, which we omit for brevity. We denote scalars with normal font, vectors with bold lowercase letters, matrices with bold uppercase letters, and sets with calligraphic uppercase letters. For any set \(\mathcal {S}\), the cardinality of the set is denoted by \(|\mathcal {S}|\). A variable \(x\) is marked with a superscript \(x^{(\anAgent )}\) if it belongs to agent \(\anAgent \). The actual value of a variable \(x\) at time \(\timestep \) is written as \(x({\timestep })\), while values predicted for time \(\timestep +\timestepIterator \) at time \(\timestep \) are written as \(x_{\timestep +\timestepIterator \vert \timestep }\). A trajectory is denoted by replacing the time argument with \((\cdot )\) as in \(x_{\cdot \vert \timestep }\).
This paper addresses motion planning for multiple agents; we refer to the entirety of all agents as a networked control system (NCS). We use graphs as a modeling tool of NCS. Every agent is associated with a vertex, so the terms are used synonymously.
-
Definition 2 (Undirected graph). An undirected graph \(\graphUndirected = \left (\setVertices ,\setEdges \right )\) is a pair of two sets, the set of vertices \(\setVertices =\set {1,\dots ,\numAgents }\) and the set of undirected edges \(\setEdges \subseteq \setVertices \times \setVertices \). The edge between \(i\) and \(j\) is denoted by \(\edgeUndirected {i}{j}\).
-
Definition 3 (Directed graph). A directed graph \(\graphDirected = \left (\setVertices ,\setEdgesDirected \right )\) is a pair of two sets, the set of vertices \(\setVertices =\set {1,\dots ,\numAgents }\) and the set of directed edges \(\setEdgesDirected \subseteq \setVertices \times \setVertices \). The edge from \(i\) to \(j\) is denoted by \(\edgeDirected {i}{j}\). An oriented graph is a directed graph obtained from an undirected graph by replacing each edge \(\edgeUndirected {i}{j}\) with either \(\edgeDirected {i}{j}\) or \(\edgeDirected {j}{i}\).
2.2 Couple #
If agents are related via their objectives or constraints, we speak of coupled agents. A coupling graph represents this relation between agents.
-
Definition 4 (Undirected coupling graph). An undirected coupling graph \(\graphUndirected (\timestep )=\bigl (\setVertices ,\setEdges (\timestep )\bigr )\) is a graph that represents the interaction between agents at time step \(\timestep \). Vertices \(\anAgent \in \setVertices =\set {1,\ldots ,\numAgents }\) represent agents, and edges \(\edgeUndirected {\anAgent }{\anotherAgent }\in \setEdges \) represent coupling objectives and constraints between agents.
An example coupling graph with four agents is displayed by Figure 4a. The application in which PP is used determines which coupling objectives and constraints must be considered in an agent’s planning problem. For example, in MAPF, a constraint is introduced to achieve collision avoidance between two agents. Such a constraint relates the two agents, so they are coupled.
The undirected coupling graph can often be determined before planning starts. For example, in robot applications with a fixed time horizon, the robots’ movement range in this time horizon can be used to determine couplings (Scheffe, Xu, & Alrifaee, 2024). Alternatively, if robots follow fixed paths, the paths’ intersections determine couplings between robots. This can be the case for road vehicles following lanes, or for warehouse robots (Li, Hoang, Lin, Vu, & Koenig, 2023; Ma, Li, Kumar, & Koenig, 2017). If no prior knowledge on the coupling is available, but the coupling is critical, agents can be coupled iteratively, or alternatively, all agents must be connected.
2.3 Prioritize #
In PP, we prioritize agents, which results in clear responsibilities considering coupling objectives and constraints during planning (Alrifaee, Heßeler, & Abel, 2016; Kuwata et al., 2007). A fixed or time-invariant prioritization function \(\fcnPrio \colon \setVertices \to \setNaturalNumbers \) prioritizes every agent. If \(\fcnPrio (\anAgent )<\fcnPrio (\anotherAgent )\), then agent \(\anAgent \) has a higher priority than agent \(\anotherAgent \).
By prioritizing, we can orient the edges of an undirected coupling graph to form a directed coupling graph.
-
Definition 5 (Directed coupling graph). A directed coupling graph \(\graphDirected (\timestep )=\bigl (\setVertices ,\setEdgesDirected (\timestep )\bigr )\) is an oriented graph obtained from orienting the edges of an undirected coupling graph at time step \(\timestep \). If an edge \(\edgeDirected {\anAgent }{\anotherAgent }\) is directed from agent \(\anAgent \) to agent \(\anotherAgent \), then agent \(\anotherAgent \) is responsible for considering the respective coupling objective and constraint in its planning problem.
An edge points towards the vertex with lower priority,
\(\seteqnumber{0}{}{1}\)\begin{equation} \label {eq:construction-rule} \edgeDirected {\anAgent }{\anotherAgent } \in \setEdgesDirected \iff \fcnPrio (\anAgent ) < \fcnPrio (\anotherAgent ) \land \edgeUndirected {\anAgent }{\anotherAgent }\in \setEdges . \end{equation}
The prioritization function thus constitutes a strict partial order \(\prec \) on the agent set \(\setVertices \),
\(\seteqnumber{0}{}{2}\)\begin{equation} \anAgent \prec \anotherAgent \iff \fcnPrio (\anAgent ) < \fcnPrio (\anotherAgent ). \end{equation}
The order is partial, as opposed to total, since a relation exists only for coupled agents. For the strict partial order—and thus the orientation of every edge—to be well-defined, a valid prioritization function needs to assign pairwise different priorities to vertices that are connected by an edge:
\(\seteqnumber{0}{}{3}\)\begin{equation} \label {eq:valid_priority} \fcnPrio (\anAgent )\neq \fcnPrio (\anotherAgent ), \quad \forall \anAgent , \anotherAgent \in \setVertices , \forall \edgeUndirected {\anAgent }{\anotherAgent }\in \setEdges , \anAgent \neq \anotherAgent . \end{equation}
Given the construction rule in (2), the directed coupling graph \(\graphDirected \) resulting from the undirected coupling graph \(\graphUndirected \) and a valid prioritization function \(\fcnPrio \) regarding (4) is a directed acyclic graph (DAG)(Scheffe, Kahle, & Alrifaee, 2025).
Since in PP, an agent requires its predecessors’ plans to consider coupling objectives and constraints, the agents must solve their planning problems in a sequential order. This order, as well as the communication links, are deducible from the directed coupling graph by traversing the graph along its edges. An example directed coupling graph with four agents is displayed by Figure 4b.
2.4 Receding Horizon Planning #
A strategy to further reduce computation time in PP is by introducing a fixed time horizon. Instead of solving the MAPF problem from start to end, only a certain number of time steps is considered. The problem is solved repeatedly over the course of time, a practice known as receding horizon planning or online replanning (Li, Tinka, et al., 2021; Scheffe, Pedrosa, Flaßkamp, & Alrifaee, 2023; Shahar et al., 2021; Silver, 2005). Agents apply actions until the next planning result is available.
In each time step of receding horizon planning, priorities can be reassigned, which results in a time-variant prioritization. We redefine the time-invariant prioritization function to a time-variant prioritization function. It is a function \(\fcnPrio \colon \setVertices \times \setNaturalNumbers \to \setNaturalNumbers \) which prioritizes each agent \(\anAgent \) at each time step \(\timestep \). We indicate a time-invariant prioritization by replacing the time argument with a dash, as in \(\fcnPrio (\anAgent ,-)\). In our approach, the priority ordering is consistent during planning for the planning horizon, but can change whenever agents plan.
In accordance with the usage in the literature (e.g., Ma et al., 2019; Wu et al., 2020), we define completeness as follows.
We extend the considerations in (Ma et al., 2019) from fixed prioritizations to time-variant prioritizations.
-
Proof. A counterexample is given in Figure 5. This example requires the highest-priority agent to take a suboptimal action with regards to its own objective in order to create a solution to the MAPF problem. The highest-priority agent, however, always takes an optimal action in prioritized planning. □
-
Proof. A time-variant prioritization can mimic a fixed prioritization by using the same ordering at every time step,
\(\seteqnumber{0}{}{4}\)\begin{equation} \fcnPrio (\anAgent ,\timestep ) := \fcnPrio (\anAgent ,-) . \end{equation}
Therefore, every MAPF instance that is solvable with a fixed prioritization is solvable with a time-variant prioritization. □
-
Proof. For the MAPF instance displayed in Figure 6, a time-variant prioritization that does not result in a solution is any fixed prioritization. □
2.5 Computation Time Definition #
In general, the computation time of MAMP is the duration from measuring the states until control inputs are determined, which consists of the planning time in a locally distributed setting, and additional communication time in a physically distributed setting. This article is on prioritization of agents, which influences the required number of sequential computations. Since, in our problem setting, the planning is computationally involving, we simplify the communication time to be constant and negligible compared to the planning time. The computation time \(\tCompNcs \) of the NCS is thus mainly determined by the time to solve the agents’ planning problems.
With the following definitions, we formalize the computation time of the NCS based on our previous work (Scheffe et al., 2025). Let the sources of \(\graphDirected \) be the set of vertices without incoming edges
\(\seteqnumber{0}{}{5}\)\begin{equation} \setVertices_s = \set {i\in \setVertices \mid \vertexInDegree {i} = 0} \end{equation}
and the sinks be the set of vertices without outgoing edges
\(\seteqnumber{0}{}{6}\)\begin{equation} \setVertices_t = \set {i\in \setVertices \mid \vertexOutDegree {i} = 0}. \end{equation}
Note that since the coupling graph is a DAG, there is at least one source and one sink. Note further that the number of agent classes \(\numLevels \) is the diameter of \(\graphDirected \) plus one. The diameter of a graph is the greatest length of any shortest path between each pair of vertices \(\setVertices \times \setVertices \).
-
Definition 9 (Path). A path in a graph \(\graphUndirected \) is a sequence of edges connecting distinct vertices
\(\seteqnumber{0}{}{7}\)\begin{equation} (e_z)_{z=1}^{n} \bigl ( \edgeDirected {\anAgent_1}{\anAgent_2}, \edgeDirected {\anAgent_2}{\anAgent_3}, \ldots , \edgeDirected {\anAgent_{n}}{\anAgent_{n+1}} \bigr ) , \end{equation}
with the domain \(\setNaturalNumbers \) and the codomain \(\setEdges \). The length of the path is defined as the number of edges \(n\).
To determine the computation time of the NCS, we add a virtual source \(S\) and a virtual sink \(T\) to the set of vertices \(\setVertices \) of \(\graphDirected \), resulting in the set of vertices \(\setVertices ^{\prime }\). We furthermore add edges from the source \(S\) to all vertices in \(\setVertices_s\), and from all vertices in \(\setVertices_t\) to the sink \(T\), resulting in the set of edges \(\setEdgesDirected ^{\prime }\).
To obtain the computation time \(\tCompNcs \), we weigh edges by a weighting function \(f_w \colon \setVertices ^{\prime } \times \setVertices ^{\prime } \to \setRealNumbers \)
\(\seteqnumber{0}{}{8}\)\begin{equation} \label {eq:weight_func} f_w \bigl ( \edgeDirected {i}{j} \bigr ) = \begin{cases} \tComp ^{(i)}, & \text { if } \anAgent \in \setVertices , \\ 0, & \text { otherwise}, \end {cases} \end{equation}
with the computation time \(\tComp ^{(i)}\) of an edge’s respective starting vertex. The result is the computation graph \(\graphDirected ^{\prime } = (\setVertices ^{\prime }, \setEdges ^{\prime }, f_w)\). Figure 7 illustrates a computation graph.
Let \(P_{\max }\subseteq \setVertices ^{\prime } \times \setVertices ^{\prime }\) denote the longest path between \(S\) and \(T\) in \(\graphDirected ^{\prime }\). The weight of this path corresponds to the computation time of the NCS
\(\seteqnumber{0}{}{9}\)\begin{equation} \label {eq:computation_time} \tCompNcs = \sum_{\edgeDirected {i}{j}\in P_{\max }} f_w \left (\edgeDirected {i}{j}\right ). \end{equation}
3 Exploring Multiple Prioritizations Simultaneously #
This section presents the core idea of this paper, which is to improve the solution quality of a given prioritization by exploring multiple other prioritizations simultaneously, as illustrated by Figure 2c. In the PP framework (Figure 3), our approach replaces both the element “Prioritize” and “Plan”, as we will identify multiple prioritizations for which agents can plan simultaneously. We state the requirements for prioritizations with which simultaneous computation is possible in Section 3.1. With these requirements, we present in Section 3.2 conditions to identify multiple prioritizations which can be computed simultaneously, which allows us to improve solution quality without increasing computation time. Building on these conditions, we present the algorithm for selecting a set of prioritizations and simultaneously computing with these prioritizations in Section 3.3. In Section 3.4, we show theoretically that our approach can both retain prioritizations to utilize the predictive nature of receding horizon planning and explore new prioritizations to increase the likelihood of finding a solution.
3.1 Parallelizable Prioritizations #
In this section, we establish the requirements for prioritizations with which simultaneous computation is possible. Our goal of improving the solution quality compared to a given prioritization implies the assumption that an initial valid prioritization according to (4), i.e., a prioritization constituting a strict partial order, is given. Finding an initial valid prioritization is simple, e.g., a prioritization that assigns priorities according to unique vertex numbers is valid. Other prioritization algorithms exist (Kuwata et al., 2007; Scheffe et al., 2022, 2025).
The computation order of agents in PP can be structured with agent classes \(\setClasses \). Agents belonging to the same agent class can solve their planning problem in parallel. An undirected coupling graph \(\graphUndirected = (\setVertices ,\setEdges )\) and a prioritization function \(\fcnPrio \) form a directed coupling graph \(\graphDirected \) with (2). By topologically sorting the directed coupling graph, we map agents to classes \(\setClasses \) according to our Algorithm 1.
-
Require: Directed coupling graph \(\graphDirected = (\setAgents , \setEdgesDirected )\)
-
Ensure: Agent classes in sequence \(\sequenceClasses \)
-
1: \(\numLevels \gets 0\) ▷ number of classes
-
2: \(\setAgents_\text {todo} \gets \setAgents \) ▷ remaining agents
-
3: while \(\setAgents_\text {todo} \neq \emptyset \) do
-
4: \(\numLevels \gets \numLevels +1\)
-
5: \(\classAgents [\numLevels ] \gets \emptyset \)
-
8: \(\classAgents [\numLevels ] \gets \classAgents [\numLevels ] \cup i\) ▷ sources can compute
-
9: end if
-
10: end for
-
11: if \(\classAgents [\numLevels ] = \emptyset \) then
-
12: return FAILURE ▷ Graph is no DAG
-
13: end if
-
15: \(\setAgents_\text {todo} \gets \setAgents_\text {todo} \setminus \classAgents [\numLevels ]\)
-
18: \(\setEdgesDirected \gets \setEdgesDirected \setminus \edgeDirected {i}{j}\)
-
19: end for
-
20: end for
-
21: end while
-
22: return \(\sequenceClasses \)
The algorithm inspects all vertices to find sources, i.e., vertices with no incoming edges (Line 7). All sources can start their computation, so we add them to a class (Line 8). The resulting class is next in solving its planning problem, so it is added to the computation sequence (Line 14). The processed vertices are removed from the set of vertices to process (Line 15). Finally, all edges connected to the processed vertices are removed from the coupling graph (Lines 16 to 18). The loop repeats until all vertices are assigned to classes. The output of Algorithm 1 is a sequence of agent classes \(\sequenceClasses \) which orders the agent classes \(\setClasses = \set {\classAgentsA {1}, \ldots , \classAgentsA {\numLevels }}\). This computation sequence represents the order in which the agent classes can solve their planning problems.
Definition 10 (Computation sequence). The set of agent classes \(\setClasses \) is a countable set with \(\abs {\setClasses } = \numLevels \). The computation sequence defined as
\(\seteqnumber{0}{}{10}\)\begin{equation} \sequenceClasses = \left (\sequenceElement {1},\sequenceElement {2},\dots ,\sequenceElement {\numLevels }\right ) \end{equation}
with the domain \(\set {1, \ldots , \numLevels }\) and the codomain \(\setClasses \) resembles a permutation of the agent classes in \(\setClasses \).
Figure 8 shows an example of Algorithm 1 for the directed coupling graph with four agents given in Figure 4b. The resulting computation sequence is \(\bigl ( \set {1}, \set {2,3}, \set {4} \bigr )\).
Algorithm 1 returns a computation sequence with the minimum number of sequential computations for the given directed coupling graph. We can also invert this process, i.e., we can generate a valid prioritization corresponding to a computation sequence. W.l.o.g., for two classes \(\classAgentsA {1},\classAgentsA {2}\in \setClasses \), \(\classAgentsA {1}\neq \classAgentsA {2}\), and with \(\anAgent \in \classAgentsA {1}\) and \(\anotherAgent \in \classAgentsA {2}\) it must hold that
\(\seteqnumber{0}{}{11}\)\begin{equation} \fcnPrio (\anAgent ) \neq \fcnPrio (\anotherAgent ). \end{equation}
According to (4), agents in the same class may have equal priorities as they are not connected by an edge. However, since we allow time-variant coupling graphs, we enforce unique priorities among agents (cf. Section 3.4). A valid prioritization corresponding to the computation sequence is given by
\(\seteqnumber{0}{}{12}\)\begin{equation} \label {eq:prio-from-sequence} \fcnPrio (\anAgent ) = Z \cdot \numAgents + i, \quad \anAgent \in \sequenceElement {Z} \in \sequenceClasses , \end{equation}
with an agent \(\anAgent \), and the sequence index \(Z\). For the example in Figure 8, this function would yield
\(\seteqnumber{0}{}{13}\)\begin{equation} \fcnPrio (1) = 5 , \quad \fcnPrio (2) = 10 , \quad \fcnPrio (3) = 11 , \quad \fcnPrio (4) = 16 . \end{equation}
3.2 Increasing Solution Quality while Maintaining Computation Time #
The problem of finding the optimal permutation of agent classes can formally be stated as follows.
-
Problem 1. Given a set of agent classes \(\setClasses \), find a computation sequence \({\sequenceClasses }^*\) resulting in a prioritization function \(\fcnPrio_{\indexPermutation }\) with (13) that minimizes the sum of the costs of all agents
\(\seteqnumber{0}{}{14}\)\begin{equation} \label {eq:problem-optimal-sequence} \fcnPrio_{\indexPermutation } = \argmin_{\fcnPrio } \fcnObjectiveNcsA {\fcnPrio }. \end{equation}
Note that finding the optimal sequence does not necessarily coincide with finding the optimal prioritization, as the permutations of sequences may not express the permutations of prioritizations.
Problem 1 is known as the Drilling Problem, which is a problem in combinatorial optimization (Korte & Vygen, 2018, p.1). Given a set of holes that need to be drilled, the Drilling Problem consists of finding a permutation of the order in which they need to be drilled to minimize the total distance the drill needs to travel. As the set of classes contains \(\lvert \setClasses \rvert = \numLevels \) elements, for Problem 1, this would result in \(\lvert \setSequences \rvert =\numLevels !\) different permutations. A possible solution to that problem is enumerating all \(\numLevels !\) sequences (Azarm & Schmidt, 1997; Korte & Vygen, 2018): Given a set \(\setClasses \) of \(\numLevels \) classes, solve PP problems regarding each of the \(\numLevels !\) sequences and choose the one with minimal cost. However, the computation time of this approach increases rapidly with the number of classes.
Solving Problem 1 is computationally intractable with real-time constraints. According to the goal of this article to increase solution quality without increasing computation time, we aim to solve the following problem.
Our goal in Problem 2 is to find \(\numLevels \) sequences out of \(\numLevels !\) that can be computed simultaneously. A class \(\classAgentsA {l} \in \setClasses \) can solve exactly one planning problem regarding a specific computation sequence in each time slot.
To formalize the task, we introduce a computation schedule matrix \(\matSchedule \in \setClasses ^{\numLevels \times \numLevels }\). Each cell contains an agent class, and each row contains a computation sequence. Each column of \(\matSchedule \) represents a time slot in the networked computation. The computation schedule matrix thus defines for each agent the prioritization for which it should plan: an agent looks up its agent class and finds the corresponding prioritization, i.e., the computation sequence, in the matrix. The matrix \(\matSchedule \) represents a valid computation schedule if its rows fulfill
\(\seteqnumber{0}{}{15}\)\begin{equation} \matScheduleElement {ij} \neq \matScheduleElement {ik}, i,j,k \in \set {1,\dots ,\numLevels }, \, j \neq k \label {eq:row-distinct} \end{equation}
and its columns fulfill
\(\seteqnumber{0}{}{16}\)\begin{equation} \matScheduleElement {ij} \neq \matScheduleElement {kj}, i,j,k \in \set {1,\dots ,\numLevels }, \, i \neq k . \label {eq:column-distinct} \end{equation}
We ensure valid computation sequences with (16), and we ensure that each class solves exactly one planning problem in each time slot with (17). The matrix resulting from these conditions is also known as a Latin square. Adapted from McKay and Wanless (2005), a Latin square \(\bm {L} \in \set {1,\dots ,N}^{N \times N}\) is a matrix in which each row and column contains only distinct entries. That is, each row and column contains every element in \(\set {1,\dots ,N}\) exactly once. For \(\matSchedule \), the property of distinct entries in each row and column is given by (16) and (17), respectively. The problem is related to a Sudoku puzzle: The solution to a Sudoku puzzle is a Latin square with \(N=9\) with the additional rule that each of the nine non-overlapping \(3\times 3\) partial matrices contain every element in \(\set {1,\dots ,N}\) exactly once.
An example computation schedule matrix and the corresponding coupling graphs are displayed by Figure 9. Starting from the prioritization and resulting agent classes of the example in Figure 8, a valid computation schedule matrix is presented in Figure 9a. Note how each agent class appears exactly once per column (at the same time, an agent class can only do one computation) and once per row (each computation sequence permutation requires one computation per agent class). The three coupling graphs corresponding to the computation schedule matrix are given in Figure 9b. With three different coupling graphs, three different solutions are computed simultaneously.
3.3 Simultaneous Computation with Multiple Prioritizations #
This section describes our algorithms for simultaneous computation with multiple prioritizations. We first present our decentralized algorithm through which every agent obtains the same computation schedule matrix which fulfills (16) and (17). We then show how agents solve PP problems with multiple prioritizations. Finally, we show how agents select a solution for the PP problems.
Our goal for the computation schedule matrix \(\matSchedule \) is to explore \(\numLevels \) prioritizations in a time step to increase the probability of finding the optimal prioritization according to Problem 1. Algorithm 2 is our decentralized algorithm to find a computation schedule matrix \(\matSchedule \). We restate that a row \(\matScheduleVector {q\cdot }\in 1 \times \numLevels \) corresponds to a computation sequence \(q\), and a column \(\matScheduleVector {\cdot m}\in \numLevels \times 1\) corresponds to a time slot \(m\) during the networked computation. First, we initialize the computation schedule matrix \(\matSchedule \) with the size \(\numLevels \times \numLevels \) and fill the first row \(\matScheduleVector {1\cdot }\) with the classes of \(\setClasses \) in sequence (Lines 1 and 2). Note that this initial row can change from one time step to another as it depends on the initial prioritization. We enforce a common random seed among agents for consistent results of the algorithm (Line 4). The algorithm determines the remaining rows in a loop (Line 5). For each row, the algorithm loops over all \(\numLevels \) columns that need to be populated (Lines 7 and 8). We use the number of available classes in each column that can form a valid computation schedule matrix as a heuristic for the order in which the columns are populated (Line 10). This number of available classes is stored in the options array. It is an array of sets, each containing all elements of \(\setClasses \) that respect (16) regarding the current row \(\indexPermutation \) and that respect (17) regarding the column corresponding to the array index. The next column to populate is the one with the least number of available classes (Line 12). One of these classes is selected at random (Line 17). If all columns are populated with valid entries, we progress to the next row (Line 21). If there is a column for which no classes are available to select from, the current row is invalid with the rest of the computation schedule matrix (Line 13), so we reset the current row (Line 23) and start over.
-
Require: Set of agent classes \(\setClasses = \set {\classAgents [1], \classAgents [2], \ldots , \classAgents [\numLevels ]}\)
-
Ensure: Computation schedule matrix \(\matSchedule \)
-
1: \(\matSchedule \gets \text {empty matrix of size } \numLevels \times \numLevels \)
-
2: \(\matScheduleVector {1\cdot } \gets [ \classAgents [1], \classAgents [2], \ldots , \classAgents [\numLevels ] ]\)
-
3: \(q \gets 1\)
-
4: initialize random seed with current time step ▷ for consistency across agents
-
6: is_valid \(\gets \) true
-
9: for \(m = 1, \ldots , \numLevels \) do
-
10: options[\(m\)] \(\gets \setClasses \setminus \left ( \matScheduleVector {q \cdot } \cup \matScheduleVector {\cdot m} \right )\)
-
11: end for
-
14: is_valid \(\gets \) false
-
15: break
-
16: end if
-
17: \(\matScheduleElement {qm} \gets \) Random(options[\(m\)])
-
18: \(N_{\text {cols-todo}} \gets N_{\text {cols-todo}} - 1\)
-
19: end while
-
20: if is_valid then
-
22: else
-
23: \(\matScheduleVector {q\cdot } \gets \text {empty vector of size } 1 \times \numLevels \)
-
24: end if
-
25: end while
-
26: return \(\matSchedule \)
We illustrate Algorithm 2 with an example in Figure 10. The figure contains four agent classes, and depicts the construction of the second row of the computation schedule matrix. Each other row in the figure corresponds either to determining the remaining options of computation classes for each cell (Line 10), or to the random selection of an option (Line 17) from the most constrained cell (Line 12).
Our Algorithm 2 builds a computation schedule row by row, which is valid according to (16) and (17). A partly built, valid computation schedule can always be completed. Therefore, our Algorithm 2 will eventually converge to a valid computation schedule. Due to the randomness of the approach, there is no bound on the number of required iterations until convergence, which could be addressed with a more sophisticated algorithm. In practice, the computation time to build a computation schedule matrix is negligible when compared to the computation time for motion planning.
Computation with multiple prioritizations is similar in concept to computation with a single prioritization. The main difference is that there are \(\numLevels \) different directed coupling graphs \(\graphDirected_{\fcnPrio_{\indexPermutation }}=(\setVertices ,\setEdgesDirected_{\fcnPrio_{\indexPermutation }})\), and with it sets of predecessors \(\setPredecessors {\anAgent }_{\fcnPrio_{\indexPermutation }}\), successors \(\setSuccessors {\anAgent }_{\fcnPrio _{\indexPermutation }}\), and neighbors \(\setNeighbors {\anAgent }_{\fcnPrio_{\indexPermutation }}\), defined as
\(\seteqnumber{0}{}{17}\)\begin{equation} \begin{gathered} \setPredecessors {\anAgent }_{\fcnPrio_{\indexPermutation }} = \set {\anotherAgent \in \setVertices \mid \edgeDirected {\anotherAgent }{\anAgent }\in \setEdgesDirected_{\fcnPrio_{\indexPermutation }}}, \\ \setSuccessors {\anAgent }_{\fcnPrio_{\indexPermutation }} = \set {\anotherAgent \in \setVertices \mid \edgeDirected {\anAgent }{\anotherAgent }\in \setEdgesDirected_{\fcnPrio_{\indexPermutation }}}, \\ \setNeighbors {\anAgent }_{\fcnPrio_{\indexPermutation }} = \setPredecessors {\anAgent }_{\fcnPrio_{\indexPermutation }} \cup \setSuccessors {\anAgent }_{\fcnPrio_{\indexPermutation }} . \end {gathered} \end{equation}
Each agent performs \(\numLevels \) computations in \(\numLevels \) time slots. An agent in class \(\classAgentsA {l}\) finds the sequence \(\matScheduleVector {\indexPermutation \cdot }\) corresponding to a time slot \(m\) by looking for its agent class in the respective column of the computation schedule matrix. For this computation sequence it must hold
\(\seteqnumber{0}{}{18}\)\begin{equation} \matScheduleElement {\indexPermutation {}m} = \classAgentsA {l} \,. \end{equation}
From this sequence, the agent constructs the directed coupling graph \(\graphDirected_{\fcnPrio_{\indexPermutation }}\) with (2) and (13). The agent solves its planning problem after receiving the required information from its predecessors \(\setPredecessors {\anAgent }_{\fcnPrio _{\indexPermutation }}\). It then stores the solution cost corresponding to the sequence \(\indexPermutation \) as an element in a solution cost array.
After all computation sequences are solved, each agent broadcasts its solution cost array. With this information, each agent can determine the solution quality of every computation sequence. We define the solution quality as the networked cost \(\fcnObjectiveNcs (\timestep )\) of a time step \(\timestep \). The networked cost for each computation sequence \(\indexPermutation \) is given as the sum of the costs of all agent planning problems regarding the corresponding computation sequence
\(\seteqnumber{0}{}{19}\)\begin{equation} \label {eq:networked_cost_timestep} \fcnObjectiveNcsA {\fcnPrio_{\indexPermutation }}(\timestep ) = \sum_{\anAgent =1}^{\numAgents } \fcnObjectiveA {\anAgent }_{\fcnPrio_{\indexPermutation }}(\timestep ), \end{equation}
with the solution cost \(\fcnObjectiveA {\anAgent }_{\fcnPrio_{\indexPermutation }}\colon \setNaturalNumbers \to \setRealNumbers \) of an agent \(\anAgent \), the sequence \(\indexPermutation \) corresponding to a prioritization function \(\fcnPrio_{\indexPermutation }\) with (13), and a time step \(\timestep \). After receiving the solution cost arrays of all other agents, the networked cost for each computation sequence is identical in each agent. Each agent selects the computation sequence with the highest solution quality, i.e., the minimal networked cost
\(\seteqnumber{0}{}{20}\)\begin{equation} \label {eq:select-prioritization} \fcnPrio_{\text {explore}} = \argmin_{\fcnPrio_{\indexPermutation }} \fcnObjectiveNcsA {\fcnPrio_{\indexPermutation }}(\timestep ) . \end{equation}
In case two prioritizations yield the same cost, the one with the lower index is chosen by the algorithm.
3.4 Initial Prioritization #
The selection of the initial prioritization, and with Algorithm 1 the initial computation sequence as well, determines which prioritizations can possibly be explored. We identified two objectives for selecting the initial prioritization.
The time-variance of priorities in receding horizon planning has both advantages and disadvantages. An advantage of time-variance is that situations which dictate varying priorities over time to find a solution can be solved, as illustrated by Figure 6. A disadvantage is that the constraints in an agent’s planning problem change with the prioritization. In a receding horizon planning algorithm, an agent’s control inputs are optimized while adhering to constraints in the future, i.e., the time horizon. A change in the constraints can counteract the predictive nature of a receding horizon planning algorithm, which can affect the solution quality (Scheffe et al., 2025). For example, if an agent must come to a full stop during the time horizon to avoid collision with a higher-priority agent, it might decelerate. However, if the priorities flip and the other agent avoids the collision, the deceleration reduced solution quality without need. Therefore, one objective is to enable the algorithm to retain priorities, and thus avoid it being forced to change priorities. The other objective is to explore many priorities to find the best available solution. A poor selection of an initial set of agent classes can limit the exploration of prioritizations to a fraction of the number of possible prioritizations.
We achieve the first objective by using the computation sequence with the lowest cost according to (20) from a previous time step for the initial prioritization. From this computation sequence, we generate priorities with (13). To guarantee the validity of these priorities according to (4), they must be unique to account for the time-variability of the coupling graph. In the next time step, we orient the undirected coupling graph by prioritizing agents with the retained priorities, thus creating a directed coupling graph. From this directed coupling graph, we construct the initial computation sequence with our Algorithm 1. With this process, it is always possible to maintain the priorities of the previous time step even with a time-variant coupling graph.
We took a step towards the second objective by random construction of the computation schedule matrix in Algorithm 2. However, given a computation sequence, only a subset of computation schedule matrices can be constructed. The question remains: which computation sequences can be explored given an initial sequence? We explore different computation sequences if we build unique computation schedule matrices. Two computation schedule matrices are unique if the set of their rows are different, i.e., if one matrix contains a computation sequence that the other one does not contain. In other words, they are the same if one can be mapped to the other by rearranging its rows. It is easy to see that for \(\numLevels = 1\) and \(\numLevels = 2\), we obtain the optimal computation sequence. There exists only a single computation schedule matrix, so all sequences can be explored. For \(\numLevels = 3\), there exist two unique computation schedule matrices which do not share a single row. With the initial prioritization selected by retaining priorities, we can transition from one computation schedule matrix to another between time steps as long as they share an identical row, i.e., an identical computation sequence. Consequently, retaining the result for an initial prioritization for \(\numLevels = 3\) restricts us to examining only three out of six possible computation sequences. More formally, we can construct a computation schedule graph \(\graphUndirected_{c} = \left (\setVertices_{c}, \setEdges_{c} \right )\) with the set of all unique computation schedule matrices as vertices \(\setVertices_{c}\), which are connected by an edge if they share an identical row
\(\seteqnumber{0}{}{21}\)\begin{equation} \label {eq:computation-schedule-graph-edges} \edgeUndirected {i}{j} \in \setEdges_{c} \iff \exists n,m \colon \matScheduleVector {n \cdot }^{(i)} = \matScheduleVector {m \cdot }^{(j)} \end{equation}
with the row \(\matScheduleVector {n \cdot }^{(i)}\) belonging to the computation schedule matrix in vertex \(i\), and the row \(\matScheduleVector {m \cdot }^{(j)}\) belonging to the computation schedule matrix in vertex \(j\).
-
Theorem 4. Let the initial prioritization be selected by retaining priorities, and let the computation schedule matrix be created with Algorithm 2. For \(\numLevels \geq 4\), we can explore all possible computation sequences of length \(\numLevels \) given enough time.
-
Proof. We can prove this theorem by showing the computation schedule graph \(\graphUndirected_{c}\) contains a spanning tree. Let an ascending sequence denote a sequence in which the indices of the agent classes are consecutive for consecutive elements, except for the index \(\numLevels \), which is followed by \(1\). We prove our theorem by showing that all computation schedule matrices are connected to a computation schedule matrix which contains the ascending sequence \((\classAgentsA {1}, \ldots , \classAgentsA {\numLevels })\). All matrices containing the computation sequence \((\classAgentsA {1}, \ldots , \classAgentsA {\numLevels })\) are connected according to (22). All \(\numLevels \) different ascending sequences fit into one computation schedule matrix by shifting the starting index by one for each row. It follows that if any ascending sequence is part of a computation schedule matrix, the computation sequence \((\classAgentsA {1}, \ldots , \classAgentsA {\numLevels })\) is part of a connected matrix. If the initial sequence does not allow any ascending sequence to be part of the computation schedule matrix, exactly one element of this initial sequence is responsible for making exactly one of the ascending sequences invalid. Put differently: As long as there are two agent classes with consecutive indices in the initial sequence – both of which belong to a single ascending sequence and thus invalidate the same ascending sequence – one ascending sequence must be valid in the corresponding computation schedule matrix. In the case the initial sequence does not allow any ascending sequence to be part of the computation schedule matrix, we can place any ascending sequence and switch the element that would make the sequence invalid with its following element. This results in a sequence where \(\numLevels - 2\) elements are consecutive in their agent class index. If \(\numLevels \geq 4\), more than two elements of this computation sequence are consecutive in their agent class index. As pointed out above, this matrix is connected to a matrix which contains a valid ascending sequence, which in turn is connected to a matrix containing the sequence \((\classAgentsA {1}, \ldots , \classAgentsA {\numLevels })\). Therefore, a spanning tree of \(\graphUndirected_{c}\) is rooted in the corresponding vertex. □
3.5 Discussion of our Approach for Simultaneous Computation #
There are up to \(\numAgents !\) possible prioritizations \(N_{\text {prio}}\), so the number of possible prioritizations exhibits factorial growth with the number of agents. The number of prioritizations which our approach considers equals the number of agent classes, which is upper bounded by the number of agents. Consequently, the ratio of explored prioritizations to all prioritizations decreases rapidly with the number of agents. However, the number of possible prioritizations is actually related to the connectivity of the coupling graph, which can be estimated by the vertex degree.
-
Definition 11 (Degree). The degree
\(\seteqnumber{0}{}{22}\)\begin{equation} \vertexDegree {i} = \lvert \setNeighbors {i} \rvert \end{equation}
of a vertex \(i\) denotes the number of its adjacent vertices. The number of incoming edges called in-degree is denoted by
\(\seteqnumber{0}{}{23}\)\begin{equation} \vertexInDegree {i} = \lvert \setPredecessors {i} \rvert . \end{equation}
The number of outgoing edges called out-degree is denoted by
\(\seteqnumber{0}{}{24}\)\begin{equation} \vertexOutDegree {i} = \lvert \setSuccessors {i} \rvert . \end{equation}
-
Theorem 5. The number of possible prioritizations \(N_{\text {prio}}\) of an undirected coupling graph \(\graphUndirected = (\setVertices , \setEdges )\) is upper bounded by
\(\seteqnumber{0}{}{25}\)\begin{equation} N_{\text {prio}} \leq \left ( \frac {2 \cdot \abs {\setEdges }}{\numAgents } \right )^{\numAgents } \leq \prod_{i\in \setVertices } \left ( \vertexDegree {i} + 1 \right ) . \end{equation}
-
Proof. The number of possible prioritizations is equal to the number of acyclic orientations of \(\graphUndirected \). The upper bound above is the upper bound on the number of acyclic orientations given in (Manber & Tompa, 1981, Theorem 10). □
-
Remark 1. According to Theorem 5, the number of possible prioritizations decreases with the sparsity of the coupling graph. Consequently, if we achieve a sparse coupling graph, the ratio of explored prioritizations to all prioritizations can be higher than the worst-case approximation of \(N_{\text {prio}}\) with \(\numAgents !\) suggests.
Our approach only explores a fraction of all possible prioritizations. However, if the networked planning problem changes only slightly from one time step to another, or if the approach is applied repeatedly, it can improve the solution quality incrementally. A significant benefit of our approach is that it is general and does not require any domain-specific knowledge.
4 Motion Planning for CAVs #
MAMP is a variant of MAPF. The evaluation of the approach presented in this paper takes place in the context of MAMP for CAVs. Extending the description of the problem setting in Section 1.2, we state the following differences between traditional MAPF and our formulation of MAMP for CAVs.
-
1. System dynamics: In MAPF, the dynamics of robots are usually abstracted in the graph search problem (e.g., Banfi, Shree, & Campbell, 2020; Li, Tinka, et al., 2021). In our MAMP, we explicitly consider the system dynamics in the planning problem by encoding motion in a motion primitive automaton (MPA).
-
2. Objective: In MAPF, the objective of agents is to plan a path to reach a target vertex in the graph representing the environment without collisions. In our MAMP, the objective of agents is to plan motions to stay close to a reference path without collisions. The reference path is the result of a routing layer and does not consider obstacles.
-
3. Planning time horizon: In MAPF, the planning time horizon is variable and extends until the agent has reached its target vertex. An agent plans once, and the plan reaches from the agent’s start vertex to its target vertex. In our MAMP, the planning time horizon is fixed. This allows for short and predictable computation times. An agent plans at every time step, thereby shifting its planning horizon, an approach known as receding horizon control (RHC).
-
4. Environment representation: In MAPF, the environment with its obstacles is encoded in a graph, where vertices represent locations at which agents can stop, and edges represent discretized motions of agents. In our MAMP, the environment with its obstacles is represented by a list of polygons. Each agent is required to avoid these polygons when planning motions.
There are works that go beyond traditional MAPF and extend the above-listed categorization; e.g., system dynamics are considered in (Alonso-Mora, Beardsley, & Siegwart, 2018; Le & Plaku, 2018; Luo, Chakraborty, & Sycara, 2016).
We formulate motion planning as an optimal control problem (OCP) in Section 4.1. We present our model for considering kinodynamic constraints in Section 4.2, and we show how we couple agents in Section 4.3. Section 4.4 presents our algorithm to solve the planning problem of each agent, which is based on model predictive control (MPC).
4.1 Motion Planning as an Optimal Control Problem #
This section presents the ordinary differential equations describing the vehicle dynamics and our cost function before both are incorporated into an OCP for motion planning. The variables and equations in this section belong to a single agent \(\anAgent \).
We refer to the entirety of all agents as an NCS. We control multiple agents in an NCS with a combination of prioritized planning and distributed model predictive control (DMPC) Richards and How (2007). This combination will be referred to as prioritized distributed model predictive control (Kloock, Scheffe, et al., 2019; Scheffe et al., 2022; Scheffe, Pedrosa, Flaßkamp, & Alrifaee, 2024; Scheffe, Xu, & Alrifaee, 2024). Finding the agents’ control inputs is modeled as an OCP. We denote by agent OCPs the OCPs solved by the agents, and by networked OCP the OCP for the entire prioritized NCS.
Figure 11 shows an overview of variables for the nonlinear kinematic single-track model (Rajamani, 2006, p. 21). Assuming low velocities, we model no slip on the front and rear wheels, and no forces acting on the vehicle. The resulting equations are
\(\seteqnumber{0}{}{26}\)\begin{equation} \label {eq:kst_model} \left \lbrace \begin{aligned} \dot {x}\ofAgent {\anAgent }(t) &= \sysSpeed \ofAgent {\anAgent }(t) \cdot \cos (\psi \ofAgent {\anAgent }(t) + \beta \ofAgent {\anAgent }(t)), \\ \dot {y}\ofAgent {\anAgent }(t) &= \sysSpeed \ofAgent {\anAgent }(t) \cdot \sin (\psi \ofAgent {\anAgent }(t) + \beta \ofAgent {\anAgent }(t)), \\ \dot {\psi }\ofAgent {\anAgent }(t) &= \sysSpeed \ofAgent {\anAgent }(t) \cdot \frac {1}{L} \cdot \tan (\delta \ofAgent {\anAgent }(t)) \cos (\beta \ofAgent {\anAgent }(t)), \\ \dot {\sysSpeed }\ofAgent {\anAgent }(t) &= \sysInSpeed \ofAgent {\anAgent }(t), \\ \dot {\delta }\ofAgent {\anAgent }(t)&= \sysInSteering \ofAgent {\anAgent }(t), \end {aligned} \right . \end{equation}
with
\(\seteqnumber{0}{}{27}\)\begin{equation} \label {eq:beta} \beta \ofAgent {\anAgent }(t) = \tan ^{-1}\left (\frac {L_r}{L} \tan (\delta \ofAgent {\anAgent }(t)) \right ), \end{equation}
where \(x\ofAgent {\anAgent }\in \setRealNumbers \) and \(y\ofAgent {\anAgent }\in \setRealNumbers \) describe the position of the center of gravity (CG), \(\psi \ofAgent {\anAgent }\in [0,2\pi )\) is the orientation, \(\beta \ofAgent {\anAgent }\in [-\pi ,\pi )\) is the side slip angle, \(\delta \ofAgent {\anAgent }\in [-\pi ,\pi )\) and \(\sysInSteering \in \setRealNumbers \) are the steering angle and its derivative respectively, \(\sysSpeed \ofAgent {\anAgent }\in \setRealNumbers \) and \(\sysInSpeed \in \setRealNumbers \) are the speed and acceleration of the CG respectively, \(L \) is the wheelbase length and \(L_r\) is the length from the rear axle to the CG.
The system dynamics defined in (27) are compactly written as
\(\seteqnumber{0}{}{28}\)\begin{equation} \label {eq:dynamics-general-continuous} \dot {\sysState }\ofAgent {\anAgent }(t) := \frac {d}{dt} \sysState \ofAgent {\anAgent } (t) = f \big ( \sysState \ofAgent {\anAgent }(t), \sysControlInputs \ofAgent {\anAgent }(t) \big ) \end{equation}
with the state vector
\(\seteqnumber{0}{}{29}\)\begin{equation} {\sysState \ofAgent {\anAgent }} = \begin{pmatrix} x\ofAgent {\anAgent } & y\ofAgent {\anAgent } & \psi \ofAgent {\anAgent } & \sysSpeed \ofAgent {\anAgent } & \delta \ofAgent {\anAgent } \end {pmatrix}\transposed \in \setRealNumbers ^5, \end{equation}
the control input
\(\seteqnumber{0}{}{30}\)\begin{equation} \label {eq:input_vector_kst} {\sysControlInputs \ofAgent {\anAgent }} = \begin{pmatrix} \sysInSpeed \ofAgent {\anAgent } & \sysInSteering \ofAgent {\anAgent } \end {pmatrix}\transposed \in \setRealNumbers ^2 \end{equation}
and the vector field \(f\) defined by (27). Transferring (29) to a discrete-time nonlinear system representation yields
\(\seteqnumber{0}{}{31}\)\begin{equation} \label {eq:dynamics-general-discrete} \sysState \ofAgent {\anAgent }_{k+1} = f_d \bigl ( \sysState \ofAgent {\anAgent }_{k}, \sysControlInputs \ofAgent {\anAgent }_{k} \bigr ) \end{equation}
with a time step \(k\in \setNaturalNumbers \), the vector field \(f_d \colon \setRealNumbers ^5 \times \setRealNumbers ^2 \to \setRealNumbers ^5\).
We define the cost function for agent \(\anAgent \) to minimize given a prioritization \(\fcnPrio \) in our motion planning problem as
\(\seteqnumber{0}{}{32}\)\begin{equation} \label {eq:cost_mp} \fcnObjective_{\fcnPrio }\ofAgent {\anAgent }(k) = \sum_{\timestepIterator =1}^{\horizonPrediction } \left ( \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } - \sysRef \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } \right )\transposed \bm {Q} \left ( \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } - \sysRef \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } \right ) \end{equation}
with the prediction horizon \(\horizonPrediction \in \setNaturalNumbers \), a reference trajectory \(\sysRef \ofAgent {\anAgent }\forTimeAtTime {\cdot }{\timestep } \in \setRealNumbers ^{5}\), and the positive semi-definite, block diagonal matrix
\(\seteqnumber{0}{}{33}\)\begin{equation} \bm {Q} = \begin{pmatrix} \bm {I}_2 & \bm {0}_{2\times 3} \\ \bm {0}_{3\times 2} & \bm {0}_3 \\ \end {pmatrix} \in \setRealNumbers ^{5\times 5} . \end{equation}
The OCP of agent \(\anAgent \) follows in ??. Since multiple agents are involved in the equation, we will include the agent superscript. We assume a full measurement or estimate of the state \(\sysState \ofAgent {\anAgent }(\timestep )\) is available at the current time \(\timestep \).
\(\seteqnumber{0}{}{34}\)\begin{equation} \begin{aligned} \min_{\sysControlInputs \ofAgent {\anAgent }\forTimeAtTime {\cdot }{\timestep }} \quad & \fcnObjective \ofAgent {\anAgent }_{\fcnPrio }(k) \\ \text {s.t.}\quad & \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator +1}{\timestep } = \sysModelDiscrete \!\left ( \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep }, \sysControlInputs \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } \right ), \\ & \timestepIterator = 0,\ldots ,\horizonPrediction -1, \\ & \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } \in \setFeasibleStates , \\ & \timestepIterator = 1,\ldots ,\horizonPrediction -1, \\ & \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep +\horizonPrediction }{\timestep } \in \setFeasibleStates_f, \\ & \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep }{\timestep } = \sysState \ofAgent {\anAgent }(\timestep ), \\ & \sysControlInputs \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } \in \setFeasibleInputs , \\ & \timestepIterator = 0,\ldots ,\horizonControl -1, \\ & \fcnConstraintCoupling \ofAgent {\anAgent \seqCoupling \anotherAgent }\!\left ( \sysState \ofAgent {\anAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep }, \sysState \ofAgent {\anotherAgent }\forTimeAtTime {\timestep +\timestepIterator }{\timestep } \right ) \le 0, \\ & \timestepIterator = 1,\ldots ,\horizonPrediction ,\quad \anotherAgent \in \setPredecessors {\anAgent }_{\fcnPrio }(\timestep ). \end {aligned} \end{equation}
Here, \(\sysModelDiscrete \colon \setRealNumbers ^{\numStates }\times \setRealNumbers ^{\numInputs }\to \setRealNumbers ^{\numStates }\) is a vector field resembling the discrete-time nonlinear system model with \(\numStates \in \setNaturalNumbers \) as the number of states and \(\numInputs \in \setNaturalNumbers \) as the number of inputs, \(\sysControlInputs \ofAgent {\anAgent }\forTimeAtTime {\cdot }{\timestep } = (\sysControlInputs \ofAgent {\anAgent }\forTimeAtTime {\timestep }{\timestep },\sysControlInputs \ofAgent {\anAgent }\forTimeAtTime {\timestep +1}{\timestep }, \dots , \sysControlInputs \ofAgent {\anAgent }\forTimeAtTime {\timestep +\horizonControl -1}{\timestep })\) is the input trajectory, \(\setFeasibleStates \) is the set of feasible states, \(\setFeasibleStatesTerminal \) is the set of feasible terminal states, and \(\setFeasibleInputs \) is the set of feasible inputs. The coupling constraint \(\fcnConstraintCoupling \ofAgent {\anAgent \seqCoupling \anotherAgent } \colon \setRealNumbers ^{\numStates } \times \setRealNumbers ^{\numStates } \to \setRealNumbers \) in ?? enforces networked constraints among agents. The set of predecessors \(\setPredecessors {\anAgent }_{\fcnPrio }(\timestep )\) contains the neighbors of \(\anAgent \) with higher priority given the prioritization \(\fcnPrio \). An agent \(\anAgent \) solves its OCP after having received the plans of all predecessors \(\setPredecessors {\anAgent }_{\fcnPrio }(\timestep )\). Afterwards, it communicates its own plan to agents in \(\setSuccessors {\anAgent }_{\fcnPrio }(\timestep )\). Since the planning problem requires the predecessors’ plans in ??, the agents must solve their OCPs in a sequential order. Each agent solves the OCP ?? repeatedly after a time step duration \(\tSample \) and with updated values for the states and constraints, which establishes the RHC. Only the first input \(\sysControlInputsA {\anAgent }\forTimeAtTime {\timestep }{\timestep }\) of the input trajectory is applied.
4.2 System Modeling with a Motion Primitive Automaton #
This section presents how we model the state-continuous system (32) as an MPA (Frazzoli, Dahleh, & Feron, 2005; Scheffe et al., 2023). The MPA incorporates the constraints on system dynamics ??, on control inputs ??, and on both the steering angle and the speed ????.
From the system dynamics (27), we derive a finite state automaton which we call MPA and define as follows.
-
Definition 12 (Motion primitive automaton, from (Scheffe et al., 2023)). An MPA is a 5-tuple \((\setAutomatonStates ,\setAutomatonTransitions ,\gamma ,\automatonState_0,\setAutomatonStates_f)\) composed of:
-
• \(\setAutomatonStates \) is a finite set of automaton states \(\automatonState \);
-
• \(\setAutomatonTransitions \) is a finite set of transitions \(\automatonTransition \), also called motion primitives (MPs);
-
• \(\gamma :\setAutomatonStates \times \setAutomatonTransitions \times \setNaturalNumbers \to \setAutomatonStates \) is the update function defining the transition from one automaton state to another, dependent on the time step in the horizon;
-
• \(\automatonState_0\in \setAutomatonStates \) is the initial automaton state;
-
• \(\setAutomatonStates_f\subseteq \setAutomatonStates \) is the set of final automaton states.
-
An automaton state is characterized by a specific speed \(\sysSpeed \) and steering angle \(\delta \). An MP is characterized by an input trajectory and a corresponding state trajectory which starts and ends with the speed and steering angle of an automaton state. It has a fixed duration which we choose to be equal to the time step duration \(\tSample \). MPs can be concatenated to form longer state trajectories by rotation and translation. Our MPA discretizes both the state space with the update function \(\gamma \) and the time space with a fixed duration \(\tSample \) for all MPs. This MPA replaces the system representation (32). Note that the dynamics model on which our MPA is based is interchangeable. Its complexity is irrelevant computation-wise for motion planning since MPs are computed offline. The MPA design forces vehicles to come to a complete stop at the end of the prediction horizon, which guarantees that solutions to ?? are recursively feasible; for details see our previous work (Scheffe et al., 2022, 2023).
4.3 Coupling of Vehicles #
Although coupling all vehicles guarantees that coupling constraints for all vehicles will be considered, it also prolongs computation time and increases the number of prioritizations compared to coupling fewer vehicles. Therefore, we only couple vehicles that can potentially collide within the horizon \(\horizonPrediction \). We couple two vehicles at a time step \(\timestep \) if at least one of their reachable sets intersects within the horizon \(\horizonPrediction \).
-
Definition 13 (Reachable set of a time interval, from Scheffe, Xu, and Alrifaee (2024)). The reachable set \(\setReachableB {\anAgent }\forTimeAtTime {[t_1,t_2]}{t_0}\) of the states of vehicle \(\anAgent \) between time \(t_1\) and time \(t_2\) starting from time \(t_0\) is
\(\seteqnumber{0}{}{35}\)\begin{equation} \label {eq:reachableSet} \setReachableB {\anAgent }\forTimeAtTime {[t_1,t_2]}{t_0} = \biggl \{ \bigcup_{t' \in [t_1, t_2]} \int_{t_0}^{t'} f(\sysState \ofAgent {\anAgent },\sysControlInputs \ofAgent {\anAgent })dt \nonumber \biggm | \sysState \ofAgent {\anAgent }(t_0) \in \setFeasibleStates (t_0), \sysControlInputs \ofAgent {\anAgent } \in \setFeasibleInputs \biggr \}. \end{equation}
The reachable sets \(\setReachableB {\anAgent }\forTimeAtTime {[\timestep +\timestepIterator ,\timestep +\timestepIterator +1]}{\timestep }\) over the prediction horizon of a vehicle can be computed offline with the MPA presented in Section 4.2. The set of coupling edges is
\(\seteqnumber{0}{}{35}\)\begin{equation} \label {eq:isCoupled} \begin{gathered} \setEdges (\timestep ) = \Bigl \{ \edgeUndirected {\anAgent }{\anotherAgent } \Bigm | \anAgent , \anotherAgent \in \setAgents ,\, \anAgent \neq \anotherAgent , \\ \exists \timestepIterator \in \{0, \ldots , \horizonPrediction -1\} \colon \, \setReachableB {\anAgent }\forTimeAtTime {[\timestep +\timestepIterator ,\timestep +\timestepIterator +1]}{\timestep } \cap \setReachableB {\anotherAgent }\forTimeAtTime {[\timestep +\timestepIterator ,\timestep +\timestepIterator +1]}{\timestep } \neq \emptyset \Bigr \}, \end {gathered} \end{equation}
This results in a time-variant coupling graph \(\graphUndirected (\timestep )=\left (\setVertices , \setEdges (\timestep )\right )\).
4.4 Planning using Monte-Carlo Tree Search in a Model Predictive Control Framework #
It is computationally hard to find the global optimum to the planning problem ?? due to its nonlinearity and nonconvexity. We substitute the system model ?? with an MPA based on our previous work (Scheffe et al., 2023). Our MPA is general instead of tailored towards a specific environment. Therefore, our motion planning is a receding horizon tree search rather than a graph search. Obstacles are included in the state constraints ?? of our OCP formulation. In the Monte Carlo tree search (MCTS), we represent obstacles and predecessors’ trajectories as polygons.
The cost of edges in the tree is given by the cost function ??. In this article, we use a sampling-based approach based on MCTS to find a path in the tree with low cost. Due to its anytime property, MCTS is a popular approach to complex control problems (Baby & HomChaudhuri, 2023; Choudhury, Gupta, Morales, & Kochenderfer, 2022; Kurzer, Zhou, & Marius Zöllner, 2018), as it allows balancing computation effort and solution quality. In each time step, our MCTS randomly expands a fixed number of vertices in the search tree. The algorithm thus inspects only a part of the complete search tree. After the expansions, the path in the search tree with the lowest cost according to ?? is selected. The algorithm has no guarantee on optimality, but nearly constant computation time.
Because the MAMP runs online, vehicles require an input at every time step. However, since PP is incomplete (Theorem 1), the MCTS might fail to find a feasible solution. In such cases, vehicles reuse their previous, recursively feasible solution, ensuring safe motion plans (Scheffe et al., 2022).
5 Evaluation #
We evaluate the presented approach for increasing solution quality by exploring multiple prioritizations in the context of multi-agent motion planning for CAVs. In Section 5.1, we describe the experiment setup, a scenario on a road network with up to 20 vehicles. In Sections 5.2 and 5.3, we evaluate the quality of the trajectories of the vehicles in the NCS, and the computation time of the NCS, respectively. In Section 5.4, we demonstrate our approach in an experiment in the CPM Lab, an open-source, remotely-accessible testbed for CAVs (Kloock et al., 2021). In Section 5.5, we discuss the implications of our results.
We compare our prioritization algorithm with five algorithms from the literature. Each algorithm is represented by a prioritization function \(\fcnPrio \colon \setVertices \to \mathbb {N}\). The first function prioritizes vehicles according to their vertex number (Alrifaee et al., 2016) and is denoted by \(\fcnPrio_\text {constant}\). The second function prioritizes vehicles randomly at each time step (Bennewitz et al., 2002) and is denoted by \(\fcnPrio_\text {random}\). The third function prioritizes vehicles with a constraint-based heuristic (Scheffe et al., 2022). A higher number of potential collisions with other vehicles results in a higher priority. The function is denoted by \(\fcnPrio_\text {constraint}\). The fourth function prioritizes vehicles to decrease the number of agent classes, and thus the computation time, via graph coloring (Kuwata et al., 2007; Scheffe et al., 2025). The function is denoted by \(\fcnPrio_\text {color}\). The fifth function evaluates all possible prioritizations by solving the networked OCP for all acyclic orientations of the coupling graph. The function then chooses the prioritization which results in the lowest networked cost. It is denoted by \(\fcnPrio ^{*}\).
5.1 Setup #
In our experiments, vehicles move on the road network shown in Figure 16. It consists of an urban intersection with eight incoming lanes, a highway, and highway on- and off-ramps. With this variety of road segments, we challenge our motion planner with merging, crossing, and overtaking scenarios.
Our MCTS uses the MPA illustrated in Figure 12, which is based on the kinematic single-track model as presented in (Scheffe et al., 2023).
The MPA is discretized with four speed levels.
We set the duration for our numerical experiments to 7 s with a time step size of 0.2 s. The vehicles start at random positions on the road network and follow a randomly selected path which starts and ends at the same point. The reference speeds for the vehicles in the experiments are selected as one of the speed levels of our MPA shown in Figure 12 at random.
During the experiment, the vehicles need to stay within the road boundaries. Lower-prioritized vehicles must avoid collisions with higher-prioritized ones. Therefore, coupling constraints with higher-prioritized vehicles reduce the set of feasible system states \(\setFeasibleStates \) by the system states the higher-prioritized vehicles occupy during a motion primitive in the MPA (Scheffe et al., 2023). Two vehicles in an NCS are coupled whenever their reachable sets overlap (Scheffe, Xu, & Alrifaee, 2024). As the reachable set is limited in size, and vehicles move along the road network, couplings between vehicles appear and disappear over the course of an experiment. Therefore, the coupling graph is time-variant. We initialize priorities in the very first time step of an experiment with the agents’ vertex numbers.
Our algorithms ran on up to 20 Intel NUCs (NUC7i5BNK) connected via Ethernet, one for each participating vehicle. Each NUC was equipped with an Intel Core i5 7260U CPU with 2.2 GHz and 16 GB of RAM. The communication between two individual NUCs is realized by the ROS 2 Toolbox in MATLAB. The open-source algorithms are implemented in MATLAB1.
5.2 Solution Quality #
We measure the solution quality with the networked cost \(\fcnObjectiveNcsA {\fcnPrio }\) of a prioritization \(\fcnPrio \). The networked cost according to the cost of the planning problem of each vehicle is given as
\(\seteqnumber{0}{}{36}\)\begin{equation} \fcnObjectiveNcsA {\fcnPrio } = \sum_{\timestep =0}^{\timestep_{\text {experiment}}} \sum_{\anAgent =1}^{\numAgents } \fcnObjectiveA {\anAgent }_{\fcnPrio }(\timestep ), \end{equation}
with \(\timestep_{\text {experiment}}\) being the number of time steps in the experiment, and \(\fcnObjectiveA {\anAgent }_{\fcnPrio }(\timestep )\) given in ??. Figure 13 shows the cost normalized to that of the prioritization \(p_\text {constant}\).
Our approach, denoted by \(\fcnPrio_{\text {explore}}\), outperforms the approaches from the literature in terms of networked cost across all our experiments. For five and ten vehicles, the networked cost of our approach is within 1% of the cost of the optimal prioritization. For 15 vehicles, we reduce the networked cost by more than 53% compared to the baseline prioritization (\(\fcnPrio_{\text {constant}}\)). There is no value for the cost of the optimal prioritization for 15 and 20 vehicles as the algorithm did not terminate in reasonable computation time.
5.3 Computation Time #
Computation Time Definition #
The computation time when planning with a single prioritization was presented in Section 2.5. Our approach explores one prioritization per agent class. When exploring multiple prioritizations, the networked computation time is at least as high as the maximum of all considered prioritizations. It can even be higher, as computations of multiple prioritizations are interdependent: agents need to finish their computations before the next prioritization can be computed.
This additional relationship needs to be addressed in the computation graph as illustrated in Figure 14 for the coupling graphs of Figure 9b. The edges in black indicate the couplings of agents in the three distinct prioritizations. Since an agent can only compute one solution at a time, the subgraphs of these prioritizations are connected by additional edges in teal accounting for the sequential computation. All edges in black and teal are weighed with the computation time \(\tComp ^{(i)}\) of an edge’s respective starting vertex. The weight of the longest path in this graph corresponds to the computation time of the NCS.
Computation Time Analysis #
For five and ten vehicles, the maximum computation time of our approach is two and 15 times faster than the optimal prioritization, respectively, while achieving a similar networked cost. For a larger number of vehicles, planning with optimal prioritization is computationally too expensive, as the number of possible prioritizations exhibits factorial growth in the number of vehicles. As shown in Figure 15, our approach scales well with the number of vehicles and only slightly increases the computation time compared to the baseline prioritization. In particular, the median computation time is only slightly increased. We discuss the larger increase in maximum computation time in Section 5.5.
5.4 Experiment #
We conducted an experiment with ten vehicles in the CPM Lab (Kloock et al., 2021). A snapshot of the experiment is shown in Figure 16, a video of the experiment is available online2. In order to achieve real-time capability, we combine our approach with our previous work of limiting the number of agent classes (Scheffe, Xu, & Alrifaee, 2024). We chose a limit of \(\numLevels = 2\).
In our experiment, we achieve a networked computation time with a median of 72 ms and a maximum of 111 ms across all time steps. With our time step duration of \(\tSample =\qty {200}{\milli \second }\), this networked computation time allows enough leeway for other subordinate computations and for communication.
5.5 Discussion #
Quality #
Experiments have indicated that constant priorities can yield better solution quality than time-varying priorities even in dynamic environments (Scheffe et al., 2022). A possible explanation is that agents can make full use of the predictive capabilities of MPC, as the agent OCPs remain similar between time steps. However, there are instances in which a different prioritization yields better solutions or a different prioritization is required to even enable progress at all. In contrast to approaches from the literature, our approach can both remain at a prioritization or switch to another one, determined by the networked cost.
Computation time #
As the networked computation time is equal to the longest path in the computation graph \(\graphDirected ^{\prime }\), the increase in computation time is more noticeable for a larger number of vehicles. When the vehicle number is larger, the number of agent classes increases, as shown in Figure 17.
Our approach explores one prioritization per agent class. If we weigh all outgoing edges with the computation time associated with solving the vertex’ planning problem, the networked computation time is determined by the longest path in the computation graph, as displayed by Figure 14. Through both the increased number of total computations and the higher connectivity of these computations in the computation graph, there is a greater possibility to obtain a larger computation time compared to other prioritization approaches. This effect is especially noticeable when the computation time of the agents is dissimilar.
Communication Effort #
In our approach, the number of prioritizations explored is equal to the number of agent classes. For each prioritization, communication is required. Consequently, this results in an increase in the total communication effort by a factor of the number of agent classes. However, the plans regarding each prioritization can be communicated in parallel, resulting in the same number of \(\numColors -1\) sequential communications. Additionally, agents communicate the cost of their solution to determine the networked cost of each prioritization. In practice, the exchanged data is small, as it only consists of the trajectory over the prediction horizon and the cost of the planning problem. However, if communication resources are scarce, this property needs to be accounted for.
Table 1 displays the communication and single-agent planning times we measured in our experiments. For the sequential approaches, the results support the assumption of a short communication time. However, with our simultaneous approach, the communication time is half the planning time in the median, and more than 2.5 times the planning time in the maximum. The main difference regarding communication of the simultaneous approach from the sequential approaches is the increased number of sent messages. Our implementation with MATLAB and the ROS 2 toolbox seems to struggle with this higher number of messages, resulting in high latencies.
| Sequential | Simultaneous | |||
| med | max | med | max | |
| Communication time [ms] | 4 | 33 | 11 | 157 |
| Single-agent planning time [ms] | 25 | 97 | 25 | 59 |
Communication time can be improved by implementation or by a local execution on multiple processing units. Our analysis of the computation time excludes the communication time and thus considers the effects of our approach on the computation time isolated from secondary effects due to communication time.
6 Conclusion #
6.1 Summary #
We presented an approach to utilize idle time in prioritized planning to simultaneously compute with multiple prioritizations. With our approach, we can achieve high solution quality with significantly less computational effort than that of sequential computation of the considered prioritizations. Additionally, our approach is general, as it does not rely on domain-specific heuristics.
Our approach can simultaneously examine as many prioritizations as there are agent classes. The higher the number of agent classes, the more prioritizations are examined. However, the number of possible prioritizations increases more quickly with the number of agent classes. Therefore, our approach is more likely to find the optimal prioritization with a lower number of agent classes. Additionally, a lower number of agent classes results in a smaller increase in computation time. These benefits make the approach well-suited to be combined with limiting the number of agent classes for real-time computation (Scheffe, Xu, & Alrifaee, 2024).
6.2 Future Work #
Our approach is general and can thus be transferred to prioritized computations in other applications, where communication is required and planning time outweighs communication time. It needs to be analyzed how well the approach transfers, especially regarding the communication effort and the computation time.
Since the planning problems of multiple prioritizations are solved simultaneously, the communication effort compared to computing with a single prioritization is multiplied by the number of agent classes. In our experiments, agents communicated via LAN. It seems likely that the communication effort is decreased if computations run on the same machine, and increased if agents communicate via WLAN. In any case, the gain in solution quality must outweigh the increase in communication effort compared to computing with a single prioritization. Further studies are required to verify that communication can be time-efficient enough to warrant the distributed execution of our approach for a large number of agents.
We presented solving an OCP as a control method to find control inputs and to generate a motion plan. However, our approach works well with any control method in which agents have similar computation times. Therefore, the approach could also be evaluated with different control methods.
We presented simulations with up to 20 agents and eight agent classes, and an experiment with ten agents and two agent classes. Conducting experiments with a larger number of agents and agent classes would be interesting to study the convergence to the optimal prioritization. An application with a convex OCP might be suitable to still achieve real-time capable computations.
Our algorithm to build a computation schedule matrix selects prioritizations randomly from all valid prioritizations. The algorithm might improve if it instead selects prioritizations based on knowledge of the solution quality of prioritizations it has solved before.
We aimed to increase the solution quality without increasing the computation time by utilizing unused computation resources in distributed PP. Thus, we construct a Latin square to explore \(\numLevels \) prioritizations in \(\numLevels \) sequential computations. However, the approach can be generalized to a desired number of sequential computations or a desired number of explored prioritizations \(N_{\text {explored}}\), making the computation schedule matrix in a Latin rectangle. Then, the number of explored prioritizations can be selected according to the available computation time.
Our approach for simultaneous computation of multiple prioritizations is intended for PP where the coupling graph can be determined before planning. With extensions to limit the number of agent classes (Scheffe, Xu, & Alrifaee, 2024), this kind of PP can realize real-time control. When constructing a coupling graph a priori is impossible, PP with ad-hoc construction of a coupling graph can be used (e.g., Chan et al., 2023; Ma et al., 2019). Adapting our algorithm for this setting could increase the utilization of available computation resources.
References #
-
Almhanna, M. S., Al-Turaihi, F. S., & Murshedi, T. A. (2023, October). Reducing waiting and idle time for a group of jobs in the grid computing. Bulletin of Electrical Engineering and Informatics, 12(5), 3115–3123. doi: 10.11591/eei.v12i5.4729
-
Alonso-Mora, J., Beardsley, P., & Siegwart, R. (2018, April). Cooperative Collision Avoidance for Nonholonomic Robots. IEEE Transactions on Robotics, 34(2), 404–420. doi: 10.1109/TRO.2018.2793890
-
Alrifaee, B., Heßeler, F.-J., & Abel, D. (2016). Coordinated Non-Cooperative Distributed Model Predictive Control for Decoupled Systems Using Graphs. IFAC-PapersOnLine, 49(22), 216–221. doi: 10.1016/j.ifacol.2016.10.399
-
Azarm, K., & Schmidt, G. (1997, April). Conflict-free motion of multiple mobile robots based on decentralized motion planning and negotiation. In Proceedings of International Conference on Robotics and Automation (Vol. 4, p. 3526-3533 vol.4). doi: 10.1109/ROBOT.1997.606881
-
Baby, T. V., & HomChaudhuri, B. (2023, May). Monte Carlo Tree Search Based Trajectory Generation for Automated Vehicles in Interactive Traffic Environments. In 2023 American Control Conference (ACC) (pp. 4431–4436). doi: 10.23919/ACC55779.2023.10156530
-
Banfi, J., Shree, V., & Campbell, M. (2020, September). Planning High-Level Paths in Hostile, Dynamic, and Uncertain Environments. Journal of Artificial Intelligence Research, 69, 297–342. doi: 10.1613/jair.1.12077
-
Barer, M., Sharon, G., Stern, R., & Felner, A. (2021, September). Suboptimal Variants of the Conflict-Based Search Algorithm for the Multi-Agent Pathfinding Problem. Proceedings of the International Symposium on Combinatorial Search, 5(1), 19–27. doi: 10.1609/socs.v5i1.18315
-
Bennewitz, M., Burgard, W., & Thrun, S. (2002, November). Finding and optimizing solvable priority schemes for decoupled path planning techniques for teams of mobile robots. Robotics and Autonomous Systems, 41(2), 89–99. doi: 10.1016/S0921-8890(02)00256-7
-
Çam, Ö. N., & Sezen, H. K. (2020, March). The formulation of a linear programming model for the vehicle routing problem in order to minimize idle time. Decision Making: Applications in Management and Engineering, 3(1), 22–29. doi: 10.31181/dmame2003132h
-
Čáp, M., Novák, P., Kleiner, A., & Selecký, M. (2015, July). Prioritized Planning Algorithms for Trajectory Coordination of Multiple Mobile Robots. IEEE Transactions on Automation Science and Engineering, 12(3), 835–849. doi: 10.1109/TASE.2015.2445780
-
Chalaki, B., & Malikopoulos, A. A. (2022). A Priority-Aware Replanning and Resequencing Framework for Coordination of Connected and Automated Vehicles. IEEE Control Systems Letters, 6, 1772–1777. doi: 10.1109/LCSYS.2021.3133416
-
Chan, S.-H., Stern, R., Felner, A., & Koenig, S. (2023, July). Greedy Priority-Based Search for Suboptimal Multi-Agent Path Finding. Proceedings of the International Symposium on Combinatorial Search, 16(1), 11–19. doi: 10.1609/socs.v16i1.27278
-
Chen, D., Hajidavalloo, M. R., Li, Z., Chen, K., Wang, Y., Jiang, L., & Wang, Y. (2023, November). Deep Multi-Agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic. IEEE Transactions on Intelligent Transportation Systems, 24(11), 11623–11638. doi: 10.1109/TITS.2023.3285442
-
Choudhury, S., Gupta, J. K., Morales, P., & Kochenderfer, M. J. (2022, March). Scalable Online Planning for Multi-Agent MDPs. Journal of Artificial Intelligence Research, 73, 821–846. doi: 10.1613/jair.1.13261
-
Erdmann, M., & Lozano-Pérez, T. (1987, November). On multiple moving objects. Algorithmica, 2(1), 477. doi: 10.1007/BF01840371
-
Frazzoli, E., Dahleh, M. A., & Feron, E. (2005, December). Maneuver-based motion planning for nonlinear systems with symmetries. IEEE Transactions on Robotics, 21(6), 1077–1091. doi: 10.1109/TRO.2005.852260
-
Guo, T., & Yu, J. (2024, October). Expected 1.x Makespan-Optimal Multi-Agent Path Finding on Grid Graphs in Low Polynomial Time. Journal of Artificial Intelligence Research, 81, 443–479. doi: 10.1613/jair.1.16299
-
Kloock, M., Kragl, L., Maczijewski, J., Alrifaee, B., & Kowalewski, S. (2019). Distributed Model Predictive Pose Control of Multiple Nonholonomic Vehicles. In IEEE Intelligent Vehicles Symposium (IV) (pp. 1620–1625). doi: 10.1109/IVS.2019.8813980
-
Kloock, M., Scheffe, P., Botz, L., Maczijewski, J., Alrifaee, B., & Kowalewski, S. (2019). Networked Model Predictive Vehicle Race Control. In IEEE Intelligent Transportation Systems Conference (ITSC) (pp. 1552–1557). doi: 10.1109/ITSC.2019.8917222
-
Kloock, M., Scheffe, P., Maczijewski, J., Kampmann, A., Mokhtarian, A., Kowalewski, S., & Alrifaee, B. (2021). Cyber-Physical Mobility Lab: An Open-Source Platform for Networked and Autonomous Vehicles. In European Control Conference (ECC) (pp. 1937–1944). doi: 10.23919/ECC54610.2021.9654986
-
Korte, B., & Vygen, J. (2018). Combinatorial Optimization: Theory and Algorithms (Vol. 21). Berlin, Heidelberg: Springer Berlin Heidelberg. doi: 10.1007/978-3-662-56039-6
-
Kurzer, K., Zhou, C., & Marius Zöllner, J. (2018, June). Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search. In 2018 IEEE Intelligent Vehicles Symposium (IV) (pp. 529–536). doi: 10.1109/IVS.2018.8500712
-
Kuwata, Y., Richards, A., Schouwenaars, T., & How, J. P. (2007, July). Distributed Robust Receding Horizon Control for Multivehicle Guidance. IEEE Transactions on Control Systems Technology, 15(4), 627–641. doi: 10.1109/TCST.2007.899152
-
Le, D., & Plaku, E. (2018, October). Cooperative, Dynamics-based, and Abstraction-Guided Multi-robot Motion Planning. Journal of Artificial Intelligence Research, 63, 361–390. doi: 10.1613/jair.1.11244
-
Li, J., Hoang, T. A., Lin, E., Vu, H. L., & Koenig, S. (2023, June). Intersection Coordination with Priority-Based Search for Autonomous Vehicles. Proceedings of the AAAI Conference on Artificial Intelligence, 37(10), 11578–11585. doi: 10.1609/aaai.v37i10.26368
-
Li, J., Ruml, W., & Koenig, S. (2021, May). EECBS: A Bounded-Suboptimal Search for Multi-Agent Path Finding. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12353–12362. doi: 10.1609/aaai.v35i14.17466
-
Li, J., Tinka, A., Kiesel, S., Durham, J. W., Kumar, T. K. S., & Koenig, S. (2021). Lifelong Multi-Agent Path Finding in Large-Scale Warehouses. Proceedings of the AAAI Conference on Artificial Intelligence, 35(13), 11272–11281.
-
Luo, W., Chakraborty, N., & Sycara, K. (2016, July). Distributed dynamic priority assignment and motion planning for multiple mobile robots with kinodynamic constraints. In 2016 American Control Conference (ACC) (pp. 148–154). Boston, MA, USA: IEEE. doi: 10.1109/ACC.2016.7524907
-
Ma, H., Harabor, D., Stuckey, P. J., Li, J., & Koenig, S. (2019, July). Searching with Consistent Prioritization for Multi-Agent Path Finding. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7643–7650. doi: 10.1609/aaai.v33i01.33017643
-
Ma, H., Li, J., Kumar, T. K. S., & Koenig, S. (2017, May). Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks. In Proceedings of the 16th Conference on Autonomous Agents and Multi-Agent Systems (pp. 837–845). Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems.
-
Mak, H.-Y., Rong, Y., & Zhang, J. (2014, May). Sequencing Appointments for Service Systems Using Inventory Approximations. Manufacturing & Service Operations Management, 16(2), 251–262. doi: 10.1287/msom.2013.0470
-
Manber, U., & Tompa, M. (1981, October). The effect of number of Hamiltonian paths on the complexity of a vertex-coloring problem. In 22nd Annual Symposium on Foundations of Computer Science (sfcs 1981) (pp. 220–227). doi: 10.1109/SFCS.1981.47
-
McKay, B. D., & Wanless, I. M. (2005, October). On the Number of Latin Squares. Annals of Combinatorics, 9(3), 335–344. doi: 10.1007/s00026-005-0261-7
-
Pan, Z., Wang, R., Bi, Q., Zhang, X., & Yu, J. (2024, August). RH-ECBS: Enhanced conflict-based search for MRPP with region heuristics. Robotica, 1–11. doi: 10.1017/S0263574724000894
-
Rajamani, R. (2006). Vehicle Dynamics and Control. New York: Springer Science.
-
Richards, A., & How, J. P. (2007, September). Robust distributed model predictive control. International Journal of Control, 80(9), 1517–1531. doi: 10.1080/00207170701491070
-
Scheffe, P., Dorndorf, G., & Alrifaee, B. (2022). Increasing Feasibility with Dynamic Priority Assignment in Distributed Trajectory Planning for Road Vehicles. In IEEE International Conference on Intelligent Transportation Systems (ITSC) (pp. 3873–3879). doi: 10.1109/ITSC55140.2022.9922028
-
Scheffe, P., Kahle, J., & Alrifaee, B. (2025, January). Graph Coloring to Reduce Computation Time in Prioritized Planning (No. arXiv:2501.10812). arXiv. doi: 10.48550/arXiv.2501.10812
-
Scheffe, P., Pedrosa, M. V. A., Flaßkamp, K., & Alrifaee, B. (2023). Receding Horizon Control Using Graph Search for Multi-Agent Trajectory Planning. IEEE Transactions on Control Systems Technology, 31(3), 1092–1105. doi: 10.1109/TCST.2022.3214718
-
Scheffe, P., Pedrosa, M. V. A., Flaßkamp, K., & Alrifaee, B. (2024). Prioritized Trajectory Planning for Networked Vehicles Using Motion Primitives. In C. Stiller, M. Althoff, C. Burger, B. Deml, L. Eckstein, & F. Flemisch (Eds.), Cooperatively Interacting Vehicles: Methods and Effects of Automated Cooperation in Traffic (pp. 253–275). Cham: Springer International Publishing. doi: 10.1007/978-3-031-60494-2
-
Scheffe, P., Xu, J., & Alrifaee, B. (2024, June). Limiting Computation Levels in Prioritized Trajectory Planning with Safety Guarantees. In 2024 European Control Conference (ECC) (pp. 297–304). Stockholm, Sweden: IEEE. doi: 10.23919/ECC64448.2024.10591179
-
Shahar, T., Shekhar, S., Atzmon, D., Saffidine, A., Juba, B., & Stern, R. (2021, March). Safe Multi-Agent Pathfinding with Time Uncertainty. Journal of Artificial Intelligence Research, 70, 923–954. doi: 10.1613/jair.1.12397
-
Sharon, G., Stern, R., Felner, A., & Sturtevant, N. R. (2015, February). Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence, 219, 40–66. doi: 10.1016/j.artint.2014.11.006
-
Silver, D. (2005). Cooperative Pathfinding. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 1(1), 117–122. doi: 10.1609/aiide.v1i1.18726
-
van den Berg, J. P., & Overmars, M. H. (2005, August). Prioritized motion planning for multiple robots. In 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 430–435). doi: 10.1109/IROS.2005.1545306
-
Wu, W., Bhattacharya, S., & Prorok, A. (2020, May). Multi-Robot Path Deconfliction through Prioritization by Path Prospects. In 2020 IEEE International Conference on Robotics and Automation (ICRA) (pp. 9809–9815). Paris, France: IEEE. doi: 10.1109/ICRA40945.2020.9196813
-
Yao, N., & Zhang, F. (2018, August). Resolving Contentions for Intelligent Traffic Intersections Using Optimal Priority Assignment and Model Predictive Control. In 2018 IEEE Conference on Control Technology and Applications (CCTA) (pp. 632–637). Copenhagen: IEEE. doi: 10.1109/CCTA.2018.8511561
-
Yu, J. (2016, January). Intractability of Optimal Multirobot Path Planning on Planar Graphs. IEEE Robotics and Automation Letters, 1(1), 33–40. doi: 10.1109/LRA.2015.2503143
-
Yu, J. (2020, March). Average case constant factor time and distance optimal multi-robot path planning in well-connected environments. Autonomous Robots, 44(3-4), 469–483. doi: 10.1007/s10514-019-09858-z
-
Yu, J., & LaValle, S. (2013, June). Structure and Intractability of Optimal Multi-Robot Path Planning on Graphs. Proceedings of the AAAI Conference on Artificial Intelligence, 27(1), 1443–1449. doi: 10.1609/aaai.v27i1.8541
-
Zhang, S., Li, J., Huang, T., & Koenig, S. (2022). Learning a Priority Ordering for Prioritized Planning in Multi-Agent Path Finding. Proceedings of the Symposium on Combinatorial Search, 15(1), 208–216.




















