In this post we will walk through how to make least cost maximum flow decisions using linear programming.
One common extension to Maximum Flow problems is to do
it as cheaply as possible. In this article we will extend the maximum
flow example we wrote on in the last
post and include a minimum cost component. The generic problem
formation is below:
Objective \[ Minimize. \sum_{(i,j)\in A}{c_{ij}x_{ij}} \] \[ s.t. \] Node Flow Constraints \[ \sum_{j}{x_{ij}} - \sum_{i}{x_{ji}} = b_i \: \forall i \in N \] Arc Flow Constraints \[ l_{ij} \le x_{ij} \le u_{ij} \: \forall (i,j) \in A \]
Road networks are everywhere in our society. In any given intersection there is a flow of cars, intersections with stop lights, and connections between each. Every road has a feasible limit it can support. In fact, this is often the cause of most congestion. Our goal is to minimize the total time required for all cars to travel from node 1 to node 6 in a fictitious road network.
nodes <- data.frame(id = c(1:6), color = c("green", rep("grey", 4), "red"))
edges <- data.frame(from = c(1, 1, 2, 2, 5, 4, 4, 3, 3),
to = c(2, 3, 5, 4, 6, 5, 6, 5, 4),
lower_bound = 0,
upper_bound = c(800, 600, 100, 600, 600, 600, 400, 400, 300),
time = c(10, 50, 70, 30, 30, 30, 60, 60, 10), # in minutes
flow = 0, #TBD
color = "grey") %>% dplyr::arrange(from, to)
nodes
id color
1 1 green
2 2 grey
3 3 grey
4 4 grey
5 5 grey
6 6 red
edges
from to lower_bound upper_bound time flow color
1 1 2 0 800 10 0 grey
2 1 3 0 600 50 0 grey
3 2 4 0 600 30 0 grey
4 2 5 0 100 70 0 grey
5 3 4 0 300 10 0 grey
6 3 5 0 400 60 0 grey
7 4 5 0 600 30 0 grey
8 4 6 0 400 60 0 grey
9 5 6 0 600 30 0 grey
We can see the upper bounds plotted on the edges of this transportation network below. The width indicates more capacity for flow. Examine the trade-off between time and space for travel between arc (1,2).
g <- igraph::graph_from_data_frame(d = edges,
directed = TRUE,
vertices = nodes)
# Capacity plot
plot(g,
main = "Vehicle Transportion Network Capacities",
edge.label = E(g)$upper_bound,
edge.width = E(g)$upper_bound/150,
vertex.color = V(g)$color,
edge.color = E(g)$color)
# Time plot
plot(g,
main = "Vehicle Transportion Network Travel Time",
edge.label = E(g)$time,
edge.width = E(g)$time/10,
vertex.color = V(g)$color,
edge.color = E(g)$color)

First we assume the average number of cars that flow through this network. This is often recovered from past databases that record the flows through the network.
AVERAGE_FLOW <- 900 # per hour
AVERAGE_FLOW
[1] 900
Next we set up the demand that will be flowing through the network.
This is indicated as the vector b in our model formation
above. This means the initial node has a supply of 900 vehicles, while
the final node has a demand of 900 nodes. The objective is to flow as
many vehicles through the network, in the shortest amount of time.
The next step is to set up the optimization. Let’s do that now. There are 3 key ingredients.
Objective
Constraints
Directions
We want to build the constraint matrix, which has 2 parts.
Arc Capacity Constraints
Node Flow Constraints
The Arc Capacity Constraints are the first we
address.
The Amat, or A matrix, is the arc matrix which contains
the upper bounds. Since linear programming relies on the resource
matrix, we need one row for each arc, our dimensions for variable flow
selection are the number of arcs also. So this means we need an
identity matrix for the rows of arcs and columns of
arcs.
create_upper_arc_constraints <- function(edges){
Amat <- matrix(0, nrow = nrow(edges), ncol = nrow(edges))
Amat_dir <- rep("<=", nrow(Amat))
Amat_rhs <- c()
for(i in 1:ncol(Amat)){
Amat[i,i] <- 1
Amat_rhs <- c(Amat_rhs, edges$upper_bound[i])
}
list(Amat_upper = Amat,
Amat_upper_dir = Amat_dir,
Amat_upper_rhs = Amat_rhs)
}
# This could be higher than zero, but for standard LP this is the default configuration, so not needed.
# create_lower_arc_constraints <- function(edges){
# Amat <- matrix(0, nrow = nrow(edges), ncol = nrow(edges))
# Amat_dir <- rep(">=", nrow(Amat))
# Amat_rhs <- c()
#
# for(i in 1:ncol(Amat)){
# Amat[i,i] <- 1
# Amat_rhs <- c(Amat_rhs, edges$lower_bound[i])
# }
#
# list(Amat_lower = Amat,
# Amat_lower_dir = Amat_dir,
# Amat_lower_rhs = Amat_rhs)
# }
upper_results <- create_upper_arc_constraints(edges)
# lower_results <- create_lower_arc_constraints(edges)
#
# Amat <- rbind(upper_results$Amat_upper, lower_results$Amat_lower)
# Amat_dir <- c(upper_results$Amat_upper_dir, lower_results$Amat_lower_dir)
# Amat_rhs <- c(upper_results$Amat_upper_rhs, lower_results$Amat_lower_rhs)
Amat <- upper_results$Amat_upper
Amat_dir <- upper_results$Amat_upper_dir
Amat_rhs <- upper_results$Amat_upper_rhs
Amat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 0 0 0 0 0 0 0 0
[2,] 0 1 0 0 0 0 0 0 0
[3,] 0 0 1 0 0 0 0 0 0
[4,] 0 0 0 1 0 0 0 0 0
[5,] 0 0 0 0 1 0 0 0 0
[6,] 0 0 0 0 0 1 0 0 0
[7,] 0 0 0 0 0 0 1 0 0
[8,] 0 0 0 0 0 0 0 1 0
[9,] 0 0 0 0 0 0 0 0 1
Amat_dir
[1] "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<="
Amat_rhs
[1] 800 600 600 100 300 400 600 400 600
The Node Flow Constraints are the next to take care
of.
The Bmat, or B matrix, is a matrix which contains the
node balance equations codified by flows.
For each node, we need to match it’s from node and
to node with the appropriate inflow and outflow. If it
matches an inflow arc, this is increase, so 1
is in the arc column. If it matches an outflow arch, this
is decrease, so -1 is in the arc column. Otherwise
0 remains as the placeholder. The sign here is
== because they must match (i.e., supply = demand). If we
require excess at certain points we can set this demand to be higher
than zero, but we will not do that here.
create_node_constraints <- function(nodes, edges){
Bmat <- matrix(0, nrow = nrow(nodes), ncol = nrow(edges))
Bmat_dir <- rep("==", nrow(Bmat))
Bmat_rhs <- rep(0, nrow(Bmat))
for(i in 1:nrow(Bmat)){
node_id <- nodes[i, "id"]
for(j in 1:ncol(Bmat)){
edge_from <- edges[j,"from"]
edge_to <- edges[j, "to"]
if(node_id == edge_from){
# print(paste("Outbound for ", node_id, "From: ",edge_from, "to: ", edge_to))
Bmat[i,j] = 1
}
else if(node_id == edge_to){
# print(paste("Inbound for ", node_id, "From: ",edge_from, "to: ", edge_to))
Bmat[i,j] = -1
}
else{
Bmat[i,j] = 0
}
}
Bmat_rhs[i] <- demand[i]
}
list(Bmat = Bmat,
Bmat_dir = Bmat_dir,
Bmat_rhs = Bmat_rhs
)
}
results <- create_node_constraints(nodes, edges)
Bmat <- results$Bmat
Bmat_dir <- results$Bmat_dir
Bmat_rhs <- results$Bmat_rhs
Bmat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 0 0 0 0 0 0 0
[2,] -1 0 1 1 0 0 0 0 0
[3,] 0 -1 0 0 1 1 0 0 0
[4,] 0 0 -1 0 -1 0 1 1 0
[5,] 0 0 0 -1 0 -1 -1 0 1
[6,] 0 0 0 0 0 0 0 -1 -1
Bmat_dir
[1] "==" "==" "==" "==" "==" "=="
Bmat_rhs
[1] 900 0 0 0 0 -900
Next, the objective is going to be the time. This is the cost we have to pay for assigning flow to an arc. Let’s take a look at everything all together.
f.obj <- edges %>% dplyr::pull(time)
f.cons <- rbind(Amat, Bmat)
f.rhs <- c(Amat_rhs, Bmat_rhs)
f.dir <- c(Amat_dir, Bmat_dir)
f.obj
[1] 10 50 30 70 10 60 30 60 30
f.cons
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 0 0 0 0 0 0 0 0
[2,] 0 1 0 0 0 0 0 0 0
[3,] 0 0 1 0 0 0 0 0 0
[4,] 0 0 0 1 0 0 0 0 0
[5,] 0 0 0 0 1 0 0 0 0
[6,] 0 0 0 0 0 1 0 0 0
[7,] 0 0 0 0 0 0 1 0 0
[8,] 0 0 0 0 0 0 0 1 0
[9,] 0 0 0 0 0 0 0 0 1
[10,] 1 1 0 0 0 0 0 0 0
[11,] -1 0 1 1 0 0 0 0 0
[12,] 0 -1 0 0 1 1 0 0 0
[13,] 0 0 -1 0 -1 0 1 1 0
[14,] 0 0 0 -1 0 -1 -1 0 1
[15,] 0 0 0 0 0 0 0 -1 -1
f.rhs
[1] 800 600 600 100 300 400 600 400 600 900 0 0 0
[14] 0 -900
f.dir
[1] "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "==" "==" "==" "=="
[14] "==" "=="
Now we unite everything together and drop it into the solver.
Remember, we are trying to minimize cost in our objective, the
constraints will maximize the flow. So specify min.
results <- lp(direction = "min",
objective.in = f.obj,
const.mat = f.cons,
const.dir = f.dir,
const.rhs = f.rhs)
edges$flow <- results$solution
edges
from to lower_bound upper_bound time flow color
1 1 2 0 800 10 700 grey
2 1 3 0 600 50 200 grey
3 2 4 0 600 30 600 grey
4 2 5 0 100 70 100 grey
5 3 4 0 300 10 200 grey
6 3 5 0 400 60 0 grey
7 4 5 0 600 30 500 grey
8 4 6 0 400 60 300 grey
9 5 6 0 600 30 600 grey
Now that we have our flow we can do some visualizing and analysis.
There are two key graphics to examine; the
flow vs. capacity and the flow vs. time.
First, the flow vs. capacity will give us insights into
stress on the network. This is because of their implicit advantage they
supply to the optimizer, maximum flows, so naturally, these get flooded
with traffic. Second, the flow vs. time will give us
insights into shortest distance paths (i.e., assuming time is
proportional to distance, which is not always the case). This is because
paths with shorter times will enable more to flow through in any given
time delta. Between these two visuals, a good assessment of the model
output is feasible.
# Set the flow
edges$flow <- results$solution
# If the arc is flowing oil, change to black
edges$color[which(edges$flow > 0)] <- "black"
# If the arc is at capacity change it to red
edges$color[which(edges$flow == edges$upper_bound)] <- "red"
g <- igraph::graph_from_data_frame(d = edges,
directed = TRUE,
vertices = nodes)
# Plot the flow and capacity
plot(g,
edge.label = paste("(",E(g)$flow,
",",
E(g)$upper_bound,
")",
sep=""),
main = "Traffic Flow | (flow, capacity)",
vertex.color = V(g)$color,
vertex.size = 25,
vertex.label.cex = 1.6,
edge.color = E(g)$color,
edge.width = E(g)$flow/200)
# Plot the time
plot(g,
edge.label = paste("(",E(g)$flow,
",",
E(g)$time,
")",
sep=""),
main = "Traffic Flow | (flow, time)",
vertex.color = V(g)$color,
vertex.size = 25,
vertex.label.cex = 1.6,
edge.color = E(g)$color,
edge.width = E(g)$flow/200)

nodes
id color
1 1 green
2 2 grey
3 3 grey
4 4 grey
5 5 grey
6 6 red
visNet_nodes <- nodes %>%
dplyr::mutate(label = paste("Location ", id),
font.size =10,
type = "black",
shape = "circle",
font = 12)
visNet_edges <- edges %>%
dplyr::mutate(label = paste("(",flow,",",upper_bound,")",sep=""),
length = 250,
width = flow/100,
arrows = "to")
visNetwork(visNet_nodes,
visNet_edges,
width = "100%",
main = "Least Time Maximum Vehicle Flow | (flow, capacity)")
Two of the most popular roadways are
2 -> 4 and 5 -> 6. These supply the
least amount of time and the most amount of capacity. Without a doubt,
the model has exploited these as much as possible. The only trick is,
just how the flow gets there. We see up front that 1 ships
off a 200 to 700 flow split from the 900
supply with the heavier allocation toward 2. Why?
2 is connected to a popular
roadway, meaning much more potential to flow (and
quickly).
In order to get to 4, going through 3
will cost much more time (60 oppose to 40).
3 has two outlets, but one is one of the worst
routes on the network due to it’s 60 minute trek, so it doesn’t even get
any flow allocated.
The shortest path shows one very interesting insight to this model;
sending a maximum flow through the network is not all about time. The
shortest path (least time) is the sequence
1 -> 2 -> 4 -> 6. However, from the above, we see
that this isn’t the most stressed path. Why? We aren’t only
interested in short times for the vehicles flowing through the network.
We are also interested in getting them all through it! We assumed there
was a 900 average vehicle flow, and having a macro-level view of this
system, sending them all down the shortest path would not solve it (that
is, we would not send as much as possible, only as cheaply as possible;
we could have pushed more). In order to get the most cars sent through
the network, in the shortest amount of time we also must take advantage
of the popular roadways that the model is straining (or
add incentive to the not so popular roadways with
greater capacity or shorter commute times).
shortest.paths <- igraph::shortest_paths(graph = g, from = 1, to = 6)
s_path <- shortest.paths$vpath[[1]]
s_path
+ 4/6 vertices, named, from ac6f2e8:
[1] 1 2 4 6
[1] 100
E(g)$color <- "black"
E(g, path = s_path)$color <- "blue"
plot(g,
edge.label = paste("(",E(g)$flow,
",",
E(g)$time,
")",
sep=""),
main = "Shortest Path | (flow, capacity)",
vertex.color = V(g)$color,
vertex.size = 25,
vertex.label.cex = 1.6,
edge.color = E(g)$color,
edge.width = E(g)$flow/200)

We can also build a function to reuse for next time.
#
# @lp.mincost.maxflow: data.frame, integer, integer, integer or list -> list
# function outputs an edge list with flows.
#
# @edges: data.frame, should have from, to, and capacity columns for each edge in the network. from and to columns should contain unique node ids as integers from 0 to N.
# @source.id: integer, unique node id
# @dest.id: integer, unique node id
# @flow.demand: integer or list, if integer create a demand signal to push that much from source.id to dest.id, if a list expectation is that it sums to 0 and will contain each nodes supply (if positive) and demand (if negative).
lp.mincost.maxflow <- function(edges, source.id, dest.id, flow.demand){
if(!("from" %in% names(edges)) ||
!("to" %in% names(edges)) ||
!("upper_bound" %in% names(edges)) ||
!("lower_bound" %in% names(edges))){
print("Need from, to, and capacity columns.")
}
else{
# Infer the nodes
nodes <- data.frame(id = sort(unique(c(unique(edges$from), unique(edges$to)))))
# Infer the demand to flow through the network
if(length(flow.demand)>1){
if(sum(flow.demand) == 0){
demand <- flow.demand
}else{
print("Flow demand doesn't add up to 0.")
}
}
else{
demand <- c(flow.demand, rep(0, nrow(nodes)-2), -flow.demand)
}
# Get arc capacity constraints
create_arc_capacity_constraints <- function(edges){
# For upper
Amat <- matrix(0, nrow = nrow(edges), ncol = nrow(edges))
Amat_dir <- rep("<=", nrow(Amat))
Amat_rhs <- c()
for(i in 1:ncol(Amat)){
Amat[i,i] <- 1
Amat_rhs <- c(Amat_rhs, edges$upper_bound[i])
}
# For lower
Bmat <- matrix(0, nrow = nrow(edges), ncol = nrow(edges))
Bmat_dir <- rep(">=", nrow(Bmat))
Bmat_rhs <- c()
for(i in 1:ncol(Bmat)){
Bmat[i,i] <- 1
Bmat_rhs <- c(Bmat_rhs, edges$lower_bound[i])
}
list(Amat = rbind(Amat, Bmat),
Amat_dir = c(Amat_dir, Bmat_dir),
Amat_rhs = c(Amat_rhs, Bmat_rhs))
}
results <- create_arc_capacity_constraints(edges)
Amat <- results$Amat
Amat_dir <- results$Amat_dir
Amat_rhs <- results$Amat_rhs
# Create node flow constraints (in = out)
create_node_constraints <- function(nodes, edges){
Bmat <- matrix(0, nrow = nrow(nodes), ncol = nrow(edges))
Bmat_dir <- rep("==", nrow(Bmat))
Bmat_rhs <- rep(0, nrow(Bmat))
for(i in 1:nrow(Bmat)){
node_id <- nodes[i, "id"]
for(j in 1:ncol(Bmat)){
edge_from <- edges[j,"from"]
edge_to <- edges[j, "to"]
if(node_id == edge_from){
# print(paste("Outbound for ", node_id, "From: ",edge_from, "to: ", edge_to))
Bmat[i,j] = 1
}
else if(node_id == edge_to){
# print(paste("Inbound for ", node_id, "From: ",edge_from, "to: ", edge_to))
Bmat[i,j] = -1
}
else{
Bmat[i,j] = 0
}
}
Bmat_rhs[i] <- demand[i]
}
list(Bmat = Bmat,
Bmat_dir = Bmat_dir,
Bmat_rhs = Bmat_rhs
)
}
results <- create_node_constraints(nodes, edges)
Bmat <- results$Bmat
Bmat_dir <- results$Bmat_dir
Bmat_rhs <- results$Bmat_rhs
# Bring together
f.obj <- edges$cost
f.cons <- rbind(Amat, Bmat)
f.rhs <- c(Amat_rhs, Bmat_rhs)
f.dir <- c(Amat_dir, Bmat_dir)
results <- lp(direction = "min",
objective.in = f.obj,
const.mat = f.cons,
const.dir = f.dir,
const.rhs = f.rhs)
edges$flow <- results$solution
}
list(edges = edges, results = results)
}
edges <- data.frame(from = c(1, 1, 2, 2, 5, 4, 4, 3, 3),
to = c(2, 3, 5, 4, 6, 5, 6, 5, 4),
lower_bound = 0,
upper_bound = c(800, 600, 100, 600, 600, 600, 400, 400, 300),
cost = c(10, 50, 70, 30, 30, 30, 60, 60, 10))
lp.mincost.maxflow(edges = edges, source.id = 1, dest.id = 6, flow.demand = 900)
$edges
from to lower_bound upper_bound cost flow
1 1 2 0 800 10 700
2 1 3 0 600 50 200
3 2 5 0 100 70 100
4 2 4 0 600 30 600
5 5 6 0 600 30 600
6 4 5 0 600 30 500
7 4 6 0 400 60 300
8 3 5 0 400 60 0
9 3 4 0 300 10 200
$results
Success: the objective function is 95000
You can also source the file to make things easier for next time. The code can be found here.
source("lp.mincost.maxflow.r")
edges <- data.frame(from = c(1, 1, 2, 2, 5, 4, 4, 3, 3),
to = c(2, 3, 5, 4, 6, 5, 6, 5, 4),
lower_bound = 0,
upper_bound = c(800, 600, 100, 600, 600, 600, 400, 400, 300),
cost = c(10, 50, 70, 30, 30, 30, 60, 60, 10))
lp.mincost.maxflow(edges = edges, source.id = 1, dest.id = 6, flow.demand = 900)
$edges
from to lower_bound upper_bound cost flow
1 1 2 0 800 10 700
2 1 3 0 600 50 200
3 2 5 0 100 70 100
4 2 4 0 600 30 600
5 5 6 0 600 30 600
6 4 5 0 600 30 500
7 4 6 0 400 60 300
8 3 5 0 400 60 0
9 3 4 0 300 10 200
$results
Success: the objective function is 95000