Maximum Network Flows in R

network flows linear programming r

In this post we will walk through how to make a maximum flow decision using network flows and linear programming.

Blake Conrad https://github.com/conradbm
2022-07-09

Oil Flow Maximization

One classic problem in Network Flows and Optimization is called the Max-Flow Problem. This takes any two nodes in a network, s and t and attempts to send as much of a resource (or multiple) from s to t. This is called a flow, and the flow which maximizes the total bandwidth of the network is called the maximum flow.

First, the problem starts with an objective: to maximize flow. These are denoted as,

\[x_{ij} = flow \: on \: node_i \: to \: node_j \:(or \: on \: arc \: (i,j))\]

Second, the problem has a set of constraints, these are called the arc capacities. These are denoted as,

\[u_{ij} = maximum \:amount \:of \: feasible \: flow \: on \: node_i \: to \: node_j \:(or \: on \: arc \: (i,j))\] Last, the network graph is supplied as a set of connections under the traditional network structure:

\[ G = (N,E) \]

Problem Formation

For our problem, the feasible flow is going to be in units of millions of barrels of oil per hour that will pass through an arc of pipeline.

Network Structure

The source for our network is indicated in green and sink in red.

nodes <- data.frame(id = c(0:4), color = c("green", rep("grey", 3), "red"))
edges <- data.frame(from = c(0,0,1,1,3,2), to = c(1,2,2,3,4,4), capacity = c(2,3,3,4,1,2), color = "grey")

edges
  from to capacity color
1    0  1        2  grey
2    0  2        3  grey
3    1  2        3  grey
4    1  3        4  grey
5    3  4        1  grey
6    2  4        2  grey

Network Visual

g <- igraph::graph_from_data_frame(d = edges,
                                   directed = TRUE,
                                   vertices = nodes)
plot(g, edge.label = E(g)$capacity, 
     main = "Oil Pipeline Capacities", 
     vertex.color = V(g)$color,
     edge.color = E(g)$color)

Big M

In order to model this problem, we need to put a very large capacity from the source node to the sink node, these are s and t mentioned above. We have given them node names 0 and 4 respectively in the dataframe for nodes.

BIG_M <- 1000000

edges <- rbind(edges, data.frame(from = c(4),
                                 to = c(0), 
                                 capacity = c(BIG_M), 
                                 color = "purple"))

# Never hurts to add an id
edges$id <- 0:(nrow(edges)-1)

g <- igraph::graph_from_data_frame(d = edges,
                                   directed = TRUE,
                                   vertices = nodes)
plot(g,
     edge.label = E(g)$capacity, 
     main = "Oil Pipeline Capacities", 
     vertex.color = V(g)$color,
     edge.color = E(g)$color)

Linear Programming (LP)

LP Structure

The next step is to set up the optimization. Let’s do that now. There are 3 key ingredients.

Constraints

We want to build the constraint matrix, which has 2 parts.

The Arc Capacity Constraints are the first we address.

The Amat, or A matrix, is the arc matrix which contains the upper bounds. Since linear programming relies on the resource matrix, we need one row for each arc, our dimensions for variable flow selection are the number of arcs also. So this means we need an identity matrix for the rows of arcs and columns of arcs.

Amat <- matrix(0, nrow = nrow(edges), ncol = nrow(edges))
Amat_dir <- rep("<=", nrow(Amat))
Amat_rhs <- c()

for(i in 1:ncol(Amat)){
  Amat[i,i] <- 1
  Amat_rhs <- c(Amat_rhs, edges$capacity[i])
}

Amat
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    1    0    0    0    0    0    0
[2,]    0    1    0    0    0    0    0
[3,]    0    0    1    0    0    0    0
[4,]    0    0    0    1    0    0    0
[5,]    0    0    0    0    1    0    0
[6,]    0    0    0    0    0    1    0
[7,]    0    0    0    0    0    0    1
Amat_rhs
[1] 2e+00 3e+00 3e+00 4e+00 1e+00 2e+00 1e+06
Amat_dir
[1] "<=" "<=" "<=" "<=" "<=" "<=" "<="

The Node Flow Constraints are the next to take care of.

For each node, we need to match it’s from node and to node with the appropriate inflow and outflow. If it matches an inflow arc, this is increase, so 1 is in the arc column. If it matches an outflow arch, this is decrease, so -1 is in the arc column. Otherwise 0 remains as the placeholder. The sign here is == because they must match (i.e., supply = demand). If we require excess at certain points we can set this demand to be higher than zero, but we will not do that here.

Bmat <- matrix(0, nrow = nrow(nodes), ncol = nrow(edges))
Bmat_dir <- rep("==", nrow(Bmat))
Bmat_rhs <- rep(0, nrow(Bmat))

for(i in 1:nrow(Bmat)){
  node_id <- nodes[i, "id"]
  for(j in 1:ncol(Bmat)){
    edge_from <- edges[j,"from"]
    edge_to <- edges[j, "to"]
    edge_id <- edges[j, "id"]
    
    if(node_id == edge_from){
      Bmat[i,j] = -1
    }
    else if(node_id == edge_to){
      Bmat[i,j] = 1
    }
    else{
      Bmat[i,j] = 0
    }
  }
}

Bmat
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]   -1   -1    0    0    0    0    1
[2,]    1    0   -1   -1    0    0    0
[3,]    0    1    1    0    0   -1    0
[4,]    0    0    0    1   -1    0    0
[5,]    0    0    0    0    1    1   -1
Bmat_dir
[1] "==" "==" "==" "==" "=="
Bmat_rhs
[1] 0 0 0 0 0

Objective

Next, the objective is going to be 0 for all values, except our final flow. This we want to maximize.

f.obj <- c(rep(0, nrow(edges)-1), 1)
f.obj
[1] 0 0 0 0 0 0 1

Solve the LP

Now we unite everything together and drop it into the solver. Remember, we are trying to maximize flow. So specify max.

f.cons <- rbind(Amat, Bmat)
f.rhs <- c(Amat_rhs, Bmat_rhs)
f.dir <- c(Amat_dir, Bmat_dir)

results <- lp(direction = "max",  
              objective.in = f.obj, 
              const.mat = f.cons, 
              const.dir = f.dir, 
              const.rhs = f.rhs)
results$solution
[1] 1 2 0 1 1 2 3

Visualize Maximum Flow Solution

Since the results$solution contain the maximum flow we can push through the pipes as a system, we can add this into our flow component of the edges dataframe.

The visual indicates that 2/3 of the capacity was shipped to node 2, and 1/2 to node 1. After this the next best transfer was from node 2 to node 4 maxed out at 2/2. This is indicated in red. The arcs that have flow are indicated in black, and no flow is indicated by grey. The maximum flow arc is just an artificial arc that indicates the maximum flow, this is indicated in purple.

Lastly from the node 1 to node 3 1/4 of the capacity was sent. Then from node 3 to node 4 was maxed out at 1/1.

Posterior analysis to this model output tells us that sending 2 million tons of oil from the source to destination 2 and 1 million tons of oil from the source to destination 1 will push as much flow to destination 4 as we possibly can.

# Set the flow
edges$flow <- results$solution

# If the arc is flowing oil, change to black
edges$color[which(edges$flow > 0)] <- "black"

# If the arc is at capacity change it to red
edges$color[which(edges$flow == edges$capacity)] <- "red"

# Last flow is purple
edges$color[nrow(edges)] <- "purple"


g <- igraph::graph_from_data_frame(d = edges,
                                   directed = TRUE,
                                   vertices = nodes)

# Make it look a little more appealing
L = cbind(1:5, 1:5)
CURVE = c(0,0.15, 0.3, 0.45, 0, -0.15, -0.3, 0, 0.15, 0) * 5

# Plot
plot(g,
     layout = L,
     edge.curved = CURVE,
     edge.label = paste("(",E(g)$flow, 
                        ",", 
                        E(g)$capacity, ")", 
                        sep=""), 
     main = "Oil Pipeline Flow | (flow, capacity)", 
     vertex.color = V(g)$color,
     vertex.size = 25,
     vertex.label.cex = 1.6,
     edge.color = E(g)$color,
     edge.width = E(g)$flow*2)

Cleaning Up The Visual

visNet_nodes <- nodes %>% 
                dplyr::mutate(label = paste("Location ", id),
                              font.size =10,
                              type = "black",
                              shape = "circle",
                              font = 12)

visNet_edges <- edges %>% 
                dplyr::mutate(label = paste("(",flow,",",capacity,")",sep=""),
                              length = 250,
                              width = flow*2,
                              arrows = "to") 
  
visNetwork(visNet_nodes, 
           visNet_edges,
           width = "100%",
           main = "Maximum Oil Flow | (flow, capacity)")

Reusable Functionality

If we would like, we can take the code and example above and make a reusable functional API so we don’t need to do this time and again. The code is as follows.

#
# @lp.maxflow: data.frame, integer, integer -> list
#   function outputs an edge list with max flows.
#
# @edges: data.frame, should have from, to, and capacity columns for each edge in the network. from and to columns should contain unique node ids as integers from 0 to N.
# @source.id: integer, unique node id
# @dest.id: integer, unique node id
#
lp.maxflow <- function(edges, source.id, dest.id){
  if(!("from" %in% names(edges)) || !("to" %in% names(edges)) || !("capacity" %in% names(edges))){
    print("Need from, to, and capacity columns.")
  }
  else{
    
    # Infer the nodes
    nodes <- data.frame(id = sort(unique(c(unique(edges$from), unique(edges$to)))))
    
    # Connect source.id to dest.id
    BIG_M = max(edges$capacity)*10
    edges <- rbind(edges, data.frame(from = c(dest.id),
                                 to = c(source.id), 
                                 capacity = c(BIG_M)))
    
    
    
    # Build up edge constraints 
    Amat <- matrix(0, nrow = nrow(edges), ncol = nrow(edges))
    Amat_dir <- rep("<=", nrow(Amat))
    Amat_rhs <- c()
    
    for(i in 1:ncol(Amat)){
      Amat[i,i] <- 1
      Amat_rhs <- c(Amat_rhs, edges$capacity[i])
    }
    
    # Build up node constraints
    Bmat <- matrix(0, nrow = nrow(nodes), ncol = nrow(edges))
    Bmat_dir <- rep("==", nrow(Bmat))
    Bmat_rhs <- rep(0, nrow(Bmat))
    
    for(i in 1:nrow(Bmat)){
      node_id <- nodes[i, "id"]
      for(j in 1:ncol(Bmat)){
        edge_from <- edges[j,"from"]
        edge_to <- edges[j, "to"]
        edge_id <- edges[j, "id"]
        
        if(node_id == edge_from){
          Bmat[i,j] = -1
        }
        else if(node_id == edge_to){
          Bmat[i,j] = 1
        }
        else{
          Bmat[i,j] = 0
        }
      }
    }

    # Join all model parameters
    f.obj <- c(rep(0, nrow(edges)-1), 1)
    f.cons <- rbind(Amat, Bmat)
    f.rhs <- c(Amat_rhs, Bmat_rhs)
    f.dir <- c(Amat_dir, Bmat_dir)
    
    results <- lp(direction = "max",  
                  objective.in = f.obj, 
                  const.mat = f.cons, 
                  const.dir = f.dir, 
                  const.rhs = f.rhs)
    edges$flow <- results$solution
  }
  
  list(edges=edges, flow = results$solution, maxflow = results$objval, results = results)
}

edges <- data.frame(from = c(0,0,1,1,3,2), to = c(1,2,2,3,4,4), capacity = c(2,3,3,4,1,2))
lp.maxflow(edges, source.id = 0, dest.id = 4)
$edges
  from to capacity flow
1    0  1        2    1
2    0  2        3    2
3    1  2        3    0
4    1  3        4    1
5    3  4        1    1
6    2  4        2    2
7    4  0       40    3

$flow
[1] 1 2 0 1 1 2 3

$maxflow
[1] 3

$results
Success: the objective function is 3 

Sourcing

We can also save that function and source it for later use. The code is available on github here.

source("lp.maxflow.r")
edges <- data.frame(from = c(0,0,1,1,3,2), to = c(1,2,2,3,4,4), capacity = c(2,3,3,4,1,2))
lp.maxflow(edges, source.id = 0, dest.id = 4)
$edges
  from to capacity flow
1    0  1        2    1
2    0  2        3    2
3    1  2        3    0
4    1  3        4    1
5    3  4        1    1
6    2  4        2    2
7    4  0       40    3

$flow
[1] 1 2 0 1 1 2 3

$maxflow
[1] 3

$results
Success: the objective function is 3 

References

Example taken from the following sources:

Winston., Wayne. Operations Research, Applications and Algorithms 4th Edition.