Every optimization problem has three components: variables (e.g., ships and ports), constraints on those variables (e.g., a ship can fit only so many containers onboard), and an objective function to be minimized or maximized (e.g., maximize the number of containers shipped). The variables and constraints are often represented as a matrix in which the columns are the variables and the rows are the constraints.
A common technique to decompose such large problems is column generation, in which only a subset of the variables are considered at first, and then new variables — that is, new columns — are generated to more closely approximate the original problem. To help manage this, we developed a software library that analyzes the problem and predicts which columns are best to generate. This library will be open sourced via MathOpt, our mathematical programming framework.
With this in hand, we defined two basic approaches to solve the problem:
Double Column Generation
We considered network design and container routing as two coupled problems, each consisting of a primary selection problem (choose the best option) and a subsidiary generation problem (identify reasonable options). We applied a shortest-path algorithm to each pair of problems to generate reasonable options, followed by a linear program (using our linear programming solver, Glop) to choose the best options for each. We applied the column generation technique to both at the same time, using intermediate results on each problem to influence progress on the other. This double column generation approach enabled us to find provably optimal solutions, but it only scaled well to moderate sized problems.
CP-SAT
We then tried an implementation based on constraint programming, using our CP-SAT constraint programming solver. This also worked well up to mid-sized networks, but did not scale to the worldwide shipping problem.
These two approaches enabled us to find provably optimal solutions, but they only scaled well to small and medium sized problems. To improve their scalability, we applied a heuristic strategy using two variants of local search in which we examine neighborhoods around existing solutions to find opportunities for improvements.
Large neighborhood search
We fixed parts of the solution (e.g., "this vessel will visit Los Angeles on alternate Tuesdays") before applying either of the methods described above. This improves scalability by reducing the search space.
Variable neighborhood search
We explored neighborhoods over both the network and the schedule. We parallelize the exploration and distribute it over multiple machines to evaluate many neighborhoods simultaneously. This also improves scalability by limiting the search space, while also allowing us to incorporate knowledge from Operations Research and the shipping industry.
With both of these approaches, we made use of incrementalism: locking down promising portions of a solution so that we could start from a known good solution to make it better.
Finally, we also took into consideration transit times. Previous attempts to solve this problem didn't take transit times into account, since they make the problem much more difficult to solve. We found that inclusion of transit times significantly improved the solution quality.
https://bxt.org/kitgb