At Veeqo we have a number of core problems that our customers come to us to solve and one of the most interesting to me has always been the digital picking feature, specifically the routing the picker around the warehouse. On the face of it it’s a simple problem, present the picker with the list of items to pick in such an order that they spend the minimum time walking between each point bin. But when we start breaking this problem down we quickly run into a few complexities; firstly how do we know where the bins are in the warehouse? Veeqo allows for businesses to set the product bin locations as a string which is a good start, but how does each bin location relate to one another? We know they must go on shelves but what’s the layout of those shelves around the warehouse? And critically what paths can our pickers take around them? These were the critical questions I (naively) thought needed answering as a priority so jumped into solving them, before I knew it I had a gamified way of placing shelving, walls and packing desks, and an implementation of A* (actually another very interesting algorithm that I’d really recommend looking into, a fantastic video on it here: https://www.youtube.com/watch?v=ySN5Wnu88nE) which allowed me to find the shortest path between each point. Sorted right! Well unfortunately the biggest problem was just rearing its head, how do we work out which of those individual paths to take to create the shortest round trip of all our bins? And just like that I’d landed into the travelling salesman problem, a problem that has puzzled scientists for nearly a century due to its algorithmic complexity. Its an NP-hard problem which means it’s extremely unlikely a polynomial-time solution exists (certainly not one we can use) or simpler still, by the time we get past about 11 bins to path between we’d be waiting a really long time. For a few examples, 11 bins have 1.814 x 106 possible combinations, 15 bins 4.359 x 1010 and 25 3.102 x 1023, by which point any chance we would have to speed up a picker round a warehouse has instead been spent staring at a spinning wheel on a scanner. So what do we do about this? We approach the problem using heuristics
A heuristic is about accepting that getting the ideal solution just takes way too long, instead we come up with a non-perfect solution in a shorter period of time and get a best guess at a solution. For each of these algorithms we must now consider two things, firstly how long it takes to execute and secondly how close to perfect are we? As with all things in life this comes down to a compromise, as we increase the accuracy then the execution time also increases, therefore we need to pick an appropriate heuristic for our use case. Veeqo already does this, by using the assumption that shelving units are organised such that as you go down a row the bin number increases and as you go across shelving units we go through the alphabet, we can use a simple alphanumeric sort. A12 comes before B5 which comes before D42, you get the idea. With no knowledge of the layout of the warehouse this is a really good heuristic and we’ll use it as our baseline to beat going forwards. For this article I’ve created a sample warehouse with 9 bins, this is just about within the scope of brute-force but allows us to understand how each algorithm works visually. So our baseline is a cost of 126.51
Our first heuristic is the fastest and simplest to implement, nearest neighbour just means that from each bin we check all bins we haven’t already travelled to and go to the closest one. We repeat this until we run out of bins so head back to the start. Nice and simple!
Here we can see this greedy algorithm at work, we’re still doing some big loops and doubling back on ourselves but our cost is already down to just 94.94, this just goes to show maybe all that work at the start with setting up a warehouse layout and implementing A* was worth it! Here’s a pseudo code ish implementation of nearest neighbour:
list_of_nodes = [a,b,c,d,e,f] def distance_between_nodes(a,b) a.dif(b) end def nearest_neighbour(list_of_nodes) new_route =  current_node = list_of_nodes.first new_route << current_node list_of_nodes.remove(current_node) while list_of_nodes.count > 0 best_node = ‘’ list_of_nodes.each do |new_node| if distance_between_nodes(node, new_node) < distance_between_nodes(node, best_node) best_node = new_node end end current_node = best_node new_route << current_node list_of_nodes.remove(current_node) end list_of_nodes end
Nearest Neighbour with Two Opt
But we still have a problem, we keep crossing over ourselves and that’s just begging for optimisation. So now we’ve generated a path with nearest neighbour we want to apply some form of improvement to that route, two opt is a pretty standard way of approaching this. The principle is simple, take two ‘edges’ (the bit between two bins), delete them and then reconnect them in the remaining possible way. We check if this actually improves the overall distance, if it does then we keep this swap but if it doesn’t we discard it and move onto the next potential swap. We continue looping through the whole path, exhaustively trying to swap every edge until either we make a loop with no more swaps that improve the route or we reach a predetermined limit of swaps.
Above is a visual example of this in my best Paint skills!
In this case it nets us a small improvement to a cost of 92.07 but still a very worthwhile step, especially when we start increasing the number of bins in our examples.
Recursive Randomised Nearest Neighbour
This algorithm is a small adaptation of our previous nearest neighbour algorithm, rather than just selecting the nearest bin instead we look at the 3 nearest bins and randomly pick one of them. The reason for this is to try to avoid solving the local best routes but instead open up the routing to pick some bins that aren’t immediately better for our path and create a globally better solution. The problem with any amount of randomness is although we might get a better solution occasionally, most of the time it’ll be pretty terrible. So we repeatedly try this slightly randomised solution and keep the best version. The exact number of times we retry I’ve set at 50 but this is another case of the trade offs we have to make, a smaller number will result in faster executions whilst a larger number means we could happen across that much better solution.
But even with these repeats our best randomised solution kind of sucks! This is because of something I mentioned earlier, by introducing randomness we’re no longer solving the problem locally nearly as efficiently. So we need to optimise our solution locally while preserving our big picture wins we gained with the randomness and that’s where Two Opt comes back in.
Recursive Randomised Nearest Neighbour with Two Opt
This ones fairly self explanatory except that it’s critical to apply Two Opt on each solution Randomised Nearest Neighbour gives us for comparison to each other, rather than just applying it to the best solution. With all of this put together we get a very impressive route cost of 88.49!
So there we have it, by first mapping the warehouse we can apply a really good heuristic solution of the traveling salesman problem, and reduce the route cost that our alphanumeric sort of warehouse locations gave at 126.51 down to just 88.49. That’s a pretty impressive saving of 30% and proves that all this work might have been worth it after all! The simple fact is a complete solution of the Traveling Salesman Problem is just too costly and although some extremely clever mathematicians have managed to reduce that cost a little (https://www.quantamagazine.org/computer-scientists-break-traveling-salesperson-record-20201008/) there is still a need for a ‘good enough’ solution. Other heuristics exist and I’d really recommend taking a look into them, a few examples are genetic, simulated annealing and branch & bound and there are plenty more out there. I hope this blog has sparked an interest in the TSP and algorithms in general as it’s certainly become a bit of an obsession of mine!