Data Cleaning¶

In [1]:
import random
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")
In [2]:
df = pd.read_csv('data/package_locations.csv')
df['Zip Code'] = df['Zip Code'].str[:5]
df.head()
Out[2]:
Order ID Zip Code Longitude Latitude Package ID Width Length Height Weight
0 jQaRKEzCo27dRNnNu 60622 -87.684558 41.896016 mb4cGEBYvSJrsFFQg 16.0 23.0 2.0 4.0
1 KxRjR6QbYrBDHio6i 60618 -87.714895 41.939587 9KWDDLMxtLwvfsHEB 12.0 12.0 12.0 15.0
2 nZ8Ffm4i7QwqgGbga 60076 -87.744326 42.018340 8KCnfP25BcCLiGx4t 12.0 16.0 16.0 15.0
3 NQipARuxiMXKa3WLy 60614 -87.636563 41.913992 b5JpAAwZoGaFhNbbZ 17.0 17.0 10.0 19.0
4 7ey2LZajFsGun3TGp 60126 -87.924939 41.901933 pf6LNNg2gsjgmgMP9 12.0 14.0 8.0 10.0
  • Restrict 'Zip Codes' to five digits.

In [3]:
def summary(df):
    obs = df.shape[0]
    types = df.dtypes
    counts = df.apply(lambda x: x.count())
    uniques = df.apply(lambda x: x.unique().shape[0])
    nulls = df.apply(lambda x: x.isnull().sum())
    min = df.min()
    mean = df.mean()
    max = df.max()
    print('Data shape:', df.shape)

    cols = ['types', 'counts', 'uniques', 'nulls', 'min', 'mean', 'max']
    summary = pd.concat([types, counts, uniques, nulls,
                        min, mean,max], axis=1, sort=True)

    summary.columns = cols
    dtypes = summary.types.value_counts()
    print('___________________________\nData types:')
    print(summary.types.value_counts())
    print('___________________________')
    return summary

summary(df)
Data shape: (15709, 9)
___________________________
Data types:
float64    6
object     3
Name: types, dtype: int64
___________________________
Out[3]:
types counts uniques nulls min mean max
Height float64 15152 11 557 0.0 10.337579 16.0
Latitude float64 15708 13267 1 39.580295 41.910249 42.460326
Length float64 15152 11 557 0.0 15.072004 23.0
Longitude float64 15708 12280 1 -105.163131 -87.859381 88.254208
Order ID object 15709 15659 0 227phsiq9inRmkH9Y NaN zzwkARDjdbHt7gu4K
Package ID object 15709 15709 0 229knFkHDqFWcxwCD NaN zzgcojDfciNMsamua
Weight float64 15036 10 673 0.0 12.911745 25.0
Width float64 15152 10 557 0.0 12.867410 17.0
Zip Code object 15709 282 0 60004 inf 80401
  • Summarize data types and ranges, unique and null values.

In [4]:
plt.figure(figsize=(12,12))

sns.scatterplot(x=df.Longitude, y=df.Latitude, c=['orange'], s=20)

for i, txt in enumerate(df['Zip Code']):
    plt.annotate('idx:'+str(i)+'\nzip:'+txt, (df.Longitude[i], df.Latitude[i]))
  • Inspect outliers.

In [5]:
df.iloc[13685]
Out[5]:
Order ID      RAQ3nvaDY28ARYvQJ
Zip Code                  60586
Longitude             88.254208
Latitude              41.566141
Package ID    ckbnfcdsG7NrH4qqm
Width                      12.0
Length                     16.0
Height                     13.0
Weight                     15.0
Name: 13685, dtype: object
  • The first outlier is missing a negative Longtitude sign and can be corrected.

In [6]:
df.iloc[861]
Out[6]:
Order ID      83g9T7p8ZXGbMRmHY
Zip Code                  60045
Longitude            -89.398528
Latitude              40.633125
Package ID    9pNxBawNfhJPFWnte
Width                      14.0
Length                     16.0
Height                     14.0
Weight                     18.0
Name: 861, dtype: object
  • The second outlier proves more difficult when attempting to identify the error.

  • No simple adjustment matches a reasonable zip code.

  • How many other zip codes do not match their coordinates?

In [7]:
df[df['Longitude'].isna()]
Out[7]:
Order ID Zip Code Longitude Latitude Package ID Width Length Height Weight
4999 GaQPn7fmehJDPte97 60061 NaN NaN NQCzw8kpF9oY8izbC 10.0 16.0 8.0 NaN
  • Missing coordinates will need to be removed.

In [8]:
from geopy import Nominatim

geolocator = Nominatim(user_agent="veho")
bad_add = []

for i in range(len(df)):
    try:
        location = geolocator.reverse(str(df['Latitude'][i])+','+str(df['Longitude'][i]))
        a = location.address.split(',')[-2]
        if str(a).strip() != str(df['Zip Code'][i]).strip():
            bad_add.append(i)
    except:
        pass
In [9]:
df = df.drop(bad_add)
df.reset_index(inplace=True)
df = df.drop(columns='index')
df.to_csv('data/good_addresses.csv')
  • Drop instances where zip code and coordinates don't match. (Ironically, validating addresses proved more time consuming than any advanced algorithm)

In [10]:
df = pd.read_csv('data/good_addresses.csv')
df['Zip Code'] = df['Zip Code'].astype('str')
df = df[df['Longitude'].notna()]
df = df.fillna(0)
df.loc[df['Width']==10, 'Weight'] = int(np.mean(df['Weight']))
df.loc[df['Longitude']>0, 'Longitude'] = df['Longitude'] * -1
df['Demand'] = df['Width'] * df['Length'] * df['Height']
df.reset_index(inplace = True)
df = df.drop(columns=['index','Unnamed: 0'])
  • Drop instance of null coordinates.

  • Fill remaining null values of dimension and weight with zeros.

  • Fill package weight for packages with dimension greater than zero witht the mean weight of all packages.

  • Correct outlying coordinate.

  • Create Demand column based on product of three package dimensions.

Exploratory Data Analysis¶

In [11]:
df = df.sort_values('Zip Code')
In [12]:
g = sns.pairplot(df, hue='Zip Code')
sns.move_legend(g, bbox_to_anchor=(1.1, 0.5), loc=5, ncol=2, title='Zip Code', frameon=False)

Pairplot analysis identifies two clusters of addresses based on zip codes. Beyond that, we can identify trimodal distributions for width and length, as well as bimodal distributions for height and weight.

In [13]:
plt.figure(figsize=(12,12))
sns.heatmap(df.corr(), annot=True)
Out[13]:
<AxesSubplot: >

The strongest correlation exists between Height and Weight. Although surprising, it could be an indication that Height is nearly always the name given to the greatest dimension.

In [14]:
summary(df)
Data shape: (13917, 10)
___________________________
Data types:
float64    7
object     3
Name: types, dtype: int64
___________________________
Out[14]:
types counts uniques nulls min mean max
Demand float64 13917 18 0 0.0 1832.116404 4284.0
Height float64 13917 10 0 0.0 10.072358 16.0
Latitude float64 13917 11797 0 39.580295 41.910101 42.460326
Length float64 13917 10 0 0.0 14.506647 23.0
Longitude float64 13917 10915 0 -105.157648 -87.882223 -87.527975
Order ID object 13917 13878 0 22Azydv6sFAYb9MzG NaN zzwkARDjdbHt7gu4K
Package ID object 13917 13917 0 229knFkHDqFWcxwCD NaN zzgcojDfciNMsamua
Weight float64 13917 9 0 0.0 12.553352 25.0
Width float64 13917 9 0 0.0 12.424014 17.0
Zip Code object 13917 280 0 60004 inf 80219
  • Sanity Check; making sure no null values remain.

Preprocessing¶

In [15]:
chicago = df[df['Zip Code'].str.startswith('6')]
chicago.reset_index(inplace=True)
chicago = chicago.drop(columns=['index'])
denver = df[df['Zip Code'].str.startswith('8')]
denver.reset_index(inplace=True)
denver = denver.drop(columns=['index'])
denver
Out[15]:
Order ID Zip Code Longitude Latitude Package ID Width Length Height Weight Demand
0 Q4Nw6RsBwf2svs5Ab 80005 -105.151411 39.851961 B6wzYfm5FYto9C2sB 12.0 15.0 16.0 25.0 2880.0
1 s9JzL3WLNhxwhNeDL 80014 -104.817076 39.659900 KE9sv24j7qYC5jmHA 11.0 16.0 14.0 18.0 2464.0
2 QJ2AD4z2WCi2e7zPn 80033 -105.101366 39.788385 4iaNQcqcbt5A3Wk6y 12.0 16.0 16.0 15.0 3072.0
3 rpfr5dqSb2ez4vwsj 80122 -104.928185 39.592444 94udkrjMaNKS9ZEJJ 12.0 16.0 10.0 15.0 1920.0
4 RKNEfSFnejQocNKvy 80127 -105.157648 39.580295 sFDGbP3rpny26LDGw 11.0 16.0 14.0 18.0 2464.0
5 rHjFeyvnYkoLSfcaJ 80209 -104.980095 39.716386 CoGZeryPrFpLKxuPP 11.0 16.0 14.0 18.0 2464.0
6 jiohtfbTbbbLQG3Ev 80210 -104.976056 39.686399 vMXZu6L84YkhwrsHq 11.0 16.0 14.0 18.0 2464.0
7 YLkzeu9bqGhpahN7y 80211 -105.015631 39.770947 XpdMYs6Rc37sj7jDx 12.0 16.0 16.0 15.0 3072.0
8 zBKpwE7fHZidvdNQn 80219 -105.045399 39.722450 KTWLxBDZxZiTHwRXX 11.0 16.0 14.0 18.0 2464.0
  • Separate the two zip code clusters into two datasets

In [16]:
plt.figure(figsize=(12,12))

sns.scatterplot(x=denver.Longitude, y=denver.Latitude, hue=denver.Weight, s=50)
Out[16]:
<AxesSubplot: xlabel='Longitude', ylabel='Latitude'>
  • The Denver dataset represents a great primer for algorithm runtimes, because it is only 9 nodes.

In [17]:
plt.figure(figsize=(12,12))

sns.scatterplot(x=chicago.Longitude, y=chicago.Latitude, hue=chicago.Weight, s=50)
Out[17]:
<AxesSubplot: xlabel='Longitude', ylabel='Latitude'>
In [18]:
# number of denver packages
den_n = denver.shape[0]
In [19]:
# number of chicago packages
chi_n = chicago.shape[0]
In [20]:
# minimum number of vehicles
den_k = int(round(sum(denver['Demand'])/57024, -1))
chi_k = int(round(sum(chicago['Demand'])/57024, -1))
In [21]:
# average number of stops per vehicle
l = round(57024/np.mean(df['Demand']), -1)
  • Establish variables for number of nodes, minimum number of routes/vehicles, mean nodes per vehicle

In [22]:
with open(f'den{str(den_n)}.tsp', 'w') as f:
    f.write(f'''NAME : den{str(den_n)}
COMMENT : {str(den_n)} locations in Denver
COMMENT : Veho Data Prompt
TYPE : TSP
DIMENSION : {str(den_n)}
EDGE_WEIGHT_TYPE : EUC_2D
NODE_COORD_SECTION
''')
    for i in denver.index:
        f.write(f'''{i+1} {denver.Latitude[i]:0.3f} {denver.Longitude[i]:0.3f}
''')
    f.write('''EOF''')
In [23]:
with open(f'chi{str(chi_n)}.tsp', 'w') as f:
    f.write(f'''NAME : chi{str(chi_n)}
COMMENT : {str(chi_n)} locations in Chicago
COMMENT : Veho Data Prompt
TYPE : TSP
DIMENSION : {str(chi_n)}
EDGE_WEIGHT_TYPE : EUC_2D
NODE_COORD_SECTION
''')
    for i in chicago.index:
        f.write(f'''{i+1} {chicago.Latitude[i]:0.3f} {chicago.Longitude[i]:0.3f}
''')
    f.write('''EOF''')
  • Create TSP files for solver methods.

In [24]:
 # clustering dataset
from sklearn.cluster import KMeans
from sklearn import metrics

x1 = np.array(df['Latitude'])
x2 = np.array(df['Longitude'])


# create new plot and data
X = np.array(list(zip(x1, x2))).reshape(len(x1), 2)
colors = ['b', 'g']
markers = ['o', 'v']

# KMeans algorithm 
K = 2
kmeans_model = KMeans(n_clusters=K).fit(X)

print(kmeans_model.cluster_centers_)
centers = np.array(kmeans_model.cluster_centers_)


plt.figure(figsize=(12,12))
for i, l in enumerate(kmeans_model.labels_):
    plt.plot(x1[i], x2[i], color=colors[l], marker=markers[l],ls='None', zorder=-i)
    
plt.title('k means centroids')
plt.scatter(centers[:,0], centers[:,1], marker="x", color='r', zorder=0)
plt.show()
[[  41.91152571  -87.87113354]
 [  39.70768521 -105.01920752]]
  • Use Kmeans clustering to establish best coordinates for depot locations.

In [25]:
denver.loc[-1] = ['Depot', 'depot', centers[1,1], centers[1,0], 'none', 0.0, 0.0, 0.0, 0.0, 0.0]  # adding depot
denver.index = denver.index + 1  # shifting index
denver = denver.sort_index()  # sorting by index
In [26]:
chicago.loc[-1] = ['Depot', 'depot', centers[0,1], centers[0,0], 'none', 0.0, 0.0, 0.0, 0.0, 0.0]  # adding depot
chicago.index = chicago.index + 1  # shifting index
chicago = chicago.sort_index()  # sorting by index
In [27]:
with open(f'DEN-n{str(den_n+1)}-k{den_k+1}.vrp', 'w') as f:
    f.write(f'''NAME : DEN-n{str(den_n+1)}-k{den_k+1}
COMMENT : (Denver, Min no of trucks: {den_k+1}, Optimal value: {den_k+1})
TYPE : CVRP
BEST_KNOWN: {den_k+1}
DIMENSION : {str(den_n+1)}
EDGE_WEIGHT_TYPE : GEO
CAPACITY : 57024
NODE_COORD_SECTION
''')
    for i in denver.index:
        f.write(f'''{i} {denver.Latitude[i]:0.3f} {denver.Longitude[i]:0.3f}
''')
    f.write('''DEMAND_SECTION
''')
    for i in denver.index:
        f.write(f'''{i} {denver.Demand[i]:0.3f}
''')
    f.write('''DEPOT_SECTION
''')
    f.write('''0
0
''')
    f.write('''EOF''')
In [28]:
with open(f'CHI-n{str(chi_n+1)}-k{chi_k}.vrp', 'w') as f:
    f.write(f'''NAME : CHI-n{str(chi_n+1)}-k{chi_k}
COMMENT : (Chicago, Min no of trucks: {chi_k}, Optimal value: {chi_k})
TYPE : CVRP
BEST_KNOWN: {chi_k}
DIMENSION : {str(chi_n+1)}
EDGE_WEIGHT_TYPE : GEO
CAPACITY : 57024
NODE_COORD_SECTION
''')
    for i in chicago.index:
        f.write(f'''{i} {chicago.Latitude[i]:0.3f} {chicago.Longitude[i]:0.3f}
''')
    f.write('''DEMAND_SECTION
''')
    for i in chicago.index:
        f.write(f'''{i} {chicago.Demand[i]:0.3f}
''')
    f.write('''DEPOT_SECTION
''')
    f.write('''0
0
''')
    f.write('''EOF''')
  • Create VRP files for solver methods.

Solutions¶

TSP: 2-opt¶

The Traveling Salesman Problem is a well known challenge in Computer Science: it consists on finding the shortest route possible that traverses all cities in a given map only once. Although its simple explanation, this problem is, indeed, NP-Complete. This implies that the difficulty to solve it increases rapidly with the number of cities, and we do not know in fact a general solution that solves the problem. For that reason, we currently consider that any method able to find a sub-optimal solution is generally good enough (we cannot verify if the solution returned is the optimal one most of the times).

2-opt

Among simple local search algorithms, the most famous are 2-Opt and 3-Opt. The 2-Opt algorithm was first proposed by Croes [1958], although the basic move had already been suggested by Flood [1956]. This move deletes two edges, thus breaking the tour into two paths, and then reconnects those paths in the other possible way. The main idea behind it is to take a route that crosses over itself and reorder it so that it does not.

In [29]:
# Calculate the euclidian distance in n-space of the route r traversing nodes c, ending at the path start.
path_distance = lambda r,c: np.sum([np.linalg.norm(c[r[p]]-c[r[p-1]]) for p in range(len(r))])
# Reverse the order of all elements from element i to element k in array r.
two_opt_swap = lambda r,i,k: np.concatenate((r[0:i],r[k:-len(r)+i-1:-1],r[k+1:len(r)]))

def two_opt(nodes,improvement_threshold): # 2-opt Algorithm adapted from https://en.wikipedia.org/wiki/2-opt
    route = np.arange(nodes.shape[0]) 
    improvement_factor = 1 
    best_distance = path_distance(route,nodes) 
    while improvement_factor > improvement_threshold: 
        distance_to_beat = best_distance 
        for swap_first in range(1,len(route)-2):
            for swap_last in range(swap_first+1,len(route)): 
                new_route = two_opt_swap(route,swap_first,swap_last) 
                new_distance = path_distance(new_route,nodes) 
                if new_distance < best_distance: 
                    route = new_route 
                    best_distance = new_distance 
        improvement_factor = 1 - best_distance/distance_to_beat 
    return route 
In [30]:
nodes = np.array([[denver.Latitude[i], denver.Longitude[i]] for i in denver.index])
route = two_opt(nodes,0.001)
In [31]:
plt.figure(figsize=(12,12))
# Reorder the nodes matrix by route order in a new matrix for plotting.
new_nodes_order = np.concatenate((np.array([nodes[route[i]] for i in range(len(route))]),np.array([nodes[0]])))
# Plot the nodes.
plt.scatter(nodes[:,1],nodes[:,0])
# Plot the path.
plt.plot(new_nodes_order[:,1],new_nodes_order[:,0])
plt.show()
# Print the route as row numbers and the total distance travelled by the path.
print("Route: " + str(route) + "\n\nDistance: " + str(path_distance(route,nodes)))
Route: [0 6 7 2 4 5 1 3 8 9]

Distance: 1.1183761343795042

*TSP solvers and .tsp files are traditionally evaluated using 2-dimensional Euclidean distance, which is why the distance above is in degrees and minutes.

**Examine the relationship between NP-Complete problems and Knot Theory.

TSP: Self Organizing Map¶

Self Organizing Map

The original paper released by Teuvo Kohonen in 1998 consists on a brief, masterful description of the technique. In there, it is explained that a self-organizing map is described as an (usually two-dimensional) grid of nodes, inspired in a neural network. Closely related to the map, is the idea of the model, that is, the real world observation the map is trying to represent. The purpose of the technique is to represent the model with a lower number of dimensions, while maintaining the relations of similarity of the nodes contained in it.

To capture this similarity, the nodes in the map are spatially organized to be closer the more similar they are with each other. For that reason, SOM are a great way for pattern visualization and organization of data. To obtain this structure, the map is applied a regression operation to modify the nodes position in order update the nodes, one element from the model () at a time. The expression used for the regression is:

This implies that the position of the node $n$ is updated adding the distance from it to the given element, multiplied by the neighborhood factor of the winner neuron, . The winner of an element is the more similar node in the map to it, usually measured by the closer node using the Euclidean distance (although it is possible to use a different similarity measure if appropriate).

On the other side, the neighborhood is defined as a convolution-like kernel for the map around the winner. Doing this, we are able to update the winner and the neurons nearby closer to the element, obtaining a soft and proportional result. The function is usually defined as a Gaussian distribution, but other implementations are as well. One worth mentioning is a bubble neighborhood, that updates the neurons that are within a radius of the winner (based on a discrete Kronecker delta function), which is the simplest neighborhood function possible.

To use the network to solve the TSP, the main concept to understand is how to modify the neighborhood function. If instead of a grid we declare a circular array of neurons, each node will only be conscious of the neurons in front of and behind it. That is, the inner similarity will work just in one dimension. Making this slight modification, the self-organizing map will behave as an elastic ring, getting closer to the cities but trying to minimize the perimeter of it thanks to the neighborhood function.

Although this modification is the main idea behind the technique, it will not work as is: the algorithm will hardly converge any of the times. To ensure the convergence of it, we can include a learning rate, , to control the exploration and exploitation of the algorithm. To obtain high exploration first, and high exploitation after that in the execution, we must include a decay in both the neighborhood function and the learning rate. Decaying the learning rate will ensure less aggressive displacement of the neurons around the model, and decaying the neighborhood will result in a more moderate exploitation of the local minima of each part of the model. Then, our regression can be expressed as:

Where is the learning rate at a given time, and (g) is the Gaussian function centered in a winner and with a neighborhood dispersion of . The decay function consists on simply multiplying the two given discounts, , for the learning rate and the neighborhood distance.

This expression is indeed quite similar to that of Q-Learning, and the convergence is search in a similar fashion to this technique. Decaying the parameters can be useful in unsupervised learning tasks like the aforementioned ones. It is also similar to the functioning of the Learning Vector Quantization technique, also developed by Teuvo Kohonen.

Finally, to obtain the route from the SOM, it is only necessary to associate a city with its winner neuron, traverse the ring starting from any point and sort the cities by order of appearance of their winner neuron in the ring. If several cities map to the same neuron, it is because the order of traversing such cities have not been contemplated by the SOM (due to lack of relevance for the final distance or because of not enough precision). In that case, any possible ordered can be considered for such cities.

In [32]:
!python ../TSP/som-tsp/src/main.py chi13905.tsp
Problem with 13905 nodes read.
Network of 111240 neurons created. Starting the iterations:
Radius has completely decayed, finishing execution at 38725 iterations
Route found of length 58.826158905022645
In [46]:
!python ../TSP/som-tsp/src/main.py den9.tsp
Problem with 9 nodes read.
Network of 72 neurons created. Starting the iterations:
Radius has completely decayed, finishing execution at 14253 iterations
Route found of length 1.1140670983781675
In [33]:
from PIL import Image
import glob

# Create the frames
frames = []
imgs = glob.glob('diagrams/*.png')
imgs.sort()
for i in imgs:
    image=Image.open(i)
    non_transparent=Image.new('RGBA',image.size,(255,255,255,255))
    non_transparent.paste(image,(0,0),image)
    frames.append(non_transparent)

# Save into a GIF file that loops forever
frames[0].save('chicago.gif', format='GIF',
               append_images=frames[1:],
               save_all=True,
               duration=500, loop=0)

Not only is the SOM beautiful to watch, it is also surprisingly efficient. It's able to find a suboptimal solution for tens of thousands of nodes and millions of parameters in less than a minute.

Because of it's reward system, or more simply put, laws of attraction. It's able to pass the geographical test of the Chicago River better than most. I would like to explore this algo in tandem with 2-opt and 3-opt more closely.

In [34]:
from networkx import DiGraph
from vrpy import VehicleRoutingProblem
import pulp
import itertools
import sklearn
from sklearn.metrics.pairwise import haversine_distances
In [35]:
def matrix(df):
    
    df[['long_radians','lat_radians']] = (
        np.radians(df.loc[:,['Longitude', 'Latitude']])
    )

    dist_matrix = haversine_distances(df[['lat_radians','long_radians']], df[['lat_radians','long_radians']])*3959
    
    # Note that 3959 is the radius of the earth in miles
    df_dist_matrix = (
        pd.DataFrame(dist_matrix,index=df.index, 
                     columns=df.index)
    )
    
    return df_dist_matrix

def melt(matrix):
    
    df_dist_long = (
        pd.melt(matrix, ignore_index=False)
    )

    df_dist_long = df_dist_long.sort_values(by=['value']).reset_index()

    df_dist_long = df_dist_long.rename(columns={'index':'start', 'variable': 'end', 'value':'miles'})

    return df_dist_long
In [36]:
den_mat = matrix(denver)
den_mat
Out[36]:
0 1 2 3 4 5 6 7 8 9
0 0.000000 12.192951 11.244259 7.081326 9.319796 11.477789 2.164204 2.725219 4.375331 1.725953
1 12.192951 0.000000 22.170626 5.133432 21.501273 18.774433 13.057837 14.751565 9.125581 10.571982
2 11.244259 22.170626 0.000000 17.523909 7.529449 18.943161 9.506344 8.651179 13.048077 12.886431
3 7.081326 5.133432 17.523909 0.000000 16.373689 14.686749 8.139617 9.695053 4.709459 5.440164
4 9.319796 21.501273 7.529449 16.373689 0.000000 12.247946 8.998379 6.973890 13.181603 10.935112
5 11.477789 18.774433 18.943161 14.686749 12.247946 0.000000 13.329045 12.129858 15.185133 11.495480
6 2.164204 13.057837 9.506344 8.139617 8.998379 13.329045 0.000000 2.083188 4.216318 3.496032
7 2.725219 14.751565 8.651179 9.695053 6.973890 12.129858 2.083188 0.000000 6.209079 4.449094
8 4.375331 9.125581 13.048077 4.709459 13.181603 15.185133 4.216318 6.209079 0.000000 3.705414
9 1.725953 10.571982 12.886431 5.440164 10.935112 11.495480 3.496032 4.449094 3.705414 0.000000
In [37]:
den_long = melt(den_mat)
den_long
Out[37]:
start end miles
0 0 0 0.000000
1 8 8 0.000000
2 7 7 0.000000
3 6 6 0.000000
4 5 5 0.000000
... ... ... ...
95 2 5 18.943161
96 1 4 21.501273
97 4 1 21.501273
98 2 1 22.170626
99 1 2 22.170626

100 rows × 3 columns

In [38]:
chi_mat = matrix(chicago)
chi_mat
Out[38]:
0 1 2 3 4 5 6 7 8 9 ... 13899 13900 13901 13902 13903 13904 13905 13906 13907 13908
0 0.000000 15.893235 15.886031 14.381582 14.312954 14.462854 12.715274 13.858960 17.760746 14.752824 ... 15.776456 16.048393 16.043210 15.288775 15.385240 15.408879 16.164806 15.954023 20.550840 20.368670
1 15.893235 0.000000 0.549348 2.746167 1.777750 1.588889 3.347681 2.371294 1.912546 1.623212 ... 31.520785 31.754795 31.761233 30.956729 31.077140 31.152491 31.832171 31.593784 36.194259 36.050558
2 15.886031 0.549348 0.000000 2.323587 1.600117 1.871530 3.220005 2.146346 1.882246 1.294533 ... 31.561915 31.802504 31.807056 31.009598 31.126563 31.193489 31.885909 31.651216 36.254804 36.105141
3 14.381582 2.746167 2.323587 0.000000 1.404202 2.843842 1.970107 1.113595 3.943117 1.172021 ... 30.151330 30.414134 30.412123 29.641138 29.745760 29.783308 30.518400 30.297257 34.902248 34.731942
4 14.312954 1.777750 1.600117 1.404202 0.000000 1.440686 1.620413 0.602531 3.479594 0.539163 ... 30.015869 30.261601 30.264638 29.473117 29.587203 29.647417 30.349953 30.118458 34.723772 34.569047
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
13904 15.408879 31.152491 31.193489 29.783308 29.647417 29.654830 28.072661 29.220834 33.048939 30.107277 ... 0.368453 0.718123 0.671162 0.700243 0.484407 0.000000 1.004424 1.044745 5.192472 4.975159
13905 16.164806 31.832171 31.885909 30.518400 30.349953 30.322083 28.784991 29.933675 33.733758 30.816338 ... 0.796354 0.359769 0.466439 0.877334 0.796911 1.004424 0.000000 0.319360 4.386098 4.219272
13906 15.954023 31.593784 31.651216 30.297257 30.118458 30.080368 28.556705 29.705407 33.496666 30.586794 ... 0.944483 0.584881 0.692115 0.690723 0.694394 1.044745 0.319360 0.000000 4.606220 4.458977
13907 20.550840 36.194259 36.254804 34.902248 34.723772 34.677037 33.162840 34.311540 38.098306 35.192701 ... 4.834557 4.516825 4.533542 5.262629 5.168178 5.192472 4.386098 4.606220 0.000000 0.420925
13908 20.368670 36.050558 36.105141 34.731942 34.569047 34.538724 33.003202 34.151859 37.952577 35.034965 ... 4.611800 4.320340 4.327944 5.095934 4.986189 4.975159 4.219272 4.458977 0.420925 0.000000

13909 rows × 13909 columns

In [39]:
chi_long = melt(chi_mat)
chi_long
Out[39]:
start end miles
0 0 0 0.000000
1 7393 7393 0.000000
2 7394 7394 0.000000
3 7395 7395 0.000000
4 7396 7396 0.000000
... ... ... ...
193460276 5381 2428 81.118902
193460277 2430 5381 81.169669
193460278 5381 2430 81.169669
193460279 5381 2427 81.516145
193460280 2427 5381 81.516145

193460281 rows × 3 columns

3D Bin Packing¶

Because the SOM gave us a suboptimal solution to our routing problem, we can now iterate through the route one truck at a time to see the minimum amount of trucks at capacity it will take to complete the route. Unfortunately, the 3D binning algorithm we are using is O(n)2 at best, and O(n)3 at worst. This could be improved by parallel processing at some point, if not by simply improving the algorithm.

In [47]:
with open('route.txt', 'r') as f:
    route = [line.split()[0] for line in f.readlines()]
    dist = 0
    for a, b in zip(route, route[1:]):
        dist += den_mat[int(a)][int(b)]
    
print(dist, 'miles')
87.25918901434022 miles
In [48]:
from py3dbp import Packer, Bin, Item
from collections import deque

queue = deque(route)
packer = Packer()
number = 1

while len(queue) > 0:
    b = Bin(f'Route {number}', 38.49, 38.49, 38.49, 500000)
    packer.add_bin(b)
    if len(b.unfitted_items) == 0:
        i = queue.popleft()
        packer.add_item(Item(i, denver.loc[int(i)]['Width'], denver.loc[int(i)]['Length'], denver.loc[int(i)]['Height'], chicago.loc[int(i)]['Weight']))
        packer.pack()
    else:
        number += 1
In [49]:
b = packer.bins[-1]
print(":::::::::::", b.string())

print("FITTED ITEMS:")
dist = 0
for i,j in zip(b.items, b.items[1:]):
    dist += den_mat[int(i.string()[0])][int(j.string()[0])]
    print("====> ", j.string()[0], ':', dist, 'miles')

print("***************************************************")
print("***************************************************")
::::::::::: Route 1(38.490x38.490x38.490, max_weight:500000.000) vol(57022.169)
FITTED ITEMS:
====>  4 : 9.319795891390228 miles
====>  2 : 16.849244972102095 miles
====>  7 : 25.500423622076333 miles
====>  5 : 37.630281720837175 miles
====>  6 : 50.95932676490349 miles
====>  1 : 64.01716410343339 miles
====>  8 : 73.1427452215115 miles
====>  3 : 77.85220417031624 miles
***************************************************
***************************************************

CVRP: Greedy, Savings, and Pulp Optimization¶

Greedy

Some authors use the name Greedy for Nearest Neighbor, but it is more appropriately reserved for the following special case of the ‘‘greedy algorithm’’ of matroid theory. In this heuristic, we view an instance as a complete graph with the cities as vertices and with an edge of length d(c i,c j) between each pair {c i,c j} of cities. A tour is then simply a Hamiltonian cycle in this graph, i.e., a connected collection of edges in which every city has degree 2. We build up this cycle one edge at a time, starting with the shortest edge, and repeatedly adding the shortest remaining available edge, where an edge is available if it is not yet in the tour and if adding it would not create a degree-3 vertex or a cycle of length less than N. (In view of the intermediate partial tours typically constructed by this heuristic, it is called the multi-fragment heuristic by Bentley [1990a,1992]). The Greedy heuristic can be implemented to run in time Θ(N2 logN) and is thus somewhat slower than NN. On the other hand, its worst-case tour quality may be somewhat better. As with NN, it can be shown that Greedy(I)/OPT (I) ≤ ( 0. 5 ) ([log 2N] + 1 ) for all instances I obeying the triangle inequality [Moore, 1984], but the worst examples known for Greedy only make the ratio grow as ( logN)/( 3 log log N) [Frieze,1979].

Clarke-Wright

The Clarke-Wright savings heuristic (Clarke-Wright or simply CW for short) is derived from a more general vehicle routing algorithm due to Clarke and Wright [1964]. In terms of the TSP, we start with a pseudo-tour in which an arbitrarily chosen city is the hub and the salesman returns to the hub after each visit to another city. (In other words, we start with a multigraph in which every non-hub vertex is connected by two edges to the hub). For each pair of non-hub cities, let the savings be the amount by which the tour would be shortened if the salesman went directly from one city to the other, bypassing the hub. We now proceed analogously to the Greedy algorithm. We go through the non-hub city pairs in non-increasing order of savings, performing the bypass so long as it does not create a cycle of non-hub vertices or cause a non-hub vertex to become adjacent to more than two other non-hub vertices. The construction process terminates when only two non-hub cities remain connected to the hub, in which case we have a true tour. As with Greedy, this algorithm can be implemented to run in time Θ(N 2 logN). The best performance guarantee currently known (assuming the triangle inequality) is CW(I)/OPT (I) ≤ [log 2N] + 1 (a factor of 2 higher than that for Greedy) [Ong & Moore, 1984], but the worst examples known yield the same ( logN)/( 3 log log N) ratio as obtained for Greedy [Frieze, 1979].

In [50]:
D = DiGraph()

for i in range(len(den_long)):
    a = den_long.start[i]
    b = den_long.end[i]
    c = den_long.miles[i]
    if a == 0:
        a = 'Source'
    if b == 0:
        b = 'Sink'
    D.add_edge(a,b,cost=c)

for i in range(1, len(denver)):
    D.nodes[i]['demand'] = denver['Demand'][i]
    
prob = VehicleRoutingProblem(D, load_capacity=57024)

prob.solve()
prob.best_routes
INFO:vrpy.vrp:new upper bound : max num stops = 11
INFO:vrpy.vrp:Clarke & Wright solution found with value 69.9542163781727 and 2 vehicles
INFO:vrpy.vrp:Greedy solution found with value 73.1532298813241 and 1 vehicles
INFO:vrpy.vrp:iteration 0, 69.954
INFO:vrpy.vrp:iteration 1, 69.954
INFO:vrpy.vrp:iteration 2, 69.954
INFO:vrpy.vrp:iteration 3, 69.954
INFO:vrpy.vrp:iteration 4, 69.954
INFO:vrpy.vrp:iteration 5, 69.954
INFO:vrpy.vrp:iteration 6, 69.954
INFO:vrpy.vrp:iteration 7, 69.954
INFO:vrpy.vrp:iteration 8, 69.954
INFO:vrpy.vrp:iteration 9, 69.954
INFO:vrpy.vrp:iteration 10, 69.954
INFO:vrpy.vrp:iteration 11, 69.954
INFO:vrpy.vrp:iteration 12, 69.954
INFO:vrpy.vrp:iteration 13, 69.954
INFO:vrpy.vrp:iteration 14, 69.954
INFO:vrpy.vrp:iteration 15, 69.954
INFO:vrpy.vrp:iteration 16, 68.835
INFO:vrpy.vrp:iteration 17, 68.835
INFO:vrpy.vrp:iteration 18, 68.835
INFO:vrpy.vrp:iteration 19, 68.835
INFO:vrpy.vrp:iteration 20, 68.434
INFO:vrpy.vrp:iteration 21, 68.434
INFO:vrpy.vrp:iteration 22, 68.434
INFO:vrpy.vrp:iteration 23, 68.434
INFO:vrpy.vrp:iteration 24, 67.630
INFO:vrpy.vrp:iteration 25, 67.630
INFO:vrpy.master_solve_pulp:total cost = 67.63099940737503
Out[50]:
{1: ['Source', 5, 4, 2, 7, 6, 8, 1, 3, 9, 'Sink']}
In [ ]:
C = DiGraph()

for i in range(len(chi_long)):
    a = chi_long.start[i]
    b = chi_long.end[i]
    c = chi_long.miles[i]
    if a == 0:
        a = 'Source'
    if b == 0:
        b = 'Sink'
    D.add_edge(a,b,cost=c)

for i in range(1, len(chicago)):
    D.nodes[i]['demand'] = chicago['Demand'][i]
    
prob = VehicleRoutingProblem(C, load_capacity=57024)

prob.solve()
prob.best_routes

CVRP: Christofides, Generalized Assignment, Sweep, 3-opt, Parallel and Sequential Savings, Insertion, Nearest Neighbor, et cetera¶

In [54]:
!python ../VRP/VRP.py -a all DEN-n10-k1.vrp
Solving DEN-n10-k1 with CW64-PS

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 3, 1, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 77.11672962049798


Solving DEN-n10-k1 with RT79-CAWLIP

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with Ga67-PS|pi+lamda

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 2, 4, 7, 6, 9, 8, 3, 1, 5, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 85.28389787534057


Solving DEN-n10-k1 with We64-SS

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 5, 4, 2, 8, 1, 3, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 80.72042349303273


Solving DEN-n10-k1 with Pa88-PS|G2P

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with HP76-PS|IMS

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 3, 1, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 77.11672962049798


Solving DEN-n10-k1 with vB94-SI

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 1, 3, 8, 9, 6, 4, 2, 7, 5, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 88.14637617371473


Solving DEN-n10-k1 with MJ76-INS

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with vB94-PI

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 1, 3, 8, 9, 6, 4, 2, 7, 5, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 88.14637617371473


Solving DEN-n10-k1 with DV89-MM
Restricted license - for non-production use only - expires 2024-10-28
ERROR: Failed to solve DEN-n10-k1 with mbsa because not enough values to unpack (expected 2, got 0)
Solving DEN-n10-k1 with CMT79-2P

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with vB95-SNN

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 2, 4, 5, 1, 3, 8, 9, 6, 7, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 81.79283788798156


Solving DEN-n10-k1 with vB95-PNN

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 2, 4, 5, 1, 3, 8, 9, 6, 7, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 81.79283788798156


Solving DEN-n10-k1 with Ty68-NN

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with Sweep

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 5, 4, 7, 2, 6, 8, 1, 3, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 84.63887110399507


Solving DEN-n10-k1 with WH72-SwLS

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with GM74-SwRI

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with Be83-RFCS

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 0, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 2
SOLUTION LENGTH: 80.60808351113056


Solving DEN-n10-k1 with FJ81-GAP

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with FR76-1PTL

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789


Solving DEN-n10-k1 with SG84-LR3OPT

SIZE: 10
CAPACITY: 57024
DISTANCE: None
SERVICE_TIME: None

SOLUTION: [0, 6, 7, 2, 4, 5, 1, 3, 8, 9, 0]
FEASIBLE: True
SOLUTION K: 1
SOLUTION LENGTH: 76.79794263555789




instance	N	K*	C	tightness	L	st	Be83-RFCS	CMT79-2P	CW64-PS	DV89-MM	FJ81-GAP	FR76-1PTL	GM74-SwRI	Ga67-PS|pi+lamda	HP76-PS|IMS	MJ76-INS	Pa88-PS|G2P	RT79-CAWLIP	SG84-LR3OPT	Sweep	Ty68-NN	WH72-SwLS	We64-SS	vB94-PI	vB94-SI	vB95-PNN	vB95-SNN
DEN-n10-k1	9	1	57024	0.408	None	None	 80.60808351113056	 76.79794263555789	 77.11672962049798	 inf	 76.79794263555789	 76.79794263555789	 76.79794263555789	 85.28389787534057	 77.11672962049798	 76.79794263555789	 76.79794263555789	 76.79794263555789	 76.79794263555789	 84.63887110399507	 76.79794263555789	 76.79794263555789	 80.72042349303273	 88.14637617371473	 88.14637617371473	 81.79283788798156	 81.79283788798156
In [ ]:
!python ../VRP/VRP.py -a all CHI-n13906-k450.vrp

Implemented Heuristics with References¶

2o : Croes & Flood (1958) two phase heuristic.

CF58-2O: G. A. CROES (1958). A method for solving traveling salesman problems. Operations Res. 6 (1958), pp., 791-812.

and

M. M. FLOOD (1956). The traveling-salesman problem. Operations Res. 4 (1956), pp., 61-75.

cmt : Christofides, Mingozzi & Toth (1979) two phase heuristic.

CMT79-2P: Christofides, N., Mingozzi, A., and Toth, P. (1979). The vehicle routing problem. In Christofides, N., Mingozzi, A., Toth, P., and Sandi, C., editors, Combinatorial Optimization, chapter 11, pages 315-338. Wiley.

gap : Fisher & Jaikumar (1981) generalized assignment problem (GAP) heuristic. Requires Gurobi.

FJ81-GAP: Fisher, M. L. and Jaikumar, R. (1981). A generalized assignment heuristic for vehicle routing. Networks, 11(2):109-124.

gm : Gillett & Miller (1974) Sweep algorithm with emering route improvement step.

swp : Sweep algorithm without any route improvement heuristics.

GM74-SwRI: Gillett, B. E. and Miller, L. R. (1974). A heuristic algorithm for the vehicle-dispatch problem. Operations Research, 22(2):340-349.

gpl : Parallel savings algorithm with Gaskell (1967) with the 𝜋 and 𝜆 criteria.

Ga67-PS|𝜋𝜆: Gaskell, T. (1967). Bases for vehicle fleet scheduling. Journal of the Operational Research Society, 18(3):281-295.

gps : Paessens (1988) parametrized parallel savings algorithm.

Pa88-PS|G2P: Paessens, H. (1988). The savings algorithm for the vehicle routing problem. European Journal of Operational Research, 34(3):336-344.

ims : Holmes & Parker (1976) parallel savings supression algorithm.

HP76-PS|IMS: Holmes, R. and Parker, R. (1976). A vehicle scheduling procedure based upon savings and a solution perturbation scheme. Journal of the Operational Research Society, 27(1):83-92.

lr3o : Stewart & Golden (1984) 3-opt* heuristic with Lagrangian relaxations.

SG84-LR3OPT: Stewart, W. R. and Golden, B. L. (1984). A Lagrangean relaxation heuristic for vehicle routing. European Journal of Operational Research, 15(1):84-88.

mbsa : Desrochers and Verhoog (1989) maximum matching problem solution based savings algorithm. Requires Gurobi.

DV89-MM: Desrochers, M. and Verhoog, T. W. (1989). G-89-04 : A matching based savings algorithm for the vehicle routing problem. Technical report, GERAD, Montreal, Canada.

mj : Mole & Jameson (1976) sequential cheapest insertion heuristic with a route improvement phase.

MJ76-INS: Mole, R. and Jameson, S. (1976). A sequential route-building algorithm employing a generalised savings criterion. Journal of the Operational Research Society, 27(2):503-511.

ps : Clarke & Wright (1964) parallel savings algorithm.

CW64-PS:Clarke, G. and Wright, J. W. (1964). Scheduling of vehicles from a central depot to a number of delivery points. Operations Research, 12(4):568-581.

ptl : Foster & Ryan (1976) Petal set covering algorithm. Requires Gurobi.

FR76-1PTL: Foster, B. A. and Ryan, D. M. (1976). An integer programming approach to the vehicle scheduling problem. Journal of the Operational Research Society, 27(2):367-384.

pi, vB94-PI : parallel insertion heuristic as described by van Breedam (1994, 2002).

pn, vB94-PNN : Parallel Nearest Neighbor construction heuristic.

si, vB94-SI : Sequential cheapest insertion heuristic without local search (van Breedam 1994, 2002).

sn, vB94-SNN : Sequential Nearest Neighbor construction heuristic as described by van Breedam (1994).

vB94: Van Breedam, A. (1994). An Analysis of the Behavior of Heuristics for the Vehicle Routing Problem for a Selectrion of Problems with Vehicle-related, Customer-related, and Time-related Constraints. PhD thesis, Faculty of Applied Economics, University of Antwerp, Belgium - RUCA.

and

Van Breedam, A. (2002). A parametric analysis of heuristics for the vehicle routing problem with side-constraints. European Journal of Operational Research, 137(2):348-370.

rfcs : Route-first-cluster-second heuristic of Beasley (1983).

Be83-RFCS: Beasley, J. E. (1983). Route first - cluster second methods for vehicle routing. Omega, 11(4):403-408.

ss : Webb (1964) sequential savings algorithm.

We64-SS: Webb, M. (1964). A study in transport routing. Glass Technology, 5:178-181.

ty : Tyagi (1968) Nearest Neighbor construction heuristic.

Ty68-NN: Tyagi, M. S. (1968). A practical method for the truck dispatching problem. Journal of the Operations Research Society of Japan, 10:76-92.

wh : Sweep heuristic with Wren and Holliday (1972) improvement procedures.

WH72-SwLS: Wren, A. and Holliday, A. (1972). Computer scheduling of vehicles from one or more depots to a number of delivery points. Journal of the Operational Research Society, 23(3):333-344.

TO DO¶

  • Kmeans Constrained Clustering on Chicago data.

  • Visualize 3d binning.

  • Visualize 3-opt algorithm.

In [ ]: