It's Christmas! What better time to work on some programming problems?
Along with the graph radius problem I discussed in my last post, the folks on Reddit posted a "hard" problem dealing with coloring the districts in France. Apparently France is divided into 96 districts similar to states in the U.S. or the counties within U.S. States. The goal of the problem is to assign a color to each district so that no two adjacent districts share the same color, while minimizing the total number of colors used. When coloring political maps, it is important not to have two regions with the same color adjacent to one another to avoid confusion about where the borders lie.
Apparently this sort of problem is known as a graph coloring problem and is subject to the four color theorem. Basically, the theorem states that any configuration of bordering regions like France's Districts or the U.S. states can be colored with 4 or fewer colors without any two adjacent regions having the same color. Adjacency means regions share a common border that is not a point (a corner.). This places an upper limit of 4 on any solution to the problem.
The input given by the problem only deals with a small subset of 8 regions within France:
8
64 40 32 65
65 64 32 31
31 65 32 82 81 11 9
9 31 11 66
66 9 11
40 33 47 32 64
32 40 47 82 31 65 64
11 31 81 34 66 9
In this problem, regions are represented as an adjacency list: a mapping of each region to a list of every region it is adjacent to. For instance, the second row of data means that district 64 is adjacent to districts 40, 32 and 65.
I wasn't sure how to write a good algorithm to find an exact answer to this problem without going out and finding one that someone else wrote, so I started by making a greedy algorithm that starts with the district with the most connections, assigns it a color and continues in this fashion until all districts are colored. It didn't return an optimal solution, even for the small test case.
Since I didn't know an exact solution, I figured I'd come with with a heuristic to make a good guess. I constructed an algorithm to assign colors to all the districts randomly, making sure that bordering districts aren't colored the same and then a a main function to run the random coloring algorithm n times and return the guess that uses the fewest colors.
My code:
import random
import copy
import time
data = [x.strip().split() for x in open("challenge130H.txt").readlines()]
verts = [x[0] for x in data[1:]]
connections = [x[1:] for x in data[1:]]
adjacency_list = {k:v for k,v in zip(verts, connections)}
blank_color_list = {k:v for k,v in zip(verts, [None]*len(verts))}
start=time.time()
#Function assigns a random coloring to the input list.
def random_color(color_list):
remaining = [x for x,y in color_list.items()]
random.shuffle(remaining)
while remaining:
choice = remaining.pop()
connections = adjacency_list[choice]
for x in range(len(adjacency_list)):
bordering = False
for y in connections:
if y not in color_list:
continue
if color_list[y] == x:
bordering = True
break
if bordering:
continue
else:
color_list[choice] = x
break
def check_num_colors(color_list):
return len(set(v for k,v in color_list.items()))
#Produces n random colorings and returns the best one.
def random_coloring_heuristic(n):
best_coloring = []
best_score = len(adjacency_list)
for test in range(n):
clist = copy.copy(blank_color_list)
random_color(clist)
score = check_num_colors(clist)
if score < best_score:
best_score = score
best_coloring = clist
return (best_score, best_coloring)
estimate, coloring_guess = random_coloring_heuristic(100)
print("Number of colors used: ",estimate, coloring_guess)
print(time.time()-start)
Output:
Number of colors used: 3 {'64': 0, '65': 1, '66': 0, '9': 1, '40': 1, '32': 2, '31': 0, '11': 2}
0.04700303077697754 (Runtime)
The random heuristic solved the small test input case with only 100 random color assignments. Someone on the forums also posted a full list of France's departments so I tested my algorithm on that. With 100 assignments, the heuristic's best ordering used 5 colors--a sub-optimal solution given the four color theorem. Bumping n up to 10,000 still gave me an ordering of 5 colors with a running time of 5.5 seconds. At that point I wasn't confident that my solution would every find a 4 color ordering within a reasonable amount of time, but I tried it once more with 50,000 random assignments. This time, it did find a random ordering with 4 colors after about 25 seconds, and succeeded in finding such an ordering in 4 of 5 attempts. It seems a graph with 100 regions is around the limit for my code to find an optimal solution in a reasonable amount of time.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.