3
==========================================
4
DTM + EAP = DEAP : a Distributed Evolution
5
==========================================
7
As part of the DEAP framework, EAP offers an easy DTM integration. As the EAP algorithms use a map function stored in the toolbox to spawn the individuals evaluations (by default, this is simply the traditional Python :func:`map`), the parallelization can be made very easily, by replacing the map operator in the toolbox : ::
10
tools.register("map", dtm.map)
12
Thereafter, ensure that your main code is in enclosed in a Python function (for instance, main), and just add the last line : ::
16
For instance, take a look at the short version of the onemax. This is how it may be parallelized : ::
20
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
21
creator.create("Individual", array.array, typecode='b', fitness=creator.FitnessMax)
23
toolbox = base.Toolbox()
26
toolbox.register("attr_bool", random.randint, 0, 1)
28
# Structure initializers
29
toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_bool, 100)
30
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
32
def evalOneMax(individual):
33
return sum(individual),
35
toolbox.register("evaluate", evalOneMax)
36
toolbox.register("mate", tools.cxTwoPoints)
37
toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)
38
toolbox.register("select", tools.selTournament, tournsize=3)
39
toolbox.register("map", dtm.map)
44
pop = toolbox.population(n=300)
45
hof = tools.HallOfFame(1)
46
stats = tools.Statistics(lambda ind: ind.fitness.values)
47
stats.register("Avg", tools.mean)
48
stats.register("Std", tools.std)
49
stats.register("Min", min)
50
stats.register("Max", max)
52
algorithms.eaSimple(toolbox, pop, cxpb=0.5, mutpb=0.2, ngen=40, stats=stats, halloffame=hof)
53
logging.info("Best individual is %s, %s", hof[0], hof[0].fitness.values)
55
return pop, stats, hof
57
dtm.start(main) # Launch the first task
59
As one can see, parallelization requires almost no changes at all (an import, the selection of the distributed map and the starting instruction), even with a non-trivial program. This program can now be run on a multi-cores computer, on a small cluster or on a supercomputer, without any changes, as long as those environments provide a MPI implementation.
62
In this specific case, the distributed version would be actually *slower* than the serial one, because of the extreme simplicity of the evaluation function (which takes *less than 0.1 ms* to execute), as the small overhead generated by the serialization, load-balancing, treatment and transfer of the tasks and the results is not balanced by a gain in the evaluation time. In more complex, real-life problems (for instance sorting networks), the benefit of a distributed version is fairly noticeable.
b'\\ No newline at end of file'