~ubuntu-branches/ubuntu/quantal/deap/quantal

As part of the DEAP framework, EAP offers an easy DTM integration. As the EAP algorithms use a map function stored in the toolbox to spawn the individuals evaluations (by default, this is simply the traditional Python :func:`map`), the parallelization can be made very easily, by replacing the map operator in the toolbox : ::

from deap import dtm

tools.register("map", dtm.map)

Thereafter, ensure that your main code is in enclosed in a Python function (for instance, main), and just add the last line : ::

dtm.start(main)

For instance, take a look at the short version of the onemax. This is how it may be parallelized : ::

from deap import dtm

creator.create("FitnessMax", base.Fitness, weights=(1.0,))

creator.create("Individual", array.array, typecode='b', fitness=creator.FitnessMax)

toolbox = base.Toolbox()

# Attribute generator

toolbox.register("attr_bool", random.randint, 0, 1)

# Structure initializers

toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_bool, 100)

toolbox.register("population", tools.initRepeat, list, toolbox.individual)

def evalOneMax(individual):

return sum(individual),

toolbox.register("evaluate", evalOneMax)

toolbox.register("mate", tools.cxTwoPoints)

toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)

toolbox.register("select", tools.selTournament, tournsize=3)

toolbox.register("map", dtm.map)

def main():

random.seed(64)

pop = toolbox.population(n=300)

hof = tools.HallOfFame(1)

stats = tools.Statistics(lambda ind: ind.fitness.values)

stats.register("Avg", tools.mean)

stats.register("Std", tools.std)

stats.register("Min", min)

stats.register("Max", max)

algorithms.eaSimple(toolbox, pop, cxpb=0.5, mutpb=0.2, ngen=40, stats=stats, halloffame=hof)

logging.info("Best individual is %s, %s", hof[0], hof[0].fitness.values)

return pop, stats, hof

dtm.start(main) # Launch the first task

As one can see, parallelization requires almost no changes at all (an import, the selection of the distributed map and the starting instruction), even with a non-trivial program. This program can now be run on a multi-cores computer, on a small cluster or on a supercomputer, without any changes, as long as those environments provide a MPI implementation.

.. note::

In this specific case, the distributed version would be actually *slower* than the serial one, because of the extreme simplicity of the evaluation function (which takes *less than 0.1 ms* to execute), as the small overhead generated by the serialization, load-balancing, treatment and transfer of the tasks and the results is not balanced by a gain in the evaluation time. In more complex, real-life problems (for instance sorting networks), the benefit of a distributed version is fairly noticeable.

b'\\ No newline at end of file'

Older »