Skip to content

nicholasharris/Distributed-Genetic-Algorithm-with-Google-Sheets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Distributed-Genetic-Algorithm-with-Google-Sheets

A template for a general-purpose distributed Genetic Algorithm using python 3 and Google Sheets API. The code uses my implementation of Markov Network Brains (from https://github.com/nicholasharris/Markov-Brains-Python ) as the stand-in GA, but any custom GA can be used in place of this, so long as the chromosomes of the population can be easily encoded/decoded to strings and can be written to the Google Sheet cells.

The code consists of a master process and a slave process. One master process should be used, which writes the genomes to the Google sheet, waits for all genomes to be evaluated, and then executes the ordinary GA operations to create a new generation. Meanwhile multiple slave processes should be running that divide up the chromosomes in the population amongst themselves, read them in from the sheet, evaluate them according to your problem domain, and write their fitnesses to the sheet for the master process.

This project serves as a very high-level implementation of a distributed Genetic Algorithm. Despite the simplicity of the approach, I found very significant speedup using this code over a single python-implemented GA, even on a single computer running 1 master and 5 slave processes. Additionally, this code makes utilizing more computers very easy as there is no need to have them on the same network; the new computer only needs to know the API key for the common sheet you're using for reading/writing.

I found there is a significant amount of inefficiency in writing/reading to the sheet, losing up to a minute in the process per generation if the populaiton is large and the chromosomes are very long. However, on many problem domains the amount of time needed to evaluate the chromosomes is by far the dominant time factor in the running of the GA; in these common situations the ineffiencies here may constitute only a few percent of the run time for a generation, in which case they are insignificant. These inefficiencies may also be improved upon through experimentation.

TO RUN: first go through https://developers.google.com/sheets/api/quickstart/python to enable Google Sheets API. Create a sheet, and use its sheet ID in the code of each process so they are both aware of the common sheet to use. The master process runs with no command line arguments. The slave process runs with one command line argument, the start index. This represents the index on the google sheet it starts reading from. So for a population of 500 genomes, with 5 processes, one process should be launched with argument 0, the next 100, then 200, 300, and 400. (The default size chunk of genomes to grab is 100; this can be changed in the slave process code if you wish, then just increment by the new amount you define instead.) A different system of apportionment can probably be created that better suits your problem domain, but this default system implmented for you at least lets you get started and can be modified easily. Then it's just a matter of pasting in your own problem domain, and you're done! Easy, scalable, distributed genetic algorithm with no difficult-to-use libraries or difficult installation. You can also easily replace my Markov Brain code with your own GA implemenation so long as you can write the chromosomes to the sheet.

Tested on python 3.6 and Windows 10.

About

A template for a general-purpose ditributed genetic algorithm using python 3 and Google Sheets API.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages