http://www.c-sharpcorner.com/UploadFile/mgold/GAAnagramSolver03262006194953PM/GAAnagramSolver.aspx
Is a great example.
This particular example is using genetic algorithms to solve anagrams (similar use to solving a grouping of letters in scrabble).
This example shows a lack of understanding of what their strong and weak points are (especially when they discuss using genetic algorithms to find the best possible answer based upon a scoring system based on the letters used such as in scrabble).
I am going to give here and then detail the three major reasons
1) A Brute Force Search is reasonable (say at a maximum 20 unique characters? one could quickly loop through these)
2) The search is non-deterministic
3) It has a yes or no scoring system
The search space is definately reasonable a large word may have a few thousand permutations but we are not talking about significantly large search spaces. Worse is that since we are using a genetic search our search is non-deterministic, we quite often repeat previous positions (see listing 3 from article)
Listing 3 - Output of the console showing all anagrams for the word rose:
|
Genome 0: ores --> 4
Genome 1: rose --> 4
Genome 2: rose --> 4
Genome 3: sore --> 4
Genome 4: ores --> 4
Genome 5: ores --> 4
Genome 6: sore --> 4
Genome 7: roes --> 4
Genome 8: rose --> 4
Genome 9: rose --> 4 |
“The fitness function turns out to be simple in this case because we can only accept a perfect fitness.“
The biggest problem I have with this particular usage of genetic algorithms is that their largest benefit is in fact being ignored. Genetic algorithms are wonderful at performing a task called hill climbing (that is manipulating themselves slightly to see if the manipulation does slightly better, if so keeping it). The big problem is that the scoring system either returns “correct“ or “incorrect“ there are no gradients of correctness, as such the genetic algorithms have no hill to climb and become nothing more than a random permutation search (I would actually wager the random permutation search would do better in the long run due to less likelyhood of duplication).
In order to better assess this, let us presume that it does not duplicate at all through some magical mechansim; most likely a very large in memory cache it would have to hit but we will forget about that overhead and focus on the search itself. The key to identifying its worth in comparison to a random search would have to be a proof that it is more likely to find another random word by transposing a character in a known good word. For example it would be saying that it is more likely to find a another word by transposing two characters in rose than it would be to just randomly generate a new set of characters (I have yet to see a proof that states it to be more likely, have you?)
As such we are left with a random search.
The alternative classic solution would be a brute force search which would actually work quite well here in that it is
1) Deterministic
2) much less likely to duplicate (could even be done to not duplicate at all)