Particle Swarm Optimization
Some of my findings: a lot of publications. 45000 citations for the most popular one! Which are the ones that matter? Many different standards, but not a single one clearly expressed. Some surprisingly ugly source code.
June 30, 2017
In my previous post, I looked at simulated annealing to calibrate Heston. A less well-known and more fancy global minimizer is the particle swarm optimization (PSO). I stumbled upon it by accident through a youtube presentation from James McCaffrey. He shows a small python algorithm that solves the travelling salesman problem.
Intrigued, I started to read papers on it. It turns out there are a lot of publications on PSO, not always of great quality. Fortunately Google Scholar is my friend and pointed me to the most popular papers. Beside the introductory paper from Kennedy and Eberhart (cited more than 45000 times!), describing the link with the social behavior of a bird flock, the most practical one I could find is “defining a standard for particle swarm optimization” by Bratton and Kennedy.
Defining a simple standard is very much needed as there are a lot of variations around the initial PSO, and it is not clear which one a newcomer in the field should pay attention to. Unfortunatly, in the PSO field, authors seems to like to define their own standard, but not precisely enough so as leave room for many other new standards. For example, there are three other standards from Clerc, not always clear on the appropriate choice of topology, the initial velocity, the behavior when a particle goes beyond the boundaries, the stopping criteria, the choice of a maximum velocity. In Bratton & Kennedy’s standard, the initial velocity is left undefined and almost nothing is said about stopping the algorithm. Even the simple lbest topology (a ring) is not precisely defined: a particle is informed by its two neighbors, but is it informed by itself as well? Clerc seems to believe it has to, but it’s never mentioned in Bratton & Kennedy’s paper.
The book “Swarm intellingence” by Eberhart, Shi & Kennedy helps a lot in clarifying some of those issues, as well as to some extent, the book “particle swarm optimization” from Clerc, but they don’t give a standard.
Then I started to search for some code, hoping to find some sort of reference code for PSO. While there is source code available for some of the “standards” PSO, for example the “standard” 2006 PSO, the “standard” 2007 PSO or the “standard” 2011 PSO, those look quite dirty to me and don’t inspire much confidence at first sight. The 2006 version is the cleanest but it does not follow the standard PSO 2007 paper: the topology is not lbest, it does not explicitely use the constricted formulation…
The two pieces of code I found the cleanest were on github, a simple python implementation with support for constraints, and a C implementation giving a choice between different topologies and using position clamping.
This is a great contrast with differential evolution, where the paper from Price and Storn contains a decent, precise C code. And their book gives in depth analysis of the algorithm, with explicit, modular variations.
All of this sounds negative, but I actually managed to make PSO work decently well on the problem of calibrating Heston. This will be the subject of another blog post.
Beside becoming more familiar with PSO, I discovered an interesting fact: the normal distribution “corresponds” to a uniform distribution of points on a sphere.