On the Quality of Research Publications
I spent the last week-end to review a paper for the journal Expert Systems with Applications. It was a paper on a variant of Spider Monkey Optimization, which is in the same spirit as differential evolution or particle swarm optimization. While the manuscript was relatively interesting in itself, and there was definitely some non-trivial amount of work behind it, it was riddled with errors.
February 28, 2018
I spent the last week-end to review a paper for the journal Expert Systems with Applications. It was a paper on a variant of Spider Monkey Optimization, which is in the same spirit as differential evolution or particle swarm optimization. Yes, it could be added to the list of esoteric optimizers at the end of this Quants R Us post.
While the manuscript was relatively interesting in itself, and there was definitely some non-trivial amount of work behind it, it was riddled with errors: in the equations, in the algorithms, in the text, in the examples. Some were easy to spot, others not so much. One of the examples is the “Design of Coil Springs” originally presented in J. Arora book Introduction to Optimum Design (nice book, which can also have applications in finance: optimum design and optimum collateral allocation are not very different). In various papers ([Mezura-Montes & Coello] (https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0ahUKEwiy7u_sqc3ZAhUBPBQKHUHBAogQFgg2MAE&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.324.3199%26rep%3Drep1%26type%3Dpdf&usg=AOvVaw2zY-0YnxhRwvHtQUcxAfME), Cagnina, Liu et al.) the problem is referred as “Minimization of the Weight of a Tension/Compression String”. I did not know it was some some sort of standard example, used in many papers on optimization. The equations are relatively simple. The problem is to minimize $$ f(x_1,x_2,x_3) = (x_3+2)x_2 x_1^2 $$ subject to $$ g_1(x_1,x_2,x_3) = 1 - \frac{x_2^3 x_3}{71785 x_1^4} \leq 0 $$ $$ g_2(x_1,x_2,x_3) = \frac{4 x_2^2 - x_1 x_2}{12566 (x_2 x_1^3 - x_1^4)}+\frac{1}{5108 x_1^2} -1 \leq 0 $$ $$ g_3(x_1,x_2,x_3) = 1 - \frac{140.45 x_1}{x_2^2 x_3} \leq 0 $$ $$ g_4(x_1,x_2,x_3) = \frac{x_1+x_2}{1.5}-1 \leq 0$$ with \( 0.05 \leq x_1 \leq 2, 0.25 \leq x_2 \leq 1.3, 2 \leq x_3 \leq 15 \)
One of the errors was to use the constant 7178 instead of 71785 in \( g_1 \). Another error is that they don’t have the factor 12566 in front of \(x_1^4\) in \(g_2\). More worryingly, the same errors are actually present in Cagnina’s paper. It is not clear then which problem they solve: did those papers implement a different objective function than the standard problem from other papers or not? If they do, their solutions can obviously not be compared.
The manuscript mentions the solution of at least 10 other papers (each a different optimization) to this problem, and make it look that their solution is best (they removed the better solutions from Cagnina or Liu from their table). Worse, a very simple differential evolution optimizer I coded led to the optimal solution (very close to Cagnina’s solution - it can not be a coincidence, he must have used the correct problem in the code and translated it wrong on paper).
f=1.26652328e-02 at (0.051689, 0.356718, 11.288961).
If I use the problem as stated in their manuscript or Cagnina, the solution I find is then much better. So it suggests, that, in reality, the person who wrote the problem in the paper copied all the mistakes from Cagnina’s paper, but not the person who actually coded the algorithm, which is a bit strange. Also while it is good that they find the correct minimum, the authors don’t specify the number of iterations used, so it is very difficult to assess the efficiency of the proposed algorithm.
On another very similar problem coming from the same book, the “Pressure Vessel Problem”, Cagnina is the only one to provide the correct bounds \(0.0625 \leq x_1\) but mispecifies \(g_3\) compared to the other paper (an additional square in \(x_4\)). Other papers provide bounds such as \(1 \leq x_1 \), while their optimal solution lies below 1. The confusion in the bounds stems from the use of the unit (inches) which is not understood correctly by many papers - still it does not seem to shock anybody that their optimal solution is not in the quoted solution space.
I thought quantitative finance was not always strict on quality, but it looks much stricter than the various journals on optimization.
I have to do a bit of mea culpa here: there are also typo errors in my book. I recently added an erratum page on the website, thanks to Liam Henry.