## "Spiraling in on Speed-Ups of Genetic Algorithm Solvers for Coupled Non-Linear ODE System Parameterization and DNA Code Word Library Synthesis"

Dan Burns, Kevin May, Dr. Thomas Renz, and Virginia Ross

AFRL/IFTC

:AbstractThis paper reports comparative results and lessons learned by developing Genetic Algorithm (GA) based solvers for two test cases involving hard optimization problems that were obtained by treading the paths from MatLab to C on a PC, from C on a PC to C and MPI on a cluster in one case, and from C on a PC to VHDL on a heterogeneous cluster in another case. Actual speedup in the first case was 3,000, and expected speedup in the second case is 30,000, enabling solutions in only seconds that previously would have taken over 8 hours. We also comment on difficulties encountered related to the lack of speed-up options for double precision floating point math, the continuing need for vendor support in making an "automated" transition from C to VHDL, and a gap in the path from C and MPI on a cluster to a cluster with FPGA's that center on communicating among FPGA's.

The first test case evaluated the performance of a GA guided method that composes and refines the parameters in sets of coupled, non-linear ODE’s to match both simulated and experimental data. This involved simulating and comparing a large number of double precision ODE data sets, which motivates attempts at extreme speed-up. We observed a 100X speed-up by moving from MatLab to C on a PC; linear (~ 30x) speed-ups moving from C on a PC to C/MPI on a cluster when using a distributed, Island Model Genetic Algorithm. Hardware speed-up for this problem has not been attempted, mainly due to perceptions that double precision floating point problems are not appropriate for hardware acceleration.

The second test case involved benchmarking the performance of a GA guided solution for the problem of designing highly constrained sets of Code Words against other methods found in the literature. This problem involves composing large libraries of pairs of short DNA strands that hybridize perfectly within each pair, but poorly with a strand taken from any other pair in the library. Such libraries are useful for applications in bio-molecular computing, microarrays, and for DNA assisted nano-device self-assembly schemes. The problem involves extensive constraint checking of strands in new candidate code word pairs with strands in existing members of the growing library of code words. The primary constraint metric is the string edit distance, as measured by the Levenshtein matrix, and it consumes about 98% of the application's compute time. Approximately linear speed-ups were observed by moving this matrix calculation from C on a PC to C and MPI on a cluster, again using a distributed, Island Model GA.

Present work involves accelerating the Levenshtein matrix calculation of the second test case problem by moving it to a hardware co-processor on an FPGA. A ripple flow through version of this matrix calculation shows an expected 100X speed-up (100ns vs 10us). A systolic array version has also been designed that processes constraint checks on multiple word pairs concurrently, and shows a pipeline latency of about 160ns, after which answers are generated every 10ns. Although this amounts to an expected 1000X speed-up, we would like to move this application to a heterogeneous cluster with FPGA's at each node to obtain further speedup. This step is still problematic due to the lack of vendor support for implementing MPI like communication between FPGA’s in the cluster nodes that is required by the Island Model GA.