Overview
In response to a post on CG Talk, Optimization ideas needed, a few other developers and I us pitched in to help out with ideas to solve a puzzling problem.
The issue was that a loop to attach 7500 objects was taking up to 8 minutes to complete. Where did the problem lie? Was it in the transformation code, the attachment code, or somewhere else?
Well as luck would have it, today's Bank Holiday has been totally washed out with rain, so I had nothing better to than to do some testing and find the cause, and in the process road test my recently-developed TimeStamper struct.
Approach
After turning undo and redraw off, as well as optimizing the transformation code within the script, which cut a 1000-object test scene's processing time in half, surprisingly the 7500-object scene was still taking around 8 minutes.
I suspected that the problem may well be in the fact that as mesh gets larger, perhaps it becomes computationally more expensive to add new meshes, seeing as the entire array of points and faces needs to be evaluated every time a new mesh is added. Cumulatively, this could really add up.
I though that perhaps the right approach would be to split the attaching up into phases, using a 2D loop, and I suspected that perhaps the square-root of the total attaches might yield the optimal performance, ie for 10000 attaches I'd perform a 100 x 100 loop, creating 100 intermediate meshes, then attaching them all at the end.
Results
These are the results for various variations of 2D loops, limiting the amount of objects attached per loop. The results are impressive!
- For 1000 objects, the optimum loop combination was 31 loops of 32, a speed increase of 12 times over a single loop
- For 3000 objects, the optimum loop combination was 55 loops of 54, a speed increase of 22 times over a single loop
- For 5000 objects, the optimum loop combination was 50 loops of 100, a speed increase of 28 times over a single loop
- For 7500 objects, the optimum loop combination was 87 loops of 86, a speed increase of 30 times over a single loop
And yes, the square root approach was most successful 3 out of 4 times.
Here are the results in table format, with the optimum loop highlighted:
1D / 2D loop comparison
Compared to the 1D loop, the 2D loop was 30 times faster on 7500 attachments.
In minutes and seconds, that means a decrease from 6 minutes and 29 seconds to just 12 seconds!
In graph-form that looks like this:
Conclusion
It seems pretty safe to say that in this case, building up intermediate meshes and attaching at the end is the best way to go. It looks likely that adding to already-huge arrays carries with it an overhead that can have an extremely negative effect in a lengthy loop situation.
I've no doubt that this approach most likely extends to other mesh / array / hash based computations as well, so splitting up large cumulatively-expensive calculations into smaller chunks could well be a great time-saver in other tasks as well.
So – a shame about the rain, but perhaps a bonus in disguise!
Resources
- View the Time Stamper page
- Download TimeStamper.ms
- Download the test script that generates the 1000's of spheres that get attached, and performs the benchmarking
- Download the Excel report that was built using TimeStamper getReport() function in Excel 2003 or Excel 2007