![]() |
Myths About Algorithm Design and OptimizationYou should always worry about getting the code working, then worry about optimization later.This is a half truth which is often given as an unqualified recommendation to new programmers. It's good advice, but isn't always appropriate:
You can always just add more memory and use a faster processor.Although this may sometimes work, and can be a reasonable trade-off for a simple system for internal use, it can have extremely high hidden and downstream costs:
Using the inline directive in C++ will improve performance.Inlining a function call has two effects: (a) a call to the inline function is shorter and faster to execute, and (b) the generated code is less compact, since the content of the inlined function is repeated every time the function is called. The effects of (b) on performance are quite subtle. Whereas with (a) a frequently called function will tend to remain in the processor cache, with (b) this is less likely to be true, because as far as the processor is concerned each function call is executing different code. The overhead of a processor cache is typically higher than the overhead of a few extra procedure entry and exit instructions. Assuming many functions are inlined, secondary effects may be seen at the operating system level. The code is less compact, so process working set requirements are higher, and increased load is placed on the paging system. So whether code will actually run faster with an inline directive will depend upon the complexity of the inlined function and the number of different places in which it is called. Generally you should only inline the simplest functions those with one or two lines of code and not involving the creation or destruction of any objects. You should always use the standard libraries.The people who wrote the standard libraries (STL, ATL, MFC) are experts. They knew what they were doing, so it's generally a good idea to reuse their work. However, they didn't know what you were doing, so they were forced to use a general purpose implementation. That's OK if you are using the library in a non-critical manner where space or speed requirements are not critical. However, if the standard library becomes a bottleneck, you need to look carefully at how it works and whether the algorithms used can be improved. We once increased the performance of a client's code by two orders of magnitude by rewriting part of the MFC library. We weren't necessarily smarter than the library's authors: we simply knew an awful lot more about how the library was being used. I copied this implementation from a text book, therefore it must be right.Text book algorithms are generally written from an instructional perspective (What is Quick Sort?), rather than a practical implementation perspective. As a consequence, text book algorithms tend to ignore degenerative cases (what happens if all the values to be sorted are equal?, or what happens if the data is already sorted?), as well as error checking steps, and runtime choices between algorithms which depend upon the size of the data. Consult the sources of any well-written library, and you will find that the implementations have little resemblance to the text book. We were once asked to diagnose a performance problem which was so bad that it made the program appear to be endlessly looping. The algorithm used was a common sorting algorithm had been scrupulously copied from the programmer's course notes. What could be wrong? The answer was that the algorithm had a good average performance on random data. Unfortunately the data to which it was being applied wasn't random, and in fact frequently met the criteria for worst case performance. A simple change to use a library function (which tested for degenerative cases) proved to be all that was required to solve the problem. You can't beat O(NlogN - N) when sorting data.Actually, you can. The problem is that many people remember the above answer without actually remembering the question. What was the question? If you're really stuck, drop us a line using our contact form. Or see Knuth. The problem is NP-Complete. Therefore no implementation is going to be any good.A proof of NP-completeness, which establishes that the problem is equivalent to a number of hard problems in computer science, only establishes that the optimal solution is going to scale badly with problem size. Fortunately it says nothing about approximate solutions. Nor does it necessarily imply that a given size of problem cannot be solved. It's a warning sign about how an optimal solution will scale, nothing more. Recommended ReadingOur recommended reading section lists some interesting books on algorithms. Albion Research Ltd. is based in Ottawa, Canada. Please contact us for more information about our services. |
|
© Albion Research Ltd. 2013 | |