The Digital Lumberjack

Hacking Code When Life Goes Digital

Why Garbage Collection?


Post date 28 Sep 2009
Filed under

Lisp has jokingly been called "the most intelligent way to misuse a computer". I think that description is a great compliment because it transmits the full flavor of liberation: it has assisted a number of our most gifted fellow humans in thinking previously impossible thoughts
Edsger Dijkstra, CACM, 15:10

Just as filthy as garbage collection sounds, it is the annoying feature many computer scientists disregard. The most common opinions are that it encourages bad programming practices, it will prevent you from learning important mechanics about the memory consumption of a program and finally, even the most complex problem can be solved with explicit memory deallocation.

Well, all of that is absolutely right. So why care about it? Why on Earth would I try to ease its adaption in such an excellent language as Ada GarciaEtAl07? In my opinion the following assumptions might explain much of its unpopularity:

  • Companies care to do software right
  • Companies only hire good people
  • There is always time and money for the best solution Some would argue that the best solution is the one that matches best the allocated time and money
  • Garbage collection encourages lazy programming

Software Engineering is all about dealing with complexity through several layers of abstraction. Why not making memory handling part of that? Software always evolves into bigger and more complex creatures. Since memory handling is a recurring error, spending time fixing such type of errors or taking care of a proper memory deallocation is a quite time consuming task. As many Lean methodology adventists would argue, it is not something that really adds value to the final product

The need to understand how a program uses memory is clear, but the driving economics factor behind any company makes even more clear that garbage collection is important. Releases can be done faster with less software bugs. Arguably companies care more about releasing something than how it has been achieved. Hence I do believe that if Ada had garbage collection, it would not be better, but it would raise more interest.

C and C++ are facing this dilemma. Many programmers want garbage collection supported into their favorite language, but there is no common agreement about how this should be done. Ellis and Detlefs have long argued about it EllisD94, and yet, the Cxx0 standard has postponed the issue for a future revision.

What is happening? I think the current solution is partly to blame. A conservative garbage collector boehmWebBoehm88 can be easily deployed with non cooperative languages such as as C++. It doesn't need changes in the compiler and mostly, only demands replacing the malloc() and new() calls with an alternative function. This have covered the needs of many garbage collection advocates, but has not convinced the mainstream.

Some software products like Firefox have successfully used this alternative, and Zorn has proven that this type of collectors really work well berger01composing Zorn93. However in my humble opinion, I believe it is only a problem of faith. Conservative collectors make many heuristic assumptions. They explore the whole memory trying to guess what looks like a pointer. Obviously, computer scientists feel nervous loosing control of their memory to a mechanism that looks more like black magic. Touching here and deallocating there without delimited control. Furthermore they are not very portable because of their tendency to rely on special mechanisms of the underlaying operative system and/or microprocessor (paging, protected memory regions, etc)

Zorn also made some promising discoveries comparing implicit and explicit memory handling, but I miss one important point. There is a huge code legacy in Ada/C/C++ doing explicit memory handling. It is quite likely that when using a conservative collector, it will expand without control to the memory managed by such code adding the cost of implicit memory deallocation to the current implicit one.

Hence I have my doubts that conservative garbage collection is the answer that we all will see in future language standards. I believe that the secret lays within implicitly deallocated containers provided by the languages themselves. By providing garbage collection to a delimited subset of a program, one will have a better control of the processes, along an estimation of the increased memory and CPU usage. Such solution should ease the mind of its detractors.

However it will not be an easy task. The road is full of conflicts and semantic restrictions. Such containers will demand more information about the memory than old languages currently provide. The essential need of a garbage collector is its ability to extract the pointers to collectable objects within a specific type layout and the program stack. If the languages don't ease the implementation of custom collectors, there will be little hope to any improvement in the current situation.

Probably we will need two standard revisions of our languages to see garbage collection made common. The first one should provide the means to build precise collectors and the second define the semantic restrictions to use a garbage collected container.

Addressing this issue is a good way to extend the life of Ada/C/C++. Since the complexity of software will evolve, I can think of two historic events:

  • Assembly has mostly been replaced by higher level languages such as C. Despite many people defending real machine code programming, C became the standard. Now even object oriented languages such as C++ and Java are making their way into the embedded market pushing away C.

  • When the gap between CPU and hard-drives grew too large, complex file systems emerged with more efficient storage-retrieval algorithms and new features never thought of.

In the same way, the gap between CPU and memory speed can grow so large that advanced memory algorithms will emerge. New advantages not considered so far might surprise us. For example, with an increasing demand in portable computing, it will be interesting if memory structures could re-arrange themselves. An optimized object layout will improve cache hits and reduce the memory regions that need to be powered, saving on energy.

In one academic study the choice of a memory management strategy made a program consume 40 times more energy Diwan02energyconsumption. Garbage collectors proved to be quite energy inefficient by then. It is curious that Java is the first choice for many portable devices, but more research is trying to close the gap CSKVIW02.

Obviously such achievements will take some time. However our future researchers might lack a more fine-grained memory control from our languages, like the ability to pin-point pointers within memory. Improving languages and compilers to provide the needed infrastructure for garbage collection seems a good excuse to start. Future advanced memory algorithms will benefit of this one way or the other and maybe, in that future it will look unconceivable that programmers take care of their own memory.




If you want to read more...

07whatevery

AltedFr10

The Memory Management Reference
http://www.memorymanagement.org/


blog comments powered by Disqus