I finally managed to finish my implementation of a deadlock detector for my year project (a C framework with which to develop distributed model checking applications) earlier today. While writing the code to output resource usage statistics, I noticed that I’d forgotten all along to add optimisation flags to my make files.
Running the detector with the -O2 optimisation flags yielded the following statistics:
pcronje@gecko modelchecker $ mpirun -H localhost ./deadlock -q -q
[0] Resource usage statistics:
* Store memory (bytes) : 192000
* User time (work) : 2.568160
* User time (total) : 2.652165
* System time (total) : 0.008000
* Page faults/reclaims : 0/1461
Switching over to -O3 yielded:
pcronje@gecko modelchecker $ mpirun -H localhost ./deadlock -q -q
[0] Resource usage statistics:
* Store memory (bytes) : 192000
* User time (work) : 1.520095
* User time (total) : 1.604100
* System time (total) : 0.016001
* Page faults/reclaims : 0/1463
Not bad. Force of habit running a Gentoo system makes me go -O2 by default when optimising.
Interestingly enough, running the deadlock detector across two systems yielded the following statistics (the number in square brackets indicate the process ID of the printing process):
pcronje@gecko modelchecker $ mpirun -H localhost,localhost ./deadlock -q -q
[1] Resource usage statistics:
* Store memory (bytes) : 96000
* User time (work) : 0.872055
* User time (total) : 0.948059
* System time (total) : 0.028001
* Page faults/reclaims : 3/1628
[0] Resource usage statistics:
* Store memory (bytes) : 96000
* User time (work) : 0.880055
* User time (total) : 0.972060
* System time (total) : 0.008000
* Page faults/reclaims : 7/1623
Obviously, the memory usage is neatly halved (since that’s just how the state generator I’m using to test this sucker works), but the interesting thing is the decrease in runtime. I would’ve thought that the communications overhead involved in throwing work items around between processes would’ve made the whole thing about as fast as swimming in tar. I guess the smaller memory footprint is making the difference when the analyser reads through the work item store or something. It’s something I’ll look at once I start properly evaluating the project.
Tags: programming, University