On PSP and code reviews

Since reading about the PSP, I have continued to collect statistics. However, last week I decided to collect additional statistics related the the practice of reviewing code before compilation. The goal of this practice is to catch 100% of your compile errors before compilation by reading through the code. I was unsure about the wisdom of putting this responsibility on the programmer so I decided to gather some statistics to see if that doubt was justified.

My first problem (as always) was which statistics to collect. I chose to collect the time and defects found in each phase. However, in the case of the review I did not fix any problems I found. I did this to see if they were more easily or quickly found in the other phases. I additionally recorded defects from the review that I knew remained after each phase.

If I were to run this again I would sub-divide the times into diagnosis and fix because the fix time would be roughly the same no matter which phase caught the problem.

I have used the word “phase” a few times and this was my next problem. I’m not used to working in strict phases. Normally I tend to work in tiny increments. This aids defect detection since the defect is almost always in the bit you changed. But to make the gathering of statistics practical I had to lump together work on each task so I could collect statistics at the end of each phase.

My first phase was to write all of the code I thought I needed to complete the task. When I was happy with that I then moved on to a review phase where I used my source control tool to highlight every change that I had made. Next I compiled the code. Finally I tested the results. If I found a problem in any phase I recorded the details of the defect: which phase it was found in, what the problem was, and how long it took to rectify. If I found a problem during review I left the time blank for this entry, and filled it in later if that defect was detected in another phase.

I did not try to recursively apply this for the fixes. This means that fix times might include several cycles without a review phase or statistics gathering for that fix.

I used the defects I found to improve future reviews. This had a definite effect on reducing the compilation defects as the week went on because I tended to make the same mistakes each time. But it has no effect on defects found during testing since there was no obvious pattern to those.

I gathered review statistics over the week on 22 tasks. The times for each phase were, on average: 5 minutes for review, 2 minutes for compile and fix, and 14 minutes for test and fix. I found 27 defects during review, and 16 of those went undetected by the compiler. However, 10 of those did not show up during testing either.

Further examination revealed most of those to be overzealous defect reports in that they did not cause problems. An example of such a defect being the failure to directly include a header file which was indirectly included. But one borderline case was an unused return value which could be wrong under certain circumstances. Since the value is unused it does not yet produce a defect at run-time, but the potential was there for the future.

This leaves a total of 6 defects that made it through compilation and showed up as bugs during testing. The total time taken to find and fix those was 19 minutes. One case gave me some trouble because I knew the cause of the crash due to the review step. I contemplated getting a colleague to diagnose the problem to get a more accurate time, but I’d asked him for help on the same type of crash earlier and he’d spotted it right away. However it highlights a problem with this experiment which I’d address if I were to run it again.

The 11 review defects caught by the compiler were found and fixed in 11 minutes. Therefore the total potential benefit of the review step was approximately 30 minutes (this is a slight overestimate due to the fix time being included in the figures.) The cost of the review step was 110 minutes. In conclusion, despite a couple of minor defects with the experiment itself, I am confident that reviewing code before compilation is not cost effective.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.