If and Only If

Performance Cost of Using If Statements and Branch Predictor

Sara Khandaker
codeburst

--

I was practicing my algorithm and data structures on LeetCode when I ran into something intriguing about “If” statements. I had the following code:

for (let i =0; i<m; i++){
for (let j =0; j<n; j++){
if (matrix[i][j]%2 != 0) count++
}
}

I was looping over every element in a 2D array and increasing the count if the value was odd. Quite simple, and in my head, I wanted to increase the count “if” a condition was true so I used the if statement. My solution worked and Leetcode gave me the following statistics:

  • Runtime: 92 ms, beats 46.58 % of javascript submissions
  • Memory Usage: 41.8 MB, beats 15.53 % of javascript submissions

However, after some research, I came across a solution that essentially did the same thing, but without the if statement. See below:

for (let i =0; i<m; i++){
for (let j =0; j<n; j++){
count += matrix[i][j]%2
}
}

This works because matrix[i][j]%2 will be 0 if the value is even (therefore not increase the count) and 1 if odd. Voila, the same function but without the if statement. Ok, I’ll admit that was clever, and I was upset I didn't come up with it on my own. But surely this small detail wouldn't change the overall performance by much… wrong. See the Leetcode stats for this solution:

  • Runtime: 84 ms, beats 77.17 % of javascript submissions.
  • Memory Usage: 40.8 MB, beats 40.64 % of javascript submissions.

Wow! Simply removing the if condition reduces 8ms of runtime and 1MB of memory. My solution went from better than 46.58% to 77.17% in runtime and from 15.53% to 40.46% for memory usage! Those are some significant improvements. All from removing one simple if statement… why is this?

Cost of Branching

How is the additional if statement slowing us down? First, and more obviously, is the extra work of evaluating the condition in the statement. Now that's not really a lot of extra work though.

The second cost is much more interesting and it relates to branching, When using an if statement, you are essentially telling the computer that there are now multiple paths or branches that could be taken. The next step or the next set of instructions isn't set in stone.

Now modern computers don't like this. They usually have long pipelines, meaning they are loading up the next instructions while performing work from the last instructions. A lot is happening at the same time! But when we come to a fork in the road or a branch, the computer doesn't know which set of instructions to put on next.

The computer now has two choices. First, it can wait for all the previous work to finish and then load up the appropriate next instructions. This results in a stall and can be a cause of performance hits. Hmm, that doesn't sound great and computers don't like waiting around.

So, for the second choice, the computer can do branch predictions. Basically, they can take a guess on which path you might take. This way it can load up the instructions and have something ready to go instead of waiting. If the computer predicted the correct path of execution, those instructions are next in the executable and ready for when the branch happens. No stalling here.

Now what happens when the computer got it wrong? You gotta go back now and load up the correct instructions. Yikes, that's a slowdown. Modern computers try to avoid this by getting better at their guesses and predictions. This is based on which path is more likely. Therefore, if your branches are easily predicted (always taken/not taken, or follows a pattern), you will suffer fewer branch prediction penalties.

I think of it as a long freight train. It can carry a lot of cargo fast in a straight line, but it corners badly . — Guge, 2008

Wikipedia page for more info: Branch Predictor

Another Example:

See the code below, will the second version have a slower runtime?

if (cond1 && cond2){
//do something
}
//vsif (cond1) {
if (cond2) {
//do something
}
}

When there are two if statements, it will result in two branchings instead of one. So technically, yes the second version would require more work on the part of the CPU and possibly run slower. However, the compiler may optimize the code and result in the two versions running the same.

Conclusion

So it seems this little two-letter word can have an impact on your code and may come with performance costs. If it can be avoided, you may see some performance improvements like above.

However, it's important to note that in all my research everyone was quick to say that this type of optimization is usually unnecessary. It's better to focus on readability since these micro-optimizations are unlikely to have major impacts on the end products. In other words, if it ain't broke don't fix it.

References

--

--