Automatic faulty commit hunting with git bisect

Intro
Creating small flask app
Suddenly, a wild bug appears
Using git bisect to manually pinpoint the faulty commit
Automating faulty commit hunting

Intro

In simple terms, git bisect is a binary search tool incorporated into git which assists in finding a commit that introduced a bug. The process is pretty straightforward:

Start bisect process
Mark a bad commit (contains a bug)
Mark a good commit (does not contain a bug)
Progress with the bisect process while git automatically checks out commits in the middle of the range until the problematic commit is reached.
If we incorporate a test that will run during the bisect process, we can automate the faulty commit hunt process.

Let’s see how it works.

Creating small flask app

Let’s assume we developed and deployed a small python (flask) app. Obviously this is only for the sake of the example, it can be any other framework or programming language.

from flask import Flask
app = Flask(__name__)

@app.route('/test', methods=['GET'])
def test():
    return 'Yay, works'

def run_flask_app():
    app.run(host='localhost', port=80)

if __name__ == "__main__":
    run_flask_app()

Suddenly, a wild bug appears

Let’s assume that after deployment, other team members continued to work on this app, committing different features. We don’t know nor do we care what were the contents of the commits. Suddenly, our app has a bug, it just stopped working (and for the sake of the example, assume we didn’t catch it immediately in our CI/CD pipeline). We take a look at the commit history and see our latest commit where things were working, but after that there are 15 (it could also be 150 or 1500) commits we don’t recognize.

Using git bisect to manually pinpoint the faulty commit

First, we’ll start the bisect process: git bisect start.
And receive: status: waiting for both good and bad commits.
We’ll mark our latest commit as bad with git bisect bad.
After that, we’ll mark the commit with the latest known good status, let’s assume it was the last commit that we recognize, with the message My last known commit - definitely no issue here.
We’ll mark this commit hash as good: git bisect good 44180754.

We’ll receive the following message:

Bisecting: 7 revisions left to test after this (roughly 3 steps)
[f5a85016286327434ca1b14a760e39d638f2f895] commit 7 - simulating work on the app

We’ll check whether the app is working - the answer is yes, so we’ll mark this commit as good: git bisect good.
Now we’ll be prompted to give feedback on the next commit:

Bisecting: 3 revisions left to test after this (roughly 2 steps)
[b4a79c14292cc2e623c4f7843aee026679145b80] commit 11 - simulating work on the app

In this commit we already see an error, so we mark it as bad: git bisect bad.
In commit 9 we don’t observe the faulty behavior, so we mark it as good, and get:

Bisecting: 0 revisions left to test after this (roughly 0 steps)
[6cbccbca6a1ce5674cf5eaa1fd6cea8c7fb5a868] commit 10 - simulating work on the app

In commit 10 we also don’t observe any issue, so we mark it as good and get the following:

b4a79c14292cc2e623c4f7843aee026679145b80 is the first bad commit
commit b4a79c14292cc2e623c4f7843aee026679145b80
Author: #####
Date:   #####

    commit 11 - simulating work on the app

 src/app.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

If we’ll observe the contents of this commit, we’ll see the following:

Port 80 was actually changed to 801, which will cause the app not to respond to requests, which are expected on port 80 as it was in the initial deployment.

We were able to pinpoint the faulty commit within 4 simple git commands.
In our example there were only 15 commits, but imagine if there were 1500 commits, or even more.

Automating faulty commit hunting

git bisect also supports the run command, which allows us to offload checking whether the commit is faulty to a test (or any other process which returns exit code of 1 or 0).
git bisect process will keep automatically checking out the commits as the binary search progress, and will report on the faulty commit at the end.
The process will treat exit code 0 as good, and exit code 1 as bad.

Let’s write a simple test which will check whether the app is working as expected and respond to API calls:

test.py

import requests
import unittest
import threading
import time
from src.app import run_flask_app


class TestFlaskApp(unittest.TestCase):

    @classmethod
    def setUpClass(cls):
        # Start the Flask app in a separate thread
        cls.flask_thread = threading.Thread(target=run_flask_app, daemon=True)
        cls.flask_thread.start()
        # Give some time for the server to start
        time.sleep(1)

    def test_app_is_running(self):
        url = "http://localhost/test"
        try:
            response = requests.get(url)
            self.assertEqual(response.status_code, 200, "App is not running")
        except requests.ConnectionError:
            self.fail("Connection failed, the app might not be running")


if __name__ == "__main__":
    unittest.main()

Now, lets add to the bisect process the run command:

git bisect start
git bisect bad
git bisect good 44180754
git bisect run python test.py

Now we’ll automatically receive the following output:

running 'python' 'test.py'
... Ommiting the output for brevity
.
----------------------------------------------------------------------
Ran 1 test in 3.023s

OK
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[b4a79c14292cc2e623c4f7843aee026679145b80] commit 11 - simulating work on the app
... Ommiting the output for brevity
----------------------------------------------------------------------
Ran 1 test in 5.030s

FAILED (failures=1)
Bisecting: 1 revision left to test after this (roughly 1 step)
[76b5c65fc1e7a32e992ea1d86be019d90fcf9845] commit 9 - simulating work on the app
... Ommiting the output for brevity
.
----------------------------------------------------------------------
Ran 1 test in 3.024s

OK

Bisecting: 0 revisions left to test after this (roughly 0 steps)
[6cbccbca6a1ce5674cf5eaa1fd6cea8c7fb5a868] commit 10 - simulating work on the app
... Ommiting the output for brevity
.
----------------------------------------------------------------------
Ran 1 test in 3.015s

OK
b4a79c14292cc2e623c4f7843aee026679145b80 is the first bad commit
commit b4a79c14292cc2e623c4f7843aee026679145b80
Author: #####
Date:   #####

    commit 11 - simulating work on the app

 src/app.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
bisect found first bad commit

As explained, git bisect leveraged the test to determine whether the currently checked out commit is faulty or not and automated the manual bisect process.