Bubble Sort
The slowest sort most programmers learn first — and exactly because of that, the clearest possible introduction to what “comparing and swapping” even means.
- The core idea: repeated neighbor swaps
- Why it’s called “bubble” sort
- Code in Python, Java, and C
- The early-exit optimization that actually matters
- Complexity: best, average, and worst case
- Why it’s almost never the right production choice
- Where it’s still genuinely useful
The core idea: repeated neighbor swaps
Bubble sort works on a single, almost embarrassingly simple rule: walk through the list comparing each pair of neighboring elements, and swap them if they’re in the wrong order. Do one full pass like that from the start of the list to the end, then do another pass, and another, repeating until a complete pass goes by with no swaps at all — at which point the list is guaranteed to be fully sorted.
Each full pass guarantees that at least the largest remaining unsorted element gets pushed all the way to its correct position at the end of the list. It’s not a clever algorithm in any technical sense — it doesn’t divide the problem, it doesn’t use auxiliary data structures, and it doesn’t make any attempt to skip unnecessary work beyond the optimization covered further down. What it has going for it is total transparency: there is no step in bubble sort that requires explanation beyond “compare two neighbors, swap if needed.”
Why it’s called “bubble” sort
The name comes directly from the visual effect of running the algorithm: each pass causes the largest remaining unsorted value to “bubble up” to its correct position at the end of the list, the same way a bubble in liquid rises toward the surface. Watching a visualization makes this obvious almost immediately — the tallest bar drifts rightward, one swap at a time, until it settles into its final spot.
The darkest bar “bubbles” to the far right over successive passes — the visual origin of the algorithm’s name.
Code in Python, Java, and C
The Python version below is about as close to plain English as code gets — two nested loops, one comparison, one conditional swap.
def bubble_sort(items): n = len(items) for pass_num in range(n - 1): for i in range(n - 1 - pass_num): if items[i] > items[i + 1]: items[i], items[i + 1] = items[i + 1], items[i] return items
Notice the inner loop’s range shrinks by one each pass — n - 1 - pass_num instead of a fixed n - 1 every time. This is a free optimization: since each pass guarantees the next-largest element has bubbled into its correct position at the end, there’s no reason to keep re-checking that already-settled region.
public static void bubbleSort(int[] items) { int n = items.length; for (int pass = 0; pass < n - 1; pass++) { for (int i = 0; i < n - 1 - pass; i++) { if (items[i] > items[i + 1]) { int temp = items[i]; items[i] = items[i + 1]; items[i + 1] = temp; } } } }
And the C version, structurally identical, makes the in-place, no-extra-memory nature of the algorithm especially visible — only a single temporary variable is ever needed, regardless of how large the input array is.
void bubbleSort(int arr[], int n) { for (int pass = 0; pass < n - 1; pass++) { for (int i = 0; i < n - 1 - pass; i++) { if (arr[i] > arr[i + 1]) { int temp = arr[i]; arr[i] = arr[i + 1]; arr[i + 1] = temp; } } } }
The early-exit optimization that actually matters
The three versions above always run a fixed number of passes, even on data that became fully sorted partway through. A single boolean flag fixes that: track whether any swap happened during a pass, and if a full pass completes with zero swaps, the list is already sorted and the algorithm can stop immediately.
def bubble_sort_optimized(items): n = len(items) for pass_num in range(n - 1): swapped = False for i in range(n - 1 - pass_num): if items[i] > items[i + 1]: items[i], items[i + 1] = items[i + 1], items[i] swapped = True if not swapped: break # nothing moved this pass — already sorted return items
This single change is the difference between bubble sort taking quadratic time on every input, and bubble sort taking linear time on data that’s already sorted or only slightly out of order — a single pass detects that nothing needs to move, and the function returns immediately.
Complexity: best, average, and worst case
| Case | Time complexity | Condition |
|---|---|---|
| Best case | O(n) | Already sorted, with the early-exit flag in place |
| Average case | O(n²) | Random ordering |
| Worst case | O(n²) | Reverse-sorted input |
The quadratic behavior comes from a triangle-number structure: the total number of comparisons across all passes is roughly (n-1) + (n-2) + ... + 1, which sums to a value proportional to n². Every simple comparison-swap sort that walks the list repeatedly shares this same underlying shape, which is exactly why none of them scale well past a few thousand elements.
Why it’s almost never the right production choice
Quadratic time sounds abstract until you put numbers next to it. Sorting a thousand elements with bubble sort takes on the order of a million operations; sorting a million elements takes on the order of a trillion. Quicksort or merge sort would handle that same million-element list in roughly twenty million operations — a difference of nearly five orders of magnitude.
This is precisely why no mainstream standard library uses bubble sort as its default general-purpose sort, and why it almost never appears in performance-sensitive code outside of teaching contexts and toy examples.
Where it’s still genuinely useful
- Teaching. No other sorting algorithm makes “comparison” and “swap” this concrete this quickly — it’s an excellent first algorithm precisely because there’s nothing hidden in it.
- Tiny, nearly-sorted datasets. With the early-exit optimization, bubble sort on data that’s already mostly in order can actually compete with more complex algorithms whose setup overhead isn’t worth paying for a handful of elements.
- Detecting “is this already sorted?” cheaply. A single pass of bubble sort with no swaps is itself a valid way to confirm a list is sorted — the early-exit check does exactly that work as a side effect.
- Building intuition for algorithm analysis. Because its complexity is so easy to derive by hand, bubble sort is a common first example when teaching big-O notation itself, before moving on to algorithms whose complexity is harder to see directly from the code.
