8/29 "Big Oh" notation and bubble sort Some general remarks about the "Big Oh" notation. 1. Consider any polynomial p(n) of degree d. It is the case that p(n) = O(n^d). e.g. 8 n^4 + 10000 n^3 + 12 n^2 + 100 n + 2000 = O(n^4) 2. Let a and b be arbitrary positive constants. Then n^a grows faster than (log_2 n)^b. e.g. n grows faster than (log_2 n)^{100}. Therefore, n + (log_2 n)^{100} = O(n). 3. From the point of view of "Big Oh" notation, the base of the log does not matter. In other words, for any positive constants a and b, log_a n = O(log_b n). This is beause of the "change of base" formula: log_b n log_a n = ------- log_b a In other words, log_a n can be expressed as a constant times log_b n (and vice versa). This basically allows us to drop the base when using the log function inside the "Big Oh" notation. So we can simply write O(log n) instead of O(log_2 n) or O(log_3 n), etc. We will get more practice with the "Big Oh" notation we analyze more algorithms and data structures throughout the semester. Sorting The next topic is sorting. This appears in Chapters 3 and 7 in the texbook. The common sorting algorithms can be partitioned into (i) slow sorting algorithms, which run in O(n^2) time and (ii) fast sorting algorithms, which run in O(n log(n)) time. Since log(n) grows so slowly, as compared with n, O(n log(n)) is hard to distinguish from O(n) for all practical purposes. The slow sorting algorithms include insertion sort, selection sort, and bubble sort. These are discussed in Chapter 3. The fast sorting algorithms include merge sort, quick sort, and heap sort. The first two of these appear in Chapter 7. These are implemented using recursive functions - we will discuss these later in the week. First, we will discuss a slow sorting algorithm (bubble sort) and analyze it. Bubble sort. The algorithms runs in (n-1) phases. In phase 1, the smallest element "bubbles" to the top and occupies its rightful position. In phase 2, the 2nd smallest element "bubbles" to its rightful position, just under the smallest element. This continues until, after n-1 phases all elements have moved to their rightful positions. In each phase, we have an index j that runs from n-1 downwards. In each step we compare the jth element and the (j+1)th element and swap them, if the jth element is larger than the (j+1)th element. Code for bubble sort: for(i = 0; i < n-1; i++) for(j = n-2; j >= i; j--) if(data[j] > data[j+1]) swap(data, j, j+1) Analysis of bubble sort: Consider the inner loop first. The loop index goes from n-2 to i. Therefore, it takes on (n-2) - i + 1 values. This means that the body of the inner loop executes n - (i + 1) times. Note that the running time of the body of the inner-loop is a constant. Therefore, the total running time of the inner loop is A (n - (i+1)) + B where A and B are constants, independent of n or i. Now let us consider the outer loop. For any particular value of i in the range 0 through n-2, one execution of the outerloop takes: A (n - (i+1)) + B + C where A (n - (i+1)) + B is the time for executing the inner loop and C is a constant that is the amount of time it takes to increment the index i and compare it against n-1. So the total running time of the outer loop is D + sum_{i = 0}^{n-2} [A (n-(i+1)) + (B+C)], where D is a constant representing the amount of time it takes to initialize the index i. This expression can be simplified as follows: = D + A * sum_{i = 0}^{n-2} (n-(i+1)) + (B+C)*(n-1) = D + A * sum_{i = 1}^{n-1} i + (B+C) * (n-1) = D + A * n(n-1)/2 + (B+C) * (n-1). = A/2 * n^2 + (B + C - A/2) * n + (D - B - C) = O(n^2).