Monday and Wednesday 11-12:15

Ciww 102

I start at 0 so that when we get to chapter 1, the numbering will agree with the text.

- gottlieb@nyu.edu (best method)
- http://allan.ultra.nyu.edu/~gottlieb
**two**el's in allan - 715 Broadway, Room 712

There is a web site for the course. You can find it from my home page, which is http://allan.ultra.nyu.edu/~gottlieb

- You can find these lecture notes on the course home page. Please let me know if you can't find it.
- I mirror my home page on the CS web site.
- I also mirror the course pages on the CS web site.
**But**, the official site is allan.ultra.nyu.edu. It is the one I personally manage.- The notes will be updated as bugs are found.
- I will also produce a separate page for each lecture after the lecture is given. These individual pages might not get updated as quickly as the large page

The course text is Goodrich and Tamassia: ``Algorithm Design: Foundations, Analysis, and Internet Examples.

- Available in bookstore.
- I expect to cover nearly all of the first 7 chapters.

- You are entitled to a computer account, please get it asap.
- Sign up for the Mailman mailing list for the course. http://www.cs.nyu.edu/mailman/listinfo/v22_0310_002_fa02
- If you want to send mail just to me, use gottlieb@nyu.edu not the mailing list.
- Questions about the lectures or homeworks should go to the mailing list. You may answer questions posed on the list as well.
- I will respond to all questions; if another student has answered the question before I get to it, I will confirm if the answer given is correct.

The major components of the grade will be the midterm, the final, and homeworks. I will post (soon) the weights for each.

We will have a midterm. As the time approaches we will vote in class for the exact date. Please do not schedule any trips during days when the class meets until the midterm date is scheduled.

If you had me for 202, you know that in systems courses I also
assign labs. Basic algorithms is **not** a systems
course; there are no labs. The homeworks, which are problem sets, are
required, will be graded, and will constitute a portion of your
grade. Very few if any homeworks will require the computer

There is a recitation session on tuesdays from 9:30 to 10:45 in room 109. The recitation leader is Sean McLaughlin <seanmcl@cs.nyu.edu>.

Good methods for obtaining help include

- Asking me during office hours (see web page for my hours).
- Asking the mailing list.
- Asking another student, but ...

**Your homeworks must be your own**.

I use the upper left board for homework assignments and announcements. I should never erase that board. Viewed as a file it is group readable (the group is those in the room), appendable by just me, and (re-)writable by no one. If you see me start to erase an announcement, let me know.

It is university policy that a student's request for an incomplete be granted only in exceptional circumstances and only if applied for in advance. Naturally, the application must be before the final exam.

We are interested in designing good

**algorithms** (a step-by-step procedure for performing
some task in a finite amount of time) and good

**data structures** (a systematic way of organizing and
accessing data).

Unlike v22.102, however, we wish to determine rigorously just
**how good** our algorithms and data structures really
are and whether **significantly better** algorithms are
possible.

We will be primarily concerned with the speed (*time
complexity*) of algorithms.

- Sometimes the
*space complexity*is studied. - The time depends on the input, most often on the size of the input.
- We can run experiments.
- Must choose
*sufficiently many, representative*inputs. - Must use identical hardware to compare algorithms.
- Must
*implement*the algorithm.

- Must choose

We will emphasize instead and analytic framework that is independent of input and hardware, and does not require an implementation. The disadvantage is that we can only estimate the time required.

- Often we ignore multiplicative constants and small input values.
- So we consider
`f(x)=x`equivalent to^{3}-20x^{2}`g(x)=10x`^{3}+10x^{2} - Huh??
- Easy to see that for say
`x > 100, f(x) < 10 g(x)`and`g(x) < 10 f(x)`.

**Homework:** Unless otherwise stated homework
problems are from the last section in the current book chapter.
R-1.1 and R-1.2.

Designed for human understanding. Suppress unimportant details and describe some parts in natural language (English in this course).

The key difference from reality is the assumption of a very simple memory model: Accessing any memory element takes a constant amount of time. This ignores caching and paging for example. It also ignores the word-size of a computer (any size number can be stored in one word and accessed in one operation time).

The time required is simply a count of the **primitive
operations** executed. Primitive operations include

- Assign a value to a variable (independent of the size of the value; but the variable must be a scalar).
- Method invocation, i.e., calling a function or subroutine.
- Performing a (simple) arithmetic operation (divide is OK, logarithm is not).
- Indexing into an array (for now just one dimensional; scalar access is free).
- Following an object reference.
- Returning from a method.

Let's start with a simple algorithm (the book does a different simple algorithm, maximum).

Algorithm innerProduct Input: Non-negative integer n and two integer arrays A and B of size n. Output: The inner product of the two arrays prod ← 0 for i ← 0 to n-1 do prod ← prod + A[i]*B[i] return prod

- Line 1 is one op (assigning a value).
- Loop initializing is one op (assigning a value).
- Line 3 is five ops per iteration (mult, add, 2 array refs, assign).
- Line 3 is executed n times; total is 5n.
- Loop termination test is one op (a comparison i<n) each time.
- Loop termination is done n+1 times (n successes, one failure); total is n+1.
- Return is one op.

The total is thus `1+1+5n+(n+1)+1 = 6n+4`.

Let's improve it (a very little bit)

Algorithm innerProductBetter Input: Non-negative integer n and two integer arrays A and B of size n. Output: The inner product of the two arrays prod ← A[0]*B[0] for i ← 1 to n-1 do prod ← prod + A[i]*B[i] return prod

The cost is `4+1+5(n-1)+n+1 = 6n+1`

**THIS ALGORITHM IS WRONG!!**

If n=0, we access A[0] and B[0], which do not exist. The original
version returns zero as the inner product of empty arrays, which is
arguably correct. The best fix is perhaps to change Non-negative

to Positive

. Let's call this algorithm innerProductBetterFixed.

What about if statements?

Algorithm countPositives Input: Non-negative integer n and an integer array A of size n. Output: The number of positive elements in A pos ← 0 for i ← 0 to n-1 do if A[i] > 0 then pos ← pos + 1 return pos

- Line 1 is one op.
- Loop initialization is one op
- Loop termination test is n+1 ops
- The if test is performed n times; each is 2 ops
- Return is one op
- The update of pos is 2 ops but is done ??? times.
- What do we do?

Let `U` be the number of updates done.

- The total number of steps is
`1+1+(n+1)+2n+1+2U = 4+3n+2U`. - The
**best case**occurs when`U=0`(i.e., no numbers are positive and gives an answer of 4+3n. - The
**worst case**occurs when`U=n`(i.e., all numbers are positive and gives an answer of 4+5n. - To determine the
**average case**result is much harder as it requires knowing the input distribution (i.e., are positive numbers likely) and requires probability theory.

Consider a recursive version of innerProduct. If the arrays are of size 1, the answer is clearly A[0]B[0]. If n>1, we recursively get the inner product of the first n-1 terms and then add in the last term.

Algorithm innerProductRecursive Input: Positive integer n and two integer arrays A and B of size n. Output: The inner product of the two arrays if n=1 then return A[0]B[0] return innerProductRecursive(n-1,A,B) + A[n-1]B[n-1]

How many steps does the algorithm require? Let T(n) be the number of steps required.

- If n=1 we do a comparison, two fetches, a product, and a return.
- So T(1)=5.
- If n>1, we do a comparison, a subtraction, a method call, the recursive computation, two fetches, a product, a sum and a return.
- So T(n) = 1 + 1 + 1 + T(n-1) + 2 + 1 + 1 + 1 = T(n-1)+8.
- This is called a
**recurrence equation**. In general these are quite difficult to solve in**closed form**, i.e. without T on the right hand side. - For this simple recurrence, one can see that T(n)=8n-3 is the solution.
- We will learn more about recurrences later.