Introduction to Computer Science

Start Lecture #1

Chapter 0: Administrivia

I start at 0 so that when we get to chapter 1, the numbering will agree with the text.

0.1: Contact Information

0.2: Course Web Page

There is a web site for the course. You can find it from my home page, which is http://cs.nyu.edu/~gottlieb

0.3: Textbook

The course text is Liang, Introduction to Java Programming (Brief Version), Eighth Edition (8e)

0.4: Computer Accounts and the Mailman Mailing List

0.5: Grades

Grades are based on the labs and the final exam, with each very important. The weighting will be approximately
(but see homeworks below).

0.6: The Upper Left Board

I use the upper left board for lab/homework assignments and announcements. I should never erase that board. Viewed as a file it is group readable (the group is those in the room), appendable by just me, and (re-)writable by no one. If you see me start to erase an announcement, let me know.

I try very hard to remember to write all announcements on the upper left board and I am normally successful. If, during class, you see that I have forgotten to record something, please let me know. HOWEVER, if I forgot and no one reminds me, the assignment has still been given.

0.7: Homeworks and Labs

I make a distinction between homeworks and labs.

Labs are

Homeworks are

0.7.1: Homework Numbering

Homeworks are numbered by the class in which they are assigned. So any homework given today is homework #1. Even if I do not give homework today, the homework assigned next class will be homework #2. Unless I explicitly state otherwise, all homeworks assignments can be found in the class notes. So the homework present in the notes for lecture #n is homework #n (even if I inadvertently forgot to write it to the upper left board).

0.7.2: Doing Labs on non-NYU Systems

You may solve lab assignments on any system you wish, but ...

I believe you all have accounts on i5.nyu.edu. Your username and password should be the same as on home.nyu.edu (at least that works for me).

0.7.3: Obtaining Help with the Labs

Good methods for obtaining help include

  1. Asking me during office hours (see web page for my hours).
  2. Asking the mailing list.
  3. Asking another student, but ...
    Your lab must be your own.
    That is, each student must submit a unique lab. Naturally, simply changing comments, variable names, etc. does not produce a unique lab.

0.7.4: Computer Language Used for Labs

You labs must be written in Java.

0.8: A Grade of Incomplete

The rules for incompletes and grade changes are set by the school and not the department or individual faculty member. The rules set by CAS can be found in http://cas.nyu.edu/object/bulletin0608.ug.academicpolicies.html. They state:

The grade of I (Incomplete) is a temporary grade that indicates that the student has, for good reason, not completed all of the course work but that there is the possibility that the student will eventually pass the course when all of the requirements have been completed. A student must ask the instructor for a grade of I, present documented evidence of illness or the equivalent, and clarify the remaining course requirements with the instructor.

The incomplete grade is not awarded automatically. It is not used when there is no possibility that the student will eventually pass the course. If the course work is not completed after the statutory time for making up incompletes has elapsed, the temporary grade of I shall become an F and will be computed in the student's grade point average.

All work missed in the fall term must be made up by the end of the following spring term. All work missed in the spring term or in a summer session must be made up by the end of the following fall term. Students who are out of attendance in the semester following the one in which the course was taken have one year to complete the work. Students should contact the College Advising Center for an Extension of Incomplete Form, which must be approved by the instructor. Extensions of these time limits are rarely granted.

Once a final (i.e., non-incomplete) grade has been submitted by the instructor and recorded on the transcript, the final grade cannot be changed by turning in additional course work.

0.9 Academic Integrity Policy

This email from the assistant director, describes the policy.

    Dear faculty,

    The vast majority of our students comply with the
    department's academic integrity policies; see

      www.cs.nyu.edu/web/Academic/Undergrad/academic_integrity.html
      www.cs.nyu.edu/web/Academic/Graduate/academic_integrity.html

    Unfortunately, every semester we discover incidents in
    which students copy programming assignments from those of
    other students, making minor modifications so that the
    submitted programs are extremely similar but not identical.

    To help in identifying inappropriate similarities, we
    suggest that you and your TAs consider using Moss, a
    system that automatically determines similarities between
    programs in several languages, including C, C++, and Java.
    For more information about Moss, see:

      http://theory.stanford.edu/~aiken/moss/

    Feel free to tell your students in advance that you will be
    using this software or any other system.  And please emphasize,
    preferably in class, the importance of academic integrity.

    Rosemary Amico
    Assistant Director, Computer Science
    Courant Institute of Mathematical Sciences
  

0.10: An Introduction to Programming with a Programming Prerequisite

How weird is this?

The formal prerequisite for 0101 is 0002, which teaches the Python programming language. (I had a tiny, insignificant part in the development of Python when I first arrived at NYU, 30 years ago.)

If instead of taking 0202, you have programmed in some other language (say C/C++), that is fine.

If, however, you are already a wizard Java programmer (or even a mere expert), you are taking the wrong course—you would be wasting somebody's money and, more significantly, wasting much of your time.

Chapter 1: Introduction to Computers, Programs, and Java

1.1: Introduction

This course, indeed the CS major sequence, emphasizes software, i.e., computer programs, rather than hardware, i.e., the physical components of a computer.

We teach a little hardware in 201, Computer Systems Organization, giving a high-level, non-detailed view, and present much more in 436, Computer Architecture.

In general, the NYU course sequence offers a top-down view: we first show you how to program in high-level languages such as Java and Python, later we present the assembly language that is essentially the language understood by the computer itself, and later still we describe the how the electronic components in a computer are able to actually execute these programs.

Many universities follow this approach. Others provide a bottom-up sequence beginning with the components, then low-level (assembly) languages, and then high-level languages.

1.2: What is a Computer?

Computers store and process data.

Modern computers store the programs on the same media as the data.

Figure 1.1 in the book is quite dated: It shows the design of 1980s computers. Modern machines do not have a single bus over which all information must travel. Compare the diagrams in sections 1.3 and 1.3.6 of my OS class note.

1.2.1: Central Processing Unit (CPU)

The CPU contains the electronic components that actually execute the instructions given to the computer.

First of all the CPU decodes the instruction (i.e., determines what is to be done, for example the contents of two CPU memory units called registers are to be added and the result placed in another register). Other instructions require the CPU to access additional components of the computer, e.g., the central memory.

In addition to determining the action needed, the CPU performs many of the operations required. For example the ALU (Arithmetic/Logic Unit) portion of the CPU contains an adder and thus performs the register add mentioned above.

1.2.2: Memory

Within a computer all data is stored as a sequence of bits, each of which can take on one of two values. Computers today mostly represent numbers as words, each consisting of 32 or 64 bits.

A 32-bit word can take on 232 (approximately 4 billion) different values.

I very much believe that you should remember just one value, 210=1024. Then you can deduce that
    232 = 22*230 = 4*23*10 = 4*210*210*210 = 4*1024*1024*1024,
which is a little more than 4*1000*1000*1000 = 4,000,000,000.

Modern computers cannot access a single bit of memory. They can access a single word and most computers (including all those we shall consider) can access a smaller unit called a byte, which consists of 8 bits.

Since a byte is the smallest unit of memory that can be referred to directly, modern computers are called byte-addressable.

Since the bytes can be accessed in any order (not just sequentially in order byte #1, byte #2, ...), the memory is said to support random access and is called random access memory or RAM.

1.2.3: Storage Devices

Computers can access any byte in RAM quite quickly, which is wonderful. However, there are at least three problems with RAM.

  1. Limited size.
  2. Volatile.
  3. Non-transportable.

Limited Size

Today's (personal) computers have around a gigabyte GB of RAM. The exact size of a gigabyte is controversial. It is either a billion (109) bytes or the binary equivalent (230). When you purchase a gigabyte of RAM you are getting the latter, but when you purchase a gigabyte of disk storage you are getting only the former.

It is clearly nonsensical that an 80GB disk cannot hold 10 copies of the data contained in an 8GB RAM. Nonetheless, it is true. In fact, the proper terminology is that the disk contains only 80GiB (abbreviating 80 gibibytes) not 80GB. However common usage is still 80GB.

Although a gigabyte of RAM is huge by historical standards, it is still insufficient to hold all the data we want on a computer system. For example, my (lavishly equipped) laptop has 8GB, which can store one movie in standard definition (one DVD) but not one hi-def movie (one blu-ray).

Disk Drives

Disks (i.e., so called hard drives) provide several hundred times more bytes per dollar than does RAM. Disks are not byte addressable (i.e., you can't refer to a single byte store on a disk). Instead the smallest addressable unit is called a sector, which is typically 512 bytes. Disks form the primary storage medium for most computer systems that are at least as big as a laptop.

Volatile

Current RAM does not maintain its contents when the power is shut off and hence is not suitable for storing permanent data. Hard drives, various types of CDs, and flash storage do maintain their contents without power.

CDs

The book's words are a little garbled. CDs come in basically three flavors: read-only, write-once, and rewritable. (In this course CDs refer to data CDs; audio CDs organize the data stored in a different manner). DVDs and Blu-ray are (for us) simply higher density CDs (in 202 you will learn that the filesystems stored on DVDs differs from that of CDs).

Flash Drives

Flash drives are physically small storage units (they are often called thumb drives due to their size and shape). Unlike disks and CDs, flash drives have no moving parts and are thus potentially much faster. Like disks they are not byte addressable; their smallest accessible unit is called a block.

Blocks can be rewritten a large number of times. However, the large number is not large enough to be ignored.

Flash drives are sometimes called solid-state disks.

Tape Drives

These are becoming less important and we will not discuss them.

Non-transportable

RAM cannot be moved easily from one machine to another. You would lose the data present (due to volatility) and if done often or carelessly, might damage the device. Some disk drives (called external disks) can be transported, but CDs and flash drives are much better in this regard.

1.2.4: Input and Output Devices

Note the CPU-centric terminology. Devices that produce output, such as mice and keyboards, are called input devices and devices that accept input such as monitors are called output devices.

How does moving a mouse, cause the pointer to move?

How does a keyboard send a 'X' as opposed to a 'x'?

Screen resolution and dot pitch of a monitor are defined correctly in the book, but the statements about quality and clarity are too simplistic; the size of the monitor must be considered as well.

1.2.5: Communication Devices

We will not study these. The book is somewhat dated here. Some homes (e.g., mine) have LANs; a typical NIC now is at least 100 megabits per second not 10 (many are now 1000 megabits per second).

1.3: Programs

I assume you have written programs (perhaps in 0002) and thus know what they are..

1.4: Operating Systems

An operating system (OS) is a software system that raises the level of abstraction provided by the hardware to a more convenient virtual machine that other software can then use. For example, when we write programs accessing disk files, we do not worry about (or even have knowledge of) how the data is actually stored on the disk. Indeed, they very concept of a file is foreign to a disk and is an abstraction provided by the OS.

The OS also acts as a resource manager permitting multiple users to share the hardware resources.

Naturally much more detail is provided in my OS class notes. A short summary is in section 1.1 of those notes.

1.4.1: Controlling and Monitoring System Activities

1.4.2: Allocating and Assigning System Resources

1.4.3: Scheduling Operations

1.5: Java, World Wide Web, and Beyond

Java is a very popular, modern, general purpose, programming language. It comes with an extensive standard library that aids in writing graphical programs, especially those, called applets, that are invoked from browsers, e.g., firefox.

Java has extensive support for the modern software development methodology called object-oriented programming.

Java is a full-featured, and thus large, programming language. In its entirety, Java is not simple; but we will be able to avoid most of the tricky parts.

1.6: The Java Language Specification, API, JDK, and IDE

Any programming language needs a detailed, precise specification describing the syntax and semantics of the language. It is basically the rules that determine a legal Java program. We will not need this level of precision.

Changing the specification essentially changes the language. The Java spec is stable.

The Application Program Interface (API) is defined by the standard library that comes with Java. It is comparatively easy to extend the API—write another library routine—and this does occur.

There are several versions of Java; we use Java SE 1.6, which we will just call Java.

The programs used to compile and run Java programs are part of the Java Development Toolkit (JDK).

Instead of using the JDK, one can use an Integrated Development Environment (IDE). Several IDEs are available. I will use only the JDK. You may develop your labs using either the JDK or an IDE, but the final product must be a Java program that can be run with just the JDK

1.7: A Simple Java Program

// Hello world in Java
public class Hello {
    public static void main (String[] args) {
	System.out.println("Hello, world.");
    }
}

// Hello world in the C programming language #include <stdio.h> void main(int argc, char *argv[]) { printf("Hello, world.\n"); }

On the right we see a simple Java program that prints the sentence Hello, world.. This program is contained in the file Hello.Java.

For comparison, the corresponding C program is below it. I put this program in a file called hello.c, but it could have been in a file called xyxxy.c (the .c is important).

Although they may look different, these two programs are basically the same. We now discuss briefly the Java version, line by line.

  1. This line is a comment and is, in a sense, not part of the program. It is there to aid anyone reading the program. A comment begins with two consecutive slashes and ends at the end of the line.
  2. This line introduces the class named Hello. Java is case sensitive and, by convention, class names are capitalized. We will have much more to say about classes later. Now we just note that they can contain data (this simple class does not) and methods (this class has the method main). Methods in Java are akin to procedures in other programming languages.
    Hello is public, which means it can be accessed from any other class Many simple .java files contain just a single public class. In this case the file must be named X.java, where X is the name of the class.
    The { at the end of the line marks the beginning of the body of the class.
  3. This line introduces the method main. The name main is special. When a program is run the system begins by executing the main method. This line tells us several things about main: The { at the end of the line marks the beginning of the body of the method.
  4. This line (the bulk of the body of main) invokes the method println, which is found in the class System.out.
  5. This line ends the method main.
  6. This line ends the class Hello.

Homework: 1.1, 1.3.

For the benefit of those students with the 7th edition, here are the problems.

1.1 (Displaying three messages) Write a program that displays Welcome to Java, Welcome to Computer Science, and Programming is fun.

1.3 (Displaying a pattern) Write a program that displays the following pattern:


      J     A     V     V    A     
      J    A A     V   V    A A    
  J   J   AAAAA     V V    AAAAA   
   J J   A     A     V    A     A  
  

Unless otherwise stated homeworks are from the Programming Exercises at the end of the current chapter. They are not from the Review Questions