Introduction to Programming

Introduction to Programming

X52.9239-003

Dr. Jean-Claude Franchitti

New York University

School of Continuing Education

Session 9 Class Notes

Goals and Topics:

Questions and comments from last class

Structured design, structure charts, data flow diagrams

Understand various types of file structures

Use the C language to open and close files

Study input/output statements to files

Understand database structures

Review C functions, function arguments, return types

Review from Previous Class

The setjet.c program, arrays and array indexes in C, structured data, data structures: stack, queue, tree, linked list, searching and sorting.

Questions

Structured Design

The principles of structured programming or modular design were organized into a set of techniques called structured design.

Structured design breaks the program down into manageable modules that are treated as black boxes (no concern about the internal workings).

The modules are arranged to show a hierachy using a structure chart.

Structure Chart

The structure chart is an improvement on flowcharts (which do not work well for larger programs). The structure chart shows:

Modularization of program

Hierachy of modules

Data communications between modules

Here are the graphical elements used to make up a structure chart:

Structure charts are arranged to show processing of inputs to outputs, with input modules on the left and output modules on the right:

Data Flow Diagram

The second diagram used in Structured Design is the Data Flow Diagram (DFD), which concentrates on the flow of data in a program, and the processes that transform the data. The notation for DFDs is shown below:

Each of the three top level processes here could be shown at a finer level of detail by a DFD for that process:

Data Dictionary

The data dictionary contains a description of all the data objects used in the program. This allows you to plan the structure, type, and names of the data.

Customer record is name

And customer address

And customer account

Customer address is street

And city

And state

And zipcode

Customer account is credit rating

And current balance

And past due 30 days

Credit rating 0 never past due

1 past due 1 … 3 times

2 past due 4 … 10 times

3 no credit extended

Current balance is dollars

Video Store Example

Introduction to Files in C

Files provide persistent, nonvolatile storage of data for computers. Files are used for secondary storage of programs and data, which is transferred to main memory storage (RAM) for direct access by the processor. An application program will interface with the computer’s operating system (OS) to manipulate files. The OS gives a uniform interface to the file management system, so you can easily use a variety of physical devices which can store files. The OS queues file requests, fills file buffers (cache) based on predictive algorithms, and handles I/O requests.

Standard File I/O Operations

FILE fp; / define a ponter to FILE */

fp = fopen("test.dat", "r");

fopen: function name

test.dat: file name

"r": mode to use file

The FILE data structure is described in file stdio.h. It is a data structure which holds a collection of information related to the file. The file name "test.dat" is the file which will be accessed in later uses of the file.

The mode "r" describes permitted interactions with the file. Some allowed modes for fopen are:

"r" read only, existing file

"w" write only, create file

"a" append to end of existing

Some of these modes will overwrite an existing file with a new file. If you had a database file, for example, which should be created if it doesn’t exist, or preserved if it does exist, use a statement like:

if ( ( fp = fopen (FILENAME, "ab+")) != NULL)

It is good programming style to always check the results of fopen as shown below.

if ((fp = fopen ("test.dat", "rt")) != NULL)

{

/* process the file */

…

}

else

{

printf("Error opening file test.dat\n");

exit(EXIT_FAILURE); /* exit the program */

}

To read a single character from an opened file, you can use fgetc

int ch; /* note that ch is an int for EOF (-1) */

if (( fp = fopen (FILENAME, "rt")) != NULL)

{

while (( ch = fgetc(fp)) != EOF)

ch: int return value of fgetc(), must be int to accept the EOF value.

fgetc: function to read a single char from the given file structure

EOF: defined constant returned by fgetc() when the end of file is reached

Example Using Standard I/O – Writing

I recommend using a #define constant for fixed file names, this makes it easy to change the file name at a later time. This program opens a file and writes some characters to the file.

#include <stdio.h>

#define FILENAME "testdos.dat"

FILE fp; /* define pointer to file */

main( )

{

/* open a file for writing, create or overwrite */

/* an existing file */

if ( ( fp = fopen (FILENAME, "w") ) != NULL)

{

fputs ("test test", fp); /* write a string */

fputc (‘\n’, fp); /* write a char */

fclose (fp);

}

else

printf ("Unable to open %s.\n", FILENAME);

}

To read the file, use the following program:

#include <stdio.h>

#include <ctype.h>

#define FILENAME "testdos.dat"

main( )

{

int i = 0;

int ch; /* note that ch is an int for EOF (-1) */

FILE fp; /* define a pointer to FILE */

/* open a file to read in binary mode */

if ( ( fp = fopen (FILENAME, "rb") ) != NULL)

{

while (( ch = fgetc(fp)) != EOF)

printf("byte %2d: %c %3d\n", i++, ch, ch);

/* print EOF character */

printf("byte %2i: EOF %3d\n", i++, ch, ch);

}

else

printf("Error opening file %s\n.", FILENAME);

return (0);

}

Program output:

Byte 0: t 116

Byte 1: e 101

Byte 2: s 115

Byte 3: t 116

Byte 4: 32

Byte 5: t 116

Byte 6: e 101

Byte 7: s 115

Byte 8: t 116

Byte 9: 13

Byte 10: 10

Byte 11: EOF -1

File Example: tabout.c

This program removes the tabs from a source code file so that your intended statements will line up vertically. Printing tabs can be a problem if the tab stops are set differently on your printer and on your editor. This program fixes the situation by removing all the tab characters in a file, and replacing them with blank spaces. The number of spaces inserted for each tab character is set by the statement

#define NUM 4 /* fours spaces per tab */

This defines the constant "NUM" to mean "4" in this program. If you use a different number for the tab stop in your editor, change this line appropriately. This program must be used from the DOS command line prompt.

Then you can use redirection to rid a file of those annoying tabs. For example, to remove the tabs from the file "hw5.c", and create a new file "hw5nt.c", use this command at the DOS prompt:

Tabout < hw5.c > hw5nt.c

This command will take the program input using redirection from the file "hw5.c", and redirect the program output to the file "hw5nt.c". The output file "hw5nt.c" can now be printed sans tabs. It will still compile, edit, and otherwise have a completely normal appearance.

#include <stdio.h> /* standard I/O info */

#define NUM 4 /* four spaces per tab */

int main (void)

{

int ch, I;

while (( ch = getchar( )) != EOF) /* until end-of-file */

{

if (ch == 9) /* found a tab, ASCII value 9 */

{

for (I = 0; I < NUM ; I++

printf("%c", 32);/* print a blank space */

}

else /* not a tab, echo character read */

printf("%c", ch);

} /* end while */

}

Databases

A database is an organized collection of information in a file. A Database Management System (DBMS) is a program which allows for managing the database files to allow for data entry, manipulation, generation of reports, and responses to queries. A Relational Database Management System (RDBMS) uses relational algebra (mathematical set operations) for ad hoc (highly flexible) manipulation of the database files.

A single database file is often called a table, since the information stored in the file can be visualized as a two dimensional table of rows and columns. A row of information is called a record, and individual columns of data are called fields. For example, a table may contain address information for singers

table singers

rec name address city state zip

1	Bolton, Michael	8980 Wilshire Blvd.	Beverly Hills	CA	90212
2	Jagger, Mick	110 W. 57^th St.	New York	NY	10019
3	Sinatra, Frank	3701 Wilshire Blvd.	Beverly Hills	CA	90010
4	Simon, Paul	1619 Broadway #500	New York	NY	10023
5	Lauper, Cindy	853 7^th Ave.	New York	NY	10022

An index is a sorted listing of fields of a table, which allows for easier access to records in the table. If we select the name and zip fields as the keyword to generate an index for the above table, we would create the following index files:

Keyword name field	Indexed record
Bolton, Michael	1
Jagger, Mick	2
Sinatra, Frank	5
Simon, Paul	4
Lauper, Cindy	3

Keyword zip field	Indexed record
90212	5
10019	1
90010	4
10023	3
10022	2

Using an indexed database makes it possible to find information quickly (since the index is sorted in some useful order) without having to reorganize the whole table, which might contain a very large amount of data.

The DBMS provides different ways to interact with the database:

Through GUI or forms-based interfaces. This allows for normal prearranged reporting and data display.

Through command or SQL interfaces. This allows for unstructured searches and processing.

Through an Applications Programming Interface (API). This allows for an interface with an application program written in a programming language, such as C. The program can manipulate information through the DBMS.

Relational Databases are widely used in current programs. Earlier databases used a hierarchical or networked structure, where there was a rigid, fixed relationship between the records and fields in the database. In the early 1970s, an IBM researcher E.F. Codd applied the relational algebra of set theory (unions, intersections, etc.) to databases to develop the relational database. Relational databases allow you to set up logical links between the data.

For example, if we added the following table to our singer database above

Table climate

record	city	state	degreeDays	AveRainFall
1	New York	NY	345	35.9
2	Los Angeles	CA	586	17.3
3	Beverly Hills	CA	568	17.9
4	Jersey City	NJ	355	33.0

We can query the relational database with the question "what is the average rainfall at Michael Bolton’s address?". The RDBMS will process the information in both of the tables, by relating the city and state fields between the two tables, and produce the answer "17.9".

Structured Query Language

Structured Query Language (SQL) can be used to request information from a database. SQL was developed by IBM in the 1970s, and standardized by ANSI most recently in 1992. SQL gives a standard format to request information from a database file in a format something like this:

select field, field, …

from tablename

where search condition

For the singers table, we could write some SQL requests like these:

select name, city, state

from singers

where city = ‘Beverly Hills’

Which would produce the output:

Bolton, Michael Beverly Hills CA

Sinatra, Frank Beverly Hills CA

The search condition can be much more complex, involving relational expressions, such as:

select name, city, state

from singers

where city = ‘New York’ and zip < 10020

Which would produce the output:

Jagger, Mick New York NY

These SQL commands can be issued through a user interface to the DBMS. To use SQL commands within an application program, we can use embedded SQL programming. With this technique, we write the SQL commands into a C program using a special prefix. A precompiler converts the embedded SQL commands to C function calls before the program is compiled.

Another way to incorporate a database interface in your programs is to use the functions and library interface supplied by the DBMS vendor for your language. Here is a C language function call to Novell’s Btrieve database which reads a record from a database file:

while (try < 3)

{

result = Btrieve(GET_FIRST,handle, table, dataLength, key);

if (result == 0) /* got data */

break;

delay(500); /* wait half a second */

try = try + 1;

}

This loop will try three times to read information from the database file using a library function named Btrieve. This loop will attempt to read the first record (GET_FIRST) of a database file until it is successful, or until it trues three times.

Functions

The modules or subprograms used in structured design are called functions in C language. In mathematics, a function transforms on set of values to another set of values.

In other languages, a subprogram is called a procedure or subroutine. Some languages reserve the term function for a subprogram that returns a value, and call a subprogram that does not return a value a procedure.

C only has functions, which may return a single value, or no value. A function is a small, self-contained piece of code which accomplishes a single task.

To use a function in a program, you call or invoke the function by giving it’s name in a statement or expression. One good place to use a function is to replace repetitive code.

The following program prints a separator between output data:

#include <stdio.h>

int main(void)

{

int a, b, c;

a = 1000;

b = 2000;

c = 3000;

printf("****************************************\n");

printf("the value of a is %d\n", a);

printf("****************************************\n");

printf("two times b is %d\n", 2*b);

printf("****************************************\n");

printf("the value of c – b is %d\n", c - b);

printf("****************************************\n");

return(0);

}

Instead of printing the border of asterisks four times, we could use a separate function called print_border.

#include <stdio.h>

int main(void)

{

int a, b, c;

a = 1000;

b = 2000;

c = 3000;

print_border();

printf("the value of a is %d\n", a);

print_border();

printf("two times b is %d\n", 2*b);

print_border();

printf("the value of c – b is %d\n", c - b);

print_border();

return(0);

}

void print_border(void) /* print a border of asterisks */

{

printf("****************************************\n");

return;

}

When the executing program sees this function name, it makes an unconditional branch (jumps) to the section of code which implements the function. Upon completion of the function code, the program makes an unconditional branch (jumps) back to the statement that follows the function call.

We could modify the print_border function to use two arguments, one to control the number of characters printed, and the second to control the character printed (asterisk in the above):

#include <stdio.h>

void print_border(int, char); /* function prototype */

int main(void)

{

int a, b, c;

a = 1000;

b = 2000;

c = 3000;

print_border(30, ‘*’);

printf("the value of a is %d\n", a);

print_border(40, ‘+’);

printf("two times b is %d\n", 2*b);

print_border(50, ‘#’);

printf("the value of c – b is %d\n", c - b);

print_border(40, ‘@’);

return(0);

}

/* print a border of max asterisks */

void print_border(int max, char ch)

{

int i;

for ( i = 0; i < max; i++)

printf("%c", ch);

printf("\n");

return;

}

Return Type

Functions do not have to return a value back to the calling program, but if they do, it is the value given in the return statement. The return value allows a function call to be used in any expression where a data type returned by the function can be used.

For example, a function with a return type of int could be used anywhere an int constant or variable could be used.

The function prototype gives the compiler advance notice about how the function call will work. The function prototype should appear at the top of the program file right below the #include lines. The function prototype lists the return type of the function, the name of the function, and the data types of the arguments. With this information, the compiler can validate the later function calls (type-checking).

Global and Local Variables

Variables which are defined within the braces of a function are local variables. Variables which are defined outside of all the functions are global variables. Local variables can only be used within their function, but global variables may be used by any function.

Standard Library Functions

C programs can use programmer defined functions, such as print_border above, or use Standard Library Functions which are supplied in the compiler library. These are standardized functions for common operations (input/output, etc.) which have been written, tested, compiled, and collected in library files to be linked into programs.

The main Function

Main is a special function which the programmer writes. It is the starting point for program execution. The linker uses the startup code object file to get the program up and running, then begins execution from the first statement in main. When the function main returns, it passes control back to the DOS command line, or back to the IDE which invoked the program.

For Next Class

Read chapters 10, 11, and appendices (optional)

Take home final examination handout