Perl Programming

Contents:

Perl Programming. 1

Writing Safe Code. 3

Declare your variables with my. 3

Enforcing the usage of my by the strict module. 3

Using the warnings module or the -w switch. 3

PERL-CGI 3

Guidelines for HTML page generation by a CGI-program.. 4

Sending the contents of a file to the user 5

Counters. 5

Mailing List 6

A complete example. 7

Checkboxes. 9

Vocabulary. 10

Perl-DBI 12

 

VARIOUS

# Notes from Weizmann Institute, Israel http://bioinformatics.weizmann.ac.il/courses/prog/

# February, 2002

# This is not meant to be an exhaustive Perl course, there are many proper ones out there. These are the notes about what I considered interesting…

            \n         newline

            \t          tab

            \a         bell

 

** exponentiation

(e.g. 2**3 is 2 to the power of 3, resulting 8)

% modulus

(e.g. the value of 10%3 is the remainder when 10 is divided by 3, which is 1)

 

length

-------

  $result = length ("university"); #$result gets 10

 

ARRAYS

===========

foreach $i ( {at} some_array) {

   statement_1;

   statement_2;

   statement_3;

}

 

An empty list:

 

{at} list = ();

 

sort

---------

  {at} array2 =  sort   ( {at} array1);

  _______   ______  ___________

 

# return   function  argument

# value

 

push

-----

$a = 5;

$b = 7;

{at} array = ("David", "John", "Gadi");

push ( {at} array, $a, $b);

 

# {at} array is now ("David", "John", "Gadi", 5, 7)

 

shift

-----

{at} array = ("David", "John", "Gadi");

$k = shift ( {at} array);

 

# {at} array is now ("John", "Gadi");

# $k is now "David"

 

$numberofelements = {at} array;

$numberofelements = scalar ( {at} array);

Use the special variable $#array_name to get the index value of the last element of {at} array_name.

Example:

{at} fruits = ("apple", "orange", "banana", "melon");

$a = $#fruits;      # $a is now 3;

 

Functions

sub subroutine_name {
   my ( {at} list) =  {at} _;    
   my ($no_of_elem, $result);
   my ($sum) = 0;      
   statement_1;
   statement_2;
   statement_3;
   ...
}

 

$var2= subroutine_name ($var1);

 

·        use local my variables inside subroutine!

·        Upon subroutine invocation, a special array variable is formed called {at} _, and it lasts throughout the duration of the subroutine.

·        The return function is used inside subroutines to specify the value that will be returned by the subroutine to the outside program (or to the place in the program from which the subroutine was invoked).

·        any type of variable can be passed onto subroutine, and will be extrapolated

 

When the arguments are to be captured in a combination of array and scalar variables, special care has to be taken. In this case, only one array is allowed, and it has to be the last variable in the list to which {at} _ is assigned.

sub good {
   my ($first, $second,  {at} group) =  {at} _; # capturing arguments
   print "first:  $first\n",
         "second: $second\n",
         "group:   {at} group\n";
}

Some functions need not receive any arguments, and are invoked as:

function_name();

Writing Safe Code

Declare your variables with my

Local Variables need to be declared, but not initialised.

Note: If a list of variables is declared with my, the list must be enclosed in parentheses. If only one variable is declared with my, do not use parentheses (this might cause problems in statements like my ($line) = <FILEHANDLE>;, which we will learn later).

e.g.

 
   my ( {at} list) =  {at} _;
   my ($no_of_elem, $result);
   my $sum;

Enforcing the usage of my by the strict module

The strict module imposes several restrictions on your Perl program. It will send you compile-time error messages and will avoid program execution, in cases where your program does not obey the restrictions. You may choose to use strict for creating safer code and to help you debug your programs.

use strict 'vars';

Using the warnings module or the -w switch

The warnings module may help you debug your program and write safer code, by sending warnings during program compilation.
However, it will not stop the program execution.

#!/usr/local/bin/perl –w
or
use warnings;

Looks more useful than strict.

Last but not least about my:

my creates a local variable. local doesn't.

Because local does not actually create local variables, it is not very much use.

PERL-CGI

Static HTML page

Stored as an HTML file on the Web server.

Dynamic HTML page

Created "on the fly" by a computer program at the server side.

CGI (Common Gateway Interface)

The mechanism that connects between computer programs and the Web server.

It also defines a set of environment variables. Commonly, the program will generate some HTML, which will be passed back to the browser but it can also request URL redirection.

CGI allows the returned HTML (or other document type) to depend in any arbitrary way on the request. The CGI program can, for example, access information in a database and format the results as HTML. A CGI program can be any program, which can accept command line arguments. Some HTTP servers require CGI programs to reside in a special directory, often "/cgi-bin" but better servers provide ways to distinguish CGI programs so they can be kept in the same directories as the HTML files to which they are related. Whenever the server receives a CGI execution request it creates a new process to run the external program. If the process fails to terminate for some reason, or if requests are received faster than the server can respond to them, the server may become swamped with processes.

 

Guidelines for HTML page generation by a CGI-program

  1. In order for your program to send its standard output to the Web it should be placed in a special directory on the Web server, usually the cgi-bin directory.

You can sometimes get the same effect if you append the name of your program with .cgi and place it in your public_html directory.

Consult your system administrator or Internet provider for that.

  1. If your program output is a text file (e.g. in HTML format), you must include in your program the following print command before any other print commands:
print "content-type: text/html\n\n";
  1. Remember to use <BR> or <P> tags to mark end of text lines.
    \n "newlines" will not be visible on Web pages.
  2. In quoted strings, remember to place a backslash before quotes. e.g.
5.             print "<A HREF=\"home.html\">Back to Home Page</A>";
  1. Do not forget to make the file containing your program (e.g. hello.cgi) executable by "the world" using the Unix command:
chmod a+x hello.cgi
  1. To execute your program from a Web browser, use the following URL (assuming that your program is in your public_html directory):
http://host_computer_name/~userid/hello.cgi

Where userid is your own userid.

From: http://www.cc.ukans.edu/~acs/docs/other/forms-intro.shtml

 

An HTTP URL may identify a file that contains a program or script rather than an HTML document. That program may be executed when a user activates the link containing the URL. Such programs are sometimes called HTTP scripts or "Common Gateway Interface" (CGI) scripts. On some HTTP servers these CGI programs are stored in a directory called cgi-bin, and so they are also sometimes called "cgi-bin scripts."

 

from http://www.cc.ukans.edu/~acs/docs/other/cgi-with-perl.shtml and http://www.yahoo.com/Computers/World_Wide_Web/Programming/Perl_Scripts/

also see: http://www.speakeasy.org/~cgires/cgi-tour.html

 

Perl may be used to return an HTML document by using a program like

 

#!/usr/local/bin/perl

print "Content-type:text/html\n\n";

print <<WEB_PAGE;

    <html>

       

    </html>

WEB_PAGE

 

Note also that any script that returns an HTML document to the Web browser must print the string Content-type:text/html\n\n to standard out.

If this script is stored in a file called script.pl within the cgi-bin directory within the username home directory on mrc-lmb.cam.ac.uk, the URL for the script would look like this:

http://www.mrc-lmb.cam.ac.uk/scripts.pl

 

IMPORTANT: Your script files should probably give access permissions only to the owner. You can assign these permissions by entering

chmod 700 script_filename

 

Perl allows you to call a script from within a Perl script. In general, you can run a script and send the output of that script to the user with a Perl print command like

print `script_filename`;

where script_filename is the name of the file that holds the script to be executed by the Perl script. This facility can be used to run file-resident shell commands as well as other Perl or shell scripts. For example, you can use this approach to include the current date in a document:

 

#!/usr/local/bin/perl

print "Content-type: text/html\n\n";

print "The current date is ";

print `/bin/date`;

 

Sending the contents of a file to the user

For example, suppose you have a signature file named

/home/username/public_html/signature.file

 

that contains a signature like

    <p>

    See <a href="http://falcon.cc.ukans.edu/~username/">

    my homepage</a> for more information.

 

and you want to use this signature with a page you send to the user from a script. You could do that with a sequence of print commands:

#!/usr/local/bin/perl

print "Content-type:text/html\n\n";

print <<WEB_PAGE;

    <html>

    <title>My Thank You Page</title>

    <h1>Thank you for reading this document.</h1>

WEB_PAGE

print `cat /home/username/public_html/signature.file`;

print "</html>\n";

 

Alternatively, you might first use the UNIX cat command to copy the signature file into a Perl variable and then print that variable within a Perl here document:

#!/usr/local/bin/perl

print "Content-type:text/html\n\n";

$SIG = `cat /home/username/public_html/signature.file`;

print <<WEB_PAGE;

    <html>

    <title>My Thank You Page</title>

    <h1>Thank you for reading this document.</h1>

$SIG

    </html>

WEB_PAGE

Counters

#!/usr/local/bin/perl

print "Content-type:text/html\n\n";

#

#First, increment the counter:

#

open(COUNTER, "+< /home/smith/counter.file");

             # open the counter file with read and write access.

$COUNT= <COUNTER>;           #read the current value.

$COUNT++;                    #increment it by one.

seek(COUNTER, 0 , 0);        #rewind the file.

print COUNTER $COUNT;        #write the new value to the file.

close COUNTER;

#

#Then use the counter in the returned HMTL.

#

print <<END_OF_PAGE;

<html>

<title>My first return page</title>

<h1>Thank you for selecting this document.</h1>

You are visitor number $COUNT

</html>

END_OF_PAGE

 

Perl does include facilities for synchronizing file accesses. Among others, the flock function allows users to lock a file for private use. You could use the line

flock (COUNTER, 2);

to lock the file immediately after opening the file and the line

flock (COUNTER, 8);

to unlock the file immediately before closing it.

Mailing List

Suppose you want to build a form that collects e-mail address from users interested in vegetarianism. The following is an example.

 

<form method="post" action="http://www.cc.ukans.edu/cgiwrap/grobe/send-veggi-info.pl">

<P>

If you would like more information about vegetarianism,

please enter your name and e-mail address below.<P>

Please enter your name:<br>

<input type="text" name="name" size="35"><br>

Please enter your e-mail address:<br>

<input type="text" name="address" size="35"><p>

<input type="submit" value="send address">

<input type="reset" value="start over">

</form>

 

To actually use the data, you must use the HTML form field names in a special way within the Perl script. For example, within the following script, $q->param('name') refers to the contents of the "name" field in the example form above. The result page returned to the user will contain the actual name entered when the fieldname "name" is used in this fashion:

thank.you.pl:

#!/usr/local/bin/perl

use CGI;

$q = new CGI;

$name = $q->param('name');

$address = $q->param('address');

print "Content-type:text/html\n\n";

print <<END_OF_MESSAGE;

<html>

<title>Thank You Page</title>

<h1>Thank you for filling out my form!</h1>

Thank you, $name, for filling out my form!

I will mail information to $address right away.

</html>

END_OF_MESSAGE

 

Mailing information from a script

sendmail.pl:

use CGI;

$q = new CGI;

$address = $q->param('address');

$name = $q->param('name');

open (MAIL, "| /usr/lib/sendmail -oi -n -t" );

print MAIL <<MESSAGE_TO_USER;

To:$address

From:cvogel\ {at} mrc-lmb.cam.ac.uk

Dear $name:

Here are some books that talk about vegetarianism:

     Diet for a New America

     How to Avoid Beef

MESSAGE_TO_USER

close MAIL;

 

To record data in a file, you can use a Perl fragment like the following:

open (DATAFILE, ">> /home/smith/filename.txt");

flock (DATAFILE, 2);

print DATAFILE <<LIST_ITEM;

 blabla

LIST_ITEM

flock (DATAFILE, 8);

close DATAFILE;

 

The information written may contain HTML mixed with strings instructing Perl to insert information collected on a form. For example, a string like $name appearing in the HTML indicates the name variable set previously in your program should be included in the HTML written to the file. This approach might be used in a typical guestbook application as

guestbook.pl:

#!/usr/local/bin/perl

use CGI;

$q = new CGI;

$name = $q->param('name');

$address = $q->param('address');

open (DATAFILE, ">> /home/smith/public_html/guestbook.html");

flock (DATAFILE, 2);

print DATAFILE <<RECORD_ITEM;

Name: $name

Address: $address

RECORD_ITEM

flock (DATAFILE, 8);

close DATAFILE;

A complete example

The script collects the name and address information submitted by the user, records that information in a data file, returns a summary of the record to the user who entered the information, and sends a mail message to the form creator to let her know more data has arrived. This script uses the CGI module to collect the form information submitted by the user and relayed to the script by the Web server.

This example would be referenced with the URL

http://raven.cc.ukans.edu/cgiwrap/grobe/send-veggi-info.pl

if it were to be run from the public_html/cgi-bin directory within the home directory for the account grobe.

 

#!/usr/local/bin/perl

#

# Written by Michael Grobe...5-8-96.

# Updated by Jeff Long...9-15-2000.

#

# Send the html document MIME type.

#

print "Content-type: text/html\n\n";

# parse the input information and retrieve arguments from the form.

#

use CGI;

$q = new CGI;  # get the form arguments.

$name = $q->param('name');

$address = $q->param('address');

#

# respond with an html file to the user.

#

print <<WEB_PAGE;

<html>

<h1>Vegetarians unite!</h1>

A list of books about vegetarianism will be sent to $name

at $address.

</html>

WEB_PAGE

#

# Mail the information to the user using the Unix sendmail command.

#  sendmail -n ignores alias file.

#  sendmail -t examines stdin for To: list of addressees

#  sendmail -oi does not stop with a line containing only a period

#

open (MAIL, "| /usr/lib/sendmail -oi -n -t" );

print MAIL <<MESSAGE_TO_USER;

To:$address

From:grobe\ {at} raven.cc.ukans.edu

Dear $name:

Here are some books that talk about vegetarianism:

     Diet for a New America

     How to Avoid Beef

MESSAGE_TO_USER

close MAIL;

#

# Record the request in a datafile in comma delimited format.

#

open (DATAFILE, ">> /homea/grobe/list-of-recipients.txt");

flock (DATAFILE, 2);

print DATAFILE <<LIST_ITEM;

\"$name\",\"$address\"

LIST_ITEM

flock (DATAFILE, 8);

close DATAFILE;

#

# send a message to the script owner that info has been sent to the user.

#

#  sendmail -n ignores alias file.

#  sendmail -t examines stdin for To: list of addressees

#  sendmail -oi does not stop with a line containing only a period

#

open (MAIL, "| /usr/lib/sendmail -oi -n -t" );

print MAIL <<MAIL_MESSAGE;

To:grobe\ {at} raven.cc.ukans.edu

From:grobe\ {at} raven.cc.ukans.edu

You have just sent info about vegetarianism to $name at

$address.  The requestor's name and address have been

recorded in /homea/grobe/list-of-recipients.txt.

MAIL_MESSAGE

close MAIL;

exit;    # end of user script.

 

Checkboxes

<html>

<FORM METHOD=POST

   ACTION="http://www.ukans.edu/cgiwrap/grobe/test-checkbox.pl">

Please help us to improve KUfacts by filling in the following questionnaire:

<P>

Your organization? <INPUT NAME="org" TYPE=text SIZE="48">

<P>Which browsers do you use?

<OL>

   <LI>Lynx <INPUT NAME="browsers" TYPE=checkbox VALUE="Lynx">

   <LI>Mosaic <INPUT NAME="browsers" TYPE=checkbox VALUE="Mosaic">

   <LI>Netscape <INPUT NAME="browsers" TYPE=checkbox VALUE="Netscape">

   <LI>Internet Explorer <INPUT NAME="browsers" TYPE=checkbox

               VALUE="Explorer">

   <LI>Others <INPUT NAME="browsers" TYPE=checkbox VALUE="Others">

</OL>

Your e-mail address: <INPUT NAME="address" SIZE="42">

<P>Thanks for your input.

    <P><INPUT TYPE=submit value="Submit survey"> <INPUT TYPE=reset>

</FORM>

</body>

</html>

 

The following script could be used to return a summary of the respondent's reply. This script collects the form information and then prints a list of each browser that was checked. The CGI module will return all of the values for the variable browsers in a single array that can be obtained by {at} BROWSERS = $q->param('browsers').

 

#!/usr/local/bin/perl

use CGI;

$q = new CGI;

print "Content-type:text/html\n\n";

{at} BROWSERS = $q->param('browsers');

$org = $q->param('org');

$address = $q->param('address');

print "<html>";

print "<h1>Thanks for filling out the survey</h1>";

print "We will record your information as follows:<p>";

print "$org uses the following browsers:";

print "<ol>";

foreach $BROWSER ( {at} BROWSERS) { print "<li>$BROWSER \n"; }

print "</ol>";

print "The contact address is:<br>$address";

print "</html>";

exit;

 

 

now:

if ($q->param('name') eq "Megen")

  {

     print "Hi Megen!";

  }

etc fancy stuff in Perl…

 

Vocabulary

HTML Tags Related to Forms Mode

The tags added to HTML to allow for HTML forms are:

<FORM>. . . </FORM>

Define an input form.
Attributes: ACTION, METHOD, ENCTYPE

<INPUT>

Define an input field.
Attributes: NAME, TYPE, VALUE, CHECKED, SIZE, MAXLENGTH

<SELECT> . . . </SELECT>

Define a selection list.
Attributes: NAME, MULTIPLE, SIZE

<OPTION>

Define a selection list selection (within a SELECT).
Attribute: SELECTED

<TEXTAREA> . . . </TEXTAREA>

Define a text input window.
Attribute: NAME, ROWS, COLS

Possible Input Tag Data Types

TEXT

For entering a single line of text. The SIZE attribute can be used to specify the visible width of the field. The MAX attribute can be used to specify the maximum number of characters that can be typed into the field.

CHECKBOX

For Boolean variables, or for variables which can take multiple values at the same time. When a box is checked, the value specified in its VALUE attribute is assigned to the variable specified in its NAME attribute. If several checkbox fields each specify the same variable NAME, they can be used to assign multiple values to the named variable, since each checkbox field may have a VALUE attribute.

RADIO

For variables which can take only a single value from a set of alternatives. If several radio buttons have the same NAME, selecting one of the buttons will cause any already selected button in the group to be deselected.

SUBMIT

Selecting this link or pressing this button submits the form.

RESET

Selecting this link or pressing this button resets the form's fields to their initial values as specified by their VALUE attributes.

HIDDEN

For passing state information from one form to the next or from one script to the next. An input field of type HIDDEN will not appear on the form, but the value specified in the "VALUE" attribute will be passed along with the other values when the form is submitted.

IMAGE

For displaying an image map within a form and returning the coordinates of a mouse click within the image.

multiple choice:

    <SELECT NAME="browser">

       <OPTION> Cello

       <OPTION> Lynx

       <OPTION> X Mosaic

       <OPTION> Mac Mosaic

       <OPTION> Win Mosaic

       <OPTION> Line Mode

       <OPTION> Some other

    </SELECT>

 

       <TEXTAREA NAME="address" ROWS=6 COLS=60>

           Academic Computing Services

           The University of Kansas

           Lawrence, Kansas 66045

       </TEXTAREA>

 

POST-Method Example

<html>

<head>

<title>This is a practice form.</title>

</head>

<body>

<FORM METHOD=POST

               ACTION="http://www.cc.ukans.edu/cgi-bin/post-query">

Please help us to improve the World Wide Web by filling in

the following questionaire:

   <P>Your organization? <INPUT NAME="org" TYPE=text SIZE="48">

   <P>Commercial? <INPUT NAME="commerce" TYPE=checkbox>

           How many users? <INPUT NAME="users" TYPE=int>

   <P>Which browsers do you use?

   <OL>

    <LI>Cello <INPUT NAME="browsers" TYPE=checkbox VALUE="cello">

    <LI>Lynx <INPUT NAME="browsers" TYPE=checkbox VALUE="lynx">

    <LI>X Mosaic <INPUT NAME="browsers" TYPE=checkbox VALUE="mosaic">

    <LI>Others <INPUT NAME="others" SIZE=40>

   </OL>

A contact point for your site: <INPUT NAME="contact" SIZE="42">

<P>Many thanks on behalf of the WWW central support team.

    <P><INPUT TYPE=submit> <INPUT TYPE=reset>

</FORM>

</body>

 

When the form is "submitted" as filled out above, the following information is sent by the client. This query is a "POST" query addressed for the program residing in the file at "/cgi-bin/post-query". Post-query is a script that simply echoes the values it receives.

 

The server takes the incoming data and passes it to the program post-query, which uses it to construct a file to return to the client. The reply may be HTML, an image file, or any other kind of document, though returning an HTML document is most common. The script's response to the example query is an HTML document that lists the variable values it received. The HTML looks like:

 

Content-type: text/html

     * a blank line *

<H1>Query Results</H1>

You submitted the following name/value pairs:

<ul>

<li>org = Academic Computing Services

<li>users = 10000

<li>browsers = cello

<li>browsers = lynx

<li>browsers = xmosaic

<li>others = Mac Mosaic, Win Mosaic

<li>contact = Michael Grobe grobe {at} kuhub.cc.ukans.edu

</ul>

GET-Method

Form data may be sent to scripts for processing by using the GET method as well as the POST method. For example, the first form example above could have been encoded as

 

<FORM METHOD=GET

        ACTION="http://www.cc.ukans.edu/cgi-bin/post-query">

 

If a GET method is used, an HTTP request from the client would look something like:

 

GET /cgi-bin/post-query?org=Academic%20Computing%20Services

&users=10000&browsers=lynx&browsers=cello&browsers=mosaic

&others=MacMosaic%2C%20WinMosaic

&contact=Michael%20Grobe%20grobe {at} kuhub.cc.ukans.edu HTTP/1.0

Accept: www/source

Accept: image/gif

Accept: application/postscript

User-Agent:  Lynx/2.2  libwww/2.14

From:  grobe {at} www.cc.ukans.edu

     * a blank line *

 

In general, GET should probably be used when a URL access will not change the state of a database (by, for example, adding or deleting information) and POST should be used when an access WILL cause a change. However, due to bugs in some server software you might not be able to use a GET method if the query is too long.

 

Perl-DBI

Thanks a lot to Bernard de Bono, Cambridge, UK:

Having mastered CGI by now, you decide you need to involve your DBMS, e.g. MySQL, to manage all the data going in and out.

Example DBI script:


#!/usr/bin/perl -w

 

use DBI;

use strict;

 

my ( {at} data1)=();

 

my($dbh) = DBI->connect('DBI:mysql:sample','tmpl','tmp') or die "Couldn't connect to database: " . DBI->errstr;

 

 

my $sth1 = $dbh->prepare("select * from student where sex=?") or die "Couldn't prepare statement: " . $dbh->errstr;

 

  $sth1->execute('M');

 

  while( {at} data1 = $sth1->fetchrow_array()){

    print join("\t", {at} data1)."\n";

  }

 

  $sth1->finish;

 

 

$dbh->disconnect;