Appendix A -- CGI Reference
Appendix A
CGI Reference
CONTENTS
Output
Headers
MIME
No-Parse Header
Input
ISINDEX
Environment Variables
Getting Input from Forms
This appendix provides a reference for the CGI protocol and related
variables, including MIME types, environment variables, and hexadecimal
encoding for nonalphanumeric characters.
Output
To output something from a CGI application, print to stdout. You
format output as follows:
headers
body/data
Headers
Headers consist of the HTTP header's name followed by a colon,
a space, and the value. Each header should end with a carriage
return and a line feed (\r\n),
including the blank line following the headers.
Header name: header value
A CGI header must contain at least one of the following headers:
Location: URI
Content-Type: MIME type/subtype
Status: code message
You can include additional headers, including any HTTP-specific
headers (such as Expires
or Server) and any custom
headers. See Chapter 4,"Output,"
for a discussion of the Location
header. Table A.1 lists the status codes, which tell the client
whether the transaction was successful or not and what to do next.
See Chapter 8, "Client/Server Issues,"
for more about status codes.
Table A.1. Valid HTTP status codes.
Status CodeDefinition
200
The request was successful and a proper response has been sent.
201
If a resource or file has been created by the server, it sends a 201 status code and the location of the new resource. Of the methods GET, HEAD, and POST, only POST is capable of creating new resources (for example, file uploading).
202
The request has been accepted although it might not have been processed yet. For example, if the user requested a long database search, you could start the search, respond with a 202 message, and inform
the user that the results will be e-mailed later.
204
The request was successful but there is no content to return.
301
The requested document has a new, permanent URL. The new location should be specified in the Location header.
302
The requested document is temporarily located at a different location, specified in the Location header.
304
If the client requests a conditional GET (that is, it only wants to get the file if it has been modified after a certain date) and the file has not been modified, the server responds with a 304 status code and doesn't bother resending the file.
400
The request was bad and incomprehensible. You should never receive this error if your browser was written properly.
401
The client has requested a file that requires user authentication.
403
The server understands the request but refuses to fulfill it, most likely because either the server or the client does not have permission to access that file.
404
The requested file is not found.
500
The server experienced some internal error and cannot fulfill the request. You often will see this error if your CGI program has some error or sends a bad header that the server cannot parse.
501
The command requested has not been implemented by the server.
502
While the server was acting as a proxy server or gateway, it received an invalid response from the other server.
503
The server is too busy to handle any further requests.
MIME
MIME headers look like the following:
type/subtype
where a type is any
one of the following:
Text
Image
Audio
Video
Application
Multipart
Message
The subtype provides
specific information about the data format in use. A subtype preceded
by an x- indicates an experimental
subtype that has not yet been registered. Table A.2 contains several
MIME type/subtypes. A complete list of registered MIME types is
available at URL: ftp://ftp.isi.edu/in-notes/iana/assignments/media-types.
Table A.2. MIME types/subtypes.
Type/SubtypeFunction
text/plain
Plain text. By default, if the server doesn't recognize the file extension, it assumes that the file is plain text.
text/html
HTML files.
text/richtext
Rich Text Format. Most word processors understand rich text format, so it can be a good portable format to use if you want people to read it from their word processors.
text/enriched
The text enriched format is a method of formatting similar to HTML, meant for e-mail and news messages. It has a minimal markup set and uses multiple carriage returns and line feeds as separators.
text/tab-separated-values
Text tab delimited format is the simplest common format for databases and spreadsheets.
text/sgml
Standard General Markup Language.
image/gif
GIF images, a common, compressed graphics format specifically designed for exchanging images across different platforms. Almost all graphical browsers display GIF images inline (using the <img>
tag).
image/jpeg
JPEG is another popular image compression format. Although a fairly common format, JPEG is not supported internally by as many browsers as GIF is.
image/x-xbitmap
X bitmap is a very simple pixel-by-pixel description of images. Because it is simple and because most graphical browsers support it, it can be useful for creating small, dynamic images such as counters. Generally, X bitmap files have the
extension .xbm.
image/x-pict
Macintosh PICT format.
image/tiff
TIFF format.
audio/basic
Basic 8-bit, ulaw compressed audio files. Filenames usually end with the extension .au.
audio/x-wav
Microsoft Windows audio format.
video/mpeg
MPEG compressed video.
video/quicktime
QuickTime video.
video/x-msvideo
Microsoft Video. Filenames usually end with the extension .avi.
application/octet-stream
Any general, binary format that the server doesn't recognize usually uses this MIME type. Upon receiving this type, most browsers give you the option of saving the data to a file. You can use this MIME type to force a user's browser to
download and save a file rather than display it.
application/postscript
application/atomicmail
application/andrew-inset
PostScript files.
application/rtf
Rich Text Format (see text/richtext above).
application/applefile
application/mac-binhex40
application/news-message-id
application/news-transmission
 
application/wordperfect5.1
WordPerfect 5.1 word processor files.
application/pdf
Adobe's Portable Document Format for the Acrobat reader.
application/zip
The Zip compression format.
application/macwriteii
Macintosh MacWrite II word processor files.
application/msword
Microsoft Word word processor files.
application/mathematica
application/cybercash
 
application/sgml
Standard General Markup Language.
multipart/x-www-form-urlencoded
Default encoding for HTML forms.
multipart/mixed
Contains several pieces of many different types.
multipart/x-mixed-replace
Similar to multipart/mixed except that each part replaces the preceding part. Used by Netscape for server-side push CGI applications.
multipart/form-data
Contains form name/value pairs. Encoding scheme used for HTTP File Upload.
As an example, the header you'd use to denote HTML content to
follow would be
Content-Type: text/html
No-Parse Header
No-Parse Header (nph) CGI programs communicate directly with the
Web browser. The CGI headers are not parsed by the server (hence
the name No-Parse Header), and buffering is usually turned
off. Because the CGI program communicates directly with the browser,
it must contain a valid HTTP response header. The first header
must be
HTTP/1.0 nnn message
where nnn is the three-digit
status code and message
is the status message. Any headers that follow are standard HTTP
headers such as Content-Type.
You generally specify NPH programs by preceding the name of the
program with nph-.
Note that HTTP is at version 1.0 currently, but 1.1 is being worked
on as this book is being written, and some features and headers
from 1.1 have already been implemented in some browsers and servers.
Input
CGI applications obtain input using one or a combination of three
methods: environment variables, standard input, and the command
line.
ISINDEX
ISINDEX enables you to enter
keywords. The keywords are appended to the end of the URL following
a question mark (?) and separated
by plus signs (+). CGI programs
can access ISINDEX values
either by checking the environment variable QUERY_STRING
or by reading the command-line arguments, one keyword per argument.
Environment Variables
CGI environment variables provide information about the server,
the client, the CGI program itself, and sometimes the data sent
to the server. Tables A.3 and A.4 list some common environment
variables.
Table A.3. CGI environment variables.
Environment VariableDescription
GATEWAY_INTERFACE
Describes the version of CGI protocol. Set to CGI/1.1.
SERVER_PROTOCOL
Describes the version of HTTP protocol. Usually set to HTTP/1.0.
REQUEST_METHOD
Either GET or POST, depending on the method used to send data to the CGI program.
PATH_INFO
Data appended to a URL after a slash. Typically used to describe some path relative to the document root.
PATH_TRANSLATED
The complete path of PATH_INFO.
QUERY_STRING
Contains input data if using the GET method. Always contains the data appended to the URL after the question mark (?).
CONTENT_TYPE
Describes how the data is being encoded. Typically application/x-www-form-urlencoded. For HTTP File Upload, it is set to multipart/form-data.
CONTENT_LENGTH
Stores the length of the input if you are using the POST method.
SERVER_SOFTWARE
Name and version of the server software.
SERVER_NAME
Host name of the machine running the server.
SERVER_ADMIN
E-mail address of the Web server administrator.
SERVER_PORT
Port on which the server is running-usually 80.
SCRIPT_NAME
The name of the CGI program.
DOCUMENT_ROOT
The value of the document root on the server.
REMOTE_HOST
Name of the client machine requesting or sending information.
REMOTE_ADDR
IP address of the client machine connected to the server.
REMOTE_USER
The username if the user has authenticated himself or herself.
REMOTE_GROUP
The group name if the user belonging to that group has authenticated himself or herself.
AUTH_TYPE
Defines the authorization scheme being used, if any-usually Basic.
REMOTE_IDENT
Displays the username of the person running the client connected to the server. Works only if the client machine is running IDENTD as specified by RFC931
Table A.4. Common HTTP variables.
Environment VariableDescription
HTTP_ACCEPT
Contains a comma-delimited list of MIME types the browser is capable of interpreting.
HTTP_USER_AGENT
The browser name, version, and usually its platform.
HTTP_REFERER
Stores the URL of the page that referred you to the current URL.
HTTP_ACCEPT_LANGUAGE
Languages supported by the Web browser; en is English.
HTTP_COOKIE
Contains cookie values if the browser supports HTTP cookies and currently has stored cookie values. A cookie value is a variable that the server tells the browser to remember to tell back to the server later.
A full list of HTTP 1.0 headers can be found at the following
location:
http://www.w3.org/hypertext/WWW/protocols/HTTP/1.0/spec.html
Getting Input from Forms
Input from forms is sent to the CGI application using one of two
methods: GET or POST.
Both methods by default encode the data using URL encoding. Names
and their associated values are separated by equal signs (=),
name/value pairs are separated by ampersands (&),
and spaces are replaced with plus signs (+),
as follows:
name1=value1&name2=value2a+value2b&name3=value3
Every other nonalphanumeric character is URL encoded. This means
that the character is replaced by a percent sign (%)
followed by its two-digit hexadecimal equivalent. Table A.5 contains
a list of nonalphanumeric characters and their hexadecimal values.
Table A.5. Nonalphanumeric characters and their hexadecimal
values.
CharacterHexadecimal
Tab09
Space20
"
22
(
28
)
29
,
2C
.
2E
;
3B
:
3A
<
3C
>
3E
@
40
[
5B
\
5C
]
5D
^
5E
'
60
{
7B
|
7C
}
7D
~
7E
?
3F
&
26
/
2F
=
3D
#
23
%
25
The GET method passes the
encoded input string to the environment variable QUERY_STRING.
The POST method passes the
length of the input string to the aenvironment variable CONTENT_LENGTH,
and the input string is passed to the standard input.
Wyszukiwarka
Podobne podstrony:
appAappaappa (11)appa (2)appaappa (9)APPA NieznanyappaappaappaAPPA (2)appa (6)appaAPPAappaappaappawięcej podobnych podstron