View
5
Download
0
Category
Preview:
Citation preview
Outline of Today’s Class Web Servers Static and Dynamic Web Pages CGI Programming What makes the CGI work?
FORM GET and POST Methods QUERY_STRING and CONTENT_LENGTH
SGML, HTML and XHTML XML and DTD XML Examples
Web Servers How does a web server work?
You contact the web server and request a file. The server returns the file.
Web Server
Files /myDir/index.html /myDir/foo.html /myDir/bar.html
PC-1
PC-2
GET foo.html
Foo.html
GET index.html
Index.html
Web Servers
Most web servers are very simple. They just return files to the PC that requests it
The web browser does the hard work of translating a file into pretty pictures
See “View->Source” for the file actually returned
Web Servers
It would be a Bad Thing if anyone on the internet could retrieve any file on the web server.
The files are kept in a special directory — requests for files are relative to that directory.
Dynamic Web Pages
Do Computation Generate HTML page with results of computation
Return dynamically generated HTML file
Request service
CGI and Web Forms
How to write the HTML that sends data to the server?
What does the server have to do to process this information?
The most common method to handle this is CGI -- Common Gateway Interface
Request Method: Get
GET requests can include a query string as part of the URL:
GET /cgi-bin/finger?hollingd HTTP/1.0
Request Method
Resource Name
Delimiter
Query String
CGI URLs
There is a mapping between URLs and CGI programs provided by a web server. The exact mapping is not standardized (web server admin can set it up)
Typically: requests that start with /CGI-BIN/ , /cgi-bin/
or /cgi/, etc. refer to CGI programs (not to static documents).
CGI Programs
When the user hits the “submit” button the data is sent to the web server
The CGI program that handles it on the web server is specified in the HTML Form tag
<FORM method=post action="http://unix.aml.yorku.ca/cgi-bin/formProcessor.pl">
CGI Programs
Anything special about the program? The web server has to have permissions set to
allow the program to be executed. Typically this is only turned on in a few directories, eg /cgi-bin
Has to comply with the usual security things for that system.
CGI Programs
What kind of program does it need to be? Can be written in any language—C++, C,
perl, etc. Just has to be able to process the attribute-value pairs.
Perl is excellent for its pattern matching and text processing capabilities.
CGI Programs The data is sent to the CGI program in a specific format of attribute-value pairs. The attribute is the name of the field in the HTML tag, the values are what the user inputs
firstName=lee middleName=harvey lastName=oswald
First name: <input type="text" name="firstName"> Middle name: <input type="text" name="middleName"><br> Last name: <input type="text" name="lastName"><br>
CGI Programs
Strengths: A simple method to send data to the server. Dynamically generates HTML pages.
Weaknesses All the processing happens on the server. Takes time to launch the CGI process on the
server. Use the process, instead of thread.
Web Forms Overview of Web forms
HTML form components
GET & POST methods
Server-side processing with forms
CGI-based Web Application
Get Data
HTT Request
HTTP Document
Output (HTML)
HTML forms to invoke CGI scripts
CGI Scripts/ Applications
Web Browser Web Server
Database Return data
Form Interaction with CGI
Web Browser Web Server
CGI Program
User requests form
Returns form to client
User submits form Forwards to CGI program
Returns results to server Returns results to client
Network Server
Forms Forms work in a different and slightly more
complex way than standard HTML pages. Forms consist of a number of separate data entry
components such as menus and text areas. The user can select different options from the menus
and enter text in the text entry fields. A single form can contain many text entry fields
and/or many menus. To differentiate the menus and text areas from each
other each one is given a unique name, selected by the Web form designer.
HTML Forms
Each form includes a METHOD that determines what http method is used to submit the request.
Each form includes an ACTION that determines where the request is made.
HTML Forms HTML includes elements or tags for creating forms on Web pages.
There are three stages to creating a form: define the form data [a set of variables] design the form itself define the method for processing the form’s data on the
server-side
When the Web page containing the form is loaded, the user can: enter data into the form then submit that data to the Web server
[usually by clicking a submit button on the form]
HTML Form Variables A variable has:
a name a value
A form contains one or more variables. When the user fills in the form, values are assigned to these variables.
When the user clicks the submit button, the set of variable names & corresponding values are sent to the Web server in a HTTP request. The Web server can extract the set of variables & values from the HTTP request, and can do something with them...
Example for HTML Form <html> <head> <title>Query Form</title> </head>
<body> <h2>Query Form</h2> <form method="GET” action="doquery.php”> <p>Your name: <input name="name" type="text" size=30></p> <p>Your ID: <input name="id" type="text" size=15></p> <p><input type="submit" value="Submit your query"></p> <p><input type="reset" value="Clear your query"></p> </form> </body> </html>
Note that this form contains two variables
name & id
Example for HTML Form
<input name="name" type="text" size=30>
<input name="id" type="text" size=15>
<input type="submit" value="Submit your query"> <input type="reset" value="Clear your query">
Forms
<?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns = "http://www.w3.org/1999/xhtml"> <head> <title>Web Engineering - Feedback Form</title> </head> <body><h1>Feedback Form</h1> <p>Any comments please.</p> <form method = "post" action = "/cgi-bin/feedbackform"><p> <input type = "hidden" name = "recipient" value = "webeng@xhtmllecture.com" /> <input type = "hidden" name = "subject" value = "Feedback Form" /> <input type = "hidden" name = "redirect" value = "main.html" /> </p> </form> <p>
Each form must begin and end with form tags.
The method attribute specifies how the form’s data is sent to the Web server. The post method appends form data
to the browser request.
The value of the action attribute specifies the URL of a script on
the Web server.
Input elements are used to send data to the script that processes the form.
A hidden value for the type attribute sends data that is
not entered by the user.
Forms <label>Name: <input name = "name" type = "text" size = "25" maxlength = "30" /> </label></p> <p><form> <input type = "submit" value = "Submit comments" /> <input type = "reset" value = "Clear comments" /> </p> </form></body></html>
The value attribute displays a name on the buttons created.
The maxlength attribute gives the maximum number of
Characters the user can input.
The size attribute gives the number of characters
visible in the text box.
The label element describes the data the user needs to enter in the text box.
Forms
Text box created using
input element.
Reset button created
using input element.
Submit button created
using input element.
Table & Form
<TABLE FRAME = none> <TR><TD ALIGN = right> Name:<BR> Card number:<BR> Expires:<BR> Telephone:<BR> <TD ALIGN=left><BR> <FORM method="POST" action=”/cgi-bin/myscript.cgi”> <INPUT NAME=“name” SIZE=18><BR> <INPUT NAME=“cardnum” SIZE=18><BR> <INPUT NAME=“expires-month” SIZE=2>/ <INPUT NAME=“expires-year ” SIZE=2><BR> <INPUT NAME=“phone” SIZE=18> </FORM> </TABLE>
Form Methods The method attribute on the form tag specifies how the Web
Browser should send the data to the Web server.
Two options: GET: pass the data in a HTTP GET request POST: pass the data in a HTTP POST request
In a HTTP GET request, the browser appends the form data to a
URL. For example:
http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234
Note how the variable names & values are appended to the URL. Any spaces in a value are converted to +.
Form Actions The action attribute on the form tag specifies what the
Web server should do with the form data.
Common options: email the data to someone [the mailto action] pass the data to a script or program
The script will be parsed the variables & values, and
can then process them.
For example, the CGI script could use the name & id to look up student info in a database.
Form Actions <form method="GET" action="mailto:jhuang@yorku.ca">
Until you can actually use scripts on the server, use the
mailto action. It operates in the same way as the mailto that you have used in the HTML document.
When used in a form, the mailto action will send an email to the email address of the person specified. The mailto action is of limited use for complicated forms but works adequately for simple forms.
The email received contains all of the names and values in one long list.
What a CGI will get
The query (from the environment variable QUERY_STRING) will be a URL-encoded string containing the name, value pairs of all form fields.
The CGI must decode the query and separate the individual fields.
GET vs. POST
The GET method delivers data (query) as part of the URL
When using forms, it’s generally better to use POST: there are limits on the maximum size of a GET
query string (environment variable) a post query string doesn’t show up in the
browser as part of the current URL
CGI reading POST
If REQUEST_METHOD is a POST, the query is coming in STDIN.
The environment variable CONTENT_LENGTH tells us how much data to read.
CGI Method Summary GET: REQUEST_METHOD is “GET” QUERY_STRING is the query
POST: REQUEST_METHOD is “POST” CONTENT_LENGTH is the size of the query
(in bytes) query can be read from STDIN
HTTP Form Processing
1. user fills in form & clicks submit
internet
5. Browser displays the script results*
4. server sends script results to
Browser
*The script results will usually be HTML text
3. server runs the script doquery.cgi passing form data to it
2. Browser sends GET http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234
A More Complex Form Example
Password field
Radio buttons
Drop-down list
Check boxes
Text area
Text field
Buttons
Form Processing & Results The easiest way to deal with form data is to simply email it to an
email address using a mailto form action:
<form method="POST" action="mailto:name@where.com">
More often, we want to process the data on the server-side, using a program or script.
The old way is to use a so-called CGI Script, usually with a URL something like:
<form method="POST" action=”/cgi-bin/myscript.cgi">
The newer way is to use an HTML-embedded script language such as Servlet, JSP, or ASP. We’ll look at how to use Servlet later in the course...
Alternatives for Generating Dynamic Pages
Java Servlets
Java Server Pages
Active Server Pages (ASP)
Can dynamically generate page in other ways?
Dynamic Web Pages
CGI program
other program
( application )
WWW server
API
WWW client
Java servlet
Java applet
script ( embedded in HTML )
SSI
HTTP
server side
client side
CGI
SGML
Standardized General Markup Language Developed by a committee! Led by Charles Goldfarb, 1978-1986 A grammar to define the structure of documents
Rules define the construct or structure Terminals are <tags> and strings
HTML & XML
HTML is a subset of SGML with a shared DTD
HTMLDOC::=(<html> HEAD BODY </html>)
XML is a subset of SGML with many DTDs
allowed
XML Uses tags to identify semantics of data looks like HTML, but isn’t
<slide><title>Introduction</title> <author><first>Jimmy</first> <last>Huang</last> </author> <content>XML this and that</content> </slide>
is license free, platform-independent and well-supported
HTML
Hypertext Markup Language
Hypertext Markup Language
Presents documents via WWW browsers Specifies document layout and hyperlink
Predefines set of tags (ie. Common DTD)
<HTML> <TITLE>Statistics Canada</TITLE> <BODY> <H3>Welcome to Stats Canada</H3> Statistics Canada ……. . <p> We like numbers….. <img src=“mapleleaf.gif> <ul>What we do <li><a href=“census.html”>Census</a> <li><a href=“special.html”>Special surveys</a> <li><a href=“online.html”>Online data</a> </ul> </BODY> </HTML>
HTML: An Example
HTML HTML - Advantages
Simple - fixed set of tags Portable - used with all browsers Linking - within and to external documents
HTML - Disadvantages
Limited tag set Can’t separate the presentation from content Can’t define structure of contents
XHTML Basics
Very few real changes from HTML But more strict
All tags are in lowercase All tags must be closed
Empty tags Paired tags
XHTML tags
Start tags and end tags
Start tags - delimited by < and >
End tags - delimited by </ and > <h1>This is a Large Heading</h1>
<br>This text starts on a new line.
Some start tags also include attributes which further define information about the element.
!DOCTYPE HTML 3.2
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 3.2 Draft//EN”>
Netscapes HTML standard <!DOCTYPE HTML PUBLIC “-//WebTechs//DTD Mozilla
HTML 2.0//EN”> Not strictly necessary for HTML, highly recommended Future browsers can still attempt to display your older documents
(written to previous HTML standards) in the way that was originally intended, even though the HTML language may have evolved
XHTML <?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Strict//EN“ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
!DOCTYPE
<?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!– Comments: name_of_webpage.html --> <html xmlns = "http://www.w3.org/1999/xhtml"> <head> <title> Web Engineering: XHTML I </title> </head> <body> <p>Welcome to XHTML!</p> </body> </html>
!DOCTYPE Title tags
Body tags
Images
<?xml version = "1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!-- Pictures with XHTML --> <html xmlns = "http://www.w3.org/1999/xhtml"> <head> <title>Web Engineering - pictures</title> </head> <body> <p><img src = "angelheart.jpg" height = "251" width = "367" alt = "An angel" /> <img src = "grail.jpg" height = "180" width = "130" alt = "A chalice" /></p> </body> </html>
The value of the src attribute
of the image element is the
location of the image file.
The value of the alt attribute gives a description of the image. This description
is displayed if the image cannot be displayed.
The height and width attributes of the
image element give the height
and width of the image.
Colours <BODY TEXT=“aqua”>
<BODY TEXT=“#00FF00”> <FONT COLOR = “#rrggbb” | “colour name”>
text</FONT>
aqua
black
blue
fuchsia gray
green
lime
maroon navy
olive
purple
red silver
teal
white
yellow
000000 00FF00 FFFFFF
BLACK BRIGHT-GREEN WHITE
Inline Styles <h1 style="color:blue; font-style: italic">First
Stylesheet Example</h1> <p>The first example of stylesheets uses an inline
style.</p> <h1>Second Stylesheet Example</h1> <p>The second example of stylesheets uses a document-
level style.</p> <h1>Third Stylesheet Example</h1> <p> The third example of stylesheets uses an external
stylesheet.</p>
XML Introduction The Extensible Markup Language (XML) is a document
processing standard proposed by the World Wide Web Consortium (W3C), which is related to Standard Generalised Markup Language (SGML).
Possible to search, sort, manipulate and render XML using Extensible Markup Language (XSL).
Highly portable
Files end in the .xml extension.
XML & W3C • XML has been in development since the 1960s through its parent called SGML (Standard Generalized Markup Language) which is also the parent for HTML
• XML is a streamlined version of SGML designed for transmission of structured data over the Web by a working group in the World Wide Web Consortium (W3C) in 1996
• Passed as W3C standard in Feb 1998
- www.w3.org/xml - www.xml.com/axml/axml.html (annotated version)
XML-related Technologies DTD (Document Type Definition) and XML Schemas are
used to define legal XML tags and their attributes for particular purposes
CSS (Cascading Style Sheets) describe how to display HTML or XML in a browser
XSLT (eXtensible Stylesheet Language Transformations) and XPath are used to translate from one form of XML to another
DOM (Document Object Model), SAX (Simple API for XML, and JAXP (Java API for XML Processing) are all APIs for XML parsing
From HTML to XML.. • HTML major drawback – information loses its structure when translated into HTML
• HTML is a presentation-oriented markup language, so information embodied in it is difficult to process
• Information and knowledge servers are overloaded since we have to search information and perform format processing
• Servers often answer the same request many times if users request several views on the same data
• HTML: - Lacks extensibility – can’t create tags or attributes to parameterise or semantically qualify data - Lacks structure – does not support the specification of deep structures needed to represent database schemas or object-oriented hierarchies - Lacks validation – does not support language specification that lets applications check imported data’s structural validity
From HTML to XML..
XML Goals As a portable, platform independent data storage
• support a wide variety of applications, • easy to use across the Internet, • compatible with SGML, • easy to create programs that process XML, • clear and legible (self-describing), • XML documents should be easy to create • XML designs should be quickly prepared, formal & concise etc.
XML.. • XML is not for displaying information but for managing information. •Working group of World Wide Web Consortium (W3C) created XML as a standard for creating markup languages. • Designed it for distributing structured documents over the web • A kind of “light” SGML (Standard General Markup Language) simplified to meet Web requirements • Unlike HTML, XML lets users:
⇒ Extract data from a document ⇒ Define their own tags and attributes ⇒ Define data structures and nest document structures to any complexity level ⇒ Make applications that validate a documents structure. Any XML document can contain an optional description of its grammar for use by applications that perform structural validation
XML..
The problem that XML helps us to solve is how to transfer data between servers, or between the client and the server.
It is a Markup language for describing structured data – content is separated from presentation.
XML documents contain only data Applications decide how to display the data
Language for creating markup languages Can create new tags
XML documents contain only data, not formatting instructions, so applications that process XML documents must decide how to display the documents data.
For example a PDA (personal digital assistant) may render an XML document differently than a wireless phone or desktop computer would render that document.
HTML and XML
XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users
XML is used to mark up data so it can be processed by computers
HTML describes both structure (e.g. <p>, <h2>, <em>) and appearance (e.g. <br>, <font>, <i>)
XML describes only content, or “meaning”
HTML uses a fixed, unchangeable set of tags
In XML, you make up your own tags
XML.. XML is a meta-language With HTML, existing markup is static: <HEAD> and <BODY>
for example, are tightly integrated into the HTML standard and cannot be changed or extremely difficult extended.
XML.. XML is a meta-language With HTML, existing markup is static: <HEAD> and <BODY>
for example, are tightly integrated into the HTML standard and cannot be changed or extremely difficult extended.
XML, on the other hand, allows ou to create your own markup tags and configure each to your liking: for example <WebEngHeading> <WebEngSummary>
<WebEngReallyWildFont>
Each of these elements can be defined through user defined document type definitions (DTD) and stylesheets are applied to one or more XML documents.
There are no ‘correct’ tags for an XML document, except those defined by the author
Some Code Schema Entity
Passport Details SubEntities
Last Name First Name Address
Entity
Address SubEntities
Street City Town State Province ……..
<!ELEMENT passport_details (last_name,first_name+,address)> <!ELEMENT last_name (#PCDATA)> <!ELEMENT first_name (#PCDATA)> <!ELEMENT address
(street,(city|town),(state|province),(ZIP|postal_code),country,contact_no?,email*)> <!ELEMENT street (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT town (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT province (#PCDATA)> <!ELEMENT ZIP (#PCDATA)> <!ELEMENT postal_code (#PCDATA)> <!ELEMENT country (#PCDATA)> <!ELEMENT phone_home (#PCDATA)> <!ELEMENT email (#PCDATA)>
DTD
Internal DTD and Instance <?xml version='1.0'?> <!DOCTYPE passport_details [ <!ELEMENT passport_details
(last_name,first_name+,address)> <!ELEMENT last_name (#PCDATA)> <!ELEMENT first_name (#PCDATA)> <!ELEMENT address
(street,(city|town),(state|province) ,(ZIP|postal_code),country,contact_no?,email*)> <!ELEMENT street (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT town (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT province (#PCDATA)> <!ELEMENT ZIP (#PCDATA)> <!ELEMENT postal_code (#PCDATA)> <!ELEMENT country (#PCDATA)> <!ELEMENT phone_home (#PCDATA)> <!ELEMENT email (#PCDATA)> ]>
<passport_details> <last_name>Smith</last_name> <first_name>Jo</first_name> <first_name>Stephen</first_name> <address> <street>1 Great Street</street> <city>GreatCity</city> <state>GreatState</state> <postal_code>1234</postal_code> <country>GreatLand</country> <email>jhuang@yorku.ca</email> </address> </passport_details>
Shared DTD XML Document specifies the DTD <?xml version='1.0'?> <!DOCTYPE passport_details SYSTEM "PassportExt.dtd"> <passport_details> <last_name>Smith</last_name> <first_name>Jo</first_name> <first_name>Stephen</first_name> <address> <street>1 Great Street</street> <city>GreatCity</city> <state>GreatState</state> <postal_code>1234</postal_code> <country>GreatLand</country> <email>jo@theworldaccordingtojo.com</email> </address> </passport_details>
XML Examples
XML Source File http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xml
XML Style language
http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xsl
Parsing and rendering XML with IE5+
http://www.yorku.ca/jhuang/xml/04.adhoc.topics_xsl.xml
XML Applications XML permits document authors to create markup for
virtually any type of information.
Authors can create entirely new markup languages for describing specific types of data, including mathematical formulas, chemical molecular structures, music, recipes etc.
- XHTML - VoiceXML (for speech) - MathML (for mathematics) - SMIL (the Synchronous Multimedia Integration Language, for
multimedia presentations) - CML (Chemical Markup Language, for chemistry) - XBRL (Extensible Business Reporting Language, for financial
data exchange)
XML Parsers Processing an XML document requires a software program
called an XML parser (or processer). These are available at no charge in many languages (Java, Python, C++ etc.).
http://www.xml.com/programming/ Parsers check an XML documents syntax and enable software
programs to process marked-up data. XML parsers can support the Document Object Model (DOM) or the Simple API for XML (SAX).
DOM: Build a tree structure containing the XML document’s data
SAX: Process the document and generate events
XML-related Vocabulary SGML: Standard Generalized Markup Language XML : Extensible Markup Language DTD: Document Type Definition element: a start and end tag, along with their contents attribute: a value given in the start tag of an element entity: a representation of a particular character or string PI: a Processing Instruction, to possibly be used by a program
that processes this XML namespace: a unique string that references a DTD well-formed XML: XML that follows the basic syntax rules valid XML: well-formed XML that conforms to a DTD
Recommended