- To study what
the browser sends, we run a fake web server on a machine and submit a HTTP
request with a form of data to the fake web server. It allows us to understand
the HTTP message format sent by a browser.
- The example is a simple cookie
order form orderForm2.html with cgi-bin scripts processOrder.pl and confirmOrder.pl.
You can try it with http://cs.uccs.edu/~cs401/hw/hw1/orderForm2.html
- For
exercise 1, we will modify a similar form orderForm.html.
- telnet
to cs.uccs.edu. You can carry out this assignment remotely.
- Copy ~cs401/public_html/hw/hw1/orderForm.html
to your public_html directory. Assume your login is fewatson, use the
following command.
cp ~cs401/public_html/hw/hw1/orderForm.html
~fewatson/public_html
- After you copy
the orderForm.html, you need to replace the following line
<FORM
METHOD="POST" ACTION="http://cs.uccs.edu:8345/cgi-bin/cs301/processOrder.pl">
with
<FORM
METHOD="POST" ACTION="http://cs.uccs.edu:8352/cgi-bin/cs301/processOrder.pl">
where we
assume that 352 is the last 3 digits of your SS# and 8352 is the port number
the fake web server will use.
- Since we are not super user,
we can run web server on port 80, we run it with port number > 1024.
- Change directory by entering
"cd ~cs401/bin" command.
- Start fake
web server by executing "ws 8<last 3digits of your SS#"
8<last 3digits of your SS#> is
the port number which the fake web server process will listen to the incoming
http requests.
If the last 3 digits of your SS#
is 352, then you enter "ws 8352" as command. The program will wait for
web client to send the request.
- On a web browser,
type "http://cs.uccs.edu/~fewatson/orderForm.html" as the url.
- You will receive
a web page similar to the following, enter the info in the text area, and
hit the order button.
- On the fake
server, you will observe the incoming requests similar to
cs.uccs.edu>
ws 8345
argv[1]=8345
portno=8345
before
getsockname socket has port #8345
socket
has port #8345
rcvd
msg 348 bytes-->
POST
/cgi-bin/cs401/processOrder.pl HTTP/1.0
Referer:
http://www.uccs.edu/~cs401/hw/hw1/orderForm.html
Connection:
Keep-Alive
User-Agent:
Mozilla/4.7 [en] (WinNT; I)
Host:
cs.uccs.edu:8345
Accept:
image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding:
gzip
Accept-Language:
en
Accept-Charset:
iso-8859-1,*,utf-8
What
is your reply ("$" to break connection)?
rcvd
msg 351 bytes-->
Content-type:
application/x-www-form-urlencoded
Content-length:
277
butter=1&chocolate=1&mint=3&mm=5&peanut=0&name=C.+Edward+Chow&email=chow%40cs.uc
cs.edu&telephone=%28719%29262-3110&creditCardType=Visa&creditCardNo=3333+2222+44
44+5555&expireDate=02%2F02&shippingAddress=ENS+186%0D%0AUCCS%0D%0AColorado+Sprin
gs%2C+CO+80918%0D%0AUSA&.submit=order
What
is your reply ("$" to break connection)?
- Note that the
request is actually sent in two packets or only part of the request is read
in the first read.
- The message
consists of two parts. The one from POST to the empty line after "Content-length:277"
is the header of the HTTP message. The rest is the data of the HTTP message,
from "butter" to "order". There are 277 bytes in the data portion of
the HTTP request.
- The HTTP header
has many "fields" separated by CRLF (carriage return and line feed characters).
- The first line
in the HTTP request is called request line. It has the syntax of
Request-Line
= Method SP Request-URI SP HTTP-Version CRLF
In our example, it is
"POST /cgi-bin/cs401/processOrder.pl
HTTP/1.0"
The request
Method is POST. Other popular method include GET, PUT, DELETE.
The request-URI
is /cgi-bin/cs401/processOrder.pl
- The
rest of the header lines are called meta headers or header fields. Each
header fields has a meta header name and an associated value. They are
separated by ':'.
- The
definition of these meta header can be found in http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.
- The form data
submitted by the web browser are encoded in a special format called x-www-form-urlencoded,
where all special characters are encoded as 3-character hexadecimal code representation.
- For detail
of the HTTP 1.1 protocol specification, read http://www.w3.org/Protocols/rfc2616/rfc2616.html
- Questions:
- 1.1.
How the space character is encoded? What is character or character sequence
sent by the browser to the web server?
- 1.2.
How '(' is encoded?
- 1.3.
The above request example is submitted by Netscape Navigator version 4.7.
Use Microsoft IE to request http://cs.uccs.edu:8345/~cs401/ and document
the differences between the two request messages. Especially, the
Method used in the request line, # and types of header fields. Is there
a message body?
- Create a hw1part1.htm
web page that includes the answers of the above questions.
Exercise 2. HTTP
response
- We can use telnet command
to study how the web server respond to a HTTP request.
- Go to CS Unix Machine, such
as redcloud, wetterhorn .
Execute "telnet cs.uccs.edu 80"
Here 80 is the port number for
the server process that telnet client will try to connect. If we did
not specify the port number, telnet will connect to the default port 23.
You will see the web server machine
reply with its address and system ID. Then type the following two lines:
GET
/ HTTP/1.0<CR>
<CR>
Here <CR> is the enter key.
Note that GET has to be all upper cases. There should be some space characters
between the GET command and its two parameters, i.e., the uri and the protocol
version number.
You can try "telnet cs.uccs.edu
80 > response1"
and repeat the typing of
the above two line request. You will not see the web server response
only your http request. But the request and response will all be save in
a file named response1.
See the return web pages in the
response1 file. Identify the header and payload of the return http response.
Create a hw1part2.htm web page
include the content of the response1 file and indicate the header/payload
boundary of return http response.
- You should see the following
HTTP response from the web server
Trying 128.198.162.68...
Connected
to cs.uccs.edu.
Escape character
is '^]'.
GET / HTTP/1.0
HTTP/1.1
200 OK
Date: Tue, 26 Feb 2002 04:17:06 GMT
Server: Apache/1.3.22 (Unix) (Red-Hat/Linux) mod_python/2.7.6 Python/1.5.2
mod_
ssl/2.8.5 OpenSSL/0.9.6b DAV/1.0.2 PHP/4.0.6 mod_perl/1.24_01 mod_throttle/3.1.2
Last-Modified: Thu, 01 Nov 2001 20:51:45 GMT
ETag: "8d4f-b4a-3be1b5e1"
Accept-Ranges: bytes
Content-Length: 2890
Connection: close
Content-Type: text/html
<!DOCTYPE
HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><HTML>
<HEAD>
<TITLE>Test Page for the Apache Web Server on Red Hat Linux</TITLE>
</HEAD>
- Similar to the HTTP request,
the HTTP response message has a header followed by a message body (a web
html document in this case).
- The first field of the HTTP
response is called Status-Line and has the following syntax:
Status-Line =
HTTP-Version SP Status-Code SP Reason-Phrase CRLF
In the above response, the status-code
is 200 which means server was successful in retrieving the document.
For other status-codes, see
http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6
or
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.1
.
- Questions
- 2.1.
Try "telnet csweb.uccs.edu 80" and type in the same
"GET / HTTP/1.0<CR><CR>" request. Note that here
we are accessing csweb server not cs server. From the Server header field,
can you tell what company's web server is running there?
- 2.2
. Try "telnet cs.uccs.edu 80" with "HEAD / HTTP/1.0<CR><CR>"
request. Compare the HTTP response with the above. What is the main
difference?
- 2.3.
Try "telnet cs.uccs.edu 80" with "GET / HTTP/1.0<CR>If-Modified-Since:
Fri, 14 Jun 2002 07:39:58 GMT<CR>" request.
- What
is the status-code return?
- Explain
how this can be used in cache server.
- 2.4.
Try
"telnet cs.uccs.edu 80" with "GET / HTTP/1.0<CR>If-Modified-Since:
Thu,
19 Oct 2000 17:00:50 GMT<CR>" request.
- What
is the status-code return?
- Why
is this different from previous request.
- Try
You can use "telnet cs.uccs.edu 80 > response2" to save the return response
in a file called response2. <CR> is the "enter" key. It
generate CRLF character sequence.
- Include the above
results in your hw1part2.htm web page.
Exercise 3. Web Page Creation
Email the url
of hw1part1.htm and hw1part2.htm to chow@cs.uccs.edu
Hint on HW#1:
Here is a short
powerpoint presentation on Content Delviery Network (CDN). It gives
a brief introduction to the use of caches (client cache, client side cache server,
and edge cache server, mirror server). It should help you understand 2.3 better.
Q&A on HW#1:
Change
access rights of your directories to allow web server access.
-----Original Message-----
From: jing jacobs [mailto:jingjacobs@yahoo.com]
Sent: Friday, August 24, 2001 6:38
PM
To: chow@cs.uccs.edu
Subject: questions about homwork
Dear professor,
I'm working on exercise 1 of
hw1 of cs522 at home. I
first log on brain.uccs.edu,
from there log onto
wetterhorn by typing "telnet
wetterhorn.uccs.edu, then
created a directory"public_html"
under "students/[my
login name]", then copied the
orderForm.html to my
"public_html" directory. Then
I just followed the
instruction on the handout, but
when I typed the http
request, I got a message that
says the page cannot be
found. what do you think could
possiblly happened?
Thank you
Jing Yang
- Jing,
I just checked your home directory.
You need to change the access
rights of your directories to allow web server which is running on a different
account to access those web pages in your public_html directory. Use
"chmod 755 ../jyang" and "chmod 755 public_html" on your home directory
to change the access rights.
The access rights for your home
directory is 700 or
cs.uccs.edu>
ls -al /usr2/students/ | grep jyang
drwx--x--x
23 jyang jyang
4096 Aug 24 19:45 jyang/
The first d indicates this is
a directory.
The first rwx indicates the owner
of the directory has r (read) w (write) x (execute) rights.
The 2nd 3 letter code indicates
the group (jyang) can only x (execute) program in this directory.
The 3rd 3 letter code indicates
other users can only x (execute) program in this directory.
After apply "chmod 755 ../jyang",
you should see
cs.uccs.edu>ls
-al
drwxr-xr-x
23 jyang jyang
4096 Aug 24 19:45 jyang/
The web server needs r and x
right to be able to look into your home directory.
Let me know if you have problem.
Edward