School of Computing and Information Systems
COMP30023: Computer Systems
Practical Week 8
Copyright By Assignmentchef assignmentchef
1 HTTP and HTML
In this section we will use our browser to view HTTP headers.
Instructions will be provided for Firefox and Chromium-based browsers. You are also welcome to complete the
lab using a different browser (e.g. Safari, non-chromium Edge), though your demonstrator will not be able to
help you in this regard.
1. Firefox: From the menu in the top-right of the browser, select Web Developer near the bottom, and
then Toggle Tools near the top of the opened menu.
Chrome: From the menu in the top-right of the browser, hover over More Tools near the bottom, and
then select Developer tools from the bottom.
2. Select the Network tab from the top bar, and visit the URL https://www.google.com.au.
3. This will show one line per web request/response. Select one, at the bottom right it will show you the
request and response headers.
1.1 Using cURL
If you prefer using the command line, you can also use the curl tool.
First, you may have to install the curl tool, $ sudo apt install curl .
You can perform a GET request by issuing the following command:
$ curl -s -vv google.com
* Trying 142.250.66.238:80
* TCP_NODELAY set
* Connected to google.com (142.250.66.238) port 80 (#0)
> GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.68.0
> Accept: */*
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently< Location: http://www.google.com/< Content-Type: text/html; charset=UTF-8< Date: Fri, 12 Mar 2021 02:58:57 GMT< Expires: Sun, 11 Apr 2021 02:58:57 GMT< Cache-Control: public, max-age=2592000< Server: gws< Content-Length: 219< X-XSS-Protection: 0< X-Frame-Options: SAMEORIGIN
301 Moved
The document has moved
here.
https://www.google.com.au
The > symbol at the start of a line represents the request header that is being sent to the destination server,
while the < symbol represents the response header that is being received on your host. Using cURL is verycommon practice when trying to debug HTTP API calls or solving any protocol messaging issues that rely onHTTP. You will notice that the response body does not actually contain the HTML page for google.com dueto a 301 Moved response code.You can ask cURL to follow redirects:$ curl -s -vv -L google.comThe -L flag tells cURL to follow redirects. The -vv flag increases the verbosity of cURL to provide moreinformation. The -s flag suppresses the request progress bar.The output of this command is shown below. You can see that it now contains the body, but does not fetchthe components of the page that are not part of the HTML.$ curl -s -vv -L google.com* Trying 142.250.66.238:80…* TCP_NODELAY set* Connected to google.com (142.250.66.238) port 80 (#0)> GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.68.0
> Accept: */*
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently< Location: http://www.google.com/< Content-Type: text/html; charset=UTF-8< Date: Fri, 12 Mar 2021 03:01:08 GMT< Expires: Sun, 11 Apr 2021 03:01:08 GMT< Cache-Control: public, max-age=2592000< Server: gws< Content-Length: 219< X-XSS-Protection: 0< X-Frame-Options: SAMEORIGIN* Ignoring the response-body* Connection #0 to host google.com left intact* Issue another request to this URL: ‘http://www.google.com/’* Trying 142.250.76.100:80…* TCP_NODELAY set* Connected to www.google.com (142.250.76.100) port 80 (#1)> GET / HTTP/1.1
> Host: www.google.com
> User-Agent: curl/7.68.0
> Accept: */*
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK< Date: Fri, 12 Mar 2021 03:01:08 GMT< Expires: -1< Cache-Control: private, max-age=0< Content-Type: text/html; charset=ISO-8859-1< P3P: CP=”This is not a P3P policy! See g.co/p3phelp for more info.”< Server: gws< X-XSS-Protection: 0< X-Frame-Options: SAMEORIGIN< Set-Cookie: 1P_JAR=2021-03-12-03; expires=Sun, 11-Apr-2021 03:01:08 GMT; path=/;domain=.google.com; Secure< Set-Cookie: NID=211=oXUedXlqFeEqBXzfzqMBFhIVdnG9-J5jiAJ6iYqB3v6eGM6iCn7cXE7o5fDsOLT8i1uD9drWjf4pfjW-jInJJag-eEidIiLLoayv0yrJ8Eizu6tPQZe6fOdOwg8wAYzOH6ljcc0p9Pf8WF69mRflhVHrlGOxPzaGnpwzKvOUsCQ; expires=Sat, 11-Sep-2021 03:01:08 GMT; path=/; domain=.google.com; HttpOnly< Accept-Ranges: none< Vary: Accept-Encoding< Transfer-Encoding: chunked Rest of HTML body
You are welcome to explore cURL by running $ man curl .
1.2 Using wget
Similarly, you can use wget to download web resources.
e.g. $ wget https://www.google.com
wget allows the downloading of all web elements to view the site correctly.
e.g. $ wget -p -k https://cis.unimelb.edu.au
1.3 Questions to ask yourself
In a new tab, look up the meaning of each header.
You may find the site https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers useful, or you can
search with your favourite search engine.
Look at the page. How many bytes do you think it would take to implement this page? Check the sum of the
sizes of the requests.
Note how many requests there are for such a simple page. Can you identify what each request is for? Which
ones do you think are cacheable?
How many different domain names are used to build this page?
1.4 Viewing the HTML
In a new window, open https://www.google.com again. Press CTRL+U. This will bring up the page source
in a new tab.
Search for the string
Terminated by CRLF (like HTTP headers), https://tools.ietf.org/html/rfc2822#section-2.2
ARC: https://tools.ietf.org/html/rfc8617
https://developer.mozilla.org/en-US/docs/Web/HTML/Element
https://tools.ietf.org/html/rfc2822#section-2.2
https://tools.ietf.org/html/rfc8617
HTTP and HTML
Using cURL
Using wget
Questions to ask yourself
Viewing the HTML
Visit a simple web page
Resolving DNS Names
Mail headers
Look at the raw text of one of your emails
Observe the structure and headers
Sample solutions
HTTP and HTML
Viewing the HTML
CS: assignmentchef QQ: 1823890830 Email: [email protected]
Reviews
There are no reviews yet.