Practice Exercise: Browsers, wget, and curl
Objectives
- Learn how to use web browsers,
wget
, andcurl
to access web content from the command line. - Understand the differences between these tools and their practical applications.
Scenario
As a Linux enthusiast, you'll often encounter scenarios where you need to access web content or download files directly from the command line. This exercise will help you become proficient in using web browsers, wget
, and curl
for various web-related tasks.
Tasks
Task 1: Using wget
- Open a terminal.
- Use
wget
to download a file from a URL. For example:wget https://example.com
. - Observe the download progress and the downloaded file.
[intern@intern-a1t-inf-lnx1 ~]$ wget https://example.com --2023-09-19 21:02:24-- https://example.com/ Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946 Connecting to example.com (example.com)|93.184.216.34|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1256 (1.2K) [text/html] Saving to: ‘index.html’ index.html 100%[================================>] 1.23K --.-KB/s in 0s 2023-09-19 21:02:25 (51.3 MB/s) - ‘index.html’ saved [1256/1256]
Task 2: Downloading Files with curl
- In the terminal, use
curl
to download the same file from the previous task. For example:curl -O https://example.com/sample.txt
. - Compare the
curl
download process towget
.[intern@intern-a1t-inf-lnx1 ~]$ curl !$ curl https://example.com <!doctype html> <html> <head> <title>Example Domain</title> <meta charset="utf-8" /> <meta http-equiv="Content-type" content="text/html; charset=utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1" /> <style type="text/css"> body { background-color: #f0f0f2; margin: 0; padding: 0; font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif; } div { width: 600px; margin: 5em auto; padding: 2em; background-color: #fdfdff; border-radius: 0.5em; box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02); } a:link, a:visited { color: #38488f; text-decoration: none; } @media (max-width: 700px) { div { margin: 0 auto; width: auto; } } </style> </head> <body> <div> <h1>Example Domain</h1> <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> <p><a href="https://www.iana.org/domains/example">More information...</a></p> </div> </body> </html>
Task 3: Downloading a Web Page
- Use
wget
to download an entire web page (HTML) and its related assets (e.g., images, stylesheets). For example:wget --recursive --page-requisites https://example.com
. - Explore the downloaded files and folder structure.
[intern@intern-a1t-inf-lnx1 ~]$ wget --recursive --page-requisites https://example.com --2023-09-19 21:03:42-- https://example.com/ Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946 Connecting to example.com (example.com)|93.184.216.34|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1256 (1.2K) [text/html] Saving to: ‘example.com/index.html’ example.com/index.html 100%[================================>] 1.23K --.-KB/s in 0.001s 2023-09-19 21:03:44 (2.36 MB/s) - ‘example.com/index.html’ saved [1256/1256] FINISHED --2023-09-19 21:03:44-- Total wall clock time: 2.7s Downloaded: 1 files, 1.2K in 0.001s (2.36 MB/s) [intern@intern-a1t-inf-lnx1 ~]$ ls example.com/ index.html
Task 4: Exploring wget
and curl
Options
- Investigate additional options for both
wget
andcurl
by runningwget --help
andcurl --help
. - Experiment with different options to customize downloads or requests.
[intern@intern-a1t-inf-lnx1 ~]$ wget --help | head GNU Wget 1.21.2, a non-interactive network retriever. Usage: wget [OPTION]... [URL]... Mandatory arguments to long options are mandatory for short options too. Startup: -V, --version display the version of Wget and exit -h, --help print this help -b, --background go to background after startup -e, --execute=COMMAND execute a `.wgetrc'-style command [intern@intern-a1t-inf-lnx1 ~]$ curl --help | head Usage: curl [options...] <url> -d, --data <data> HTTP POST data -f, --fail Fail silently (no output at all) on HTTP errors -h, --help <category> Get help for commands -i, --include Include protocol response headers in the output -o, --output <file> Write to file instead of stdout -O, --remote-name Write output to a file named as the remote file -s, --silent Silent mode -T, --upload-file <file> Transfer local FILE to destination -u, --user <user:password> Server user and password
Conclusion
By completing these exercises, you've gained practical experience in using web browsers, wget
, and curl
for various web-related tasks. These tools are valuable for accessing web content and automating web-related tasks from the command line in a Linux environment.