Skip to content

Practice Exercise: Browsers, wget, and curl

Objectives

  • Learn how to use web browsers, wget, and curl to access web content from the command line.
  • Understand the differences between these tools and their practical applications.

Scenario

As a Linux enthusiast, you'll often encounter scenarios where you need to access web content or download files directly from the command line. This exercise will help you become proficient in using web browsers, wget, and curl for various web-related tasks.

Tasks

Task 1: Using wget

  • Open a terminal.
  • Use wget to download a file from a URL. For example: wget https://example.com.
  • Observe the download progress and the downloaded file.
    [intern@intern-a1t-inf-lnx1 ~]$ wget https://example.com
    --2023-09-19 21:02:24--  https://example.com/
    Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
    Connecting to example.com (example.com)|93.184.216.34|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 1256 (1.2K) [text/html]
    Saving to: ‘index.html’
    
    index.html               100%[================================>]   1.23K  --.-KB/s    in 0s
    
    2023-09-19 21:02:25 (51.3 MB/s) - ‘index.html’ saved [1256/1256]
    

Task 2: Downloading Files with curl

  • In the terminal, use curl to download the same file from the previous task. For example: curl -O https://example.com/sample.txt.
  • Compare the curl download process to wget.
    [intern@intern-a1t-inf-lnx1 ~]$ curl !$
    curl https://example.com
    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
    
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
        <style type="text/css">
        body {
            background-color: #f0f0f2;
            margin: 0;
            padding: 0;
            font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
    
        }
        div {
            width: 600px;
            margin: 5em auto;
            padding: 2em;
            background-color: #fdfdff;
            border-radius: 0.5em;
            box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
        }
        a:link, a:visited {
            color: #38488f;
            text-decoration: none;
        }
        @media (max-width: 700px) {
            div {
                margin: 0 auto;
                width: auto;
            }
        }
        </style>
    </head>
    
    <body>
    <div>
        <h1>Example Domain</h1>
        <p>This domain is for use in illustrative examples in documents. You may use this
        domain in literature without prior coordination or asking for permission.</p>
        <p><a href="https://www.iana.org/domains/example">More information...</a></p>
    </div>
    </body>
    </html>
    

Task 3: Downloading a Web Page

  • Use wget to download an entire web page (HTML) and its related assets (e.g., images, stylesheets). For example: wget --recursive --page-requisites https://example.com.
  • Explore the downloaded files and folder structure.
    [intern@intern-a1t-inf-lnx1 ~]$ wget --recursive --page-requisites https://example.com
    --2023-09-19 21:03:42--  https://example.com/
    Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
    Connecting to example.com (example.com)|93.184.216.34|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 1256 (1.2K) [text/html]
    Saving to: ‘example.com/index.html’
    
    example.com/index.html   100%[================================>]   1.23K  --.-KB/s    in 0.001s
    
    2023-09-19 21:03:44 (2.36 MB/s) - ‘example.com/index.html’ saved [1256/1256]
    
    FINISHED --2023-09-19 21:03:44--
    Total wall clock time: 2.7s
    Downloaded: 1 files, 1.2K in 0.001s (2.36 MB/s)
    [intern@intern-a1t-inf-lnx1 ~]$ ls example.com/
    index.html
    

Task 4: Exploring wget and curl Options

  • Investigate additional options for both wget and curl by running wget --help and curl --help.
  • Experiment with different options to customize downloads or requests.
    [intern@intern-a1t-inf-lnx1 ~]$ wget --help | head
    GNU Wget 1.21.2, a non-interactive network retriever.
    Usage: wget [OPTION]... [URL]...
    
    Mandatory arguments to long options are mandatory for short options too.
    
    Startup:
      -V,  --version                   display the version of Wget and exit
      -h,  --help                      print this help
      -b,  --background                go to background after startup
      -e,  --execute=COMMAND           execute a `.wgetrc'-style command
    
    [intern@intern-a1t-inf-lnx1 ~]$ curl --help | head
    Usage: curl [options...] <url>
     -d, --data <data>          HTTP POST data
     -f, --fail                 Fail silently (no output at all) on HTTP errors
     -h, --help <category>      Get help for commands
     -i, --include              Include protocol response headers in the output
     -o, --output <file>        Write to file instead of stdout
     -O, --remote-name          Write output to a file named as the remote file
     -s, --silent               Silent mode
     -T, --upload-file <file>   Transfer local FILE to destination
     -u, --user <user:password> Server user and password
    

Conclusion

By completing these exercises, you've gained practical experience in using web browsers, wget, and curl for various web-related tasks. These tools are valuable for accessing web content and automating web-related tasks from the command line in a Linux environment.