The Linux wget command: download files from the terminal

Introduction

In the Linux environment the terminal becomes a user extension allowing you to run tasks quickly and without depending on graphic environments. Among the most outstanding tools for obtaining files from the Internet is wget a non-interactive download program that supports HTTP HTTPS and FTP protocols. Its popularity is due to its simplicity its robustness and the great amount of options that allows to adapt the download to almost any scenario. In this article we will see from the most basic syntax to advanced tricks that will make your work with wget more efficient.

Basic syntax

To start a download simply write wget followed by the resource URL. For example:

wget https://ejemplo.com/documento.pdf

When running that line the program will connect to the server you will request the file and save it in the current directory by keeping the original name. If the server redirects the wget request, it will follow the redirection automatically. In case there is a file with the same name wget will overwrite it unless otherwise indicated by some option.

Most commonly used options

  • -orfile: save the message record in the indicated file instead of printing it in the terminal.
  • - Ofile: defines the name of the output file by overwriting the default name.
  • -c: continues a download that has previously been stopped by reusing the part already downloaded.
  • -r: activates the recursive mode allowing you to download complete directories or websites.
  • -lnumber: sets the maximum level of recursion when combined with -r.
  • -k: converts the links into the downloaded documents to operate locally.
  • -p: download all the resources needed to correctly display an HTML page like style sheets and scripts.
  • - Uagent: specifies a useful custom user agent when some servers block generic requests.
  • -limit =speed: restricts the bandwidth used by the download by accepting values such as 200k or 1.5m.
  • -wait =seconds: introduces a pause between successive requests to reduce the load on the remote server.
  • -rery-connrefuted: indicates to wget that you continue to try if the connection is initially rejected.

Practical examples

  • Download a file and rename it when saving:
    wget -O informe_final.pdf https://ejemplo.com/informe.pdf
  • Resume a large discharge that was cut by a power cut:
    wget -c https://ejemplo.com/copia_de_seguridad.iso
  • Get a full website to browse without connection:
    wget -r -l 2 -k -p https://ejemplo.com/blog/
  • Limit the speed to 250 KB / s to avoid saturation of the link while working:
    wget --limit-rate=250k https://ejemplo.com/actualizacion.zip
  • Change the user agent to simulate a modern browser:
    wget -U NavegadorWeb https://ejemplo.com/detalles.html
  • Download only PDF files from a directory:
    wget -r -l 1 -A pdf https://ejemplo.com/documentos/
  • Try reconnection up to five times before leaving:
    wget --tries=5 https://ejemplo.com/archivo_inestable.gz

Tips and good practices

  • Always check the URL before running wget to avoid accidental downloads of unwanted content.
  • When you use recursive mode it combines -l with -np to not climb to parent directories.
  • Use the log file (-o) to debug connection or authentication problems.
  • If you need authentication, add the hors -user and -password or better still use a .netsecure file to avoid leaving credentials in the history.
  • For scheduled downloads it combines wget with cron or with systemd timers.
  • Remember that wget does not play JavaScript so some modern sites may require additional tools such as curl or wget with specific headers.

Limitations and considerations

  • Wget does not process JavaScript so sites that depend on scripts to load content may require tools such as curl or headless browsers.
  • Some servers limit the amount of simultaneous connections; use too many requests with -r can lead to blockages.
  • For very large downloads it is recommended to use -continue and verify integrity with checksums.

Conclusion

Wget remains one of the most reliable and versatile tools to download files from the Linux terminal. Its wide variety of options can be adapted from a simple capture of a file to the complete cloning of an offline website. Dominating your syntax and knowing the most used options will save you time and give you more control over your downloads. Now that you have the basic concepts and several practical examples you are ready to incorporate wget into your daily workflow.

This work is under aCreative Commons License Attribution 4.0 International for Francesc Roig francesc @ vivaldi.net.

EnglishenEnglishEnglish