How to recursively wget on specific TLDs?

Is it possible to recursively upload files from specific TLDs using wget?

In particular, I am trying to download the full text of the Code of Massachusetts Regulations . The actual text of the rules is stored in several files across several domains, so I would like to start a recursive download from the index page , but just follow the links to the .gov and .us domains.

+3
source share
1 answer

Using the wget documentation to host the hosts, I was able to do this work with the -H and -D flags:

wget -r -l5 -H -D.us,.gov http://www.lawlib.state.ma.us/source/mass/cmr/index.html
+4
source

All Articles