To reiterate what I said in my email, globalchange.gov, itself, does not host datasets. Rather, it links to datasets hosted on other websites. Because space is an issue, I said that it would be sensible to therefore not download the datasets from gobalchange.gov, since they should have already been downloaded. That is what I was referring to in what you quoted.
Additionally, as I explained via email, what globalchange.gov really has to offer is on data.globalchange.gov. This website provides graph data linking datasets with authors and models and other attributes, and provides a graph query interface for them. For this reason, as I explained via email, simply using wget to mirror the website will still end up losing most of the data that this website has to offer. To that end, I have sent emails to globalchange.gov asking if it would be possible to get a database dump of their website, so that we could make a fully-functioning clone, without losing any data. I have not yet received a response from them, however.
Since it's a site that is dynamically generated from a database, one never knows whether the set of URLs is infinite. You can always add f and f, and f, etc.
When I come back from work, I'll take a closer look at it to see if there's a reasonable hope that it's finite. (And the *.climate.gov site is suffering from the same problem; for the last day or more, it's been -- apparently -- finding many ways to reveal the same set of reports.)