du17qeo69
Coraz więcej gadania
Dołączył: 29 Kwi 2011
Posty: 105
Przeczytał: 0 tematów
Ostrzeżeń: 0/5 Skąd: England
|
Wysłany: Nie 15:17, 29 Maj 2011 Temat postu: peep toe pump Powerful tool wget to download detai |
|
|
wget to download detailed
If you have not used the download tool wget. you try it,[link widoczny dla zalogowanych], too strong , the following are details:
$ wget [link widoczny dla zalogowanych]
it can control the ftp site to download the entire web directory at all levels, of course, if you're not careful,[link widoczny dla zalogowanych], you may will, and his entire site and other sites to do all the download links.
$ wget-m [link widoczny dla zalogowanych]
Since this tool has is the ability to download, so you can put it on the server as mirror sites of the tools. it in accordance with the to do the right image, you can restrict the type of link and download the file type and so on. such as: download only link and ignores the associated GIF image:
$ wget-m-L - reject = gif [link widoczny dla zalogowanych]
wget can also achieve the endpoint resume (-c parameters), of course, this operation is the need to support the remote server.
$ wget-c [link widoczny dla zalogowanych] file
can resume and the mirroring endpoint combination, so many times in the previous case of broken mirror to a large number of selective files Sites. How to automatically achieve this we will discuss later in even more.
If you feel disconnected when downloading always affect your office, you can limit the number of retries wget. ;
$ wget-t 5 [link widoczny dla zalogowanych]
so try again gave up after five. with the never give up. constantly try again.
B. That agency services for how to do it?
http proxy can use the parameters or. Wgetrc configuration file to specify a way to download through a proxy. But there is such a problem, / p>
If the resume through a proxy to the endpoint, then there may be a few failures. If there is a download through the proxy process is interrupted, then the proxy server cache is preserved in the complete
file copies. So when you use / p>
this time you can add a specific request parameter to make clear their cache proxy server:
$ wget-c - -header = numbers, add a variety of ways. Through which we can change
web server or proxy server of some properties. Some sites do not provide an external connection file services, only through the same site on some other page content
will be submitted. This time you can add page >
$ wget - header = br>
C. How do I set the download time?
If you need a computer in your office colleagues and shared through a connection to download some large files,[link widoczny dla zalogowanych], and you want your colleagues will not slow down the closing speed of the network's to affect,
then you should try to avoid peak hours. Of course, not in the office to wait until people are gone so does not need to run out at home after dinner, still misses to the Internet to download again.
to use can be very good at custom work time:
$ at 2300
warning: commands will be executed using / bin / sh
at> wget [link widoczny dla zalogowanych]
at> press Ctrl-D
this way, we set set the download work at 11 points. For this arrangement to be normal, make sure
recognize atd the daemon is running.
D. Download spend a lot of time?
when you need to download large amounts of data, and you have enough bandwidth and no, this time you will often find the downloads in your arrangement is not yet complete, but the work day to begin .
as a good colleague, you can only stopped these tasks,[link widoczny dla zalogowanych], and start another job. And then you need to repeatedly re-use This will certainly be too cumbersome,
so it is best to use crontab to run automatically. Create a plain text file, called ;
0 6 * * 1-5 killall wget
the crontab file specifies the implementation of certain tasks on a regular basis. When did the first five columns a statement run this command, and the rest of each line tells what crontab execution.
specifies the first two columns to 11 pm every day began with wget, one at 6 am to stop all wget
download. * Indicates the third and fourth columns every day of every month to perform this task. The fifth column specifies which days a week to perform this procedure. -
so that at 11 pm each working day, download work began, to the morning of 6, any wget task
was stopped. You can use the following command to execute crontab:
$ crontab crontab.txt
wget of the match, the download will stop, because it shows the entire file has downloaded completely.
with I have repeatedly used this method, the telephone dial-up by sharing many of the ISO to download the image file, or
more practical.
E. Dynamic web page how to download
some pages every day to change several times on request. So, technically, the target is no longer a file, it does not file size. So This argument loses its meaning.
example: a PHP written and often change linux weekend news website:
$ wget [link widoczny dla zalogowanych] / bigpage.php3
conditions of the network in my office are often poor, give me a download with a lot of trouble, so I wrote a simple to detect whether the script is completely dynamic pages updated.
#! / bin /
# create it if absent
touch bigpage.php3
# check if we got the whole thing
while,[link widoczny dla zalogowanych]! grep-qi bigpage.php3
do
rm-f bigpage.php3
# download LWN in one big page
wget [link widoczny dla zalogowanych]
done
this script able to download the page to ensure continuing until the pages inside a / p>
F. Cookies for ssl and how to do?
ssl if you want to come through the Internet, then the Web site address should be to It can
readily available. Some sites force users to use the time in the browser cookie. so you must be on the website that there are Cookie ;
parameters can ensure the correct download. For the lynx, and Mozilla's Cookie file format, with the following:
$ cookie = $ (grep nytimes ~ /. lynx_cookies | awk {printf (, you have to use the browser to complete the registration on the site.
w3m uses a different, more compact Cookie file format:
$ cookie = $ (grep nytimes ~ / .w3m/cookie | awk {printf (:
$ wget - header = > or use the curl tool:
$ curl-v-b $ cookie-o supercomp.html [link widoczny dla zalogowanych]
G. How do I create address lists?
to now,[link widoczny dla zalogowanych], we are a single file or download the entire site. Sometimes we need to download a large number of links on a page file, but it is not necessary to the entire website
are mirrored down. For instance, we want a descending order from 100 before the 20 songs there to download. Note that here >
work, because they only work for file operations. so be sure to use .com / pub / lg / | grep gz $ | tail -10 | awk {print $ 2}> urllist.txt
lynx output can be a variety of GNU worry too much text processing tools. In the above example, we link the address is Download this file in the destination file:
$ for x in $ (cat urllist.txt)
> do ;
> wget $ x
> done
so that we can successfully download the Linux Gazette website (ftp://ftp.ssc. com / pub / lg /) the latest 10 topics.
H. Expanding the use of bandwidth
If you choose to download a file by the bandwidth limitations, you download because of server-side constraints become very slow. The following tips will greatly reduce the download process. But This technique
require you to use curl and the remote server can have multiple images for you to download. For example, suppose you want to download from the following three addresses Mandrake 8.0:
url1 = [link widoczny dla zalogowanych]
url2 = [link widoczny dla zalogowanych] Mandrake/iso/Mandrake80-inst.iso
url3 = [link widoczny dla zalogowanych]
this file The length is 677,281,792 bytes, so the processes with the curl part1 $ url1 &
$ curl-r 200000000-399999999-o mdk-iso.part2 $ url2 &
$ curl-r 400000000 --o mdk -iso.part3 $ url3 &
This created three background processes. each transfer process from a different server in different parts of the ISO file. The byte range. When these three
end of the process, the use of a simple cat command to link up these three files - cat mdk-iso.part?> mdk-80 . iso. (strongly recommended before the dials check md5)
You can also use
Conclusion
not worry about using non-interactive mode will affect your download download the results. No matter how the web designer racking their brains trying to stop us from their website, we can get
free tool to automatically download the task. This will greatly enrich our web experience.
Post został pochwalony 0 razy
|
|