A Download Manager in MSH
I recently stumbled upon this blog entry that expanded on a piece I wrote a few days ago: Command Line Shortcut for Repetitive Operations. (Hankatsu?)’s entry is in Japanese, so I don’t know what it says. In fact, for all I know, he or she could be making fun of me. In any case, the code included with the blog entry shows a quck way to download sequentially numbered files from the internet – such as File001.jpg, File002.jpg, etc. That’s a great use of the technique, and we can improve it even further with a useful script that acts as a download manager.
This was one of the first Monad scripts I wrote (about 2.5 years ago,) and I’ve faithfully ported it through every one of the many breaking changes that have happened since then :) It originally relied heavily on the Windows port of wget, but I was able to finally remove that a few weeks ago when I noticed that the .Net framework now supports the WebClient.DownloadFile() method.
It’s one of my most heavily used scripts – it’s not very complex, but sure is useful.
## download-queue.msh
## Acts as a download manager, to download batches of files.
##
## 1) Create a directory, and place "download-queue.msh" in it.
## 2) Create a subdirectory, called "Queue"
## 3) Inside the "Queue" directory, place .txt files that contain only URLs in them.
##
## Download-queue.msh will use the name of the text file to create a new subdirectory.
## It will place the downloaded files inside that subdirectory.
## Ensure the System.Net and System.Web DLLs are loaded
[void] [Reflection.Assembly]::LoadWithPartialName("System.Net")
[void] [Reflection.Assembly]::LoadWithPartialName("System.Web")
## Keep on processing the queue directory, while there are batches
## remaining
while($(get-childitem Queue\*.txt).Length -gt 0)
{
## Get all of the .txt files in the queue directory
foreach($file in $(get-childitem Queue\*.txt))
{
write-host "Processing: $file"
## Create a directory, based on the filename (minus extension)
## of the text file
$name = $file.Name.Replace(".txt", "")
$null = new-item -name $name -type Directory
set-location $name
## Download each item in the file
foreach($url in (get-content $file))
{
## Strip the filename out of the URL
if($url -match ".*/(?<file>.*)")
{
$filename = $matches["file"]
$filename = combine-path "$(get-location)" "$([System.Web.HttpUtility]::URlDecode($filename))"
write-host " Downloading: $url"
$webClient = new-object System.Net.WebClient
$webClient.DownloadFile($url, $filename)
}
else
{
write-host "$url is not a valid URI."
}
}
## Move the file list into the directory, also
move-item (combine-path "..\Queue" ($file.Name)) .
set-location ..
}
}
For now, you’re on your own for generating the queue files. Right-clicking “Copy Shortcut” in your browser is a great way to get URLs. Batching them this way is many times faster than downloading each file individually.
Here it is in action:
MSH:297 C:\Temp >md Queue
Directory: FileSystem::C:\Temp
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- Aug 02 21:10 Queue
MSH:299 C:\Temp >echo "https://www.leeholmes.com/blog/images/rssButton.gif" > Queue\LeeHolmes.com.txt
MSH:300 C:\Temp >echo "https://www.leeholmes.com/blog/images/xmlCoffeeMug.gif" >> Queue\LeeHolmes.com.txt
MSH:301 C:\Temp >download-queue
Processing: C:\Temp\Queue\LeeHolmes.com.txt
Downloading: https://www.leeholmes.com/blog/images/rssButton.gif
Downloading: https://www.leeholmes.com/blog/images/xmlCoffeeMug.gif
MSH:302 C:\Temp >dir LeeHolmes.com
Directory: FileSystem::C:\Temp\LeeHolmes.com
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- Aug 02 21:12 107 LeeHolmes.com.txt
-a--- Aug 02 21:12 1025 rssButton.gif
-a--- Aug 02 21:12 1486 xmlCoffeeMug.gif
Stay tuned – in the near future, I’ll write a post that shows how to parse all of the URLs out of a web page.
[Edit: I’ve updated the script, to make it a little less sensitive to URLs with funky characters.]
[Edit: I’ve now posted my link parser script, so you don’t have to generate these files manually anymore.]
[Edit: Monad has now been renamed to Windows PowerShell. This script or discussion may require slight adjustments before it applies directly to newer builds.]