Saturday, September 15, 2012

Grabbing Photos from the Web with Groovy

I recently had occasion to bulk download some photos from a web based photo album. By "had occasion" I of course mean to say that my wife told me she would very much like to have the photos and I was therefore inclined to get them for her.


The Hooligans
The particular site for this effort offers up the photos in a custom javascript slideshow presentation. Intentionally or otherwise, they have obscured the actual location of the photos by burying the URLs. I assume this is in an effort to keep the average user from easily downloading the photos, thus depending on the provider for digital copies and/or printing.  It could just be a byproduct of the site design I suppose. A quick peek under the hood, however, and the exact URL of every unique photo was tucked into a variable in the javascript. I cut and pasted the list of URLs from the javascript into a text file and proceeded to write a short piece of Groovy code to go get them all and deposit them on my local machine.

Both file handing and URLs are things that Groovy does pretty well so I didn't figure it would take much. First things first, define the input file:


def inputFile = new File('/SomePath/YourInputHere.txt')


Next up, run through each line of the file, each of which is the URL of a unique photo. This is easily achieved with the "eachline" methods Groovy gives us, one of which feeds two arguments into a closure, the line and the line number:

inputFile.eachLine { lineContent, lineNumber -> ....}


Define an output file to deposit the photo to once it has been read. Uniqueness was achieved in the file name by using a GString with the line number variable embedded:

def outFile = new File("/Path/photo${lineNumber}.jpg").newOutputStream()


Lastly, read from the URL into the output file and close up the output file:

outFile << new URL(lineContent).openStream() 
outFile.close()


When it is all assembled it looks like this:

def inputFile = new File('/SomePath/YourInputHere.txt')

inputFile.eachLine { lineContent, lineNumber ->
     def outFile = new File("/Path/photo${lineNumber}.jpg").newOutputStream()
     outFile << new URL(lineContent).openStream()
     outFile.close()

}

So, five pretty easy lines of code and the wife was happy.  Gotta love Groovy.

No comments:

Post a Comment