Newseum Daily Headlines and Automator
One of my clients wanted to download the front page of several newspapers each day and make them easily accessible to the entire staff. When they asked me for help, they were doing it manually by visiting the Newseum’s website and downloading the front page of each paper manually (via bookmarks in their web browser) and then using PDF Combiner to output 18 front pages as a single PDF. Then they would email the 10-20MB file to everyone.
Well there were a number of issues with that approach including the manual process (time consuming) and the large file which was filling up outboxes and inboxes. At first we explored how to make the PDFs into smaller files…perhaps a JPG of the page instead of a PDF. But this was problematic as it was then hard to read the fine print if the reader actually wanted to read an article as opposed to just seeing the headlines.
I did some research and found an Automator script written by Jason called Newseum: Today’s Front Page. To my amazement, the script did 98% of what we wanted. I guess a lot of people are interested in having their front pages in one place! I had to customize it a little as we needed FTP upload.
I found this FTP Upload Automator action by Peter Dekker and after a little fiddling, it did just what it said it would.
The next step was cleaning up the fairly large (1-2MB) PDF files after the script runs. A number of people in the comment area of the original Today’s Front Page script had a similar issue. I used this version of the script which was modified by and included some clean-up steps.
Then I had to modify the script a little more as I wanted to first rename the combined PDF to “news.pdf” and save it in the temp directory where all the single downloaded PDFs were located. This way the FTP portion of the script could upload news.pdf and overwrite the old version on the site. Then staff members could always get to the latest headline file by visiting the URL somecompany.com/daily/news.pdf
Once it was all working, I created an iCal event to happen daily and run the automator script.
One issue which I still have is that the initial download of the PDF files is stored them in some weird directory inside /private/var/folders/. I don’t know if this directory will just keep filling with files or whether they are cleaned out by the OS on restart or something. I will have to monitor and see what happens. I can delete the folder and files manually, but you have to be ROOT to do it and I am not sure if it’s “OK” to do that or not. And I wonder if Automator will allow me to automatically clean that (root owned) directory.
No comments yet. Be the first.
Leave a reply

