The Portugal News Epaper Engine and Website

Project Description

This project is more than what meets the eye. Simply because its not about the complexity of the website, which 90% was taken from existing software, but for the fact that everything since the newspaper prodution of the pages to the publishment of thoose pages into epaper.theportugalnews.com portal is automatic.

The project is divided in 2 mechanisms:

The Website

The flash content of the site was used from the software that produces the presentation, nothing new in there apart from a bit of customisation and re-branding. The coded part iv spent more hours in was the library section (second image from the slider), since that the produced html from the software, which processes the flash did not output any kind of library, I had to modify the template in a way that could accommodate browsing for the past editions, since the The News Group wanted to make them available.
So with a little help form the marvellous javascript tool JQuery i have managed to produce an interface that is able to present current and old editions with respective edition number on a yearly basis.
Because every week a new edition is uploaded the system in this the script was programmed with an algorithm that I wrote so that it knows the exact edition number and date, since there a couple of scenarios where the newspaper does not come out in that particular week.
This way I was able to keep an accurate date / edition record and instruct the system to load a new paper every week on that specific day. It also monitors the presence of the correct files, folders and cover image, if any of those items failed to be found, the edition will not be displayed.

The Engine

So every week, what we got was a bunch of PDF files named with the edition number and page, in a folder, somewhere, in some network server... So basically, the person en charged with the epaper should do the following on a weekly basis:

    1. Have an accurate online library of the maximum editions possible ( meaning do the process below x number of times)
    2. Account for pdfs that come out of production every week, make sure there are no missing pages...
    3. Merge those pdfs with the correct order, yes because the flash software only accepts one input...
    4. Make one or two hacks (you don't need to know)...
    5. Produce a downloadable version (shrink the original production size to a 1/4)
    6. Send the merged copy to the software that produces the flash.
    7. Wait for it, it takes 30 min to complete...
    8. Once completed, creates specific folders with year and edition numbers on the web server and upload flash and pdf version via ftp.
    9. Wish that everything goes accordingly to plan and hope that particular employee does not take leave or gets home sick...

So by now you should be thinking "what a mission", Hell yeeaaah!!! And mainly because its time consuming and it takes several hours to instruct someone to manually do this, not to mention that missing one step kills the whole process...

So the idea at the time, not knowing or having the skills I do now, was to create a php script and make it run via the CLI.

Because this required several applications to be installed iv created a Virtual Machine with a dedicated OS for this, specially because I didn't want to re-do all of the scripting and setting if the hardware had to be moved or all of a sudden crashed.
With the help of the algorithm that calculates editions and dates I could easily make the process of uploading one edition, available, to upload all the remain editions that we had sitting in our network server and place them in the web server with correct dates and years.

And that was the whole point! Even if I had to rebuild the library on the web server today the only problem would cause would be the time it would take, but the "engine" would be capable to make that happen with a single run.

Mainly because before its starts its journey compiling and producing the flash version it makes several pre checks, it connects to the web server via FTP and the local network and looks for all the available editions and makes sure that the web server has as many editions online as possible,  comparing both locations for missing or uncompleted editions and adding them in queue for later process. This way we are able to guarantee that the online library matches the in house one.

After that is simply lines of code that automate the process described above, when it finishes processing an edition, it uploads it via ftp to the web server and de-compresses it, the website then does all the work of making it public.

 

Project Information

Categories: Web Development

Url: http://epaper.theportugalnews.com

Client: The Portugal News Group

Project Timeline

Start: from Nov, 2012

End: Dec, 2012

 

Latest Posts

Contacts

João Vieira

Skype: jcv.pt

Email: info@joao-vieira.pt

About

This is my personal page, here you will find IT related, projects, discussions and reviews. Feel free to coment and leave your input.