Skip to content
theo-armour edited this page Apr 16, 2012 · 27 revisions

##File Scraping The site is password protected using cookies.

When you use a PHP routine to access the data, the PHP opens a new tab/window and this prompts the the site to request name and email address etc.

I am not yet sufficiently capable of dealing with cookies in PHP to code a routine to that would deal with this issue.

There may well be some easy stuff I should learn... Hello. See http://www.w3schools.com/php/php_cookies.asp.

I therefore resorted to a "human being simulator" - AutoHotKey.

This app was used to call up the web pages and save them to particular folders: patterns-url-download-to-file.ahk

##Image Scaping Luckily it is possible to view and grab any of the images with a URL:

I used this PHP routine to grab the images and diagrams: patterns-content-find-images.php

##Making folders Each pattern is in its own folder. This allows fpr easy storage of text and images:

I used this routine a number of times to create the 253 folders: patterns-new-folders.php

##HtmlTidy The original files are very messy

I used this routine to run through all the folders and tidy the HTML: patterns-html-tidy.php

##More Cleanup Even after the HTML tidy, there

Clone this wiki locally