Skip to content
theo-armour edited this page Apr 16, 2012 · 27 revisions

###File Scraping The site is password protected using cookies.

When you use a PHP routine to access the data, the PHP opens a new tab/window and this prompts the the site to request name and email address etc.

I am not yet sufficiently capable of dealing with cookies in PHP to code a routine to that would deal with this issue.

There may well be some easy stuff I should learn...
Hello. See W3Schools.

I therefore resorted to a "human being simulator" - AutoHotKey.

This app was used to call up the web pages and save them to particular folders:
patterns-url-download-to-file.ahk

###Image Scaping Luckily it is possible to view and grab any of the images with a URL:

I used this PHP routine to grab the images and diagrams:
patterns-content-find-images.php

###Making folders Each pattern is in its own folder. This allows fpr easy storage of text and images:

I used this routine a number of times to create the 253 folders:
patterns-new-folders.php

###HtmlTidy The original files are very messy

I used this routine to run through all the folders and tidy the HTML:
patterns-html-tidy.php

###More Cleanup Even after the HTML tidy, there was still a lot of detritus in the files.

I greatly improved my regex skills in building this file:
patterns-preg_replace-files.php

###Cleaning Up the Menu File The menu is in a separate HTML file

I used this routine to replace the ALL UPPERCASE TEXT with something nicer:
patterns-clean-navigation.php

###Extract Categories from Menu Files The menu files contained something lie a dozen sub-categories for patternd

I used this file to find and extracted categories:
patterns-extract-cats.php

###Creating Summaries There are a number of web sites that will allow you to supply a link or text and return a generated summary of that text.

[smmry.com](http://smmry.com} seemed the simplest and provides an API to boot

Even with the API I had issues accessing the results using PHP, so again I resorted to using AutoHotKey:
patterns-smmry-com-get-and-download-to-file.ahk

Clone this wiki locally