XULRunner and Crowbar – Crawling of sorts?

by Keiron on November 30, 2008

This was going to be a tutorial on getting these two things running to achieve everything I want, sadly I can’t work out how to get the last step working, which is to navigate the returned Ajax page to allow me to extract different information.

As such this is more a guide on getting the two things installed and working – if you have any more luck than I do on getting navigating Ajax working then let me know!!

XULRunner

First things first, I downloaded the Windows version of XULRunner from (look in the runtimes directory!):

http://releases.mozilla.org/pub/mozilla.org/xulrunner/releases/

(Unpacking takes a while the 8.23MB download contained 302 items totalling 18.8MB!)

Crowbar

Not such a simple download for the uninitiated. It’s not actually released, so it uses Subversion to store its files – you’ll need a Subversion client to download it. I don’t have one on the machine I’m working on, so another post will cover the in’s and out’s of downloading Crowbar with subversion.

All Downloaded and Unpacked – Onwards we go!

Back to the instructions here, which tell me once I’ve done all this to open a command prompt (thankfully a place I’m familiar with) and run:

c:\> %XULRUNNER_HOME%\xulrunner.exe --install-app %CROWBAR%\xulapp
c:\> cd %CROWBAR%\xulapp
c:\> %XULRUNNER_HOME%\xulrunner.exe application.ini

Windows Firewall blocked the program but that was kind of expected, so I unblocked that.
I now have a Crowbar window and an Error Console, apparently I can use Crowbar by visting:

http://127.0.0.1:10000/

On doing so, a nice little web window pops up similar to a web proxy, asking me what page I want to fetch.

I inserted my Ajax based page and the next thing I know, I’m being presented with all the source code for that page, which includes all the output from the Javascript that wouldn’t be there when I did a PHP curl get on the page!!

Now apparently I can run this using curl (why can I see me having to install a fair bit of software on my laptop to get this all working over there?).

OK, so all well and good we’ve fetched one page, but that page has a dropdown box on it that forces the entire page to change – how do I go about “Crowbarring” my way around that?

With little documentation I can’t see a way… Back to the drawing/scraping board?

{ 3 comments… read them below or add one }

1 Using Subversion to get Crowbar | Skillett.com 11.30.08 at 10:51 am

[...] post is just a reference point for another post, a Subversion client is needed for downloading Crowbar, so I downloaded TortoiseSVN available [...]

2 Dan 01.14.09 at 3:51 pm

FireWatir might be a better tool for you:

http://wiki.openqa.org/display/WTR/FireWatir

Dan

3 Keiron 01.14.09 at 5:06 pm

I eventually resorted to outsourcing it via rentacoder to an absolutely excellent coder in the US.

He provided exactly what I needed in PHP to the spec I provided with no extensions or the like! Was really pleased with his work – I think he was kind of surprised when I had no complaints or changes that needed making as well!

Leave a Comment

PLEASE: Take note of the commenting policy, using keywords instead of your name will only result in your comment being deleted or the link removed.
Use your name and actually get a 'dofollow' link to your site!

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

CommentLuv Enabled