Dec 24

Web Toolbox

I've collected some helpful tools for web developers

I've built collected various website testing tools into a webtoolbox repository on Github. This includes my earlier red_spider work as well as a few other utilities which have come in handy:

red_spider

A spider based on Mark Nottingham's redbot: it will produce a nice HTML report of page cacheability and, optionally, HTML validation; since it uses the same nbhttp library it's pretty fast, too. There are a number of options for filtering and it allows you to save lists of page and media URLs for use with tools like wk-bench or tornado-bench.

log_replay

If you need to replace webserver log files at something approximating realtime, log_replay is your friend. It uses Tornado's non-blocking HTTP client (based on pycurl - at some point it would be good to refactor down to just that) to fetch all of the URLs but will sleep any time it's too far ahead of the simulated virtual time.

tornado-bench

Also uses Tornado's non-blocking HTTP client, this program simply takes a big list of URLs and simply retrieves them as quickly as possible.

wk-bench

Mac OS X-specific tool which measures user-perceived page-load performance. It uses PyObjC to load a full WebKit browser, processes a list of URLs and reports the time taken from beginning the HTTP request until the browser fires the didFinishLoadForFrame event, which includes things like image loading, Flash, JavaScript, etc. This is also useful for reporting JavaScript errors as they are logged to the console and can very easily be extracted for verifying that you don't have on-load errors site-wide.