An idea

Here is an idea that has been bubbling in the back of my mind for a while.

I don’t have the time at the moment to make it a reality, but if there is a smart developer or two out there looking for a project give me a shout.

The problem

It is very difficult to ensure that all of the page in a large and dynamic web sites contain valid HTML, especially as changes are made to those pages over time.


There are a few reasons:

  • It is very time consuming to manually validate a large number of HTML pages. And it’s hard to make the case for spending extra testing time on this.
  • Dynamic pages change (duh!) so it is difficult to validate all of the different permutations.
  • Pages that require users to post information in a form or login cannot be easily accessed by an automated test script, so are often not validated at all.

The solution

While developing and testing a site, or even just while browsing, we typically visit a large number of different pages.

The proposed tool will run in the background while we use the website and capture the details of each HTTP request and corresponding response. The HTML that is returned can then be validated asynchronously by the tool.

At the end of the browsing session (or even during, if required) the tool will provide a list of the pages that have been visited and a count of the number of warnings or errors contained on each. The user will be able to drill into the details of any page to investigate the cause of the warnings or errors.

This would allow developers and testers to quickly and easily validate a large number of pages, including those that require users to post form information and login, without any extra work over and above what they are already doing.

Required Components

1. A program that runs in the background collecting details of each HTTP request and HTML response

Perhaps this would be a Firefox extension? Or, a Windows application that hosts IE in a sub-window?

The user interface should be very simple (e.g. start/stop button, plus perhaps an indicator of the page count and/or total warnings/errors or a summary of the previous request a la the Firefox HTML Validator add-on (btw, does anybody know of an equivalent add-on which works on a Mac?)

2. An XHTML Validator

This would ideally give equivalent results to the W3C validation tool, which will require an SGML parser (but if this is too painful it could just implement HTML Tidy-like validation for starters)

This could also be extended in the future to validate CSS, JavaScript, and also run standard accessibility tests.

3. Result viewer

A simple web interface, with two views would do the trick:

  • List view
    • Identify each page by URL and/or HTML title
    • Identify pages which contain errors (perhaps using simple green, orange and red light icons)
  • Details view
    • Lists all warnings/errors for the selected page (with HTML snippits)
    • View full HTTP request and response details

Extra for experts

It would be nice to allow the user to compare results for a given page to the results from previous sessions, so that trends can be identified.

Comments welcome

What do you think?

Would something like this be useful?

Does something like this already exist?

If you have any thoughts or suggestions comment away. :-)

10 thoughts on “An idea”

  1. I still haven’t done my own website but I look after several. The joys of being busy. Your comment about an XHTML background validator rang some bells with me. I think it would be a excellent idea, and maybe some sort of Javascript and css validation as well. A plugin seems the obvious choice.

    I leave my FireFox Javascript console open while I’m developing pages; it shows me error in scripts as well as css and some other errors. Then I go to some other internet site and the errors just pour out. The front page is pretty good, it only has six (all minor css errors). The BNZ has 18. That’s a worry.

    Best of luck in the new job.

  2. Have you seen Charles from Karl at Cactus Lab? This is an HTTP proxy which already logs all the traffic from IE or Firefox. Maybe he should plug in a validator to this so it would do what you want. That would be an interesting tool.

  3. Validation on the Mac – I use the Web Developer Toolbar, which has:

    – Validate HTML
    – Validate Local HTML

    plus validation for CSS, feeds, links, accessibility.

    It submits your content to W3C and others.

  4. On a bespoke CMS i looked after a few years ago, the system had added validation of any new pages, or edits through Tidy as part of it’s workflow. Docvert can use Trang as part of the pipeline process.

    Your idea sounds like a validation version of Live HTTP Headers

  5. Reading you post I immediately though that would be a cool addition to Charles. But Glen bet me to communicating that. The FireFox web developer toolbar has ‘Validate Local HTML’ but this requires you to actually do it. A report at the end of a browsing session, or in real time, of pages that have errors would be great.

  6. I guess I’d write this as a firefox plugin that posts RESTfully to a web-server api hosted at trademe. People could install the plugin and it’d send back any pages that had validation errors.

    To think about this more generically though (not just about TradeMe) users would install a ‘Validation Feedback’ plugin and when visiting a new site with a meta tag of [meta name=”ValidationFeedback” content=”url=”/] it would ask the user whether they wanted to assist and – if so – it’d do so transparently in the future. This way it’s more abstract than just about TradeMe.

    It doesn’t seem like there’d be any point to storing up lots of validation feedbacks to send in batch… I suppose there’d be some redundancy over multiple files that a zip archive could take advantage of. Seems like a nice optimisation though rather than a core feature though, pretty negligible difference. Also if you get overwhelmed it’d be easy to tell users to stop sending in feedback… just ditch the meta tag.

    (BTW, can we get a preview comment option on this blog? I was going to post some HTML but then I don’t know what filters you have Rowan)

    Oh, and everyone should use Docvert, ok? It just makes sense :)


  7. You want to be able to automate it in your build process. Sure have an extension in Firefox to create some of the scripts in the first place. But what would be really cool is if you could tweak the scripts afterward to make it easier to test a range of scenarios, and then automate the running of them each time you check a change into your source repository.

  8. It could also be an interesting service to offer via a website. Publicly accessible websites could be indexed by the engine, and a report mailed to the siteadmin with all the errors/warnings.

Comments are closed.