June 2024

December 13, 2006

Validation considered harmful

Filed under: semanticweb,webarch — Mark Baker @ 2:10 am

We don’t get a lot of enjoyment about trampling sacred cows here at Coactus, honest we don’t. But we see so much bad practice out there – more so recently – that we feel compelled to speak out.

Today’s sacred cow is document validation, such as is performed by technologies such as DTDs, and more recently XML Schema and RelaxNG.

Surprisingly though, we’re not picking on any one particular validation technology. XML Schema has been getting its fair share of bad press, and rightly so, but for different reasons than we’re going to talk about here. We believe that virtually all forms of validation, as commonly practiced, are harmful; an anathema to use at Web scale. Specifically, our argument is this;

Tests of validity which are a function of time make the independent evolution of software problematic.

Why? Consider the scenario of two parties on the Web which want to exchange a certain kind of document. Party A has an expensive support contract with BigDocCo that ensures that they’re always running the latest-and-greatest document processing software. But party B doesn’t, and so typically lags a few months behind. During one of those lags, a new version of the schema is released which relaxes an earlier stanza in the schema which constrained a certain field to the values “1”, “2”, or “3”; “4” is now a valid value. So, party A, with its new software, happily fires off a document to B as it often does, but this document includes the value “4” in that field. What happens? Of course B rejects it; it’s an invalid document, and an alert is raised with the human adminstrator, dramatically increasing the cost of document exchange. All because evolvability wasn’t baked in, because a schema was used in its default mode of operation; to restrict rather than permit.

Just because it only makes sense for a field in a document to contain the values “1”, “2”, or “3” today, does not necessarily mean “4”, “0”, or “9834” won’t be valid tomorrow. Similarly, just because a document doesn’t now contain a field named “Blarg”, it doesn’t mean it won’t later. A good rule of thumb in document design is to avoid making assumptions about what won’t be there in the future, and a rule of thumb for software is to defer checking extension fields or values until you can’t any longer.

On the Web, you need to be able to process messages from the future.

P.S. if you’re wondering what time-independent validity looks like, we’ll cover that at a later date (check the tags of this post for a hint).

• • •

November 1, 2005

Principled Sloppiness

Filed under: integration,semanticweb,webarch,webservices — Mark Baker @ 11:52 am

Adam Bosworth has published a great article in ACM Queue titled Learning from THE WEB. As you might expect, it’s very much in the vein of previous missives from Adam, where he highlights the advantages of Web based development, much as we espouse here.

As surely comes as no surprise to my regular readers, I largely agree with the points Adam makes in this article. But rather than focus on those points, I’d like to instead make an observation about his writing style, and the descriptors he uses for the Web, since I feel this makes for an interesting contrast with my own style.

Adam uses words like “sloppy” to describe what I would describe as the principled application of must-ignore style extensibility (as seen throughout Web architecture). It’s not that either descriptor is “better” or “worse” necessarily, just different sides of the same coin … which isn’t recognized nearly enough IMO, by those who promote “better” alternatives to Web based development, e.g. Web services. But then that’s the same point Dick Gabriel was getting at with his schizophrenic and self-contradictory ramblings on Worse is better.

Update: the one part of the article I took issue with – the critique of the Semantic Web – is well countered by Danny Ayers.

• • •
Powered by: WordPress • Template by: Priss