One observation; it looks like David’s using the term “versioning” to mean what I call “explicit versioning”, where the consumer of the document needs to know what “version” the document is in order to understand it. I think that’s an important difference between the general versioning problem which can have “no explicit versions” as a solution. I haven’t looked at the latest draft TAG finding to understand that view, but I suspect they use the term similarly to how I use it.
March 25, 2007
March 8, 2007
It’s been a week since the W3C Workshop on Web of Services for Enterprise Computing
wrapped up, and I’ve had some time to reflect on it.
On the plus side, there’s been some very welcome progress in Web services vendors moving towards the Web. Having Web services stalwarts Chris Ferris and Glen Daniels acknowledge that the use of URIs and HTTP GET was a better way to request data, gave me the warm-and-fuzzies. But they didn’t go beyond that, to consider the value of the mutating-side of the uniform interface. Dave Orchard also reiterated his position that Web services need service-specific writes.
On the down side though, I don’t think much was accomplished at the workshop itself. The point I was trying to make in my presentation, about SOA/WS not separating interface and implementation, must have gone totally over peoples’ heads because nobody commented on it, either to agree or disagree. And the recommendations seemed largely toothless with respect to doing something about this mess within the W3C… though I suppose the WS-Core proposal to consolidate maintenance of the W3C WS-* specs would save some resources.
As always though, it was great to get together with this group of folks; despite our technical disagreements, and the occasional hurt feeling over the past few years, there’s still a lot of mutual respect there. It was also nice to meet some people face-to-face for the first time, such as Pete and Dims.
Speaking of Pete, he also had, IMO, the best line of the workshop. On the topic of description languages, he commented;
Stop trying to stop the world and describe it
February 28, 2007
It’s a short presentation because I anticipated (and received!) lots of questions. The objective was to drill down to the primary technical reason why I believe the Web is better; it’s the same reason I’ve been writing about a lot recently.
January 2, 2007
Starting with the Web
A position paper for the W3C Workshop on Web of Services for Enterprise Computing, by Mark Baker of Coactus Consulting.
For those that might not be aware of my position regarding Web services, I will state it here for clarity.
I believe that the Web, since its inception, has always been about services, and therefore that “Web services” are redundant. Worse than that though, while the Web’s service model is highly constrained in order to provide important degrees of loose coupling, Web services are effectively unconstrained, and therefore much more tightly coupled. As a result, Web services are unsuitable for use at scale; certainly between enterprises, but even between departments in larger enterprises.
With that in mind …
I begin from the assumption then, that Web services are not part of the solution, and that the Web is. And while I believe that the Web is far more capable as a distributed system than commonly believed, there is also little doubt that it is lacking in some important ways from an enterprise perspective … just not as much as some believe.
I suggest that the problem statement is this;
How can the Web be used to meet the needs of the enterprise?
Part of the Solution
When looking at how to use the Web – how to add new “ilities” (architectural properties) or modify amounts of existing ones – we need an architectural guide to help us understand the tradeoffs being made by either relaxing existing constraints, or adding new ones. I suggest that the REST architectural style is a good starting point, because in addition to being rigorously defined and providing well documented kinds and degrees of properties, it is also the style used to craft the foundational specifications of the Web; HTTP and URIs.
I anticipate that most of the improvements that are developed will be REST extensions; solutions which provide additional constraints atop REST, and therefore don’t require disregarding any of REST’s existing constraints. In my experience, this is a desirable situation in general because of the emphasis of REST on providing simplicity, and the role many of its constraints have in providing it. In addition, on the Web, it means that network effects are maximized and centralization dependencies (and other independent evolvability issues) avoided. It’s also often achievable in practice, as long as the problem continues to involve document exchange, in my experience… but not always.
For example, when performance is a make-or-break proposition, and other avenues for improving it have been exhausted, sometimes the stateless constraint will need relaxing. This comes with reasonably large costs – reliability, scalability, and simplicity are reduced for example – but if those costs are fully appreciated and the benefits worth it, then it is the right choice. Because, of course, the solution isn’t REST as a panacea, it’s REST as a starting point, a guide.
The Enterprise Web?
So what features do enterprises need exactly? Some of the common ones mentioned include reliability, security, and transactions. Let’s dig a little deeper into those …
Reliability is often mentioned by Web services proponents as a property in which the Web is lacking. But what is meant by “reliability” in that context? Commonly, it’s used to refer to a quality of the messaging infrastructure; that HTTP messages are not, themselves, reliable. That’s true of course, but it’s equally true of any message sent over a network which is not under the control of any single authority, like the Internet; sometimes, messages are lost.
The issue, therefore, is how to build reliable applications atop an unreliable network.
In general, there is little that can be done for a Web services based solution. As arbitrary Web services clients and servers share knowledge of little more than a layer 6 message framing technology and processing model (SOAP), the only option for addressing the problem is at this level, using that knowledge. So we can take actions such as assigning messages unique identifiers in an attempt to detect loss or duplication, but not much else. Solutions at this level tend to do little more than tradeoff latency (i.e. wait time for automated retries) for slightly improved message reception rates. But even then, applications still have to deal with the inevitable case of message loss. Some might even argue that “reliable messaging” is an oxymoron, and that RM solutions are commonly attempts to mask the unmaskable.
Web servers and clients though, share a lot more than just a message framing and processing standard; they share a standardized application interface (HTTP GET, PUT, POST, etc.. as well as associated headers and features), and a naming standard (URIs), both richer (layer 7) artifacts that a generic Web services solution can’t take advantage of. As a result, the options for providing reliability are greater and, in my experience, simpler.
Consider that a reliable HTTP GET message would make little sense, as HTTP GET is defined to be safe; if a GET message is lost then another can be dispatched without fear of the implications of two or more of those messages arriving successfully at their destination. The same also holds for PUT and DELETE messages; though they’re not safe, they are still idempotent. Known safe and/or idempotent operations are a key tool in enabling the reliable coordination of activities between untrusted agents. Most important though, is that there exist known operations, as this makes a priori coordination possible in the first place.
HTTP POST messages, however, due to POST being neither safe nor idempotent, have some of the same reliability characteristics as do Web services. But despite this, the use of POST as an application layer semantic, together with standardized application identifiers, permits innovative solutions such as POE (POST Once Exactly), which uses single-use identifiers to ensure “only once” behaviour.
The “404” response code is also sometimes mentioned as a reliability problem with HTTP, but it is not at all; quite the opposite, in fact. The 404 response code, once received, reliably tells the client that the publisher of the URI has, either temporarily or permanently, detached the service from that URI. A distributed system without an equivalent of a 404 error is one which cannot support independent evolvability, and therefore is inherrently brittle and unreliable.
Security on the Web today is likely the best example of how the Web itself has been limited by the canonical “Web browsing” application. The security requirements for browsing were primarily to provide privacy, and the simplest solution providing privacy was to use SSL/TLS as a secure transport layer beneath HTTP (aka “HTTPS”).
For the enterprise, and for enterprise-to-enterprise integration in particular, privacy is necessary, but insufficient in the general case as requirements such as non-repudiation, intermediary based routing, and more are added (others can better speak to these requirements than myself). Of course, TLS doesn’t really begin to address these needs, and work is clearly needed.
Luckily, I’m not the first person to point this out. There have been at least two initiatives to add message-level security (MLS) to HTTP; SEA from 1996, HTTPsec from 2005. One might also consider S/MIME and OpenPGP as simple forms of MLS insofar as they’re concerned with privacy and sender authentication.
Interestingly, message-level security is actually more RESTful than transport level security in two respects; that visibility is improved through selective protection of message contents (versus encrypting the whole thing with transport layer security), and that layering and self-descriptiveness (possibly including statelessness) are improved by using application layer authentication instead of (potentially – when it’s used) transport layer authentication.
This is obviously an immensely broad topic that we couldn’t begin to cover in a position paper in detail sufficient to actually make a point. Suffice it to say though, that as with reliability, my position is that the space of possible solutions is transformed considerably by the use of a priori known application semantics (and otherwise within the constraints of the REST architectural style), and therefore that solutions will look quite unlike those used for Web services.
Perhaps there will be an opportunity to work through a scenario at the workshop, if there’s enough interest.
The Web offers a more highly constrained, more loosely coupled service model than do “Web services”. While enterprises can often afford the expense of maintaining and evolving tightly coupled systems, inter-enterprise communication and integration is inherrently a much larger scale task and so requires the kind of loose coupling provided by the Web (not to suggest the Web has a monopoly on it, of course). Even within the enterprise though, loose coupling using existing pervasively deployed standards (and supporting tooling, both open source and commercial) could offer considerable benefits and savings. REST should be our guide toward that end.
December 13, 2006
We don’t get a lot of enjoyment about trampling sacred cows here at Coactus, honest we don’t. But we see so much bad practice out there – more so recently – that we feel compelled to speak out.
Surprisingly though, we’re not picking on any one particular validation technology. XML Schema has been getting its fair share of bad press, and rightly so, but for different reasons than we’re going to talk about here. We believe that virtually all forms of validation, as commonly practiced, are harmful; an anathema to use at Web scale. Specifically, our argument is this;
Tests of validity which are a function of time make the independent evolution of software problematic.
Why? Consider the scenario of two parties on the Web which want to exchange a certain kind of document. Party A has an expensive support contract with BigDocCo that ensures that they’re always running the latest-and-greatest document processing software. But party B doesn’t, and so typically lags a few months behind. During one of those lags, a new version of the schema is released which relaxes an earlier stanza in the schema which constrained a certain field to the values “1”, “2”, or “3”; “4” is now a valid value. So, party A, with its new software, happily fires off a document to B as it often does, but this document includes the value “4” in that field. What happens? Of course B rejects it; it’s an invalid document, and an alert is raised with the human adminstrator, dramatically increasing the cost of document exchange. All because evolvability wasn’t baked in, because a schema was used in its default mode of operation; to restrict rather than permit.
Just because it only makes sense for a field in a document to contain the values “1”, “2”, or “3” today, does not necessarily mean “4”, “0”, or “9834” won’t be valid tomorrow. Similarly, just because a document doesn’t now contain a field named “Blarg”, it doesn’t mean it won’t later. A good rule of thumb in document design is to avoid making assumptions about what won’t be there in the future, and a rule of thumb for software is to defer checking extension fields or values until you can’t any longer.
On the Web, you need to be able to process messages from the future.
P.S. if you’re wondering what time-independent validity looks like, we’ll cover that at a later date (check the tags of this post for a hint).
December 4, 2006
I’ve elaborated on an earlier blog post about reuse over at InfoQ with an article titled The Lost Art of Separating Concerns. Check it out and let me know if this helps you understand REST’s value-add over SOA/WS any better.
October 17, 2006
I’ve been doing a lot of writing recently (most of it yet to be published) on a specific topic that I first wrote about here a while back. The topic is important because it’s the answer to a question I was asked (offline) a few weeks ago;
In your opinion, what is the one thing that most clearly separates REST from SOA?
The answer I gave was… generality.
If you ignore identifiers, hypermedia, and all those other Webby things, and just look at REST and SOA as two architectural styles, the main difference is that REST internalizes the best practice that generalized interfaces are better (in the general case of document oriented distributed systems). That’s really 90% (to pick a large number out of the air) of the difference. If you’ve come from an RPC or OO-RPC background (CORBA, DCE, DCOM, RMI, Web services, etc..), and understand that, then you understand the Web. And if you don’t, you don’t.
November 10, 2005
A common technique in software development, but particularly in distributed software development, is the “separation of interface from implementation”, whereby software components are developed to have dependencies only upon the interfaces (the connectors, in architectural terms) of other components, and not their implementations. The purpose of this technique is to isolate the impact of a change of implementation to just the changed component, thereby reducing coupling between components. The technique is so pervasively referenced and evangelized, that it’s practically axiomatic.
Unfortunately, it’s not commonly practiced.
As mentioned, the purpose of the technique is to isolate the impact of implementation changes. Yet this WSDL contract for a new home buyers database places extreme restrictions on how the implementation can vary because it is so specific to, well, a new home buyers database. Need an interface to an historical database of home purchases, or one to a database of recent automobile sales? Too bad, you need a new interface, and therefore new code to use that service. So much for that separation of concerns!
The lesson here, I suggest, is the following;
True separation of interface from implementation requires designing broadly reusable interfaces.
November 1, 2005
Adam Bosworth has published a great article in ACM Queue titled Learning from THE WEB. As you might expect, it’s very much in the vein of previous missives from Adam, where he highlights the advantages of Web based development, much as we espouse here.
As surely comes as no surprise to my regular readers, I largely agree with the points Adam makes in this article. But rather than focus on those points, I’d like to instead make an observation about his writing style, and the descriptors he uses for the Web, since I feel this makes for an interesting contrast with my own style.
Adam uses words like “sloppy” to describe what I would describe as the principled application of must-ignore style extensibility (as seen throughout Web architecture). It’s not that either descriptor is “better” or “worse” necessarily, just different sides of the same coin … which isn’t recognized nearly enough IMO, by those who promote “better” alternatives to Web based development, e.g. Web services. But then that’s the same point Dick Gabriel was getting at with his schizophrenic and self-contradictory ramblings on Worse is better.
Update: the one part of the article I took issue with – the critique of the Semantic Web – is well countered by Danny Ayers.
October 27, 2005
(Sorry, my personal blog (and the machine that runs it) died while I’m on the road. I should have it up again Sunday or Monday).
Some sage advice from Steve Ballmer on services;
Ballmer also suggested that the corporate network and the Web shouldn’t be thought of as two entities that get searched separately and that the two should be tied together to provide one seamless search experience.
Web Architecture dictates that resources should be identified with
URIs. Thus, use of the abstract properties of an EPR other than
wsa:address to identify resources is contrary to Web Architecture. In
certain circumstances, use of such additional properties may be convenient
or beneficial, perhaps due to the availability of QName-based tools. When
building systems that violate this principle, care must be taken to weigh
the tradeoffs inherent in deploying resources that are not on the Web.
Here’s hoping that this might serve to motivate Web services proponents to realize the Web meets their requirements – for a loosely coupled, document oriented, platform independent distributed computing framework – better than Web services themselves.