Towards truly document oriented Web services
In the beginning
For much of the first year or two in the life of Web services – and indeed all of their
history up to that point –
they were about remote procedure calls (RPC); exposing remote APIs across the Internet in order to facilitate machine-to-machine communication and ultimately, business-to-business integration over the Internet.
It didn’t take very long however, for Web services proponents to realize that they needed to distance themselves from RPC and its well-deserved reputation as a poor large scale integration architectural style, due to the failure of systems such as CORBA, DCOM, and
RMI to see any widespread use on the Internet. So, sometime in 2000/2001, collective wisdom in the space shifted towards a preference for “document oriented” services. Vendors quickly jumped on board with upgraded toolkits, and that was that; documents were the New Big Thing.
Unfortunately, the basic architectural assumptions underlying Web services at the time, didn’t change nearly enough to distance Web services from the problems of RPC.
What is “Document oriented”?
Respected Web services guru Anne Thomas Manes
succinctly (and unknowingly, it appears)
describes the differences between RPC and document orientation;
Document style:
<env:Body> <m:purchaseOrder xmlns:m="someURI"> ... </m:purchaseOrder> </env:Body>RPC style:
<env:Body> <m:placeOrder xmlns:m="someURI"> <m:purchaseOrder> ... </m:purchaseOrder> </m:placeOrder> </env:Body>The bigger difference is how you encode the message. […]
While the encodings used were certainly different, each with its own not-insignificant pros and cons, what Anne failed to point out is that the RPC example included an operation name (“placeOrder”) while the document oriented example did not. This constitutes an extremely significant architectural difference, as it tells us that Anne’s document example uses a state transfer style, while the RPC example does not.
State Transfer
State transfer styles, which include
MOM,
EDI,
pipe and filter
and others, are characterized primarily by one architectural constraint; all the components expose the same application interface. Actually, in most cases, including those three, the application interface is constrained to providing a single operation that one might call “processData” (it’s actually called “putData” in that pipe-and-filter description). Each server component exposes this operation, enabling any client to submit data to it for processing. In addition, because there’s only one operation, its use is implicit and therefore needn’t be included in the message.
Allow me to reiterate my main point; Anne’s document oriented example above includes an implicit (“processData”) operation.
REST
REST – REpresentational State Transfer – is, as the name suggests, also a state transfer style. One of the interesting ways that REST differs from the others, is that rather than constrain the interface to the single “processData” operation, it allows any operation which is meaningful to all components (referred to as the “uniform interface”). An interesting side-effect of allowing more than one operation, is that it requires messages be explicit about the operation in use, since there obviously needs to be a way to disambiguate messages with the same document, but different operations.
HTTP is the application protocol most closely associated with REST, largely because it was developed to respect many of REST’s constraints. As it related to the uniform interface and explicit operations, HTTP provides a “POST” operation which is an alias for the aforementioned “processData” operation. So, back to Anne’s example again, this HTTP message is semantically identical to her document oriented example;
POST some-uri HTTP/1.1 Host: some-host.example.org Content-Type: application/x-purchase-order+xml <env:Body> <m:purchaseOrder xmlns:m="someURI"> ... </m:purchaseOrder> </env:Body>
Moreover, note that if the HTTP operation were different – say, if it were “PUT” instead of “POST” – then the message would no longer have semantics identical to Anne’s original document oriented example. Yes, this means that the semantics of the message are a function of the application protocol being used, unlike conventional wisdom with Web services which suggests that message semantics should be “protocol independent”.
Conclusion
Hopefully this little note helps put in context the architectural relationship between the Web and document oriented Web services. The relationship is closer than it appears in some important ways, yet more distant in others, likely as a result of the fact that Web services began with RPC, rather than with a truly document oriented architectural style. Perhaps spelling this out explicitly, as I hope I’ve done here, will help more Web services proponents realize the importance of the Web to their objectives of integrating systems across the Internet.
Mark,
Can you please expand on “Moreover, note that if the HTTP operation were different – say, if it were “PUT†instead of “POST†– then the message would no longer have semantics identical to Anne’s original document oriented example.”?
thanks,
Comment by Davanum Srinivas — July 18, 2005 @ 2:38 pmdims
Hi Dims. Well, using PUT, the message would mean “store this data at that URI” instead of “process this data” if POST were used. Anne’s example means the latter, not the former AFAICT by extrapolating from her “placeOrder” semantic in the RPC example. Or put another way, “placeOrder” generalizes down to POST not PUT, just as “getStockQuote” generalizes down to GET, not POST.
Comment by Mark Baker — July 18, 2005 @ 2:46 pmNot sure what the point is … I feel doc and RPC styles have their pros and cons. Docs free us from the complex RPC encodings, but RPC semantics allow us to carry over a message over multiple hops, and still make sense to the ultimate receiver as to what opreation needs to be invoked. Ultimately, I think web services as a whole will have to take the advantages of both and leave the disadvantages and unify into one model.
Comment by Seetharama Durbha — July 20, 2005 @ 7:14 pm“but RPC semantics allow us to carry over a message over multiple hops, and still make sense to the ultimate receiver as to what opreation needs to be invoked”
Document oriented solutions can do that too. No RPC required.
Comment by Mark Baker — July 20, 2005 @ 9:32 pm[…] The blogosphere, at least, is injecting some sanity into the discussion. Mark Baker wrote a nice piece, Towards truly document oriented Web services, which does a good job of explaining why the web’s basic architecture has been and continues to be a perfect medium for “Web services” since long before the term was even coined. […]
Pingback by Web Architecture Roundup [@lesscode.org] — July 23, 2005 @ 1:46 am[…] Mark Baker provides one of the most succinct and clearest explanations of REST that I’ve seen till date. In particular, he clearly shows how REST is dependent on HTTP for its messaging semantics: […]
Pingback by Prashanth Rao’s Weblog » Blog Archive » Document Oriented Web Services Explained — July 26, 2005 @ 11:21 am[…] In a RESTful use of HTTP, HTTP provides the operations; there is nothing more to be encoded, domain-specifically or not. Only the data payload – the document representing the state of the job, or whatever it might be – is required, as I previously explained . It’s an odd oversight, because Table 6, which details the RESTful messages, makes no mention of any operation other than the HTTP ones. […]
Pingback by Integrate This»Blog Archive » Ian Foster on state and REST — September 7, 2005 @ 2:35 pmAre you saying that with the document oriented approach that since the operation is missing, it is implicit in the protocol used? I don’t see that in the above example, PUT and POST are really two different types of RPCs, that don’t involve the protocol on the wire. I agree that one needs operations, not simply documents, but somehow I don’t see your point. A document could have operation instructions included in the document, too. I thought that is really what REST does. Perhaps I’m missing something here, but this seems to be a discussion about a whole lot of nothing.
Comment by David Forslund — September 16, 2005 @ 11:38 pmHi David.
“Are you saying that with the document oriented approach that since the operation is missing, it is implicit in the protocol used?”
If you mean Anne’s example, no. I’m saying that the operation is implicit, period, independent of the underlying protocol. I’m not saying that it’s a good thing, just making that observation.
“A document could have operation instructions included in the document, too. I thought that is really what REST does.”
No, a RESTful approach (on the Web) takes the operations solely from HTTP. Everything else is just data.
“Perhaps I’m missing something here, but this seems to be a discussion about a whole lot of nothing.”
I hear ya. This is an extraordinarily subtle point, but extraordinarily architecturally significant too.
Comment by Mark Baker — September 17, 2005 @ 1:17 pmResponse posted here
Comment by Anne Thomas Manes — September 17, 2005 @ 7:17 pmREST “takes the operations solely from HTTP”. I don’t understand this statement. The operations can’t be in the http, they must be in the data transmitted. There is no operation in the http, or at least there is no way to know what the “operation” is without defining it outside of the fact that it is an http call. If the operation is contained in the data then it is important to distinguish the two. If it is not in the data, how is one to know what the operation is?
Dave
Comment by David Forslund — September 17, 2005 @ 11:21 pmDavid – rather than having this discussion in two places, let’s just take it to the mailing list.
Comment by Mark Baker — September 18, 2005 @ 4:49 pm[…] Here’s hoping that this might serve to motivate Web services proponents to realize that the Web meets their requirements – for a loosely coupled, document oriented, platform independent distributed computing framework – better than Web services themselves. • • • […]
Pingback by Integrate This»Blog Archive » Ballmer on services — October 27, 2005 @ 11:30 am