Saturday, July 16, 2005

WITSML not "Document Oriented"

Some more refining of a subtle problem with WITSML. The 1.30 API tightens up the specifications of how servers handle update requests on growing objects like logs. Good! The server's responsibilities are a little clearer now.

The problem is that the server has responsibilities. The server has to understand what a log object is. If you append some <data> elements and only some curves are present, the server must fill in empty curves with the null value. Probably the server must also update the start and end indexes as another side effect (still never explicitly stated in the API).

Even if we someday exhaustively specify server responsibilities, these complex responsibilities are bound to be implemented differently by each company, and incorrectly in many cases. It's an interoperability nightmare.

Contrast that to a document oriented approach: a client would retrieve the data it needs, modify it as neccessary, and update the log object. No side effects: the client submits fully correct <data> elements, including nulls for the omitted curves; it may update the start and end indexes. In this scenario, the client's expectations are never disappointed! Any mistakes are the client's responsibility. A really good server would enforce integrity and prevent clients from making inconsistent changes. Yet a poorly implemented server that did not enforce that integrity would still work well with a well-behaved client. (And anyway, integrity requirements on one server may be quite different from those on another server; under document-oriented approach, it is fine for integrity constraints to be server specific).

In case you're wondering about efficiency: the document oriented approach doesn't mean you have to send complete log objects back and forth just to update. You can still use an update protocol as we do now. It just shifts responsibility for performing complex transformations on data, to the guy who wants to do them -- the client.

Document orientation is a shift in outlook that would make servers less complex to build, and more widely available.

Saturday, July 02, 2005

Versioning, Messages, and Models

There's a dangerous idea creeping into WITSML services. It smells a little like the problems I outlined in Messages not Models: Is WITSML morphing into a data model?

A virtue of WITSML has been its document-oriented focus. But now the discussion at the last SIG concerned delivering WITSML documents to clients understanding only legacy versions of the schema -- in particular we expect a 1.20 client connecting to a 1.30 server to receive version 1.20 documents.

The creeping dangerous idea here is that there is server that has abstracted the data and stored it in some version-independent model. And now there neccessarily are some canonical mappings that need to be promulgated. And we are a hair's breadth away from defining a logical data model for well data.

A purely document oriented approach would regard the WITSML documents we send around as, well, documents. If you go to a web site and retrieve a PDF file, you don't ordinarily expect to be able to ask the server to convert it to MS Word format before delivering it. The document is just the bits as someone put them up there, and the server has no idea what they are.

We should think of WITSML documents the same way. If I insert a WITSML version 1.20 well document, it's reasonable to say that anyone retrieving that document should get the version 1.20 document -- not the contents of that document mapped -- possibly with losses or errors -- to version 1.30. For the same reason, it was a mistake to require WITSML servers to perform units of measure transformations for a client.

That said, it is still possible, but not required, for web servers to perform media type transformations, using content negotiation. If the server has several different formats in which it can deliver the content, it can supply these choices to the client. That would be an appropriate way for mutually consenting clients and servers to agree on the format of a returned document. Yet another idea from web architecture we should incorporate into WITSML (cf. WITSML+REST).

Point: We should be thinking about documents, not models, as the resources we are manipulating on the server.