Sunday, November 26, 2006

Rating WS-*

I blogged back in January on the lack of a WS-Contract, so I thought it was time to review the landscape and see what WS-* standards there are these days, and how much wasted effort there has been. I'm excluding all the vertical standards from this (like Translation Web Services et al) as they are addressing specific business problems and so at least its an attempt to work from a business perspective, the goal here is to look at the technical standards.

To this end I'm going to split the WS-* standards into five groups
  1. Good Idea - The standards that make sense and I can see being used
  2. Good Or Dumb - The standards that have potential to go either way
  3. DOA - The standards that have either died or shouldn't have been looked at
  4. Dangerous Idea - The standards that are down right dangerous and will cause more harm than good
  5. MIA - The standards that just don't exist, but which should
Good Idea
  • WSDL - Its the interface, its a good idea, and WSDL 2.0 has some decent extensions that will cause issues (callbacks) but are there for a decent reason
  • WS-Addressing - It was needed and it will help more complex deployments
  • WS-Policy - Great idea, shame its not being universally adopted elsewhere
  • WS-BPEL 2.0 - Just need that human workflow in there chaps
  • WSDM - Good idea, and it seems like the implementation might be on the way too.]
  • WS-RM - Reliability is a good idea
  • WS-Trust, WS-Security etc - Damn good idea to do this in a standardised way
  • WS-Federation - Federated things are good, centralised things are bad
Good Or Dumb
  • Semantic Web Services - Could very easily turn into intellectual masturbation and very little real world practicality. Could also be fronted by decent tools and provide a great way to add more formalism to a contract.
  • WS-Context - Looks like a good spec on paper, the devil in this one will be in the implementation, bit worried about the WS-CTX service which looks like a big central attempt to manage context, and the lack of reference to process standards such as WS-BPEL, could DOA in a year.
  • UDDI - Never realised its grand vision, sure its still going inside some decent products but it clearly isn't the standard it was cracked up to be
  • WSN - Web Service Notification, again its down to the implementations here, I haven't seen much out in the wild that indicates its going strong, even though its a 1.3 standard.
  • WSRF - Resource management, it sounds beguiling but I'd bet on this moving into Dangerous as product implementations start coming out and making resource state into some complex beast, they could however make it trivial and please the "stateless" invocation crews.

  • WS-Choreography - sounded so good, but just doesn't seem to have the weight behind it
  • FWSI - Hadn't even heard of it till I started this post
  • WSRP - RSS does a better job
  • Web Service Quality Model - Sounded good... but has it gone anywhere?
  • WS-Reliability et al - killed off by the better WS-RM standards
  • WS-Discovery - Jini broadcast discovery for Web Services... oh god.
Dangerous Idea
  • WS-TX - Two Phase Commit via Web Services, I'm sorry this is in my dumb ideas group. People will expose this via the internet and then "wonder" why they are having massive performance problems. If something is so closely bound in that it needs 2PC, then build it tightly bound, not using Web Services.
  • WS-BusinessActivity - Shifts logic around into the ether, not a great idea
  • WS-Contract - Defining business contracts on services, include pre, post and invariants
  • WS-SLA - defining the SLA for both the service, and the invocation.
So there are quite a few good ideas, and a whole heap of not very good ideas. But it is good to see that the basics of security, reliability and identity are being covered. To be honest its better than I expected to see, and I've deliberately excluded all of the random WS-* proposals that have never made it into a multi-consortium group or a standards body.

Any I've missed or any ratings that are wrong?

Technorati Tags: ,, ,

Wednesday, November 22, 2006

Want to be cool? Learn REST. Want a career? Learn WS

I've been reading about, and playing with, various different REST and WS approaches recently. I have to admit that when knocking up a quick demo of something over a website where I am doing both sides that REST is very nice and quick, and WS is still more complex than it should be in the tools.

But as with any technical discuss there is another thing that should drive you when you look to your decisions. This is whether you want to get fired for a decision when something goes wrong, and whether you want your skills to be relevant in 3 years time.

REST v WS is pointless from a technical perspective IMO, and will become more so as tooling sets improve.

From a business perspective however the choice is much more stark, and really doesn't come down well for the folks in the REST camp.

Out in the big wide world of the great employed of IT there are four dominant software players, these people represent probably the majority of IT spend in themselves and influence probably a good 95% of the total IT strategy out there on planet earth. Those four companies are SAP, Oracle, IBM and Microsoft.

These are the companies who your CIO goes to visit and sits through dinners and presentations on their product strategy, and what they are pushing is WS-* in all its ugly glory. This means that in 3 years time you 100% will have WS-* in your company, in a company you work with, or in a company you want to work with. Sure you can argue that its harder and more difficult than REST, in the same way as you can argue that Eiffel, PHP or Ruby are more productive languages. Some people will get to use these language commercially, some people will get to use REST commercially.

Everyone will have to use WS-* commercially if they want to interact with systems from the major software vendors.

I'm not saying its right, just that its reality. The best technology isn't the technical purest, the most productive or the easiest, its the one that the most people use and which has the widest acceptance and adoption. For shifting data across the internet this means its what SAP, Oracle, IBM & Microsoft say, its also what the various vertical standards (who the big boys aim to implement "out of the box") who have also all gone for WS-*.

The technical discussion is pointless, the commercial discussion is mute. But hey lets continue having the discussion on REST v WS because it makes us feel cool and trendy. Its about time that IT people realised that we need to have discussions based on commercial realities not on technical fantasies.

Technorati Tags: , , ,

Tuesday, November 21, 2006

Using SOA to understand competitive advantage

One of the bits that I've talked about at various conferences and in the book is using SOA to understand the business value that various services have.

I was chatting on the phone today with someone around this topic and I thought it was worth a post on why treating SOA as a technology thing missed the real power and opportunity of what IT can deliver. There was a research report from the Economist recently that said that IT would need to move away from reducing cost and towards delivering value and of course that the business and IT have different views on how this will happen and what the barriers would be.

But lets start first with the IT department being expected to deliver value, this means that you have to understand the bits that add value, and of course the bits that don't. If you are viewing SOA as a technology thing then it really isn't going to help, as you can't really start ascribing value to a WSDL or a BPEL process, its just to low level to consider investment or cost cutting down there.

This means you have to have some way of understanding the business, some way of understanding what you will need to deliver to the business, and therefore some way of understanding the different values and drivers of the different parts of the business. Some people might say "That is what enterprise architecture is for" but I'd have to disagree as its really just the first step in enterprise architecture and its also the first step in actually managing an IT department, and I don't see the concept of "value" early on in the likes of Zachman or TOGAF .

This is why I argue for a simple approach to the business service architecture, because it quickly gets us to the stage where we can understand what the services are and we can then use that information to understand the business value.

By understanding which services are "under the line" and which are above the line you can quickly understand where you should cut cost, and where the business would appreciate suggestions as to added value. Speaking to someone from work a week or so ago he mentioned working with someone who did a similar exercise a few years back and found that they had over 50 projects in areas below the line.

As the business starts expecting IT to deliver more value its going to be essential for IT to understand more about which parts of the business deliver actual value, and which are just IT things that we think add value but in fact no-one actually cares.

For instance, how many Finance or HR services would live above the line? How much competitive advantage is there in having a customised Invoicing process? How much advantage is there in having an EAI tool or an ESB? How much value is there in REST v WS? Once you understand the value you can start delivering actual benefit and realising that the technology isn't important, its the end-game that matters.

For SOA to give competitive advantage it means knowing what advantage looks like.

Technorati Tags: ,

Thursday, November 16, 2006

XML is not human readable

Over the last couple of weeks I've heard the same phrase said by about five different people
The advantage of XML is that its human readable, this is why Web Services are better than previous technologies.
Now I'm not going to get in a readability of WSDL v IDL (hint: the winner isn't WSDL). But I think its worth examining the whole concept of XML and whether it should be human readable, particularly when it comes to business processes, service descriptions and service contracts.

So should a "good" Web Service description be human readable? Lets examine the purpose of that description

  1. To enable consumers to call the service correctly
  2. errr... that is pretty much it
So given that goal what is the best way to show this to both systems and to people? The answer is of course to have a common technical language that enables accurate exchange, and then have this rendered for different types of people and systems, so Java code turns into "Java", C# turns it into C# and for people it gets rendered into a nice picture that shows the methods and the constraints.

WSDL and BPEL (especially BPEL) are examples of that technical language. There was never a goal for them to be human readable, they are aiming to be machine readable. The Geo Ripping wsdl is a very simple self contained example as to why XML isn't designed for humans to read.

Sure when it comes to debugging you can print out the XML and a skilled person can spot some of the errors, but then you could do this with RMI, CORBA, DCOM and even C (using a hex editor in the later case) but the idea of "human readable" is that anyone could read the SOAP messages or BPEL process context, and this 100% isn't true.

Or to look at it another way....

XML is not human readable, its not designed to be human readable and you shouldn't try and make it human readable. Just because something is in Unicode doesn't mean that anyone can read it. French, Chinese, Klingon (WTF?), Japanese, German, English, Urdu and many other languages can be written in Unicode, and XML should be viewed in the same way but as a language with lots of unrequired syntax, no real semantics and pretty random grammar in general.

Think of XML as being English spoken by a sulky French teenager, lots and lots of grunts that mean nothing to anyone and the occasional fragment of something that no-one actually properly understands.

Reading BPEL is like trying to understand a conversation between a sulky French teenager and a sulky American teenager... in Chinese when they've only had two lessons and you don't speak Chinese.

XML is as hum

Technorati Tags: , ,

Tuesday, November 07, 2006

What Geo ripping means to the enterprise

The other reason for Geo ripping wikipedia was to explore what can be done with the unstructured information that is created inside organisations and how easy it would be to
  1. Re-purpose the information
  2. Give credence to quality
  3. Turn human focused information into systems focused information
The first piece that is critical is that the Wikipedia information isn't truly unstructured. So I was really taking templated information out, which meant it was much easier than truly unstructured information. But this is a pretty standard case when you think about information that is stored in Access databases or Excel sheets where templated or semi-structured information is the norm which makes it a reasonable use case to think about how current information in things like Excel et al can be turned into information that can be directly used elsewhere in the enterprise.

So that is stage one, which leads directly to stage two, namely the question of data quality and provenance. If I release information that is manually created into a spreadsheet (but on which critical decisions are currently based) and allow that to be directly integrated elsewhere without the human judgement and oversight, how do consumers know the quality or provenance of the data? How do I state on my Web Service "this service shouldn't be used for anything serious like nuclear power or making actual decisions" without it become the standard shrink wrapped license that all software vendors tag on, and everyone ignores?

The threat here is that Line of Business (LOB) will use this sort of approach to create a web service like "Current Sales Budget" which contains not only out of date information, but information that has incorrect assumptions. This will then be consumed by others who think it is the "real" current sales budget. This is a big risk in businesses especially if used for modelling and the like as small errors in one place can lead to massive errors at the end. Data provenance is going to be a big issue in this world of "easy" to develop Web Services.

The final element is about going the other way from the previous goal of IT which has tended to turn systems information into human focused information. The goal here is to take all of the information created in these collaborative and participative systems and turn it back into something that the enterprise can use, hence the reason I wanted to take a Wiki and put the information into a database.

So my little experiment proved that it can be done, and that its liable to be an issue in terms of data and provenance. Not sure on the solution yet, but at least it give me something to think about.

Technorati Tags: , , , , , ,

Geo-ripping Wikipedia

As part of my on going quest to stop my drift into senility and powerpoint (the difference is marginal) and make sure that when I recommend things to clients that they actually work I went in search of some Web Services to play with. Now there used to be a useful bunch over a Capescience, now they've just got the Global Weather one which is okay, but I could do with more than one (and I hate stock quote examples). I also wanted to see what could be done to get some interesting information out of wikipedia, so I hatched a plan.

The idea was to create a very simple Web Service which took the Wikipedia Backup file and then extracted from it the geolocation information that now exists on lots of the pages.

Stage one was doing the georip, this was very simple. I elected to use a StAX parser (the reference implementation infact) as the file is excessively large. Using StAX was certainly quick (it takes Windows longer to copy the file somewhere than it does StAX to do the parse) but there are a few odd elements in using it (which I'll probably put a separate post on). That gave me a simple database schema (created using Hibernate 3 as an EJB 3 implementation and using inheritance in the objects mapped down onto the tables, very nice implementation by the Hibernate folks).

Next up was the bit that I thought should be simple, after all I had my domain model, I had my queries, I had even built some service facades, so how long could it possibly take to get this onto a server? The answer was far too long, particularly if you try and use Axis2 v1.0 (which I just couldn't get to work), switching to Axis 1.4 picked up the pace considerably, and thanks to some server coaching by Mr Hedges its now up and running.

There are around 70,000 geolocations that I managed to extract from Wikipedia. Some of these aren't accurate for several reasons
1) They just aren't accurate in Wikipedia
2) There were multiple co-ordinates in the page, so I just picked the first one
3) There are 8 formats that I identified, there could be more which would be parsed wrong.

So there it is, extracting 70,000 locations out of unstructured information and repurposing it via a Web Service for external access. Couple of notes

1) The Wikipedia -> DB translation is done offline as a batch
2) Name based queries are limited to ten returns
3) Any issues let me know
4) Don't rely on any information in this for anything serious.

The next stage is to get some Web Services up that run WS-Security and some other WS-* elements so there are more public testing services for people to use.

Technorati Tags: , , , , ,

Thursday, November 02, 2006

Heisenberg's SOA - the uncertainty of state

Heisenberg uncertainty principle in physics basically says that you can't know both where something is and how fast it is going, the more accurately you know one, the less accurately you know the other.

The thread reference above was talking about statelessness and whether this is a good or a bad thing (answer: depends) but it raised an interesting thought in my mind about back-end reconciliation and the non-availability of actual present state.

Like the busy beaver (you can't know how much resources something will take) and the halting problem (or when it will stop) there is another problem in complex systems in that as a consumer you cannot know what the current internal state of a service is, and indeed you shouldn't care.

A few examples:

When I place an order I get a new order ID, before I make the call I have no idea what that ID will be, but it is a critical piece of information for me as it is my key into the tracking process. That ID represents a critical piece of state information that is internal to the service.

Depending how the company "picks" its products in the warehouse and handles its inventory also impacts when I will get the product. If they decrement inventory at sale then I should get it quickly if it said it was in stock. If they decrement at pick then I'm in a race condition with others to get the element. Again this is a question of the internal representation of state and its processing by the service, and as a consumer I have no way of knowing that state before I ask the service the question, and I cannot guarantee that the answer will be the same when I make my next request to the service.

All very obvious of course, but it does have an impact on the concept of stateless systems. It means that if you try and be stateless by having "before" and "after" requests then they are liable to fail as something else will alter the "before" state independently of what you are trying to do. Its even more complex in those areas that do backend consolidation, either based on timestamps or (in the case of a certain airline) when they receive the request on the mainframe. In these cases its impossible to say at 1pm what the 1pm answer actually is, the system doesn't know yet, and you certainly don't know.

This uncertainty of state is actually a good thing, it is letting the service manage its world in the most effective way that it can on behalf of its consumers. If the state had to be stored externally to the service (in some process message or something) then the level and degree of complexity would be unmanageable.

So maybe the uncertainty principle in SOA is that the simpler the interaction with the service, the less you know about the impact.

Statelessness is good sometimes, but its not something to pursue at the expense of simplicity.

Technorati Tags: ,