I've been meaning to write this article for a number of weeks now and have finally been inspired by attending the IBM WebSphere user group session yesterday. I decided to follow the Web 2.0 and "Mashup" track, because I feel this is becoming increasingly relevant in telecoms.
It’s interesting to see how technologies such as IBM Mashup Centre, sMash and iWidgets are making it possible to build short-lived (“situational”) applications very quickly; aggregating, consuming and repurposing content using visual drag and drop paradigms.
I recently proved this to myself, just for fun, in a very very simple way, by assembling what you might call an "application" - frygle.com - prompted by the Twitterings of Stephen Fry. (For those of you who haven't heard of twitter.com, that ISN'T an insult.)
The "application" (for want of a better word) is part blog, part feed aggregator and part search engine; specifically though, the search engine, based on Google - not only searches the web but gives additional weighting to the blog content and twittering of Mr. Fry himself.
This whole concept only took an hour to put together, including buying the domain name and going live on the Blogger platform. So the question is, how does some of this apply to the Telco world?
Well, I do think there are currently some issues around the widget concept for the voice world.
Firstly, the concept of multiple widgets works well in the visual paradigm, but does not translate to the synchronous world of the audio streams. However, more fundamental than that, is the fact we are seeing some of the processing work moved to the browser itself. In the next-generation Telco network the media server takes the place of the user's browser. This increased separation into the model view controller way of doing things makes good sense, even though in the early days client-side scripting was considered bad practice because it hampered accessibility, SEO, and was subject to the vagaries of individual browser quirks.
However, in the voice world where the delivery of content in real time is absolutely essential to the integrity of the voice user interface, additional processing load in the media server is still something most architects are going to shy away from. Furthermore, whilst we still have the legacy of traditional telephony, we are still constrained to dimensioning systems with concepts such as "ports". We therefore have to guarantee performance and footprint, and therefore need the behaviour of media servers to be utterly predictable.
However, I do think the concept of widgets in a slightly more abstract sense still applies. And this is where we come to the discussion around "service creation" versus "service assembly". What the Mashup paradigm shows us is that it is possible to assemble services from higher order building blocks, rather than writing individual lines of code. And this is exactly where we need to be in the Telco world.
Current voice service creation toolkits (and Vicorp toolkit is undoubtedly the best example) go some way to achieving this by allowing the creation of reusable components. These can be built into increasingly higher order applets of functionality, such that services can be built from rich building blocks, without the need to work at the detailed UI level. This is definitely a step in the right direction, even though there are some idealogical challenges to overcome.
For example, my vision would be to see an explosion of these building blocks, in the same way that we see an explosion of widgets, so that service assembly can be achieved using readily available published components. However, this is a pill that Telcos are probably going to find hard to swallow, not least because the average Telco has a completely different mindset to the average Web architect. As I like to point out in my presentations, this mindset is one that says stability is achieved by not changing anything; i.e. if it ain't broke don't fix it.
This is completely at odds with the web and SOA world, where applications are born, live, breathe and die on an almost daily basis. It's going to be quite a challenge for anyone involved in telephony to start to embrace this concept. (Which introduces a whole new discussion around just how embedde-in-the-network applications should be). However, I am without doubt that we are going to have to, particularly as we see the move to SIP based services under direct control of the application server and not the media server.
But this is not the only challenge. The current generation of service creation toolkits do not go far enough in providing point-and-click Mashup capabilities: i.e. they do not provide the capability to consume and re-purpose content without resorting to writing the code by hand.
What we have are two separate worlds of the web and voice, with the voice world borrowing concepts from the web world. Although we have long talked about "voice as an application" as a means to bring these two worlds into the same domain, we are still very much at an early stage of achieving it. So it is with much interest that I am watching initiatives such as Ribbit, recently bought by BT, to see how it will downstream and if it really will bring voice service assembly to the masses.
When I originally started at BT, working on their Network-Embedded voicemail service (“CallMinder”) it took three years to bring the project to fruition – which included developing the application and the high-availability platform to run it on. (Bear in mind, these were the days when a 1Gb disk in your server was bleeding edge).
The dream now is to do it in three weeks. Could we?