Categories
technology

Native XML support in ECMAScript (E4X)

Yet another interesting nugget of information pulled from Jon Udell’s site. Makes you wonder how many bloggers are merely human blog aggregators of other people’s blogs. Eventually there’s 1 part content and O(nn) level of repetition, like P2P only worse as info is wrapped with ‘opinion’ by each subsequent blogger. There’s a study that could be done here using a combination of the google API and bloglines. Blog information is distributed virally? Discuss…
E4X is native data type for XML in ECMAScript. More information here

Categories
technology

Web Services standardisation (or trying to pass a herd of overkeen elephants through the eye of a needle)

In a previous life as a research manager in an Irish research group called TSSG I wrote a piece about semantic web for a technology column in a local paper. It’s the usual non-critical high-level look at a technology but the excitement at the promise of semantic web is very real.

However I’m less than convinced about the current web services standardisation effort. In comments to another blog I was scathingly critical of the original WS technology (SOAP & XML-RPC) and the malaise of WS standards and specs. I’ve also been keenly following the wise words of Steve Vinoski, Chief Engineer of IONA technologies, another company I used to work for. Before I digress onto another topic entirely I’m going to reiterate some of my original comments about WS standardisation, mirroring steve’s feelings about the lessons that can be learned from CORBA regarding tool & vendor support. So without wanting to offend too many of the great people involved in the process, here are my considered thoughts:

  • “Web Services” is a brand name for a range of disparate and relatively unfocused technologies.
  • The technology was hugely overhyped without accepted standards to back it up
  • XML messages were touted as human-readable. If you know that many humans who read large XML schemas in their spare time you need to get yourself and your friends “to a nunnery”. OK, maybe not but you get the point 😉
  • It often seems that around 20 years of distributed systems thinking was ignored in their creation. Hence SOAP was misnamed “Simple”. “Incomplete” would have been more appropriate.
  • With usefulness comes complexity. With complexity comes unwieldiness and with unwieldiness comes confusion. The secret is normally appropriate abstraction but it’s early days yet
  • The standardisation effort is frustrating and feels uncoordinated. All too often standards are hurriedly created to plug holes in other standards. Often if feels like the wheel is being reinvented, as if nobody in the effort knows that RPC has been done before. I hear Vinoski’s cries for an overarching Architecture spec so have both a map and a flashlight
  • Almost none of this matters as the major industry players are now behind it in a bid to recapture the goldrush of the late 90s with a ‘must-have’technology. For this reasons alone the tool support will hide much of the complexity and encourage utilisation. This is already happening. Thank you Microsoft, IBM, HP, BEA, IONA, SUN etc.
  • The most loosely coupled thing about WS/SOA is often the standardisation process. There could be trouble ahead

However there’s hope for us all in the form of REST. It may correct several issues with webservices (including the loengthy standardisation process). WS piping is so incredibly powerful that it can’t be overlooked. Also, REST provides some neat answers to security issues, automation, semantic web & may just bring about world peace given an appropriate level of vendor support

Arguably the URI is the reason the web took off in the 1st place. There were better transport and application layer protocols, more elegant markup grammars but the idea of the URI is compelling. Arguably with REST, semantic web & canonical URI’s we may just be getting somewhere. I believe that these technologies will determine the success or failure of the web service initiative and everything else is pretty much window dressing.

Categories
technology

MySql & XML Output

Pulled this from DECAFBAD… Nice tasty article all about -X command line switch in mysql. Wonderful I thought. XML based power in the worlds most popular and free relational database. I heard the sounds of dreams coming true.. well maybe 😉 There had to be a catch however.

However, just when I thought everything was peachy I started playing with this feature with less than spectacular results. Some intensive googling yielded the answer.
It seems that using the -X command line option for exporting the data in XML format produces invalid XML. It assumes XML escaped data in the DB!.. On what grounds??? Mysql only encloses the query results in XML element tags, but doesn’t do encoding of the contents inside the tags.

In XML, if you want to use one of the characters <, >, &, etc. inside an element tag is not valid. If you want to use one of those characters, you have to use the respective entity instead. Mysql doesn’t seem to do that, so when selecting tagged data or markup like “<foo>red & green<&#47foo>” with the -X command line option will always lead to invalid XML.
An uncool workaround would be to perform some string replacements for every selected column when using the -X option:

  • replace all & by &amp;
  • replace all < by &lt;
  • replace all > by &gt;
  • replace all " by &quot;
  • replace all ' by &#39;

Other stuff, like language specific characters (umlauts etc.) has to be encoded as well or has to be handled by defining or applying a different character set when post processing the XML output.

So the command produces invalid XML as invalid chars haven’t been escaped… Now this is a shame as writing some code to escape it in the db server coulda been done quite easily. A combination of escapes and using different charsets (perhaps as a command line option) along the lines of mysql --xml --xmlcharset=mycharset would be sweet. We’ll see what happens in the next release