Using XSLT in the WSO2 Mashup Server

Maybe it’s just my long history with the technology, but I often find XSLT a convenient technology for manipulating XML.  When I scrape a Web page with the WSO2 Mashup Server, I generally use XSLT to extract and manipulate the values (e.g. my National Geographic Picture of the Day Feed mashup which I described here).  I find it performs better than extracting values individually through an XPath or RegExp filter or through E4X itself.

But what about when you’re not scraping a page?  While E4X does most XML manipulation tasks pretty well, it doesn’t provide some higher-level functions such as sorting.  The last thing I want to do is implement a sorting algorithm in Javascript, when XSLT already does the task very well.

The Mashup Server doesn’t have direct access to an XSLT processor, but you can use the Scraper object to execute a transformation even without performing a scrape.

Here’s a simple function that you can add to a mashup to support transformations inside your mashup.

transform.visible = false;
function transform(source, stylesheet) {
  var config =
    <config>
      <var-def name=’response’>
        <xslt>
          <xml>
            <template>{source.toXMLString()}</template>
          </xml>
          <stylesheet>
            <template>{stylesheet.toXMLString()}</template>
          </stylesheet>
        </xslt>
      </var-def>
    </config>;

  var scraper = new Scraper(config);
  var result = scraper.response;

  // strip off xml declaration and any PIs, E4X can’t parse them
  while (result.indexOf("<?") == 0)
    result = result.substring(result.indexOf("?>")+2);

  return new XML(result);
}

Usage is simple:

var xslt = <xsl:stylesheet version="1.0"
                xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
             …
           </xsl:stylesheet>;

var xml = <source/>;
var result = transform(xml, xslt);

Note that the function strips of any XML declarations or leading processing instructions, but you can (and should) also include <xsl:output method="xml" omit-xml-declaration="yes"/> inside your XSLT to make that extra cleanup step unnecessary.  Enjoy!

Blog Constellations Mashup

I finally got around to putting a few finishing touches on my Blog Constellations mashup.  It provides a visualization of a set of feeds, providing an intuitive sense of activity in a set of blogs, including frequency and size of posts, cross-links between the feeds, and highlighting links to "domains of interest."  This information proves useful in providing a feedback loop on the "quality" of blogging in support of the promotion of specific web sites, something WSO2 relies on as an open source company.

Some notes about the design of the mashup:

  • The blog analysis is performed by the mashup service (try it here), which has the following capabilities:
    • Subscription operations: add a feed to the analysis (trackBlog), remove a feed from the analysis(unTrackBlog), and list the feeds being analyzed (showTrackedBlogs).
    • Groups: Feeds are grouped under usernames so multiple groups of feeds are supported.  Groups can also be listed (listGroups) or removed (removeGroup).
    • Passwords: When you create a group by providing a password, adding or removing items from that group, or deleting the group entirely, requires a password.  Non password protected groups are also supported.
    • A utility method (fetchBlog) fetches a blog and analyses it in terms of size and the links it contains.
    • The analysis for a group of blogs can be easily obtained (getActivity).
  • The graphics are created entirely in the browser using the <canvas> tag and the excanvas.js library for IE.  Works pretty well although the graphic items don’t stay dynamic (and therefore can’t easily be used as links.)
  • The code contains some interesting performance techniques that may be useful for other mashup authors:
    • Reuses the feedCache mashup service to cache feeds for faster response.
    • Does some local caching of group analysis as well for faster response.
    • The mashup calls itself (getActivity calls fetchBlog) asynchronously to enable feeds in a group to be fetched in parallel.

The mashup is of course hosted on mooshup.com, our online version of the WSO2 Mashup Server.  This means you can try it out online or download it and run it locally on your Mashup Server, or use the service with your own front end or whatever.  The mashup page has all these links, metadata, script libraries, and everything you need to reuse this service or its source code.

Want to make some improvements?  How about these:

  • True multi-user capability - each user has their own groups, and can keep those groups private or make them public.
  • A mashup that finds a blogroll and publishes it into this mashup as a new group.
  • Adding animation (e.g. smooth zooming and spinning) and linking to the graphic (hover over a dot to see a preview of the blog?)
  • URLs for groups including options (so you can bookmark a particular visualization.)

Enjoy!

Mooshup blazes

Keith upgraded mooshup.com to the 1.5.1 release.  I used to find it pretty sluggish, but no longer!  Try it out…

WSO2 Mashup Server heap space fix

Thanks for Tyrell for finally locating the source of the elusive Java heap space error, which I came across sporadically during my daily Mashup Server stress testing.  Tyrell describes the fix that you can add to your 1.5.1 installation here.

The Open Source Social Contract

Now that I’ve had the opportunity to observe the Open Source industry from the inside for a couple of years, I find myself musing a fair bit about the economics - broadly speaking the exchange of value - involved.  Much of the customer appeal for open source solutions comes from the sticker price, which is generally zero.  But as the adage goes, if something sounds too good to be true, it generally is.  If open source software is to be high quality and broadly applicable, the customer demand for low-cost software needs to be matched with incentives for vendors and individuals to continue to produce high-quality software.  When the transaction isn’t based on actual currency, what is the commodity that makes this a successful transaction for both sides?

For individual open source authors the incentives might include the joy of having a large and loyal user base.  It might include fame and the development of skills that lead to greater personal satisfaction and a more impressive and marketable resume.  But I’d like to focus here on the incentives for professional software development businesses to invest in producing more and better open source software.

The research, development and deployment of software is a complex and costly process.  How are open source vendors able to accomplish these goals without getting licensing revenue in return?  I believe there are many ways a vendor can receive value from a user besides an exchange of currency.  Although the commodities exchanged aren’t tangible the exchange is rarely a zero-sum game, and can enrich the supplier without depriving or depleting the customer.  No matter how many smiles you give, you never run out.

Enforcement of non-tangible exchanges is impossible, and thus the exchange relies on the good will of the customer to give back.  My intention in this post is to enumerate some of the ways this non-monetary economy works, and I hope to encourage users to participate more fully and consciously in holding up their end of the transaction, and thus to perpetuate a virtuous cycle of open source software development.  If you use open source software, please consider one of the following ways you can remunerate the creator.

What the user gives

  How the creator benefits

Tell the author whether you
liked the product or not

= Reduced cost of soliciting customer feedback, ability to target new features more cost-effectively

Tell a friend or blog about it

= Reduced awareness-marketing costs

Rating the technology positively on sites Digg or Ohloh

= Reduced awareness-marketing costs

Lend an eyeball to a promotion or advertisement

= Makes marketing expenditures more productive

Become a registered user

= Reduces costs of contacting users, helps accurately judge the popularity of the product and thus the level of continuing investment

Ask a question on the mailing list

= Reduces costs of getting customer feedback

Answer someone else’s question on the mailing list

= Reduces general product support costs

Write an article or blog about creative uses of the product

= Reduces documentation and marketing costs

File a bug

= Reduces QA costs

Send a patch

= Reduces development costs

Implement a new feature

= Reduces development costs

Download additional products

= Reduces marketing costs and strengthens the business

Consider purchasing other products or services from the author

= Improves profitability and increases ongoing R&D

Be grateful for the software

= Increases everyone’s karma ;-)

The laws of economics state that the more rewards there are for a product or service, the more of that product and service will be produced.  By increasing the rewards for vendors to create useful and high-quality open source software, you encourage more of that software in the future.  Isn’t that worth an investment of a little time?  It doesn’t even lighten your wallet!

New toy :-D

As I posted back in 2005:

The US automakers don’t seem tapped into [the fuel economy] trend at all.  They still seem to think circumventing mileage minimums by pumping out SUVs is the way to sustainable revenues.  Last week Ford and GM were put on notice that they were wrong.  At least the blue half of this country, and I suspect lots of export markets, are willing to invest their automobile acquisition budget in a choice that reduces pump costs, unsightly and unhealthy smog, and reduces our dependence on foreign oil, and maybe even get a bit of value appreciation while they’re at it.  They’re even more motivated to vote with their dollars since their election votes haven’t provided much of a visible return.  Yet despite plenty of urging by the environmental community, Ford and GM seem to have ignored the inevitabilities of the long-term.  More and more of those purchasing dollars will head straight to Japan.  I suspect the next 15 years could be pretty rough as our automobile designers adapt.

My new Prius 8-)High fuel prices have accelerated the timeline beyond what I had imagined, and Prius sales have accordingly boomed

I’m proud to at last announce I’ve joined those voting with their wallets for fuel efficiency and low emissions (and unfortunately against our domestic automakers) with my new purchase!

Let’s hope American competitiveness is up to the challenge, and hope that they do a much better job of recognizing and capitalizing on long-term trends in the future.

"Spontaneous Reflections" Podcast Launched!

Today I’m launching my new musical podcast of short piano or keyboard improvisations.  I call it Spontaneous Reflections, and am planning to update it approximately weekly.  It’s available free in the iTunes store or you can subscribe directly here.

For those of you who don’t know, I started my career in music school and playing keyboard in the showrooms of Reno, but found the vocational aspects much less exciting than I hoped, and switched over to a more intellectually and financially rewarding career in computer software.

After a long period of playing primarily pipe organ, a couple of years ago I returned to piano and keyboard and though I don’t have my 20-year old chops (and pipe organ tends to ruin your sense of rhythm) I am exploring some new directions in my improvisation, really trying to dig into the instrument as a percussion instrument.

Live music is a funny thing though - each note goes out into the Ether and dissolves, gone forever except as ghosts in memory.  And I find that playing is quite a different experience than listening, heavily colored with the subjective.  To grow further I thought it valuable to be able to listen to some of my improvisations and see what works and what doesn’t, and to help me practice more discipline about the overall shape of a piece.  This podcast thus is an audio journal, not polished or refined, but genuinely spontaneous.

I hope you enjoy it!

Slightly Spastic (from Off the Path)

Variations on a rhythmic theme over a trivial chord sequence tarted up with some flavorful intervals.

 
icon for podpress  Slightly Spastic [4:16m]: Play Now | Play in Popup | Download

“Why SOA Architects Should Care” about Enterprise Mashups

Can’t help but repeat a few choice bits from a recent article Enterprise Mashups Part II: Why SOA Architects Should Care by Chris Warner and John Crupi.  Too bad I didn’t see this before the Enterprise Capabilities of the WSO2 Mashup Server 1.5 webinar I gave this morning - I could have cribbed a few bits of wisdom! Like these:

Mashups bring a “user” into the SOA mix. … By having the business build mashups upon the foundation of your service-oriented architecture, they essentially become SOA champions without knowing it.

This is a point we tried to hit strongly in the WSO2 Mashup Server, which can help bring individuals, teams, or enterprises together into a community around services, and strengthen the concept of a "user" and help build a diverse set of user experiences tied to services.  Until the SOA platform gets connected to the business users, the promises of SOA (business agility) won’t be fully realized.

In addition to SOA-friendly formats, such as RSS and Atom plus REST and SOAP, mashup creators can publish mashups to spreadsheets, as WSRP-compliant portlets, wiki- and blog-friendly widgets, or even into a mobile phone as a micro-application. Mashups can become the vehicle through which services become part of the everyday tools of the enterprise business user.

Which is why we’ve added on to the RSS, Atom, REST, and SOAP support in 1.0 the ability to interact with spreadsheets and databases (through Data Services), and are starting our foray into the world of portlets, widgets, and gadgets with support for Google gadgets.

Mashups can go well beyond leveraging an SOA by becoming part of that SOA, allowing developers to create customized “service skins” from core services.

Because mashups can be exposed as REST-, WSDL- and JSON-based services, they look and feel like a real SOA-based service to developers who want to consume them. … The mashup, created by the developer, becomes the tailored service which is directly aligned with their particular need. Major enhancements to core services can be accommodated with a reformulated or updated mashup by the mashup creators themselves.

Yes, we’ve found our users (and ourselves) using the WSO2 Mashup Server for more than just combining services into a new one, but for rapid development of new services, or for customizing services to a particular scenario.  From the outside, these don’t just look like real services - they are real services - for example in the 1.5 release high levels of security can be applied to the services even though they are so easy to write.  A good example of a "skinned" service - the digit2image sample service that comes with the mashup server tailors a very general purpose and complicated service (Flickr) for a particular narrow purpose - to find a random image, with appropriate copyright, for a single numeric digit.  By providing a narrow API, this capability can be exposed more easily and efficiently (some caching is performed by the service to speed up returns).  And it allows innovation without touching the original Flickr service which we don’t have control over.  For instance, I could search beyond Flickr for appropriate images without breaking the service contract or rewriting the Flickr service.

Toward the end, the authors write:

At this point any SOA architect worth his WSDL should see that mashups can greatly enhance an SOA and don’t necessarily ignore or break the principles that made your SOA great. But governance, granularity and scope for mashups do require subtle tweaks and adjustments to your enterprise toolset.

Indeed, governance is important as your SOA grows and chaos theory has a chance to get a toehold.  The WSO2 Mashup Server is taking the first steps into governance by relying on the WSO2 Registry to store mashups in a versioned repository, manage ownership, users and roles, and facilitate communication and the building of trust in an Enterprise SOA community.  But we still have to to a better job of working with the Registry’s features for supporting multiple versions, product lifecycles, and dependency management.  For instance, while the Mashup Server stores all it’s mashups, comments, tags, ratings, etc. in the Registry, when you deploy a new mashup, it doesn’t automatically populate the registry with the WSDL.  We’re looking at much better integration with the Registry as a primary feature of our next Mashup Server release.

Kudos to the authors for clearly presenting not just the case for service mashups in an SOA architecture, but why any SOA architecture that doesn’t expand into the mashup space is missing out on huge opportunities.

WSO2 Mashup Server 1.5.1 release

As announced on the mooshup blog, the WSO2 Mashup Server team completed a new release, which includes 50 bug fixes.  Most of these are small, but a few are worthy of note:

  • Issues with Email host object in v1.5 fixed.
  • Private key management features.
  • Fixes in the SMTP and JMS transports (Keith is working on an article about how to enable them.)
  • WSRequest host object now supports a fire-and-forget message exchange pattern.
  • Supporting http proxies more consistently (e.g. in the Feed object).
  • Lots of little UI nits and polish.

I’m very pleased with this release, as it is often the case that some of the fit and finish work gets postponed in the last stages of a major release like 1.5.  Open source is about rapid and continual improvement, and this release fits that bill nicely!

Download it today!

Holding On

Spare minimalistic piece allowing you to sink into the rich tones of the Yamaha CP300. All sounds played live (no dubbing.)

 
icon for podpress  Holding On [2:15m]: Play Now | Play in Popup | Download

Upcoming Webinar: Enterprise capabilities in the WSO2 Mashup Server 1.5

As announced here:

Title: Enterprise Capabilities in the WSO2 Mashup Server 1.5
Date: August 12, 2008
Start Time: 9.00 am PST
Duration: 60 min
Presenter: Jonathan Marsh

The WSO2 Mashup Server provides a powerful platform for developing and deploying mashups. In this Webinar, we’ll provide a brief introduction to the Mashup Server architecture and Javascript-based development model, as well as highlight the new features in the 1.5 release that support enterprise deployments. These include support for message level security (authentication, encryption, signing) on mashup services, and the ability to expose a data source such as a relational table, an Excel spreadsheet, or a simple CSV file as a zero-coding Data Service.

Attend this Webinar if you:

  • You have services, web pages, or other information sources available but you want smarter ways to use those services.
  • You want to expose or compose services in a simple yet powerful way, complete with enterprise-level security.
  • You are an existing WSO2 Mashup Server user and would like an overview of the new features in the 1.5 release.

Your presenter, Jonathan March, is the Director of Architecture for Mashup Technologies, and led the design and development of the WSO2 Mashup Server. Jonathan, a product designer and devoted script hacker, has spent over ten years developing and standardizing, XML and Web service technologies, most recently at Microsoft where he contributed to the standardization of XML, XSLT, XPath, WSDL, and other technologies.

Register now!

A Confluence of Fortuitous Circumstances

1) Project Auburn, a celebration of civic pride and elbow grease, in addition to renovating the classic State Theatre, has renovated the rear garden of OLAS, making it a great site for sculpture, relaxing, and entertainment.  I helped put the fence up and did some of the repainting a few weeks ago when hundreds of volunteers turned out to make Auburn a better place.  Kudos to the Rotary Club and other civic organizations for making our town a better place!

2) The Auburn Art Walk, a recurring evening of open studios and galleries, occurs every Second Thursday throughout the summer.

3) OLAS is celebrating the new landscaping, in conjunction with the Art Walk, with a show entitled From Earth To Sky, August 14th from 6:00 to 9:00 PM.

CIMG3631 4) I acquired a new keyboard, a Yamaha CP300, at long last giving me some mobility to my music.  (And making it easier to get quality recordings, more about which later.)

The outcome of all these fortuitous circumstances?

Jason and I will be playing our usual eclectic mix of improvisatory world/new age/jazz/undefinable music for the event.

If you’re in the neighborhood, come on by!

Mashups that work despite cross-site scripting (XSS) browser restrictions

What is XSS and why is it restricted?

[Note: also published on the Mooshup.com blog.]

Disclaimer: I’m not a security guru, so what follows is my opinion, observation, and experience.  Please feel free to comment and correct!

Modern browsers protect against release of private information to a third (possibly malicious) party by imposing “cross-site scripting” or “XSS” restrictions. The basic attack is that a browser is pointed to a web site that is trusted (to some degree) by the user. In using the web site the user may provide some confidential information to that web site, such as a password, bank account number, or some other information that could be exploited by a malicious entity. Naturally, the user would hesitate to provide this information to a site that he doesn’t trust (the user can be fooled through a set of techniques known as phishing, but that’s a different story.) Here we assume that the user has gone to a site that is authentic.

However, it is possible that even a trustworthy site, through poor construction or through compromised delivery mechanisms, could be “hacked” by a third party. For instance, accessing the site through an open (but malicious) wireless network may allow the page to be subtly changed during transmission. This change might be to insert a bit of script code that records the interactions the user has with the page, including information he enters such as a password, and also information that is provided by the website to the user. The inserted script could collect this private information, and then "phone home" to the attacker.  HTTPS can mitigate such attacks by securing the communication channel, but interactions with plain HTTP sites may still disclose user secrets of various levels of sensitivity.

Attacks can also come, and generally do come, over a trusted internet connection, even possibly through HTTPS. Anytime user-generated content appears in a page (e.g. comments on a blog, etc.) there is a possibility that third party, and thus untrustworthy, content is piggybacked on a trusted site.  Plain text third-party content is benign (what you see is what you get), but if the content can be submitted in html, it is possible that such html can include malicious scripts. For this reason, a trustworthy site that allows user-generated content must scrub any user-generated content provided to it, removing anything that could be executed as script.

From the perspective of browser vendors there are a lot of sites out there and not all of them consider the security implications of user-generated content adequately. To help protect the user, XSS counter-measures in the browser attempt to limit the ability of and scripts within the page to “phone home”. The is accomplished by preventing HTTP POST (the protocol used to submit forms and upload data) access to any web site domain called on by the page other than the one from which the main page originated. For instance, a page from http://wso2.org can access “safe” content from alternate domains like http://wso2.com, including images, stylesheets, even script libraries. The page however won’t be able to post a form containing user input to anywhere but http://wso2.org. The XMLHttp object provides a way to POST from script, and also prevents information from being posted to any domain other than the page domain.

So as a user, having a browser watch out for these types of attack and prevent them seems useful.  But let’s consider situations where they get in the way of useful, trustworthy work.

 

Why a developer might want to access alternate domains

For a web developer (especially a mashup developer) XSS can be quite a pain, as it limits your ability to write a page that spans domains.  It limits your ability to host AJAX and Web Service interactions (powered by the XMLHttp object) anywhere other than your primary domain.  For instance, you can’t host a Web service on mooshup.com and use it within your own application (at least directly from the browser).  Even though both sites may be trusted by you as the web site author, the browser enforces a blanket restriction on this access. (Each browser has mechanisms that may loosen this in some circumstances, but there isn’t anything with zero-config or cross-browser.)

This restriction limits applications such as gadget pages (e.g. iGoogle.com) that aggregate information from a large number of sources.  The Google Gadget framework, for instance, provides a way to GET information through a proxy on the trusted server, but currently disallows similar capabilities for POSTing.

 

The Loophole - Script Injection

Don’t start feeling too secure as a user, or too disappointed as a developer trying to do legitimate work – there are some loopholes that can be exploited.

As described above, an HTTP GET operation is assumed to be safe across domains, while HTTP POST is not.  If one could masquerade a POST as a GET, one could circumvent security restrictions.  In particular, script can be fetched regardless of domain.  This powers important functionality, such as third party libraries, an important feature supporting simplified development, analytics, and advertising.  Basically one needs to translate the body of the POST into url parameters on the GET (recognizing there are length and encoding issues to deal with), insert a <SCRIPT> tag dynamically into the page which uses the GET, and the server on the external domain can access the "posted" information.  It can even send a response back in the form of a block of script (essentially a callback).  Of course, you need to insert script into the page initially to get the ball rolling - which can be pretty difficult over a secure connection or for sites that properly sanitize user-generated content.  But if you’re the owner of the original site, it’s not terribly difficult once the technique has been mastered.

Naturally, a user can protect themselves reliably against these attacks by turning off Javascript in the browser.  Or cutting the internet wires.  Or burying the computer in the backyard and raising carrier pigeons.  All quite practical alternatives, don’t you think?

 

WSO2 Mashup Server 1.5

The new WSRequest.js in the new WSO2 Mashup Server 1.5 release has facilities to “exploit” this loophole when the developer wants to access a mashup from a page within another domain.  If an "access denied" error is returned from the XMLHttp object, a script injection is attempted instead.  This allows you to use the Mashup Server’s convenient stubs within a page or a gadget without encountering the XSS pain.  There are some restrictions in the fine print of course - only asynchronous calls are allowed, message size is limited, and the wire-level messages no longer are conformant to open standards for Web services, but those aren’t unreasonable considering the alternatives.

 

Conclusion: time to drop XSS restrictions?

So, my question is, if XSS restrictions are so painful, yet circumvented with a modest bit of work (hey I’m no genius at this stuff and I did it) why are the XSS restrictions in place at all?  Instead of trading off convenience for security, you’re imposing convenience without actually making a meaningful contribution to the user’s security.  The additional level of security provided by making cross domain access simply obscure rather than truly prohibited doesn’t seem worth it.  Is it time to dump XSS restrictions?  Or do we need to add a new (and further inconvenient) restriction against inserting <SCRIPT> tags into a page dynamically?  As long as there is any cross-domain access I don’t think I’ll be completely secure.  And that rules out advertisement insertions which I don’t think is going to happen anytime soon!