In celebration of the new WSO2 Mashup Server 1.5.2 release I’ve been looking through some of my old mashups and maintaining them and adding some new features where appropriate. Here is a summary:
- Sri Lanka Incident Mashup – the page I was scraping changed (factored into multiple pages) so for older history I had to update my scraping code. I added unique identifiers for each data point and an operation to add a manual override in cases where the English-to-integer parsing was insufficient, and added controls from the UI to make it easy to add these corrections.
- National Geographic Picture of the Day Mashup – I added operation safety to make it easier to use from as a RESTful service, added a global variable to hide the test operations when not in “debug mode”, fixed the links back to the picture-of-the-day page, updated the internal logic to use the Feed object (I wrote this before the feed object could handle media modules), stripped some script tags from the description, and added a format parameter that determines whether the feed format should be returned in RSS 2.0 or in Atom 0.3.
These still are pretty useful mashups to me. (Along with my southwest agent which I’ve improved a lot but can’t offer publicly.) Too bad they both rely on scraping, which makes regular maintenance a necessary to prevent bit-rot. We need more Web services alongside with Web sites!
Today a minor release of the WSO2 Mashup Server went out. Nothing dramatic, but a few nasty bugs have been fixed and annoyances addressed, including:
- A memory leak that some of us experienced when running periodic page scrapes over a long period.
- Case insensitive searching.
- An “ignoreUncited” annotation to allow the HTTP binding to POST parameters in the body instead of as URI parameters – which have a 2K limit in IE, and limits in other browsers as well.
- Recent Activity queries on a specific mashup now work – a typeahead helps you identify the mashup you want to track.
- Feeds and getXML now support basic authentication.
This clears the way for the larger project – bringing mashup capability into the WSO2 Carbon platform.
Enjoy!
[Also posted to the mooshup blog.]
The third installment of my introduction to the WSO2 Mashup Server is now live.
This week’s episode augments last week’s episode (on scraping a web page) by composing the scraping service with a simple storage service, and adding periodic execution to generate an ongoing history of the number of downloads of my podcast.
For best results (when you have more bandwidth) view the screen cast in it’s original size and quality here or click below. [Note: this appears to require an OxygenTank login – a mistake that should be corrected soon.]
If you’re bandwidth is constrained and you’re willing to put up with some blurriness, a smaller version has been posted on YouTube here.
[Also posted on the mooshup blog. 10/29: Updated links.]
Following up on last week’s screencast, the second in the series is also live now.
This episode shows a simple but fairly realistic scenario for the WSO2 Mashup Server, analyzing a web page using Firefox/Firebug, creating a service to scrape information out of the page, and expose it as a service. Including a little live debugging! Also looking a little deeper into the type annotations and their effect on the generated WSDL. Hope you enjoy it!
For best results (when you have more bandwidth) view the screen cast in it’s original size and quality here or click below.
If you’re bandwidth is constrained and you’re willing to put up with some blurriness, a smaller version has been posted on YouTube here.
[Also posted on the mooshup.com blog. 10/29 Updated links.]
Last week we launched the first in a series of screencasts I created, showing how to get started on the WSO2 Mashup Server, and then jumping in to some fairly realistic scenarios for building a powerful mashup.
Screencast #1 is the very basics – building a Hello World service using nothing more than the WSO2 Mashup Server and Notepad.
For best results (when you have more bandwidth) view the screen cast in it’s original size and quality here or click below.
If you’re bandwidth is constrained and you’re willing to put up with some blurriness, a smaller version has been posted on YouTube here.
I wondered how Southwest found my auto-checkin mashup so quickly. Until I did a Google search of “southwest automatic checkin” and found that my corresponding blog entry showed up at number 6. I have little doubt that Southwest subscribes to such a feed and uses it to notify their legal department of any new potential infractions.
So was my mashup caught by a mashup? Four out of five conspiracy theorists say yes! (The fifth doesn’t want to cross his secret CIA paymasters.)
This Southwest incident has made me somewhat introspective. As I started to muse in my previous post:
… it would be nice to have not just Terms and Conditions that allowed the site and its content to be reused freely. … It’s unfortunate they can’t do this without changing their checkin and boarding procedures in some way.
I really wonder whether in the long term whether Southwest’s procedure can withstand the onslaught of “bots” like mine. It seems doubtful. But the consequences of being overrun by “bots” is already causing alarm in some places.
For a very thoughtful look at the risks, the consequences, and maybe some ways to approach solutions, I highly recommend checking out the Long Now Foundation’s excellent podcast featuring Daniel Suarez talking about Bot-Mediated Reality. Based on that I can’t wait for his book to be rereleased ;-).
That didn’t take long! My (now deleted) southwestAutoCheckin mashup has generated some controversy. Within a few days of posting it, WSO2 received the following email from Southwest Airlines (minus headings and signature):
October 9, 2008
VIA EMAIL
Dear Sanjiva Weerawarana Ph. D,
I represent the Brand Protection Team for Southwest Airlines Co. (“Southwest”). Jonathan, Director of Mashup Technologies at WSO2, recently blogged about offering a service in which your company, without Southwest’s permission, offers online check-in services and boarding passes for Southwest flights. The name of this website is Mooshup.com.
The program offered on Mooshup.com uses southwest.com to check Southwest Customers in for a Southwest Airlines flight in an effort to obtain for them an “A” boarding pass. By using the Southwest Airlines website you are bound by the Terms and Conditions and the Use Agreement which all users are subject to in exchange for using southwest.com. Mooshup.com is in direct violation of the Use Agreement including, but not limited to, the following:
Southwest’s web sites and any Company Information is available to you only for your personal use to determine the availability of goods and services offered on the web sites and to transact business with Southwest. Unless you are an authorized Southwest travel agent, you may not use Southwest’s web sites for any commercial use or other purpose.
…
You may not use Southwest’s web sites for or in connection with offering any third party product or service not authorized or approved by Southwest. For example, online check-in service providers may not use the Southwest web sites to check-in Customers online or attempt to obtain for them a boarding pass in any certain boarding group.
…
For any use not specifically listed herein, you may not use Southwest’s web sites after Southwest notifies you that your use is not authorized and requests that you cease such use.
As your company is using southwest.com in an unauthorized manner which Southwest believes to be harmful to its business and its Customers, we are asking you to discontinue this activity immediately.
Southwest hopes that this matter can be quickly and amicably resolved. I’m asking that by October 23, 2008 you either sign and return a copy of this letter to indicate your agreement with the terms, or reply back to this e-mail with an indication that you have discontinued offering this service and removed the entry from Mooshup.com homepage and the blog posting from MSN Windows Live (auburnmarshes.spaces.live.com/blog).
Please be advised if we do not hear from you by the specified date, Southwest reserves the right to take whatever action we deem necessary to enforce our rights.
Very truly yours,
I did the mashup to demonstrate a simple workflow including scraping and notifications. I wasn’t really trying to undermine Southwest’s business model. Everyone knows Southwest is my favorite airline. So, where did I go wrong? Southwest above outlines two complaints: personal use only, no online check-in service providers.
First of all, personal use seems pretty tricky. A user of my service is actually using it for personal use, to interact with Southwest’s web sites. I don’t really think providing the user with a better tool, essentially a better “browser” tailored to Southwest’s site, changes this. At no time does the mashup contact Southwest except under specific instructions from the user. So I hold the mashup itself blameless.
However, their second complaint seems a bit more concrete - offering the southwestAutoCheckin service online. I can see that Southwest would consider mooshup.com, or even my user account on it, as an “online check-in service provider”, and that could be argued as a violation of the second item quoted in the terms above, especially since I made it publicly available.
Are these Terms and Conditions reasonable? I think they fit within the current norms of society, reflecting the fact that an online service should be used in a way consistent with it’s purpose and limiting abusive behavior. If for instance, a random user had posted the southwestAutoCheckin service, our mooshup.com Terms and Conditions would have allowed us to remove it for the simple reason of avoiding even a semblance of conflict. If every service provider had to consider countermeasures for every possible type of abuse we wouldn’t have the array of services we do now.
And I can see if my service became popular that it would cause significant upheaval in Southwest’s customer experience (if 50% of a flight used my service and 50% didn’t, the latter would be disadvantaged in their seat selection, with no recourse through Southwest if they thought the procedure unfair.)
However, I think the norms to have some evolution ahead of them, and it would be nice to have not just Terms and Conditions that allowed the site and its content to be reused freely, but also real Web Services to make this valuable information (some of which is owned by the user after all) available in an easier manner. It’s unfortunate they can’t do this without changing their checkin and boarding procedures in some way.
The moral of the story – carefully read the terms and conditions of any site you scrape, and make sure you stay within those terms. If you don’t like the terms, find a different provider, or agitate for better terms. This is a real danger for mashup authors.
In this case, there is no percentage for me in conflict, real or perceived, with Southwest – definitely not for a simple Mashup Server demo. So the southwestAutoCheckin mashup is permanently offline. I don’t see anything in the Terms and Conditions that justifies removing any blog post, and I’m confident that updating the previous post and providing context here will satisfy Southwest while remaining fully transparent.
[Update 9 Oct 2008: this mashup has been pulled off of mooshup.com. See here for details.]
I wanted to go the head of the line on my recent Southwest Airlines flight. Although Southwest doesn’t have assigned seats, those who check in first get to board first. The last end up with yucky middle seats. Especially useful in playing this system is the 24-hour advance checkin online. If you can check in early, you get the lowest boarding number and your pick of the choicest seats. But virtually everyone seems to know this now and there’s a bit of a grab for the best boarding numbers. Wait two or three hours and half the passengers have checked in.
This time I successfully checked in online within 5 minutes of the opening of online checkin, even though I was actually in the air on another flight at the time! My new and still under-development southwestAutoCheckin mashup worked brilliantly and gave me one of the lowest boarding numbers I’ve seen. It works like this:
- Use the try-it page to input information about your upcoming flight (confirmation number, passenger name, airport codes and so forth) into the mashup when you make your reservation.
- The mashup will alert you that it is tracking your flight for automatic checkin.
- 48 hours before the flight, the mashup will remind you it’s planning to check you in automatically.
- 24 hours before the flight, the mashup will actually check you in, although it can’t print your boarding pass for you.
- 3 hours before the flight, the mashup will alert you of the time of your flight.
- Reprint your boarding pass at an airport kiosk.
- After the scheduled flight time, the mashup will alert you that it has completed its work and is deleting the flight from its watch queue.
When I say “alert” I mean that the southwestAutoCheckin service uses the alertme service to distribute a notification to you. You can register with the alertme service to send alerts through any set of email addresses, instant messenger accounts (MSN and Yahoo currently supported), or a Twitter feed. Pretty cool, no?
Still lots to of improvements that can be made – more IM providers and even SMS for the alertme service, as well as a nice HTML interface; scraping the information out of a GMail message instead of having to input it manually, as well as automatically calculating the GMT time for the flight based on local time and the airport name. But the basic mashup seems to have worked brilliantly in it’s first real world trial.
So yesterday when my flight started boarding, was I there proudly at the head of the line? Ironically, no. Boarding started early, and in the few moments while I was packing up my laptop and charger, all 18 other passengers had boarded. Yes, despite having the best boarding pass obtained by my personal digital agent, I was the very last to board. But I still had 119 empty seats to choose from ;-).
Upcoming flight on Southwest? Let the southwestAutoCheckin mashup give you a helping hand!
Mihoko asks a question about creating a custom Google Gadget for a mashup using the WSO2 Mashup Server, and displaying it in the Dashboard:
I have a question about Dashboard feature. Is it possible to display images on my own Dashboard as the Try-it Gadget? … For example, about digit2image Mashup, this gadget is able to get the URL information stored image, but not able to display the image. Is it possible to display the image on my own Dashboard?
I thought this would be a good opportunity to explain some of the widget/gadget features now available in the WSO2 Mashup Server and walk though building a super-simple Google Gadget.
What is a gadget or widget?
A widget or gadget is a little program, usually with a cute and compact UI, that runs inside a widget engine. The widget engine generally has the characteristic that a number of widgets can be viewed at once, allowing a user to construct their own digital dashboard of relevant information sources. There are a number of widget technologies under various names – gadgets, widgets, portlets. Some examples of widget engine include the Google Desktop, iGoogle, Windows Vista Sidebar, Windows Live widgets, Yahoo! Widgets, Apple’s Macintosh Dashboard, and lots more.
Most widgets or gadgets today seem to be simple user interfaces for online services – tracking weather, the stock market, movie times, quotes-of-the-day, etc. We heard from a number of users that the WSO2 Mashup Server was a powerful tool for building the on-line part of these applications, and in the 1.5 release we added features to make it easier to create, test, and even host widgets.
WSO2 Mashup Server widget support
Of course we had to start small, so we focused on one widget engine, Google Gadgets, which can be
used in the Google Desktop or iGoogle, and provided some great features:
- A Google Gadget version of the try-it for a mashup
- A source code template for building a custom Google Gadget for a mashup
- A link to the custom Google Gadget for a mashup
- Support in the Javascript stubs for cross-domain access to a mashup
- A customizable Dashboard for each user, which can host Google Gadgets from the Mashup Server or from other places.
Tyrell wrote a fairly definitive guide to these gadget features in Converting your WSO2 Mashup Server to a Personalize Dashboard, which I recommend. I’ll give a brief and rather simplistic summary here for those of us with short attention spans!
Adding a Google Gadget to the Dashboard
You can find your customized dashboard for a local installation of the Mashup Server at https://localhost:7443/dashboard/ or you can get a free account (just supply a valid email address) at http://mooshup.com and access your customizable dashboard at http://mooshup.com/dashboard. There are a few default 3rd party gadgets installed by default, but you can close any you dislike and replace them with your own. Just select “Add Gadgets” and either type in the url of the gadget (available from the gadget publisher), or select one of the local mashups for a “try-it” gadget for that service.
Note that not all services can fit into the limited size of the gadgets yet – try out the “version” gadget for one that fits pretty well. We have some work to do to enable gadgets to be resized, and to try to squeeze more into the limited size available for mashups with (e.g.) too much documentation (as if there were such a thing!)
For a mashup with a custom gadget, you can find the url of the gadget on the main page for that mashup. For instance, the built-in upgradeChecker service has a custom gadget which you can add to your dashboard by pasting the link http://localhost:7762/services/system/upgradeChecker/gadget.xml into the “Add Gadgets” page.
Creating a custom Google Gadget
Let’s create a sample custom gadget to show how easy it is to provide the cute and customized UI that is the hallmark of successful gadgets. You could take the gadget try-it code generated by the Mashup Server, save it as {mashupname}.resources/www/gadget.xml, and start to modify it, but it is usually easier to start from a smaller template.
I’m going to show you how to make a simple gadget for the built-in digit2image service, which when given a digit and a size, returns a URL to an appropriately licensed Flickr image of that digit.
Go to the mashup page https://localhost:7443/mashup.jsp?author=system&mashup=digit2image (if you’re logged on as an administrator, otherwise you’ll have to copy the digit2image mashup to your own user space and adjust the above url accordingly.) You’ll notice a “Source code template for building a Google Gadget for this service" link. You can download the template and edit it with a text editor, saving it as digit2image.resources/www/gadget.xml. But today I’ll demonstrate the built-in editor instead.
So, click on the “Edit this Mashup” link in the task pane, and select the tab for “Gadget UI code.” The editor will indicate that there isn’t a custom gadget yet, but by clicking the “Generate Template” button farther down the page, we can get the basic gadget definition written for us automatically.
The template even creates source code examples for an asynchronous invocation of each operation, so it’s easy to just tweak the code a bit and do new things: (some comments removed)
// Demonstrates calling an operation of the 'digit2image' Mashup function
init() {
// Set up a callback and an error handler for the digit2image operation.
digit2image.digit2image.callback = showPayload;
digit2image.digit2image.onError = handleError;
digit2image.digit2image(/* (string) digit */ "0", /* (string) size */ "small");
}
// Sample invocations (unused) for other operations.
function samples() {
}
// Handles and error by displaying the reason in a dialog
function showPayload(payload) {
log ("result-console", payload);
}
This automatically generated sample code asynchronously invokes the first operation in the service with some sample values, and includes a “samples” function to show how other operations can be invoked. The results are serialized into the “result-console” part of the gadget page. While some services won’t work out of the box because of additional constraints on the inputs of an operation (e.g. another operation must be invoked first, that although a parameter has a declared type such as string there are other constraints like that is must be a valid username, and so forth.) The first thing we usually do is clean up the operations we don’t want, order the ones we do, and provide meaningful input values.
In this case, let’s use the values “9” and “thumbnail”, and display the resulting url as an image tag. Replace the above code (all three functions), with this:
// Display an image for the digit "9"
function init() {
digit2image.digit2image.callback = function (payload) {
var output = document.getElementById("display");
output.innerHTML = "<img src=" + payload + ">'";
};
digit2image.digit2image.onError = handleError;
digit2image.digit2image("9", "thumbnail");
}
And let’s simplify the body of the mashup to just the image display and a place for errors to be displayed:
<div id="body">
<div id="display">
<!-- This div will contain the resulting image -->
</div>
<div id="error-console">
<!-- This div will contain a description of any errors encountered. -->
</div>
</div>
When you save the gadget and return to the mashup page, you will see a new link "View the custom Google Gadget for this service." Copy this link and use the “Add Gadget” link on the Dashboard as I described earlier, and you will see the simple custom gadget as something like this:
There are naturally many improvements that need to be made to the Dashboard, and to our ability to power gadgets, widgets and portlets. But I hope you find our experimental support intriguing and give us some feedback on the Mashup Server User Forum, just as Mihoko did!
Maybe it’s just me and my Vista, or my version of the Java Runtime (1.5.0_10) but anytime I get an iTunes upgrade (such as the recent version 8 which I just installed), the WSO2 Mashup Server fails to launch. Hate that! Now I don’t know much about Java and environment variables, but here’s what I do to solve this problem:
- Right click on "Computer" and get properties, then select "Advanced System Settings…". Or, open up the Control Panel and search for "environment variables" and select the "Edit the system environment variables" task.
- Click on "Environment Variables."
- Double click on the "CLASSPATH" variable and find the segment (separated by ‘:’) that contains QTJava. Delete this path segment and close all the windows.
The Mashup Server should now launch fine. Now back to work…
Keith has created a nice REST demo to show how WSDL 2.0 can be used to describe a RESTful interaction, and posted the resulting mashup here. This is in response to an old post he found from Stefan Tilkov.
One of the area Stefan explores is the difference between the conceptual models of "operations" versus "resources":
It seems to me that the right thing would be to get rid of the operations (or map them to the HTTP verbs, which is essentially the same thing as getting rid of them).
Keith indeed mapped each combination of "verb", "uri template" into a WSDL operation. So what Stefan describes as "GET on /customer/{id} - get customer details" Keith maps to a "getCustomerDetails" operation which takes an "id" parameter. I think this is a very reasonable mapping, and one that looks a lot like the "getCustomerDetails(id)" construct which is present in some form in every programming language.
I don’t call this "getting rid of the operations" either, if by that is meant writing a WSDL that has only four operations (GET, PUT, POST, DELETE). I would instead say that an operation encapsulates the combination of the http method, the http location uri pattern or template, and the input and output types into a construct that maps well to familiar programming constructs and provides a level of abstraction that can prove valuable (e.g. the uri location can change, security can be applied, even the transport protocols can change without perturbing the development experience).
Having a WSDL 2.0 description of the service in terms of operations also had the beneficial effect of documenting which combinations of verb/uri template are supported, with the effect of also documenting which combinations aren’t supported. Stefan had to notate these as "unused" in his diagram, important because some of the combinations aren’t obvious (why can’t a customer be deleted?)
It still baffles me why there isn’t more demand for WSDL 2.0 and its REST description features. Hopefully Keith’s post helps demonstrate the value of this technology.
Maybe it’s just my long history with the technology, but I often find XSLT a convenient technology for manipulating XML. When I scrape a Web page with the WSO2 Mashup Server, I generally use XSLT to extract and manipulate the values (e.g. my National Geographic Picture of the Day Feed mashup which I described here). I find it performs better than extracting values individually through an XPath or RegExp filter or through E4X itself.
But what about when you’re not scraping a page? While E4X does most XML manipulation tasks pretty well, it doesn’t provide some higher-level functions such as sorting. The last thing I want to do is implement a sorting algorithm in Javascript, when XSLT already does the task very well.
The Mashup Server doesn’t have direct access to an XSLT processor, but you can use the Scraper object to execute a transformation even without performing a scrape.
Here’s a simple function that you can add to a mashup to support transformations inside your mashup.
transform.visible = false; function transform(source, stylesheet) { var config = <config> <var-def name=’response’> <xslt> <xml> <template>{source.toXMLString()}</template> </xml> <stylesheet> <template>{stylesheet.toXMLString()}</template> </stylesheet> </xslt> </var-def> </config>;
var scraper = new Scraper(config); var result = scraper.response;
// strip off xml declaration and any PIs, E4X can’t parse them while (result.indexOf("<?") == 0) result = result.substring(result.indexOf("?>")+2);
return new XML(result); }
Usage is simple:
var xslt = <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> … </xsl:stylesheet>;
var xml = <source/>; var result = transform(xml, xslt);
Note that the function strips of any XML declarations or leading processing instructions, but you can (and should) also include <xsl:output method="xml" omit-xml-declaration="yes"/> inside your XSLT to make that extra cleanup step unnecessary. Enjoy!
I finally got around to putting a few finishing touches on my Blog Constellations mashup. It provides a visualization of a set of feeds, providing an intuitive sense of activity in a set of blogs, including frequency and size of posts, cross-links between the feeds, and highlighting links to "domains of interest." This information proves useful in providing a feedback loop on the "quality" of blogging in support of the promotion of specific web sites, something WSO2 relies on as an open source company.
Some notes about the design of the mashup:
- The blog analysis is performed by the mashup service (try it here), which has the following capabilities:
- Subscription operations: add a feed to the analysis (trackBlog), remove a feed from the analysis(unTrackBlog), and list the feeds being analyzed (showTrackedBlogs).
- Groups: Feeds are grouped under usernames so multiple groups of feeds are supported. Groups can also be listed (listGroups) or removed (removeGroup).
- Passwords: When you create a group by providing a password, adding or removing items from that group, or deleting the group entirely, requires a password. Non password protected groups are also supported.
- A utility method (fetchBlog) fetches a blog and analyses it in terms of size and the links it contains.
- The analysis for a group of blogs can be easily obtained (getActivity).
- The graphics are created entirely in the browser using the <canvas> tag and the excanvas.js library for IE. Works pretty well although the graphic items don’t stay dynamic (and therefore can’t easily be used as links.)
- The code contains some interesting performance techniques that may be useful for other mashup authors:
- Reuses the feedCache mashup service to cache feeds for faster response.
- Does some local caching of group analysis as well for faster response.
- The mashup calls itself (getActivity calls fetchBlog) asynchronously to enable feeds in a group to be fetched in parallel.
The mashup is of course hosted on mooshup.com, our online version of the WSO2 Mashup Server. This means you can try it out online or download it and run it locally on your Mashup Server, or use the service with your own front end or whatever. The mashup page has all these links, metadata, script libraries, and everything you need to reuse this service or its source code.
Want to make some improvements? How about these:
- True multi-user capability - each user has their own groups, and can keep those groups private or make them public.
- A mashup that finds a blogroll and publishes it into this mashup as a new group.
- Adding animation (e.g. smooth zooming and spinning) and linking to the graphic (hover over a dot to see a preview of the blog?)
- URLs for groups including options (so you can bookmark a particular visualization.)
Enjoy!
Keith upgraded mooshup.com to the 1.5.1 release. I used to find it pretty sluggish, but no longer! Try it out…
|
|