I was just doing a little research to answer a question about XML Base, a spec which I edited (and of whose minimal length I remain proud!) There is a current controversy about XInclude adding xml:base attributes whenever an inclusion is done. If your schema doesn’t allow those attributes to appear, you’re document won’t validate. This surprises some people, since the invalid attributes were added by a previous step in the processing chain (in this case XInclude), rather than by hand. As if that makes a difference to the validator!
Norm Walsh, after a false start, correctly points out this behavior was intentional. But he doesn’t go the next step to say that this behavior is vital! The reason xml:base attributes are inserted is to keep references and links from breaking. If the included content has a relative URI, and the xml:base attribute is omitted, the link will no longer resolve - or worse, it will resolve to the wrong thing. Can you say "security hole"?
Sure it’s inconvenient to fail validation when xml:base attributes are added, especially when there are no relative URIs in the included content (and thus the xml:base attributes are unnecessary.) But hey, if you wanted people or processes to add attributes to your content model, you should have allowed them in the schema! Validation over a closed content model is the first line of defense for many applications accepting XML content, and a backdoor way to add attributes (even ones dear to my heart like xml:base) is simply inconsistent with this.
For these reasons, I urge you first to allow sufficient extensibility in your content model, and second to consider carefully what you’re doing before using mechanisms to circumvent normal validation rules.
Posts (RSS)