Permanent URLs, Addresses and Names

I found a link to an article by Taylor Cowan about persistent URLs on the web. It was mostly about what happens to metadata assertions (such as RDF statements) when links break, but there was a little something on persistent links and URNs, too. A comparison with Amazon.com and how books are referenced these days was made. A way to map the ISBN number as a URN was described (URN:ISBN:0-395-36341-1 was mapped to a location by the PURL service, in this case at http://purl.org/urn/isbn/0-395-36341-1), which is quite cool and, in my opinion, both manageable and practical.

The author thought otherwise, however: But on the practical web, we don’t use PURLs or URNs for books, we use the Amazon.com url. I think in practical terms things are going to be represented on the web by the domain that has the best collection with the best open content.

Now, what’s wrong about this? At first, it may seem reasonable that Amazon.com, indeed the domain with the (probably) largest collection of book titles, authors, and so on, should be used. Books are their business and they depend on offering as many titls as possible. In the everyday world, if you want to find a book, you look it up at Amazon.com. I do it and you do it, and the author does it. So what’s wrong about it?

Well, Amazon.com does not provide persistent content per se, they provide a commercial service funded by whatever books they sell. At any time, they may decide to change the availability of a title, relocate its page, offer a later version of the same title, or even some other title altogether. The latter is unlikely, of course, but since we are talking about URLs, addresses, rather than URNs, names, talking about the URL when discussing what essentially is a name is about as relevant as talking about the worn bookshelf in my study when discussing the Chicago Manual of Style.

Yes, I realise that my example is a bit extreme, and I realise that it’s easy enough to make the necessary assertions in RDF to properly reference something described by the address rather than the address itself, but to me, this highlights several key issues:

  • An address, by its very nature, is not persistent. Therefore, a “permanent URL” is to me a bit of an oxymoron. It’s a contradiction in terms.
  • Even if we accept a “permanent URL approach”, should we accept that the addresses are provided and controlled by a commercial entity? One of the reasons to why some of us advocate XML so vigorously is that it is open and owned by no-one. Yes, I know perfectly well that we always rely on commercial vendors for everything from editors to databases, but my point here is that we still own our data, the commercial vendors don’t own it. I can take my data elsewhere.
  • Now, of course, in the world of metadata it’s sensible to give a “see-also” link (indeed that is what Mr Cowan suggests), but the problem is that the “see-also” link is another URL with the same implicit problems as the primary URL.
  • URLs have a hard time addressing (yes, the pun is mostly intentional) the problem with versioning a document. How many times have you looked up a book at Amazon.com and found either the wrong version or a list of several versions, some of which even list the wrong book?

Of course, I’m as guilty as anyone because I do that, too. I point to exciting new books using a link to Amazon.com (actually I order my books from The Book Depository, mostly) because it’s convenient. But if we discuss the principle rather than what we all do, it’s (in my opinion) wrong to suggest that the practice is the best way to solve a problem that stems from addressing rather than naming. It’s not a solution, it merely highlights the problem.

I Want A Nokia N900

I’ve been waiting to get my hands on a Nokia N900 smartphone for a couple of months now. Nokia released it in November or December (depending on who you choose to believe), and here in Sweden in January, but the phones have been in very short supply. I’ve been asking around but so far, there’s been no sign of the N900, anywhere I shop. The other week I finally placed an order at The PhoneHouse. I was told that there are currently six (6) phones available for 114 stores, but that I could expect it in a week and a half or so. And if I didn’t want it, the guy said he could sell it anyway…

The phone itself is a nerd’s wet dream. It runs on Maemo, a Debian/GNU Linux-based distro (yes, it can run Debian apps even though the screen might be ill-suited for some of them), and is actually more of a computer with a built-in mobile rather than the other way around. People have successfully managed to get OpenOffice to run on it and so I’m thinking that I can probably make some kind of XML editor work on it.

A fellow XML’er in the UK has had the phone for months, now, and doesn’t miss a chance to tell the world about it on Twitter. I’m jealous and I want one. Now.

Footnotes

Those familiar with my old schemas and DTDs will now probably raise an eyebrow, but I have finally succumbed to the lure of footnotes in the inline content model of my all-purpose personal DTD.

What finally convinced me was my need to create multiple references to a single note that, while interrupting the text flow and thus unwelcome in the text itself, was too short to place in a section of its own. There was no logical way to semantically identify that note in a form or in a place that would allow me to reference it from several different points in my text. Footnotes (and footnote references) solve that problem very neatly, and the allow me to present my footnotes as end notes using a different stylesheet.

Coffee

Coffee, as you all know, is the lifeblood of any office. Well, our coffee machine is dead and while I would have liked to say that it didn’t suffer, the trail of dried-up coffee along the floor speaks the opposite.

Expect a slow day after the Eastern Holidays, here.

XML for the Long Haul

There will be a one-day symposium on the theme XML for the Long Haul, right before the Balisage conference in Montréal this year. I’ve thought about this, lately.

First of all, isn’t this what XML is about? The ability for information to survive a proprietary method of conserving it? The means to make it happen, regardless of what happens to your software? I’ve preached about this for a long time for my customers, listeners, and those who just couldn’t get away. If a disaster happened to your software, if it was somehow wiped out in spite of your best efforts, my point was that it would only take a few days to build something that would parse most of the information in an XML file. Maybe another few days to produce output from it, but provided that you spoke the written language and the structure was done by someone who had at least a basic idea of what XML (and SGML; this isn’t new) was about, it wouldn’t take more than a few days at most to see what that lost information was about.

Second, my points re the first, above, pretty much summarise my views here, but I really mean it: This is what XML is about.

But is it really that simple? Is markup really that descriptive? Well, not always. There’s plenty of markup out there that is obscure and hard to read. For example, is a namespace going to make your leftover instances easier to read? Are your element type names descriptive? What about your attributes? Do you include comments or annotations with your schema? Do you include wrappers that contain groups of element types in a semantically meaningful way? Does your group include everything required for that group to be complete? Have a look at one of your instances with fresh eyes, see if it makes sense. Does one type of information relate to another? How would you format this lost instance, if you had just come across it? If it had been a thousand years and you could understand the language but not the culture, would you understand the meaning of the information? Could you print it and explain what went on then?

Don’t laugh. Pretend that you really are viewing your structures from the outside. Pretend that you don’t have the schema at hand. Pretend that you don’t know the semantics, even though you can understand the contents. Pretend that you really are studying the information as an outsider. Does it all make sense?

I think this is a worthwhile reality check. I think that we all should ask this of the schemas we create, every time we do an information analysis. Are our schemas understandable? Are they legible?

I would really like to be in Montréal in August this year. I think it’s important.

Back from XML Prague

I’m back home from XML Prague. It’s been a fabulous weekend with many interesting talks and lots of good ideas, and I’m still trying to sort my impressions. So many things I want to try, so many technologies I want to learn. The feedback from my talk on Film Markup Language alone is enough to keep me busy for a few weeks.

More later, but for now, suffice to say that I’m already thinking of a subject for a presentation next year.

It’s Quite Possible to Lose Your Way in Prague

I drove to Prague for XML Prague, yesterday. I left Göteborg on Wednesday evening, taking the ferry to Kiel, and then spent most of Thursday on the Autobahn. It all went without a hitch; not that I’m that good but my GPS is. I would probably have ended up in Poland without it because I often miss the road signs when on my own. Some of my business trips before the GPS era were truly memorable.

So today I took a walk around central Prague, shopping gifts and seeing the sights. And a wonderful city it is, one of my favourite cities in Europe. All that history, all that architecture, the bridges… and small, narrow streets that are never straight. They are practically organic (and probably feed from the gift shops since they are everywhere), and it’s very difficult to find your way. It’s a labyrinth we are talking about.

Yes, I lost my way. The third time I came back to that innocent-looking Kodak shop (and there are a lot of shops with Kodak signs in central Prague, I might add), I knew I was in trouble. I was walking in circles, my feet aching while a particularly wet mixture of snow and rain poured down, and had no idea where I was. And I kept thinking about my GPS, safely tucked away back in my hotel room, remembering that I actually considered bringing it along for the walk but then shrugging, thinking “how hard can it be?”

I found a shelter in a mall I hadn’t seen before (well, I think I hadn’t seen it before) and considered my next move while high-heeled ladies tried lipsticks and wondered what the out-of-place stranger was doing in the cosmetics department. I could ask someone, I suppose, some friendly local…

Then I remembered: I have a GPS in my mobile. It took a few minutes for it to find the satellites it required but after that, I only had to walk for a few more minutes to find a familiar landmark. In a counter-intuitive direction, I might add.

The wisdom in this story? Thank goodness for GPS devices. Oh, and XML Prague starts tomorrow morning.

Automating Cinemas at XML Prague

I’ve been busy writing my presentation and some example XML documents for my presentation on Automating Cinemas Using XML at XML Prague in about a week and a half. I’m slightly biased, I know, but I think the presentation actually does make a good case for XML-based automation of cinemas. I know how primitive today’s automation is, in spite of the many technological advances, and I know where to improve it. The question I’m pondering right now is how to explain the key points to a bunch of XML people who’ve probably never seen a projection booth, and do it in twenty minutes.

The opposite holds true, of course, if I ever want to sell my ideas to theatre owners. They know enough about the technology (I hope) but how on earth will I be able to explain what XML is?

There’s still have stuff to do (for one, it would be nice to finish the XSLT conversions required and be able to demonstrate those, live, at the conference) but the presentation itself is practically finished and the DTD and example documents are coming along nicely. I suppose I need to update the whitepaper accordingly and publish it here, when I’m done.

See you at XML Prague!

Developing SGML DTDs: From Text To Model To Markup

Quite by accident, I discovered that Eve Maler and Jeanne El Andaloussi’s Developing SGML DTDs: From Text To Model To Markup is available online. I’m one of the people lucky enough to own a hard copy, but if you aren’t as fortunate, read it at http://www.xmlgrrl.com/publications/DSDTD/. It’s one of the best books ever written about information analysis, that (far too) little used skill required to write a good DTD. In my ever-so humble opinion, the book should be mandatory for anyone involved in a markup-related project of any kind, that’s how good it is.

(Yes, I know it was written before XML came out, 12 years ago, but XML is SGML, really, and the book remains as useful today as it was when it came out in 1995.S