Federating identity

In last week's column, I suggested that individuals and corporations should be the authoritative sources of basic information about themselves. That way, if an application needs my name, address, and phone number, I can refer it to a source that I control and guarantee to be correct. But how many applications really need my name, address, and phone number? Capturing the identity of individuals, along with personal information about them, has become a habit. In a climate of increasing concern about privacy, it's a bad habit we must learn to resist.

Consider a transaction at a liquor store. To prove your age, you present your driver's license -- the all-purpose credential. The card displays two items the clerk requires: your picture (biometric proof of identity) and your birth date (proof of age). It also displays facts that the clerk doesn't need to know: your name and address. A printed card can't selectively disclose only the required facts. But an electronic identity token can.

Last week, at a PKI summit hosted by Dartmouth College, I heard quite a lot about Shibboleth, an approach to federation of identity that's rooted in the idea of selective disclosure. Little-known in the commercial world, Shibboleth -- a project of the Internet2 consortium's Middleware Architecture Committee for Education -- is gaining real traction in the realm of higher education. The most widely publicized deployment enables students at Penn State University to use their home credentials to log in to Napster.

Here's the drill. A student requesting access at Napster (the "target") declares an affiliation with Penn State. Napster redirects to a Penn State server (the "origin") that locally authenticates the student. (An origin server can use any means of authentication available to it: LDAP, certificate, hardware token, and so on.) Then the university redirects back to Napster with an opaque handle that pseudonymously identifies the student. Now that Napster knows it's dealing with a Penn State student, it asks the university for the one additional fact it requires: Is the person associated with that handle permitted by the university to use Napster? Although probably not needed in this case, the protocol can offer the user the option to review and perhaps deny the release of the attributes the target server needs to make its authorization decision.

Shibboleth's protocols overlap in many respects with those of the Liberty Alliance Project. Both use SAML, albeit in slightly different ways. Both can use opaque handles to prevent the accumulation of tracking data linked directly to a name. And in the latest round of specs, Liberty moves closer to the Shibboleth philosophy that users should control the release of their attributes.

How do they differ, then? The Shibboleth model is simpler because access to resources is granted by institutional role, not by individual identity. A user need not have an account on the target service. Liberty, in contrast, assumes that two (or more) services do require individual accounts and provides a way to link those accounts.

It's true that we often need to establish individual identity. But beyond cross-university resource sharing, there are plenty of cases where role-based access, certified by a remote authority, will suffice. Look for them, because the best way to sidestep liability for collecting too much information is to avoid capturing it in the first place.

Jon Udell is lead analyst at the InfoWorld Test Center.