May 11 2010

Under the covers of OAuth 2.0 at Facebook

For the past three years, the Facebook Platform has been built on top of a session-based authentication system that many developers found complex. In order to make any API calls, you have to understand the details of signature algorithms. It’s a common source of problems for new developers using the Facebook Platform. We have been searching for a way to make it simpler.

The OAuth community has faced similar issues. If a new developer just wants to start using the Twitter API, suddenly they have to understand things like HMAC-SHA256 and how to sign their base string. That is just overkill for a simple web API.

Because of this complexity, I am really excited about the next version of OAuth. OAuth 2.0 largely solves the performance and usability issues from OAuth 1.0a. It relies on SSL instead of signatures as the default way to interact with applications, which means you can play with it in your browser. While there are drawbacks to exclusive reliance on SSL, it is so simple to get started that it can’t be beat for most developers. OAuth 2.0 also splits out flows for different security and performance contexts (i.e., desktop apps vs. web apps vs. mobile apps). That means that when a developer starts coding, they can see exactly what they need and start running.

How did we get here? Last fall, a small group wrote a proposal called OAuth WRAP, which introduced the core innovations of what would become OAuth 2.0 - using SSL and multiple flows. After Bret Taylor implemented OAuth WRAP in Friendfeed, we realized that it lived up to the promise. In the time since, we have been working with Yahoo, Twitter, Google, Microsoft, and many community members within the IETF to produce a new draft.

At f8, Facebook shipped a new Graph API, which relies entirely on OAuth 2.0. In this post, I’ll go into detail about exactly what we shipped and our plans for the future.

The access token is king

The OAuth 2.0 spec is divided into two sections:

* First, you get an access token
* Second, you use the token to access protected resources.

Using a token is the easy part, so we’ll start there.

To use an access token, just append it to the end of a protected resource with the oauth_token parameter. (To alleviate some developer confusion, Facebook also accepts access_token as the parameter name.)

All calls are required to go over HTTPS.

The new Graph API demonstrates how easy it is to get started. You can fetch data about any object on Facebook by hitting https://graph.facebook.com/. For instance, this is my profile without an access token - all you can see are a few public fields:

http://graph.facebook.com/lukeshepard

{
   "id": "2901279",
   "name": "Luke Shepard",
   "first_name": "Luke",
   "last_name": "Shepard",
   "link": "http://www.facebook.com/luke.shepard"
}

Now, if you append an access token (notice the protocol switches to https), you can see much more information about me, such as my employer:

https://graph.facebook.com/lukeshepard?access_token=....

{
   "id": "2901279",
   "name": "Luke Shepard",
   "first_name": "Luke",
   "last_name": "Shepard",
   "link": "http://www.facebook.com/luke.shepard",
   "work": [
      {
         "employer": {
            "id": 109719699048814,
            "name": "Facebook"
         },
         "location": {
            "id": 104022926303756,
            "name": "Palo Alto, California"
         },
         "position": {
            "id": 109981385691387,
            "name": "Software Development Engineer"
         },
         "start_date": "2007-05"
      },
      ....

The new Graph API only accepts OAuth tokens. All existing API calls are backwards compatible with our existing auth system, but they do accept an OAuth token as well. So the Graph API is a carrot to encourage OAuth adoption.

Getting an Access Token

It used to be that people just accessed services from their desktop or laptop computers, but in the past few years, that has changed. Now, people use laptops, all sorts of web browsers, mobile phones, devices connected to TVs, etc. One of the most common criticisms of OAuth 1.0a is that there is only one way to do things - the new version of OAuth 2.0 addresses that by allowing for multiple types of flows to get a token.

In this initial launch, Facebook supports three ways of getting a token.

Web server flow

The web server flow is intended for use by server-side developers that don't really like JavaScript. The whole flow works by redirecting the user to the authorization server (Facebook) and back to your site. It is baked into the Facebook docs as the default auth flow (if you don't pass "type" param, then this flow is assumed).

You must pre-register a "Connect URL" with the domain and path of their site. When the user tries to authorize, Facebook checks the "redirect_uri" to make sure it begins with the URL registered for the given "client_id".

You can also pass an optional "display" parameter to customize, primarily for mobile devices. Values accepted are "page" (default), "popup", "wap", and "touch" (the last two for mobile sites).

The Web Server flow.

The Web Server flow.

User-agent flow

While the code for web applications typically lives on the web server, far away from the user, sometimes it lives on the user’s machine. The main examples are desktop apps (Tweetdeck) and JavaScript-based applications (StreamDiff). Because the code actually runs on the client device, it can’t really rely on embedded secret keys for security - in JavaScript, anyone can look at the source and trivially extract the secret. So, we need something else.

The user agent flow is created for applications that cannot embed a secret key. The access token is just returned directly in the redirect response instead of requiring an extra server call. Security is handled in two ways:
* Facebook makes sure that the access token is not sent to a random webserver by validating the redirect_uri matches a pre-registered URL.
* The access token never goes across the wire in the clear. Even if redirect_uri is an HTTP url, the token itself is returned after the fragment (#) and so the browser will never send it to the server.

Facebook encourages the User Agent flow for use in Desktop applications. We plan to incorporate it into the the JavaScript client library

The User Agent Flow

The User Agent Flow

Client Credentials flow

This is perhaps the simplest flow - just exchange your client_id and secret for an access token, no user involved. We support this for accessing application-only resources. In particular, it’s required to use our new subscriptions API, modify developer settings, etc.

Session Exchange flow

Finally, for backwards compatibility, there is an endpoint where developers can exchange existing session keys for access tokens. We considered using the “Assertion Flow” for this, but we didn’t because a) it uses a bunch of parameters and is optimized for SAML tokens, and b) this is just a migration strategy and not really core to the spec.

Open Areas

The OAuth 2.0 spec is in a draft state, which means it’s still subject to change - especially around the edges. Since it’s much easier to add a feature later than to remove or change one, we have tried to implement the stable core while postponing controversial or unstable features. In particular, the following areas are really interesting:

Identity. As David Recordon posted, we would love to “get OAuth 2.0 to the point – fairly quickly – where we can start to architect the next version of OpenID on top of it.” Since Facebook provides both identity as well as authorization, it’s critical that we get this piece right in order to have a complete solution.

Signatures. Most of the benefit of OAuth 2.0 is its lack of signatures - however, there are some use cases where it is difficult or nonperformant to make SSL requests, and in those situations we want to use signatures. Signatures are also required for identity - the server needs to be able to verify that a given message came from Facebook without making another HTTP request. We will eventually add support for signatures.

Immediate mode. In order to do single sign in with JavaScript, we need the ability to check if the user is currently logged in with an iframe. Today, Facebook uses an endpoint called login_status.php to handle this, but we plan to use the immediate mode in the future.

Device flow. I am intrigued by the Netflix-style device flow, and I think it’s important for someone to implement it and try it out in the wild.

Refresh tokens. Most of the access tokens issued by Facebook are short-lived - they last for an hour or at most a day. In the OAuth 2.0 spec, the official way to handle long lived tokens is by issuing a “refresh token”, which is then exchanged repeatedly for new, short term access tokens. In the Facebook API, the developer just asks for the “offline_access” extended permission, and then their access token just lasts forever (or until the user revokes it). We may look into using refresh tokens in the future.

Error formats. The error messages are currently JSON encoded, but they should be form-encoded - we are planning to fix that.

Client state parameter. The “state” parameter was removed and then added back in, but we don’t support it. I believe that client state should be tracked in the redirect_uri, as it offers more protection.

Display parameter. We use a “display” parameter to support mobile flows, even though the spec does not officially include it (yet). We hope to see it included in a future draft.

What comes next?

The OAuth 2.0 draft is still a work in progress. As we work through the open issues, the spec will evolve. I’m looking forward to other companies shipping their own endpoints, and eventually building OAuth 2.0 into the fundamental fabric of how the Internet works. Please leave a comment or join the mailing list to get involved!


Jun 9 2009

Geeks in Vegas? Woot!

Last week, Facebook sponsored the TopCoder world finals in Las Vegas. I was lucky enough to go and hang out for a few days with some of the world’s best coders, and it was a fantastic experience. Sadly, I didn’t bring a camera, so I’m linking to these pics from the Flickr photostream.

TopCoder is the world’s premier online coding competition. Before this trip, I was of course familiar with the algorithm competitions, but I didn’t know that they have a wide variety of events including lightning bug fixing and
grueling eight-hour marathon matches. For the more right brained geeks out there, they have also branched out to a design competition in which Photoshop wizards make pretty looking mocks.

For the contest, the competitors sat up on a raised platform like the one below. Their screens were mirrored on a bank of monitors off to the side, so spectators could cluster around and watch them.

I was surprised by how much strategy went into it. For example, there is a challenge round, where your competitors can look at your code and try to find bugs. If they see one, they can suggest an input that would expose the bug - if they are right, then the author loses the round and they get points. But if they are wrong, then the challenger loses points. Some folks try to put extra whitespace or obfuscate their code to lure potential bug finders in, and then they catch the edge case somewhere else just to nail them. It’s exactly the opposite of the purely collaborative environment in which I usually code.

Facebook did some recruiting:

Geek schwag, of course

Geek schwag, of course

Most of the top folks were not from the United States. Many were from Argentina (including another Lucas, with whom I practiced my Spanish), Russia, Ukraine, China. Makes me wonder what the spooks were doing there:

If youre going to collect peoples info, wouldnt you rather do it at Facebook where its less creepy?

If you're going to collect people's info, wouldn't you rather do it at Facebook where it's less creepy?

An ongoing obsession with many of the engineers was the economic game The Settlers of Catan. It was constantly being played as far as I can tell throughout the whole competition. Jack and I played a game with jdmetz and others. While I was strategizing in general terms like “Wood is now expensive”, I could tell that some of the math wizards at the table were calculating precise economic values for various commodities, and card counting to keep track of what everyone had in their hands.

Reminds me of my own high school days playing Risk. Of course, who’s to say I’m not still playing?

Anyway, on the second night I gave a talk about some of the technical details of Facebook Connect. Some of the more technically inclined may find it interesting.

Verdict: Topcoder is some hot stuff. Even worth missing the hackathon for.


May 22 2009

Logout: the other half of the identity equation

This week, Facebook began accepting OpenID for single sign on. At the Internet Identity Workshop, many people raised a lot of questions about Facebook’s implementation, and in general the relationship between single sign in and sign out. In this post, I’ll argue that sign in is only half the battle; if we want OpenID to represent a complete identity solution, then it needs to support sign out just as robustly as it does sign in.

First, let me define what I mean by “automatic single sign in”. Here’s how it works:

  1. The user first establishes a connection between an identity provider (i.e., Google) and a relying party (i.e., Facebook).
  2. On subsequent visits, if the user is signed into their provider, then the relying party detects that and automatically logs them into their own site.

So then, the user sees something like this on the relying party:

Logout link at the top of a page

What does that “logout” link do?

There are three possible choices:

  • Log you out of the current site. (clear cookies on the current domain)
  • Log you out of the current site AND the identity provider. (clear cookies and use cross domain communication to clear the parent)
  • Give the user a choice between #1 and #2.

As I argued last fall, the only logical choice is #2: for it to log you out of not only the current site, but the identity provider as well.

For #1 or #3 (which involve clearing cookies on just the current domain), we leave the user in a logically inconsistent state, which exposes them to security and usability problems.

The contrapositive

One of the basic rules of logic says that if an if/then statement is true, then its contrapositive must also be true.

IF p => q THEN !q => !p

So if we have:

P = “You are logged into the provider”
Q= “You are logged into the relying party”

Then the negation would be:

~Q = “You are NOT logged into the relying party”
~P = “You are NOT logged into the provider”

In English:

If you are NOT logged into the relying party, then you should NOT be logged into the identity provider.

Otherwise, we end up with a contradiction. The user clicks “logout”, and perhaps they even see a logged out page, but when they refresh or navigate away, the relying party will grab the parent’s session and log them right back in.

In short: for relying parties and providers that support automatic single sign in, they need to also support automatic single sign out.

The examples

We can look to the web to see how real sites have implemented this. My favorite example is Google, Youtube, and Blogger. These are all owned by the same company, and if the user sets it up, then they all share the same identity provider (Google). However, I think to most consumers, they are branded sufficiently different that it approximates the world in which the provider and the relying party have no relationship.

Link your Blogger account with Google, then log into Gmail. Then go to Blogger - you’re automatically logged in. Now, click “logout”. You are logged out of Blogger and Google. (Incidentally, you’re also logged out of YouTube, Gmail, and all other Google services).

As another example, go to Citysearch. Log in with Facebook Connect. Now log out, and you are logged out of Facebook as well as all other Connect sites that implement automatic login detection.

Implementation details

Single Sign In

Single sign in can be implemented in different ways. With OpenID, it is accomplished with an immediate mode call. A relying party can do an immediate call either in a background iframe, or with a full page redirect (if there is only one provider).

In Facebook Connect, the background iframe request is done automatically by the Facebook Javascript libraries. An application can support single sign in by assigning a callback that refreshes the page when the user is logged in.

Single Sign Out

But, how do we do logout? For Facebook Connect, a site calls FB.Connect.logout(). That will do a background ping to Facebook that clears the user’s current session. It also briefly pops up a screen that says “You are logging out of Facebook.”

There is no equivalent support in OpenID, but I argue that there should be. I would like to see two request modes in addition to checkid_setup and checkid_immediate.

* logout_teardown: Inverse of checkid_setup, allows user interaction for the logout. If a provider chooses, they could show the user a page before logging them out, and perhaps give the user a choice.
* logout_immediate: Inverse of checkid_immediate, logs out without any interaction. This could be done in a background iframe or full page redirect.

It would be up to the provider to support one, both, or none of these additional modes.

In the example of Facebook Connect, a provider could display the “You are being logged out” notice via logout_setup.

The choice

But wait, you say! I don’t like this behavior! I don’t want to be logged in and out at will - I want choice to navigate throughout the web.

First, let me assert that most people don’t care that much. They just want it to work. And single sign in / sign out is a conceptually easier model to handle (once we get the kinks out) than having the keep track of accounts across the web. Single sign in is kind of the point of OpenID and related technologies.

That said, there are legitimate reasons why you might not want to support automatic login. For example, Mint.com handles very sensitive data, so may not want to trust in third-party credentials exclusively. Or Netflix, which has its own very developed sense of user identity, including credit cards on file, may choose not to accept single sign on. These companies may find it useful to link with other identities, but single sign on is just not part of the equation.

That’s fine, actually - they don’t have to. But the developer does need to pick a side. If you don’t want to support sign in, then you don’t need to support sign out. But for those sites that DO support sign in, they DO need to support signout. You’re either in the club or not.

Imagine there’s a “single sign in cloud”. When a user logs into their identity provider, they log into the entire cloud; when the log out, they log out of the whole thing. Sites choose whether they join the cloud.

Single sign in cloud

Let’s review the contrapositive rule. IF a site supports automatic login THEN it must support automatic logout. It’s a choice. If you don’t want to log the user out of the provider when they hit “logout”, then you can’t automatically log them in. You must require them to click a button or enter a password or some such to get in - each and every time.

But regardless, we want to make sure that OpenID supports automatic login and logout, even if it’s not always required.

Update: Shibboleth and Single Sign Out

There’s already been some extensive analysis of single signout within other identity communities. That post makes the very valid point that single signout is much harder than it seems. However, I don’t think that argues that it shouldn’t be attempted, just that we need to watch out and use the right tools.


Apr 15 2009

Making OpenID more useful: let’s detect logged-in state

One of the biggest issues with OpenID is its usability. Many relying parties are currently faced with a difficult choice: how do you let the user know the provider they should be using? Users may be familiar with the brands of Facebook, Google, Yahoo, etc, but if your site doesn’t show it, then it may as well not exist. For the time being, at least, the OpenID brand is invisible (and in fact, sometimes is considered a negative).

As I see it, the fundamental problem we would like to solve is to figure out what are the most likely OpenID providers that the user would like to use? There are a few approaches to this:

  • Show logos for the most popular providers. This is currently a common option, popularized by Janrain’s RPX and numerous other sites. This scales up to maybe 4-8 providers - beyond that, we get into what Chris Messina dubs the “OpenID Nascar”
  • Janrain RPX displays icons of various top identity providers

    Janrain RPX displays icons of various top identity providers

  • Show a dropdown of popular providers. This scales to maybe a few more, but starts to suffer from the value problem. What’s the first element of the dropdown? Do users understand what the “openid” moniker means that sits in front of it? (the answer: nope)
  • Ma.gnolia used a dropdown of common providers

    Ma.gnolia used a dropdown of common providers

  • Let the user type in a url. You can provide a simple textbox, or better yet, a typeahead, in which the user can enter the service provider. The problem is that without prompting, users don’t really understand the value or what to write - in fact, they will type just about anything in there. You can see what this looks like here.
  • Blank box offers infinite choice and confusion

    Blank box offers infinite choice and confusion

The core problem is that we are relying entirely on the user to tell us what their provider is, but we can’t really ask them an intelligent question without explaining the value first. And the catch-22 is that the value provided relies sometimes on the provider! For instance, I may not know what OpenID is or internet identity or whatever, but I just want to log in okay?

To a user of Google products, this is how the interface should look

To a user of Google products, this is how the interface should look

A solution: detect the user’s provider

Okay, so let’s not just ask the user. Let’s ask the user’s browser, and let it tell us who their provider is. There are a few brainstormed techniques for doing this, none of which is particularly widespread at the moment:

  • One idea that’s commonly suggested but not implemented is to use some strange browser history hacks to determine what sites the user has visited recently. If an OpenID provider appears in the list, then target the message to them. Fantastic!

    While this would work with today’s technology, I think most sites have shied away from it because the information comes via an exploited bug rather than intentional, informed consent.

  • We could build support into the OpenID spec for detecting the user’s current logged-in state. Relying parties could iterate through a whole bunch of possible providers, and if the user is currently logged in, they can show them a login screen. The performance would hardly be as bad as people think - requests can go in the background, in parallel, and you could theoretically query dozens or hundreds of providers within a few seconds (although I admit I haven’t tested it). See more discussion below.
  • We could create a centralized authority or authorities for managing user OpenID provider preferences. People could set a single cookie with their preferred provider, and then every relying party can check that cookie via a cross-domain call. Some Google engineers have initiated some discussion of this based on a similar model used for advertising, but it suffers from a severe chicken-and-the-egg problem. I think that we first need to feel some pain from these other approaches first before everyone gets fed up and moves to a centralized model

OpenID needs a middle state

I think #2 is the best option of the three. OpenID already supports two types of requests- first, modal calls, with a mode of “checkid_setup”, tell the provider to ask the user to login. This is the standard mode that most people are familiar with, and much of the user experience discussion (including my previous post about the popup UI) has focused on it.

The second type of call is triggered with a mode of “checkid_immediate”, and it, well, always returns immediately. This mode is used by a relying party to re-authorize a user that has previously visited the site. When a provider receives an immediate request, it can send only one of two replies:

  1. Yes: Yes, the user is currently logged in, and they have previously authorized your website, so here’s their identifier. (mode is “id_res”)
  2. No: Nope, don’t know anything about them, you have to send them over to find out. (mode is “setup_needed”).

By contrast, Facebook Connect offers more functionality. If an application (relying party) queries the Facebook servers for a given user, it can return three states: returns three states:

  1. Yes, the user is logged into Facebook and they previously authorized your website.
  2. The user is logged into Facebook, but they haven’t authorized your site.
  3. No, we don’t know anything about this user.

That middle state ends up being pretty important. When a site sees it, they can place a Facebook login more prominently than they otherwise would. Also, because the relying party knows the user is already logged in (and so won’t be entering any user credentials), it can issue a nice neat iframe dialog instead of the heavyweight popup window. Because the popup can’t get blocked or hide behind the window, it is a nicer and less confusing user experience, and it looks better too. (It’s not really a security problem - if the OP wants to collect a password, it can always pop out of the iframe with Javascript).

There’s currently no way to communicate that middle state via the OpenID protocol. What I would like to see is an additional parameter that’s part of the “setup_needed” mode, which says in effect “the user is logged in, so if you present them with a dialog, it will be an easier experience than with other providers”.

Objection: what if a provider doesn’t want to?

Releasing logged-in state can be an optional feature. There should be a way to return logged-in state within the context of an OpenID transaction. Right now, there’s no standard way to do this, so providers that want to need to invent their own way to release that state.

Of course, the OpenID community has been fighting an uphill battle to even get the basic checkid_immediate call to work. Google, MyOpenID, and many others behave correctly, but even top providers like Myspace, Yahoo, and Microsoft always return a negative response to immediate requests - even when the user is logged in and authorized! We clearly still have a lot of work to do to make checkid_immediate a standard across all top providers.

Nonetheless, the spec should make it possible for providers who WANT to offer this additional state to do so. Just as some providers choose to offer checkid_immediate and others don’t, likewise we should allow those who want to the ability to return a more nuanced reply.

Objection: what if the user is logged into more than one provider?

So let’s say this becomes widespread. A user shows up to a relying party, and the background ping reveals that the user is logged into Facebook, Google, and Yahoo. Now what?

I think this would be a great problem to have. The relying party could present the user with choices if it wanted to, except confident in the fact that the user is familiar with all the choices. Or, a relying party could choose the provider that it has found gives it consistently better data and user experience. Or it could choose based on security preferences. The point is, the interface is still in the control of the relying party, but now the RP has strictly more information to make a better experience for its users.


Mar 4 2009

A proposal for a conceptual “Open Stack”

Last summer, John McCrea and Joseph Smarr put together a diagram of the “open stack”. The image showed up in numerous talks throughout last year, culminating in an Open Stack Meetup in December. Last week, Marc Canter sent an email asking for thoughts on crafting a new revision to the “open stack” graphic. I’d like to propose a new stack based on the underlying concepts rather than the specific, possibly obsolete technology.

Here’s the open stack graphic from summer 2008:

The Original Open Stack

The Original Open Stack

Here’s the problem: how do I read this? If I’m an average businessperson or developer who has never read a spec, how do I know what these terms mean? Some of the ideas are in there … ID, Auth, Contacts … but others don’t make any sense. XRDS-Simple? My eyes start to glaze over - that doesn’t seem all that simple. I have to lean over to my friend and ask him what all these things mean.

Technology changes.

The technologies involved here change rapidly. In just the past six months, we’ve added draft specs for PortableContacts, ActivityStreams, OpenID / OAuth Hybrid, and OpenID User Experience. These are all draft specs, which means they will change (some more than others) in the near future. Other specs, like XRDS-Simple, are actively being deprecated in favor of newer versions, like LRDD. And the work is far from done - there will likely be even more specs developed and revved before the whole thing really starts to gel.

Do we really want every version of this graphic to be out of date shortly after it’s released? Or do we want something that is compelling, demonstrates the vision, and gets people thinking about how to do it themselves? We should use the underlying concepts to communicate ideas instead of specific technologies.

The "Real" Open Stack .. of papers.

For an example of a different approach, I looked at the messaging around Facebook Connect. A developer who’s deciding whether to implement Connect will see the three main benefits: Identity, Friends, and Feed. Sure, it’s much more complicated under the covers, and sure there are some pieces that aren’t covered (lots, actually), but those are the main points that everyone should think about. They will go home and think: “How can I fit each of these pieces into my own website?”

This is the Facebook Connect stack

The Facebook Connect value stack.

The “open stack” embodies a similar set of concepts, but they aren’t entirely the same. for example, to participate with a many-to-many decentralized web made up of open standards, Discovery becomes a really important element. Someone who views the diagram should be able to tell what’s going on without having to look up a bunch of terms or be involved with the community. They should also be able to immediately understand most of the terms, and apply them to their own use cases.

Here’s my proposed conceptual open stack:

Proposed new "concept" open stack

A conceptual open stack.

Let’s stack it up, with the highest-level concepts on top and the foundations for those concepts on the bottom. Thus we have:

  • Streams. Read recent activities that people are doing around the web. Can be implemented with Atom, RSS, or the newer ActivityStreams.
  • Friends. Get information about people you are connected to. Alternately, this could be called Contacts, although I think that word tends to turn people off since they think it means contact information (which it doesn’t always). Can be implemented with PortableContacts, the OpenSocial “People and Friends” API.
  • Identity. How does someone prove they are who they say they are? This one is solidly covered by OpenID.
  • Profile. All the information that goes along with an identity - name, profile picture, birthday, whatever. There are a bunch of ways of getting this, and it hasn’t really settled yet. The most popular now are OpenID Simple Registration and OpenID Attribute Exchange. Another possibility is to use the OpenSocial “People and Friends” API with the OpenID-OAuth hybrid (although to my knowledge nobody has implemented this yet).
  • Authorization. Allow someone to have access to private data. Alternately, this could be called Privacy. The open standard for authorization is clearly OAuth.
  • Discovery. In a decentralized system, we need a way to figure out where everything is. As we get more and more providers and consumers, a smooth discovery process becomes ever more important. It has hopped around, from simple link tags in OpenID 1.1, to XRDS-Simple in OpenID 2.0. Now Eran Hammer-Lahav is working on a Link-based discovery mechanism, which will hopefully replace the other forms of discovery going forward.

As you can see, the world of the open stack is constantly changing, but the underlying concepts are maturing. We should ask people to think about how they can apply the concepts to their own business or project, rather than asking them to check off a bunch of arcane technology names.

I’m sure that lots of people have thoughts on this. Let me hear ‘em!


Feb 4 2009

How to accept OpenID in a popup without leaving the page

For most sites that accept OpenID today, the user experience is one of two things:

  • User is redirected to the OpenID provider, and then redirected back to the original site. This is the most popular one, but it’s a particularly jarring experience for the user.
  • User is given a Javascript browser popup, but when the popup returns, it still refreshes the whole page. I haven’t actually seen this in the wild, but I’ve heard it discussed.

There has been some discussion lately about how the OpenID experience can work within a popup window. Next week, Facebook is hosting an OpenID design summit to work through some of the issues around cohesive design within a popup. However, one question that I’ve heard several times is “How does the popup work for Facebook Connect? Can it be done for OpenID?”

In theory, yes. In this blog post, I’ll walk through an approach for how the existing OpenID 2.0 spec can be used to do the entire exchange within a browser popup, so that the page doesn’t even have to refresh. Once the user is logged in, she can just keep doing what she was doing. At the moment this is just theoretical, but I will put up sample code and a demo once I get it all working.

Update: Brian Ellin implemented this idea in, like, an hour. His sample code implements a slightly simpler version of the technique described here. It’s really quite good.

See it in action here.

How it works

Okay, so suppose the user is on a page, and they see a “Sign in with OpenID” button. I’ll assume they have some way of choosing their provider. In the image below, I use the “Sign in with Yahoo” button. The user clicks the button, which triggers a Javascript handler. That handler sets up a callback (I’ll get to that later) and then calls window.open to initiate the transaction.

The first part of any OpenID transaction is discovery - that is, given a URL like “yahoo.com”, where do I go to actually log the user in? The RP also needs to establish a secure association, so that it can verify the signature on the response for security.

We open the popup onto a helper file located in the domain of our site. That helper file does the appropriate discovery, and then does a redirect to the OpenID provider, which remains within the browser popup. The OpenID provider walks the user through the steps to log in. It doesn’t even know it’s in a popup - as far as the provider is concerned, this is a full page (Although there are some discussions about how to let that it’s in a popup, nothing has made it into the spec yet).

The OpenID provider within the popup looks something like this:

Finally, the provider redirects back to the openid.return_to url. Remember in the first step, before the popup was even opened, the Javascript handler set up a callback? Well, the return_to url is a specially encoded cross domain url. It encodes information about that callback so that it can be decoded on the reply.

Here’s an example of a return_to cross-domain url:


http://open.sociallipstick.com/openid/xd_receiver.htm#fname=_opener
&%7B%22t%22%3A3%2C%22h%22%3A%22openIDresponse%22%2C%22sid%22%3A%220.672%22%7D

The cross-domain URL reads the response parameters, and just passes them directly to the parent page using Javascript. Because the cross-domain receiver and the parent page are on the same domain, the communication can proceed.

What next? The parent document now has all the OpenID parameters, but it’s in Javascript. The last step is to verify those parameters. The RP can make an Ajax call to a helper script, which looks up the association and verifies the signature. It can then perform any logging in, and pass back a success variable to the Javascript to let it know that everything went well. Of course, if the Javascript wants it can then go ahead and refresh the page, but if the user is in the middle of something, it can wait until the user is done to do so.

Performance

What about performance? This flow involves the following HTTP requests:

  1. The initial request for the RP page.
  2. The load of the helper page in the popup (which does background discovery)
  3. The redirect to the OP
  4. Probably, at least one form submit within the OP. (The user enters their name and password, and submits)
  5. The load of the reciever file when the OP redirects back
  6. The final Ajax call to the RP server to validate the signature

At least two of these HTTP requests can be optimized away:

  • Load of the helper popup. In the common case, users will be using an OpenID provider that’s been used before. For example, if someone clicks the “Yahoo” button, then the RP doesn’t need to do discovery on Yahoo.com again. In fact, the RP should cache both the server endpoint and the secure association, and make them available in Javascript. If that were the case, then the popup could open directly onto the OP site, and skip this one.
  • Load of the receiver page. The client needs to redirect to a page that lives on the RP domain- but that doesn’t mean that it has to load that page from the server. If the RP serves the cross domain receiver as static HTML with long cache headers, then the user’s browser will cache the page. As long as the query string doesn’t change, the browser won’t need to fetch the reciever again.

    But how can the query string not change? After all, the OpenID parameters need to be sent back in the query string, right?

    The way around this is to send the parameters back not in the query string but the fragment. So the return would look like:

    http://sociallipstick.com/receiver.htm#handler_info&openid.ns=…..

    Because the part before the fragment doesn’t change, this file never needs to be reloaded, and this HTTP request is saved.

Conclusion

The techniques laid out here can help the OpenID user experience reach the same level of fluidity as that achieved by Facebook Connect. Now it just remains to get some working code, and put it in practice!


Jan 8 2009

Macbook Wheel isn’t all that far-fetched for some

Apple just released the Macbook Wheel, a computer with only a single input device. Well, it actually has multiple inputs - an entire scroll wheel, plus a button in the middle.

But it’s not that ridiculous of an idea. In Beautiful Code, there’s a chapter about how some researchers needed to design software for parapalegics (like Stephen Hawking). They had the most restrictive user interface possible: a single input button. In fact, it was a breakthrough when the researchers realized that the user could potentially hold down the button for variable lengths of time:

From observing Professor Hawking use Equalizer, I discovered a new mode of operation: besides simply just clicking the button, he could hold down the button and release it at a strategic moment. The button, in effect, is not merely a binary input device, but actually an analog one, for it can provide a signal of varying duration. We thought long and hard about how best to use this new power we were presented with: we could now get more information out of a click than a simple bit. We could, for instance, allow the user to pick from a list of choices. A short click would now be used for the default action, while a long click opened up many other options.

Crazy! It is a testament to the skill of Hawking and the engineers that they were able to get typing speed up to a reasonable pace.

Here’s some of the pages from the book. You’ll need to buy it to read the whole chapter (which I highly recommend).
http://books.google.com/books?id=gJrmszNHQV4C&pg=RA6-PA183&lpg=RA6-PA183&dq=steven+hawking+open+source+one+button+input+beautiful+code&source=bl&ots=rKRWvwYauh&sig=8YxP_NlgaagT0vGOU-pSEz4HsUU


Dec 17 2008

I’m running for the OpenID board of directors

I’m running for the OpenID board of directors. I’m a little nervous, having never done any sort of political thing before. So let me try to answer some questions.

Q. Cool! Can I vote for you?

Anyone who is a member of the Foundation is eligible to vote. Membership in the foundation costs $25, and requires an OpenID (Yahoo or Google will work fine if you have an account at one of those sites).

If you’re interested in OpenID and the future of the web, please join! And then vote for me. Thanks!

Click here to join and vote

Q. Huh? What is OpenID?

If you aren’t familiar with OpenID, don’t worry- you’re not alone. It’s a geeky protocol that says how to use an account from one website to log into another website. It was designed initially as a way to avoid typing a username and password into every site around the web. You should watch the same video I did that convinced me of its potential.

Q. I don’t get it.

You can try it out by visiting a Blogger blog, like my brother Scott’s. You can leave a comment using your Google account, or any OpenID. Providers currently include Yahoo, AOL, Livejournal, and a few others. Here’s a screenshot:

Logging into Blogger using my yahoo.com OpenID

Logging into Blogger using my yahoo.com OpenID

As you can see, the user experience leaves a little to be desired, as it’s not really obvious how it all works to a non-geek.

Q. What is Facebook Connect?

I’ve been working on this project for several months. It basically lets you log into a website using your Facebook account instead of making a new username and password. You also get tons of cool benefits like seeing what stuff your friends are doing, and publishing activities back to Facebook automatically (but only if you want to). Give it a shot at Citysearch or Techcrunch. Or on the Comment form on my blog (if you’re reading this on Facebook, click “view original post” and then check out the comment form).

Q. But I heard that Facebook and OpenID were competitors. Why would they want you on their board?

Well, they aren’t competitors so much as just working at the same problem from different angles. You could say it’s complicated.

OpenID is a protocol, like HTTP, SSL, 802.11b. Facebook Connect is a product offered by a single company. But as far as products go, I think we did a pretty good job of it, and I’ve learned a lot that can be shared with the community.

Ultimately, I would love to see a world in which your information and identity follows you around. If you interact with a website, or a store, or a phone, then it knows who you are – to the extent that you want it to. All your data should be privacy protected, so that the user is ultimately in control of who gets to see what. But we can remove a lot of the friction that gets in the way of people sharing their data with who they want to share it with. I don’t think that Facebook can get to this world all by itself – that’s why they built the platform, and now Connect. I hope that by joining the board I can establish a tighter connection and increase communication between competitors and allies alike.

Q. Watch out for this shoe!

*ducks*

Q. What do you think OpenID needs to do to improve adoption?

The message of OpenID has generally been “make it easier for consumers to log into multiple sites without a new password”. Well, after a few years, it’s pretty clear that that is not enough to get people to adopt it.

The primary competitor to OpenID is not Facebook Connect, Google Friend Connect, or any of these new systems. It’s old-fashioned email. When a site gets your email address, they get both an identifier and a way to contact you. When they get an OpenID, all they get is an identifier. As long as an OpenID is less valuable than an email address, it will not be adopted widely. So we need to make it more valuable to websites.

There are elements of the “open stack” that can layer on top of OpenID and provide not only a way to contact the user, but also get their profile info, friend data, and distribution among their friends. These are all available via Facebook Connect, and they offer real value. For instance, with Connect, websites are impressed by how much data they get from their users, and how much more content users contribute. For example, Govit reported that more than half of their new users use Connect, and they all have names and profile pictures, and they can publish their stories back into their Facebook Newsfeed.

Unfortunately, OpenID providers aren’t there yet with providing all that value. There are extensions to OpenID that help with this: simple registration, attribute exchange, OAuth, portable contacts, …. Sorry, did I lose you? These different pieces are really confusing and inconsistently applied. As long as that is true, it will be really difficult for relying parties to well, rely on them being there. Yahoo is the only major OpenID provider that offers simple reg (I mean of the big few) and even they haven’t released it publicly (although soon will). I hope during 2009 that the breadth of providers offering the full “open stack” will be dramatically expanded, such that relying parties can come to expect a consistent experience from the average OpenID provider. It should be as consistent as it is with Facebook Connect.

Q. What would you do as a board member?

My understanding is that the board is actually somewhat disjoint from the mechanics of actually moving the technology forward. The board members meet once or twice a quarter, plan and manage finances, and set the strategic direction and overall goals of the OpenID brand and organization. I think I can help particularly in representing the needs of big companies (Facebook specifically), and the top 100 websites, and making sure that their voice is heard within the OpenID board meetings. I’m also really interested in learning about the goings-on in the technology, and talking with representatives from other stakeholders in OpenID.

Regardless of whether I make the board, I plan to work within the community … help with the OAuth extension, continue to evangelize OpenID and other elements of the open stack within Facebook. I’ve done a significant amount of work towards that end already and plan to continue.


Dec 11 2008

Lessons from Facebook Connect

Last week we finally launched Facebook Connect to the general public. In the time since I joined the team last May, I’ve definitely been surprised by a few things I thought I’d share.

think big

This time last year, I thought “Man, wouldn’t it be cool if Facebook became an OpenID provider? Maybe if we just put it out there, then eventually people could do redirect back to Facebook, and we could expand the platform incrementally.” Moments of surprise:

  • When Wei told me that we were going to do the entire login flow in an iframe on the remote site, without any redirects needed. And he would build an extension to HTML that would render social data on a remote site, but in Javascript. Whoa, I didn’t even know that was possible.
  • I started to think, “Wow, maybe some sites will actually use this.” Then Josh, Matt, and Dave told me that they had already talked to Citysearch, Digg, CNN, CBS, … I thought, “Wow, real sites? Like ones my friends would use?”
  • As we got closer to launch, the dialog kept growing and shrinking. The feed form kept changing. Engineers would add checkboxes, options, and then Julie would smack them down. Or if she didn’t, Zuck would. After several iterations, I started to grasp the vision. This wasn’t just about letting users share their blog comments if they wanted to. This was about radically changing human behavior so that everything they do is shared through Facebook. Everything. Ultimately, that’s the goal. It blew me away.

marketing matters

The team built a great product, if I don’t say so myself. But it would have been impossible without the partner managers. Zhen, Josh, and Anand were out there every day for six months talking to partners, nursing them, giving us feedback. Somehow they kept track of a software product that was constantly changing. They cajoled and explained to the folks at shopping sites, media companies, newspapers, tech sites, and bloggers. They figured out what each one needed and let them know that Facebook would make them money.

Most importantly, they explained it to us. I didn’t really know what we were building back in June. It was the partner managers that gave us the early Citysearch mocks that let the team know we were onto something. And as the product rolled on, they helped us prioritize. I think most features we built between August and November were geared towards one partner or another, whose feature requests represented the voices of huge swaths of developers that would never ask, but just would not use our products. For example, Citysearch operates on multiple subdomains - chicago.citysearch.com and miami.citysearch.com - and we needed to build out support for that. But because we did that, it made the product that much better.

speedy is as speedy does

The engineers on this team are quick! I think every day for months someone was checking in code. There were some days with over 30 commits, and code being pushed two or three times a week. This is only possible because the whole company is geared towards speed, speed, speed. I credit our main pusher, Chuck Rossi, for launching us at least a month earlier than we would have otherwise. Of course this also meant incredible pragmatism when it came to churning out features. Very little work was wasted in the end, which was a tribute to the product management.

listen to developers

As the first few … dozen … partners rolled out, the team cheered. Each of them represents hours of engineering time on our end working with developers, helping them solve problems, debug issues, just get a mental model for how XFBML is supposed to work. Internet Explorer is not kind to cross-domain Javascript developers, I’ll say that much. But it was worth it. In the past two months we’ve seen what kind of errors people kept making, and tried really hard to reduce the code necessary to make simple things happen. Our initial partners spent so much of their time helping us smooth things out. David Recordon and Jonah Schwartz were rock stars really early on.

Wei and I made a video demonstrating Connect today. The code in this video would have taken hundreds of lines of Javascript only two months ago.


Nov 25 2008

Data driven decision making: Netflix or Blockbuster?

My wife and I have been Netflix subscribers for years, during which we have rented hundreds of movies. We are considering a switch to Blockbuster, but one of the holdups has been that Blockbuster supposedly only has best sellers, while Netflix has lots of niche and foreign movies that make it more attractive. Then I realized it doesn’t really matter what the selection is in the abstract; what matters is, are the movies we want available? So I wrote a quick Perl script to help answer that question. It was fun so I thought I’d share my methodology and results.

  1. First download my Netflix account history to an HTML file: https://www.netflix.com/RentalActivity?all=true
  2. Take out the movies from the HTML file
grep "http://www.netflix.com/Movie" RentalActivity.html
  | sed "s|.*Movie/||" | sed "s|/.*||"
  | sed "s/_/ /g" > netflix_history

Then, go to the Blockbuster search page, and figure out their search endpoint. Blockbuster doesn’t have an API like Netflix does, so we have to scrape their page. Since we’re looking for a relatively simple answer, this is not bad. Playing around, I can get the answer with this command:

curl "http://www.blockbuster.com/search/movie/movies
  ?keyword=The Sopranos Season 2 Disc 2"
  | grep "results containing"

I figured out the regular expression, and whipped up a quick perl script that pulled the number of results, and the title of the first result.

#!/usr/bin/perl

while ($title = <STDIN>) {
  chomp $title;

  $date = `date`;
  chomp($date);
  print STDERR $date . "   " . $title . "\n";
  $url = "http://www.blockbuster.com/search/movie/movies?keyword=". $title;
  $result = `curl "$url"`;

  $num = -1;
  if ($result =~ m/(\d+)&nbsp;results containing/) {
    $num = $1;
  }
  $new_title = '';
  if ($result =~ m|<dt class="titleInfo">.*?<a href="/catalog/movieDetails/\d+" title="(.*?)"|) {
    $new_title = $1;
  }
  print $title . "\t" . $num . "\t" . $new_title . "\n";
  sleep 15;
}

In an attempt not to get shut down by any rate limiters, I only did one query every 15 minutes.

After getting the data this morning, I loaded it into Excel and did some manual scrubbing. Sometimes it was wrong; occasionally I’d get back 0 results even though such an entry did exist. So I manually ran about 20 or 30 searches on the few remaining items, just to make sure everything was accurate.

The net result: only eight out of our 327 movies was not available in Blockbuster. This was mostly composed of the Up Series, which is an old British documentary dating from the 60s, so I’m not terribly surprised. The remaining few missing movies were:

Besides those, Blockbuster had them all. They had all our seasons of Freaks and Geeks, Buffy, Sopranos, Angel, Gilmore Girls, Sex and the City, Six Feet Under. They had The Clan of the Cave Bear, The End of Suburbia, “sex lies and videotape”, Yo Soy Boricua Pa Que Tu Lo Sepas, Uchicago’s own Proof.

So in short, I think I’m switching to Blockbuster. Here’s to data-driven decision making.

Outcome