Independent Identity: IETF

Showing posts with label IETF. Show all posts

Tuesday, February 24, 2015

A 'Robust' Schema Approach for SCIM

This article was originally posted on the Oracle Fusion Blog, Feb 24, 2015.

Last week, I had a question about SCIM's (System for Cross-domain Identity Management) approach to schema. How does the working group recommend handling message validation? Doesn't SCIM have a formal schema?

To be able to answer that question, I realized that the question was about a different style of schema than SCIM supports. The question was assuming that “schema” is defined how XML defines schema as a way to validate documents.

Rather then focus on validation, SCIM’s model for schema is closer to what one would describe as a database schema much like many other identity management directory systems of the past. Yet, SCIM isn't necessarily a new web protocol to access a directory server. It is also for web applications to enable easy provisioning. The SCIM schema model is "behavioural" - it defines the attributes and associated attribute qualities a particular server supports. Do clients need to discover schema? Generally speaking they do not. Let’s take a closer look at schema in general and how SCIM’s approach supports cross-domain schema issues.

Many Definitions of Schema and Many Schema Practices

Looking at the definition in Wikipedia, schema is a very broadly defined term. It can define a software interface, a document format (such as XML Schema), a database, a protocol, or even a template. There is even a new JSON proposal called JSON Schema. This too is very different from XML Schema. It has some elements that describe data objects, but JSON Schema focuses a lot more defining a service and more closely resembles another schema format: WADL.

With XML schema, the bias seems to be about “enforcement” and “validation” of documents or messages. Yet, for many years, the REST/JSON community has been proud of resisting formalizing “schema”. May it just hasn't happened yet. This does appear to be an old debate with two camps claiming the key to interoperability is either strict definition and validation, or strict adherence to flexibility or “robustness” or Jon Postel’s law [from RFC 793]:

“Be conservative in what you do, be liberal in what you accept from others.”

12 years ago or so, Arran Swartz blogged "Postel's law has no exceptions!". I found Tim Bray’s post from 2004 to be enlightening - "On Postel, Again". So, what is the right approach for SCIM?

The Identity Use Case

How does SCIM balance the "robustness" vs. "verifiability" to achieve inter-operability in a practical and secure sense? Consider that:

There is often a cross-domain governance requirement by client enterprises that information be reasonably accurate and up-to-date across domains.
Because the mix of applications and users in each domain are different, the schema in one domain is will never exactly be the same as in another domain.
Different domains may have different authentication methods and data to support those methods and may even support federated authentication from another domain.
A domain or application that respects privacy tends to keep and use only the information it has a legitimate need for rather than just a standard set of attributes.
An identifier that is unique in one domain may not be unique in another. Each domain may need to generate its own local identifier(s) for a user.
A domain may have value-added attributes that other domains may or may not be interested in.

SCIM’s Approach

SCIM’s approach is to allow a certain amount of “specified" robustness that enables each domain to accept what it needs, while providing some level of assurance that information is exchanging properly. This means that a service provider is free to drop attributes it doesn't care about when being provisioned from another domain, while the client can be assured that the service provider has accepted their provisioning request. Another example, is a simple user-interface requirement where a client retrieves a record, changes an attribute and puts it back. In this case, the SCIM service provider sorts out, whether some attributes are to be ignored because they are read-only, and updates the modifiable attributes. The client is not required to ask what data is modifiable and what isn’t. This isn't a general free-for-all, that the server can do whatever it wants. Instead, the SCIM specifications state how this robust behaviour is to work.

With that said, SCIM still depends largely on compliance with HTTP protocol and the exchange of valid JSON-parsable messages. SCIM does draw the line with regards to the information content “validation” in an abstract sense like XML schema does.
Does the SCIM completely favour simplicity for SCIM clients? Not exactly. Just as a service provider needs to be flexible in what it accepts, so too must SCIM clients when a service provider responds. When a SCIM service provider responds to a client request the client must be prepared to accept some variability in SCIM responses. For example, if a service provider returns a copy of a resource that has been updated, the representation always reflects the final state of the resource on the service provider . It does not reflect back exactly what the client requested. Rather, the intent is that the service provider informs the client about the final state of a resource after a SCIM request is completed.

Is this the right model?

Let’s look at some key identity technologies of the past, their weak points and their strong points:

X.500 was a series of specifications developed by the ITU in 1988. X.500 had a strict schema model that required enforcement. One of the chief frustrations for X.500 developers (at least for myself) was that while each server had its own schema configuration, clients were expected to alter their requests each time. This became particularly painful if you were trying to code a query filter that would work against multiple server deployments. If you didn’t first “discover” server configuration and adjust your code, your calls were doomed to fail. Searching became infuriating when common attributes weren’t supported by a particular server deployment since the call would be rejected as non-conformant. Any deviation was cause for failure. In my experience X.500 services seemed extremely brittle and difficult to use in practice.
LDAP, developed by the IETF in 1996, was based on X.500, but loosened things up somewhat. Aside from LDAP being built for TCP/IP, LDAP took the progressive step of simply assuming that if a client specified an undefined attribute in a search filter, that there was no match. This tiny little change meant that developers did not have to adjust code on the fly, but could rather build queries with “or” clauses profiling common server deployments such as Sun Directory Server vs. Microsoft Active Directory and Oracle Directory. Yet, LDAP still carried too many constraints and ended up with some of the brittleness as X.500. In practice, the more applications that integrated with LDAP the less able a deployer was able to change schema over time. Changing schema meant updating clients and doing a lot of staged production testing. In short, LDAP clients still expected LDAP servers to conform to standard profiles.
In contrast to directory or provisioning protocols, SAML is actually a message format for sending secure assertions. To be successful, SAML had to ensure a lot of optionality that depended on “profile” specifications to clearly define how and when assertions could be used. A core to its success has been clear definition of MUST understand vs. MUST ignore. In many cases, if you don’t understand an assertion value, you are free to ignore it. This opens the door to extensibility. On the other hand, if as a relying party you understand an attribute assertion, then it must conform to its specification (schema).

In our industry, we tend to write security protocols in strict forms in order to assure security. Yet we've often achieved brittleness and lack of usability. Because information relationships around identity and the attributes consumed are constantly variable, history appears to show that identity protocols that have robust features are incrementally more successful. I think SCIM as a REST protocol, moves the ball forward by embracing a specified robust schema model, bringing significant usability features over the traditional use of LDAP.

Post-note: I mentioned in my last blog post that SCIM had reached 'last call'. The working group has felt that this issue is worth more attention and is currently discussing clarifications to the specifications as I have discussed above.

Tuesday, December 16, 2014

Standards Corner: IETF SCIM Working Group Reaches Consensus

On the Oracle Fusion blog, I blog about the recent SCIM working group consensus, SCIM 2's advantages, and its position relative to LDAP.

Friday, May 30, 2014

Standards Corner: Preventing Pervasive Monitoring

On Wednesday night, I watched NBC’s interview of Edward Snowden. The past year has been tumultuous one in the IT security industry. There has been some amazing revelations about the activities of governments around the world; and, we have had several instances of major security bugs in key security libraries: Apple's ‘gotofail’ bug the OpenSSL Heartbleed bug, not to mention Java’s zero day bug, and others. Snowden’s information showed the IT industry has been underestimating the need for security, and highlighted a general trend of lax use of TLS and poorly implemented security on the Internet. This did not go unnoticed in the standards community and in particular the IETF.
Last November, the IETF (Internet Engineering Task Force) met in Vancouver Canada, where the issue of “Internet Hardening” was discussed in a plenary session. Presentations were given by Bruce Schneier, Brian Carpenter, and Stephen Farrell describing the problem, the work done so far, and potential IETF activities to address the problem pervasive monitoring. At the end of the presentation, the IETF called for consensus on the issue. If you know engineers, you know that it takes a while for a large group to arrive at a consensus and this group numbered approximately 3000. When asked if the IETF should respond to pervasive surveillance attacks? There was an overwhelming response for ‘Yes'. When it came to 'No', the room echoed in silence. This was just the first of several consensus questions that were each overwhelmingly in favour of response. This is the equivalent of a unanimous opinion for the IETF.

Since the meeting, the IETF has followed through with the recent publication of a new “best practices” document on Pervasive Monitoring (RFC 7258). This document is extremely sensitive in its approach and separates the politics of monitoring from the technical ones.

Pervasive Monitoring (PM) is widespread (and often covert) surveillance through intrusive gathering of protocol artefacts, including application content, or protocol metadata such as headers. Active or passive wiretaps and traffic analysis, (e.g., correlation, timing or measuring packet sizes), or subverting the cryptographic keys used to secure protocols can also be used as part of pervasive monitoring. PM is distinguished by being indiscriminate and very large scale, rather than by introducing new types of technical compromise.

The IETF community's technical assessment is that PM is an attack on the privacy of Internet users and organisations. The IETF community has expressed strong agreement that PM is an attack that needs to be mitigated where possible, via the design of protocols that make PM significantly more expensive or infeasible. Pervasive monitoring was discussed at the technical plenary of the November 2013 IETF meeting [IETF88Plenary] and then through extensive exchanges on IETF mailing lists. This document records the IETF community's consensus and establishes the technical nature of PM.

The draft goes on to further qualify what it means by “attack”, clarifying that

The term is used here to refer to behavior that subverts the intent of communicating parties without the agreement of those parties. An attack may change the content of the communication, record the content or external characteristics of the communication, or through correlation with other communication events, reveal information the parties did not intend to be revealed. It may also have other effects that similarly subvert the intent of a communicator.

The past year has shown that Internet specification authors need to put more emphasis into information security and integrity. The year also showed that specifications are not good enough. The implementations of security and protocol specifications have to be of high quality and superior testing. I’m proud to say Oracle has been a strong proponent of this, having already established its own secure coding practices.

Cross-posted from Oracle Fusion Blog.

Wednesday, April 9, 2014

Standards Corner: Basic Auth MUST Die!

Basic Authentication (part of RFC2617) was developed along with HTTP1.1 (RFC2616) when the web was relatively new. This specification envisioned that user-agents (browsers) would ask users for their user-id and password and then pass the encoded information to the web server via the HTTP Authorization header.

Basic Auth approach quickly died in popularity in favour of form based login where browser cookies were used to maintain user session, rather than repeated re-transmission of the user-id and password for each web request. Basic Auth was clinically dead and ceased being the "state-of-the-art" method for authentication.

These days, now that non-browser based applications are increasing in popularity, one of the first asks by architects is support for Basic Authentication. It seems the Basic Authentication "zombie" lives on. Why is this? Is it for testing purposes?

Why should Basic Authentication die?

Well, for one, Basic Auth requires that web servers have access to "passwords" which have continually been shown to be one of the weakest security architecture. Further, it requires that the client application ask users directly for their user-id and password greatly increasing the points of attack a hacker might have. A user giving an application (whether a mobile application or a web site) their user-id and password is allowing that application the ability to impersonate the user. Further, we now know that password re-use continues to undermine this simple form of authentication.

There are better alternatives.

A better alternative uses "tokens", such as the cookies I mentioned above, to track client/user login state. An even better solution, not easily done with Basic Auth, is to use an adaptive authentication service whose job it is to evaluate not only a user's id and password, but can also evaluate multiple factors for authentication. This can go beyond the idea of something you know, to something you are, and something you have types of factors. Many service providers are even beginning to evaluate network factors as well, such as, has the user logged in from this IP address and geographical location before?

In order to take advantage of such an approach, the far better solution is to demand OAuth2 as a key part of your application security architecture for non-browser applications and APIs. Just like form-based authentication dramatically improved browser authentication in the 2000s, OAuth2 (RFC6749 and 6750), and its predecessor, Kerberos, provide a much better way for client applications to obtain tokens that can be used for authenticated access to web services.

Token authentication is far superior because:

Tokens cleanly separate user authentication and delegation from the application's activities with web services.
Tokens do not require that clients impersonate users. They can be highly scoped and restrictive in nature.
The loss of a token, means only a single service is compromised where as the loss of a password compromises every site where a user-id and password is used.
Tokens can be issued by multi-factor authentication systems.
Tokens do not require access to a password data store for validation.
Tokens can be cryptographically generated and thus can be validated by web services in a "stateless" fashion (not requiring access to a central security database).
Tokens can be easily expired and re-issued.

RFC 2617 Basic Authentication is not only dead. It needs to be buried. Stop using it. You can do it!

Cross-posted from Oracle Fusion Blog.

Friday, February 14, 2014

New IETF SCIM drafts - Revision 03 Details

Yesterday, the IETF SCIM (System for Cross Domain Identity Management) Working Group published new draft specification revisions:

This draft was essentially a clean-up of the specification text into IETF format as well as a series of clarifications and fixes that will greatly improve the maturity and interoperability of the SCIM drafts. SCIM has had a number of outstanding issues to resolve and in this draft, we managed to knock off a whole bunch of outstanding issues - 27 in all! More change log details are also available in the appendix of each draft.

Key updates include:

New attribute characteristics:

returned - When are attributes returned in response to queries
mutability - Are attributes readOnly, immutable, readWrite, or writeOnly
readOnly - this boolean has been replaced by mutability

Filters

A new "not" negation operator added
A missing ends with (ew) filter was added
Filters can now handle complex attributes allowing multiple conditions to be applied to the same value in a multi-valued complex attributes. For example:

filter=userType eq "Employee" and emails[type eq "work" and value co "@example.com"]

HTTP

Clarified the response to an HTTP DELETE
Clarified support for HTTP Redirects
Clarified impact of attribute mutability on HTTP PUT requests

General

Made server root level queries optional
Updated examples to use '/v2' paths rather than '/v1'
Added complete JSON Schema representation for Users, Groups, and EnterpriseUser.
Reformatting of documents to fit normal IETF editorial practice

Thanks to everyone in the working group for their help in getting this one out!

Monday, November 4, 2013

Standards Corner: OAuth WG Client Registration Problem

Update: Cross-Posted on the Oracle Fusion Middleware blog.

This afternoon, the OAuth Working Group will meet at IETF88 in Vancouver to discuss some important topics important to the maturation of OAuth. One of them is the OAuth client registration problem.

OAuth (RFC6749) was initially developed with a simple deployment model where there is only monopoly or singleton cloud instance of a web API (e.g. there is one Facebook, one Google, on LinkedIn, and so on). When the API publisher and API deployer are the same monolithic entity, it easy for developers to contact the provider and register their app to obtain a client_id and credential.

But what happens when the API is for an open source project where there may be 1000s of deployed copies of the API (e.g. such as wordpress). In these cases, the authors of the API are not the people running the API. In these scenarios, how does the developer obtain a client_id?

An example of an "open deployed" API is OpenID Connect. Connect defines an OAuth protected resource API that can provide personal information about an authenticated user -- in effect creating a potentially common API for potential identity providers like Facebook, Google, Microsoft, Salesforce, or Oracle. In Oracle's case, Fusion applications will soon have RESTful APIs that are deployed in many different ways in many different environments. How will developers write apps that can work against an openly deployed API with whom the developer can have no prior relationship?

At present, the OAuth Working Group has two proposals two consider:

Dynamic Registration

Dynamic Registration was originally developed for OpenID Connect and UMA. It defines a RESTful API in which a prospective client application with no client_id creates a new client registration record with a service provider and is issued a client_id and credential along with a registration token that can be used to update registration over time.

As proof of success, the OIDC community has done substantial implementation of this spec and feels committed to its use. Why not approve?

Well, the answer is that some of us had some concerns, namely:

Recognizing instances of software - dynamic registration treats all clients as unique. It has no defined way to recognize that multiple copies of the same client are being registered other then assuming if the registration parameters are similar it might be the same client.
Versioning and Policy Approval of open APIs and clients - many service providers have to worry about change management. They expect to have approval cycles that approve versions of server and client software for use in their environment. In some cases approval might be wide open, but in many cases, approval might be down to the specific class of software and version.
Registration updates - when does a client actually need to update its registration? Shouldn't it be never? Is there some characteristic of deployed code that would cause it to change?
Options lead to complexity - because each client is treated as unique, it becomes unclear how the clients and servers will agree on what credentials forms are acceptable and what OAuth features are allowed and disallowed. Yet the reality is, developers will write their application to work in a limited number of ways. They can't implement all the permutations and combinations that potential service providers might choose.
Stateful registration - if the primary motivation for registration is to obtain a client_id and credential, why can't this be done in a stateless fashion using assertions?
Denial of service - With so much stateful registration and the need for multiple tokens to be issued, will this not lead to a denial of service attack / risk of resource depletion? At the very least, because of the information gathered, it would difficult for service providers to clean up "failed" registrations and determine active from inactive or false clients.
There has yet to be much wide-scale "production" use of dynamic registration other than in small closed communities.

Client Association

A second proposal, Client Association, has been put forward by Tony Nadalin of Microsoft and myself. We took at look at existing use patterns to come up with a new proposal. At the Berlin meeting, we considered how WS-STS systems work. More recently, I took a review of how mobile messaging clients work. I looked at how Apple, Google, and Microsoft each handle registration with APNS, GCM, and WNS, and a similar pattern emerges. This pattern is to use an existing credential (mutual TLS auth), or client bearer assertion and swap for a device specific bearer assertion.

In the client association proposal, the developer's registration with the API publisher is handled by having the developer register with an API publisher (as opposed to the party deploying the API) and obtaining a software "statement". Or, if there is no "publisher" that can sign a statement, the developer may include their own self-asserted software statement.

A software statement is a special type of assertion that serves to lock application registration profile information in a signed assertion. The statement is included with the client application and can then be used by the client to swap for an instance specific client assertion as defined by section 4.2 of the OAuth Assertion draft and profiled in the Client Association draft. The software statement provides a way for service provider to recognize and configure policy to approve classes of software clients, and simplifies the actual registration to a simple assertion swap. Because the registration is an assertion swap, registration is no longer "stateful" - meaning the service provider does not need to store any information to support the client (unless it wants to).

Has this been implemented yet? Not directly. We've only delivered draft 00 as an alternate way of solving the problem using well-known patterns whose security characteristics and scale characteristics are well understood.

Dynamic Take II

At roughly the same time that Client Association and Software Statement were published, the authors of Dynamic Registration published a "split" version of the Dynamic Registration (draft-richer-oauth-dyn-reg-core and draft-richer-oauth-dyn-reg-management). While some of the concerns above are addressed, some differences remain. Registration is now a simple POST request. However it defines a new method for issuing client tokens where as Client Association uses RFC6749's existing extension point. The concern here is whether future client access token formats would be addressed properly. Finally, Dyn-reg-core does not yet support software statements.

Conclusion

The WG has some interesting discussion to bring this back to a single set of specifications. Dynamic Registration has significant implementation, but Client Association could be a much improved way to simplify implementation of the overall OpenID Connect specification and improve adoption. In fairness, the existing editors have already come a long way. Yet there are those with significant investment in the current draft. There are many that have expressed they don't care. They just want a standard. There is lots of pressure on the working group to reach consensus quickly.

And that folks is how the sausage is made.

Note: John Bradley and Justin Richer recently published draft-bradley-stateless-oauth-client-00 which on first look are getting closer. Some of the details seem less well defined, but the same could be said of client-assoc and software-statement. I hope we can merge these specs this week.

Tuesday, August 27, 2013

New Draft for Enabling OAuth2 To Be Used for Authentication

In my last blog post, I discussed the issue of OAuth2 and authentication: Simple Authentication for OAuth 2? What is the Right Approach? As promised, I submitted a draft to the IETF for discussion in Berlin at the beginning of the month. While the working group didn't get a lot of time in the meeting to talk about the authentication issue (it wasn't formally on the charter), the submission did receive quite a bit of discussion in the hallways and on the IETF OAuth WG mailing list -- some of which I have already addressed in my last blog post.

Since the Berlin meeting, I have reviewed the feedback and have submitted an update. Draft 01 is now aligned to be compatible with OpenID Connect. In other words:

If you are a web developer, and the only problem you want to solve today is "how can I use OAuth to authenticate my users?", the new User Authentication and Consent for Clients draft is intended to solve just that problem.
But, if you decide you want to add an attribute provider (user profile) services to your requirements, you can upgrade to the OpenID Connect drafts as an extension to authentication without impacting existing code.

The new draft also allows clients to request a minimum Level of Assurance (per NIST-800-63 or ISO 29115) for the authentication -- a useful feature if the clients would like the service provider to do a higher assurance authentication (such as a multi-factor or biometric) than it might otherwise normally perform.

My thanks to Tony Nadalin and Mike Jones of Microsoft for their contributions to this draft.

Thursday, February 28, 2013

Standards Corner: Tokens. Can You Bear It?

This week's post is all about tokens. What are the different types of tokens that may be used in RESTful services? How are they the same/different from browser cookies? What are access tokens, artifacts, bearer tokens, and MAC tokens?

If I asked you what are tokens used for, many of you would answer authentication. But there is a bit more to it than that. First, I'd like to point you to a post I wrote on my personal blog called "3 Parts to Authentication"

In this post, authentication is described as a process broken down into 3 parts:
1. Registration
2. Credential Presentation
3. Message Authentication

What's important here is that many often confuse the process of credential presentation with message authentication. Credential presentation is the process where a user or an HTTP client application demonstrate (with one or more factors), that the user or HTTP client application in question is the same one that was previously registered. Having successfully completed the credential presentation process, the authenticator issues a cookie or token which can be used for a period of time, as a means of message authentication -- creating a single-sign-on session.

Today's post focuses on step 3, using cookies or tokens to access web resources. In browsers, cookies are added to requests in order to allow web sites to perform message authentication -- in effect creating the effect of single-sign-on. HTTP client applications use tokens in much the same way. They pass tokens, given to them by an authorization server, in the HTTP Authorization header of requests to achieve the same thing cookies do for browsers. In the case of tokens issued by an OAuth2 Authorization server (as with Kerberos and others), we call these tokens "access tokens" because they are used to access web resources.

Broadly speaking, there are 2 categories of tokens web sites may accept: bearer tokens and proof tokens. Bearer tokens work very much like browser cookies. They can be a simple unique identifier (aka artifact), or they can be encoded strings that have meaning to the web sites they are intended for. However to the client, these cookies are just random (opaque) text strings that need to be passed to the web site in order to access a resource (message authentication). Because the client doesn't have to do anything but merely attach a bearer token to its request, bearer tokens are very simple for client developers to use. For more details on bearer tokens, check out RFC 6750 The OAuth 2.0 Authorization Framework: Bearer Token Usage.

While bearer tokens are incredibly simple and easy to use, there is a downside. Any client that obtains a copy of the bearer token may use it. Simple possession is enough to access the web resource (hence the term bearer). So, a critical limitation of bearer tokens is they SHOULD NOT be used over plain HTTP since they can be sniffed and copied. Web sites, accepting tokens should consider whether there is a possibility that access tokens could be sniffed or otherwise shared and does that impose a risk. Because of this, the IETF OAuth Working Group is nowworking on requirements for Holder-of-Key tokens (aka proof tokens).This document describes in detail the kinds of problems that could be solved and attempts to get to a set of use case requirements for a final token specification.

Proof tokens require an HTTP client to perform some kind of calculation that shows that only it could have used the token (such as with a private key or other shared secret). In a HoK token, a client could be required to generate a request signature, and even add a counter in order to prevent play-back attacks in addition to simple proof of a client's right to use a token. An example of this is the MAC token draft. The OAuth2 Working Group is debating whether this specification should move forward or whether a simpler specification based on JSON Tokens (JWT) should be developed.

So, in many ways, tokens build on the experience industry has had for many years with browsers and single-sign-on cookies. Tokens wielded by HTTP clients accessing RESTful web resources achieve the same feature we've taken for granted with browsers. Bearer tokens are easy for most clients to use, but require secure connections when used to prevent sniffing. Proof/HoK tokens can be used to where web resources are either unprotected, or further proof of the right to use a token is needed.

Cross-posted from OracleIDM.

Thursday, February 14, 2013

3 Parts to Authentication

At the IETF85 meeting in Atlanta, I ran into Phillip Hallam-Baker after a meeting on HTTP Authentication (you may recall, Phillip is one of the editors of RFC2617 - Basic and Digest Access Authentication). We were talking about how the term "authentication" is very poorly defined and means different things to different people and different service components.

Phil pointed me to a WG draft he put together as input to the HTTP Working Group - "HTTP Authentication Considerations". In section 2, he points out that authentication has several parts:

registration,
credential presentation, and
message (aka session) authentication.

The term 'authentication' tends to cause confusion due to the fact that there are actually three separate activities that it can refer to. When Alice establishes an account at a Web site, the site may verify her email address is correct. When Alice presents her username and password, the site verifies the data and if correct issues a HTTP cookie. When Alice re-visits the same Web site to make further requests, the cookie is verified each time. Each of these verifications is a form of authentication but they are totally different in character and only the last of these is a form of authentication that is directly related to HTTP.

Attempts have been made to distinguish between these as 'initial authentication' and 're-authentication' but this also creates confusion as some people consider the first contact with the user as the 'initial' authentication and others consider that to be the start of a Web session.

I have seen a lot of confusion about the purpose of access tokens in OAuth. Some feel that there should be strong authentication with every HTTP requests. But when you look at it how Phil lays it out, it's clear that the job of message authentication is simply to maintain session state across multiple HTTP requests. It has very different requirements from credential presentation.

Phil also has some interesting commentary on some of the problems with passwords like promiscuity, recovery, phishing, and lock-in issues. Many of us have for some time been on a rant about killing the password - but Phil talks plainly about what 'passwords get right'.

This is a great read.

OAuth2: Is OAuth the End of SAML? Or a New Opportunity?

I mentioned in my year in review post that rather then spell the end of SAML, OAuth2 might in fact greatly expand SAML's adoption. Why is that?

The OAuth2 Working Group is nearing completion on the OAuth2 SAML Bearer draft which defines how SAML Bearer assertions can be used with OAuth2 essentially replacing less secure user-id and passwords with more secure federated assertions.

Before I describe how this works, here is some quick terminology:

Resource Service - A service offering access to resources, some or all of which may be "owned" or "controlled by" users known as "Resource Owners".
Resource Owner - An end user, who is authorizing delegated scoped access by a client to resources offered by a Resource Service
Client - An application (e.g. mobile app, or web site) that wants to access resources on a Resource Service on behalf of a Resource Owner.
Authorization Service - A service authorized to issue access tokens to Clients on behalf of a resource server.

While the resource service and the authorization service may be authenticated by means of TLS domain name certificate, both the client application and the end-user often need to be authenticated. In "classic" OAuth, you can use simple user-id's and passwords for both. The SAML2 Bearer draft describes how federated SAML assertions can be used instead.

A typical scenario goes much like this.

OAuth2 with SAML

Alice (resource owner) accesses a corporate travel booking application.
In order to log into the corporate travel application, Alice is redirected to her employer's Identity Provider to obtain a SAML Authentication Assertion.
Upon logging in to the Corporate Travel Application, Alice wishes to update her seat preferences with her selected airline. In order to do this, the corporate travel application goes to the authorization server for the airline. The travel application provides two SAML authentication assertions: 1) An assertion representing the identity of the client application, and 2) an assertion representing Alice. The scope requested is "readProfile seat".
Upon verifying the SAML assertions and delegated authority requested, the authorization server issues an access token enabling the corporate travel application to act on behalf of Alice.
Upon receiving the access token, the corporate travel app is then able to access the frequent flyer account web resource by passing the token in the header of the HTTP Request. The Access token, acts as a session token that encapsulates the fact that the travel app is acting for Alice with scope read & seat update.

This SAML Bearer flow is actually very similar to the classic OAuth 3-leg flow. However instead of redirecting the user's browser to the authorization server in the first leg, the corporate travel app works with the user's IDP to obtain a delegation (or simple authentication) assertion direct from the IDP. Instead of swapping a code in the second leg, the client app now swaps a SAML Bearer assertion for the user.

OAuth2's ability to leverage different authentication systems makes it possible for SAML to enhance OAuth2 security going even further to eliminate the propagation of dreaded user-ids and passwords in much the same way SAML did for classic federate web sign-on. Rather than making SAML redundant, OAuth2 has in fact increased SAML's utility.

Note: this post cross-posted from the OracleIDM blog.

Wednesday, March 14, 2012

SCIM - What Should A New SCIM WG Address?

In my last blog post, I mentioned that SCIM 1.0 defines as a simple provisioning API for cloud application service providers. SCIM is architecturally oriented as a connector API specification in a hub and spoke architecture typically with an enterprise provisioning system at the hub and a cloud application service provider being a spoke. Other variations could include provisioning for on-premsise SaaS applications as well as directory synchronization. For each cloud application, the enterprise IDM hub should be able to just invoke the SCIM RESTful API of a target application's SCIM provisioning end-point.

But is SCIM about to repeat much of the history of SPML? Has it corrected some miss-steps? Yes, definitely. Is that enough? Let's look at some of the historical issues that will be of relevance to the evolution of SCIM. Just to be clear, my comments are not to suggest that SCIM adopt SPML features. My comments are intended so that SCIM learn from SPML's history.

The Value Problem

SPML 1 was very much like SCIM 1.0 is now. A simple API that supported basic CRUD operations. When SPML 1 was developed, the proposed value proposition was that provisioning would be made easier if applications would adopt a standard IDM protocol, then provisioning of enterprise applications would become easier. The value to application developers was that simpler, standardized management API would not be specific to individual IDM vendors and could inter-operate with any IDM provisioning product.

From the enterprise perspective SPML 1 made a lot of sense since it would make all applications provision the same way. They could pick and choose the IDM product they wanted to use. More importantly, enterprises would not have to pay for custom coding when attempting to provision to proprietary APIs of applications.

SPML 1 was somewhat successful, but before it could be broadly adopted, several new requirements emerged and SPML 2 was defined (though SPML 1 remains dominant). SPML 2 introduced many new features such as

the clean separation of payload from protocol;
the introduction of new common IDM operations (e.g. password operations);
a formalized DSML/XSD profile;
targeting - the ability to provision accounts through a gateway; and,
an extension mechanism for registering capabilities so that contributed capabilities could be made inter-operable.

Yet application vendors wanted more: they wanted standard schema conventions, they wanted a standard that enabled them not to have to introduce individual IDM vendor dependencies. If they could write one SPML provider once and be done with it, their costs would go down.

Many idM vendors were concerned that SPMLv2 had gone too far. In the end, it was either perceived complexity or the basic value proposition was not enough for SPML to succeed.

Has SCIM moved the ball forwards? On one important point, the answer is yes. SCIM has put forward a well defined schema with clear definition of attributes and their use or meaning. The RESTful style of SCIM keeps schema cleanly separated.

The Information Semantics Fidelity Trade-off

IDM Provisioning product developers have always faced an engineering trade-off. Would a standardized provisioning protocol/API lower development costs? Each application is unique, therefore each unique application APIs often has highly specific semantics and contextual meanings. While saving money initially by using a standardized SCIM or SPML API, does this mean a loss of "fidelity" or functionality? Do different systems treat the notion of person or user in the same way? What does delete person mean? In translating information semantics, is mapping intelligence in the hub or in the spoke or somewhere in-between? The engineering question is: should the provisioning system understand the true nature of the application, or should the application understand provisioning systems and behave like an identity store? In my experience, there's no clear answer. It depends on the nature of the application.

Does SCIM help in this regard? That is yet to be determined. The SCIM community will need to discuss issues like how to handle high level IDM operations like suspend vs. delete, password resets, federation and other deeply IDM specific issues and how they are operationally mated with a diverse application services API community.

The Gateway Problem

Corporations that are organized into divisions often end up with different independent IT organizations and outsourced providers -- especially after corporate re-organizations, acquisitions, and divestitures. In these cases, single-hub provisioning systems often become unpractical. While some may view this as rare situation, the whole idea of a cloud based apps hosted externally makes this situation de rigueur.

In these cases a key provisioning architecture element is the ability to support provisioning gateways and hub-to-hub provisioning. Gateways (or proxies) serve a dual purpose of both firewalling direct access to internal services and they serve to greatly simplify network complexity for inter-organization communication. As well as solving basic firewalling issues, gateways can also support mapping functions changing from a standardized provisioning protocol like SCIM into application specific connector protocols like CRM OnDemand who may or may not have built support for a protocol such as SCIM.

Since a gateway acts as a "proxy" to other connected SaaS services, SCIM needs the ability route or "target" operations to specific application end-points. SPML 2.0 and now RESTpml/SIMPLEST supports targeting. Targeting enables a provisioning "hub" to indicate to a provisioning "gateway" that particular person requires an account in a particular target system. In the diagram above, Alice, employee 1234 is to be provisioned into the "Finance" application.

SCIM with routing/targeting becomes a critical communication protocol for hub-to-hub and hub-to-gateway provisioning. Unlike SPML implementation of the past, inter-operability becomes a key requirement because in the world of cloud provisioning it is more likely that gateway and hub implementations will come from different provisioning product developers.

The Cloud Does Change Everything

SPML was built for a world where everything occurred inside an enterprise. But the requirements for cloud identity management are substantially different. Cloud based provisioning architecture must take into account:

Performance and Scalability – A lightweight HTTP protocol such as with REST/JSON is a cornerstone requirement when provision cloud environments with 100s of millions of users.
Firewall requirements – securely connecting directly to application APIs (standardized or not) will likely require some special sauce. It's not reasonable to expect all application end-points to be able to support this in the cloud.
Cloud Providers are often "hubs" themselves – since cloud providers offer more than one application service, cloud providers may behave more like "hubs" than spokes.
Cloud Providers With Value-Added Data – some cloud providers may have provisioning and identity management systems of their own. This suggests that cloud hubs may need to flow back to the enterprise.
Entitlement Reporting – A big requirement for provisioning these days with SOX is the need for entitlement reporting. Further, when you are paying an external cloud provider for services rendered, you want to make sure you are paying for the correct employees to use cloud services. A key component of provisioning systems need to report back available rights of all users from all applications, especially through cloud "hubs".
Inter-operability – no longer can we assume hubs and gateways are provided by a single vendor. Cloud-based provisioning will almost always be multi-vendor based.

What Should The New SCIM WG Address?

The main success of SCIM has been a standardized schema. It defines the attributes and says what each means -- something that application vendors always wanted. This is goodness. Yet, there are some gaps when you start to consider the overall provisioning system that will emerge from SCIM's adoption.

A couple of scope items that the future IETF SCIM WG should be considering:

Routing or targeting – SCIM needs to have a way to handle updates through gateways and hub-to-hub relationships for supporting multi-service cloud providers.
Persons as distinct from Users – Currently SCIM combines these entities together in a simple form. The reality is that in the hub, persons hold multiple user accounts. Is a change needed to SCIM schema to support managing the relationships between persons and their user accounts? This may not need change, but wider discussion is needed.
Peer relationships – Cloud providers with hubs may need to be able to flow updates back to client hubs.
Reporting – attestation is a key component of provisioning. Not only will clients want to be able to reconcile what cloud providers are charging for, but clients also still have requirements driven by Sarbanes-Oxley. SPML's approach was burdensome. Could SCIM support the ability for a client "hub" to get the information it needs to accomplish this in a lightweight way in the spirit of SCIM?

In my next post, I'll have more details on how I think the routing issue could be addressed.

....With thanks to Gary Cole and Mark Diodati for their wisdom and input.

Note well: the comments posted here are my personal comments and are intended as input to the IETF and are subject to the rules of RFC 5378 and RFC 3979 (updated by RFC 4879).

Monday, March 12, 2012

Simple Cloud Identity Management - Getting Started

Good news! The folks behind SCIM have decided to begin the process to formalize SCIM at the IETF. To kick things off, there will be a birds-of-a-feather session planned for the upcoming IETF meeting in Paris at the end of the month.

The above diagram shows the typical scenario that SCIM attempts to solve. The perspective of SCIM is to provide a common RESTful API for cloud SaaS providers that enterprises could use to provision accounts. Instead of an enterprise having to provision users to many cloud providers using many different APIs, SCIM proposes a simple provisioning API that all application service providers could support.

SCIM's deployment architecture model is a simple hub-and-spoke model where the enterprise IDM system is at the "hub" and each cloud service provider is a spoke. The idea behind SCIM is that each spoke is enabled by a standardized 'connector' using a standardized SCIM RESTful API. Without SCIM, the alternative is that enterprise provisioning systems have to support many different proprietary service APIs.

So far, I'm impressed with SCIM. It does the job it was designed for. But it does it solve all the requirements for cloud provisioning? I'll get into that in my next blog post.

For more information, check out the SCIM mailing list at IETF.

Friday, October 7, 2011

IIW XII: Adding Identity Information to OAuth2

IIW XII is coming up October 18-20, so I thought I'd share with you a couple of discussions I'd like to open at IIW. The first one is how best to add user identity or authentication information to OAuth2.

The OAuth Identity Information Problem
As many of you know, OAuth2 enables a process whereby a client application, as authorized by a user, is issued a "valet key" (a token) for accessing protected resources controlled by the same user at some resource site/service provider.

One limitation of OAuth "valet keys" is that client applications do not receive information about the authorizing user. From the client application's perspective, the authorizing person is anonymous unless the underlying resource API (e.g. a social graph API) can reveal information about the authorizing user. To date, this has been the typical technique used with Twitter and Facebook. But what if, instead of a social graph API, the resource being accessed by a client application is a bank account, a set of medical test results, or a set of photos? Obviously not all resource service providers provide identity information. What if there was a standard way for client applications to obtain information about the authorizing user?

OpenID Connect's Proposal
Members of the OpenID Foundation have already been working on this. OpenID Connect has been touted by many as an important addition to OAuth2 because it would define a way for an artifact to be passed to the client application that the client could use to obtain more information--or, so I thought. Instead, the current "connect" specification suggests that an ID_token be given directly to the client in addition to the resource access token that shares basic profile information. Is this sharing of information going to be acceptable to all OpenID Providers? Shouldn't there be a trust mechanism between client applications and identity providers (OPs)?

Further, the "Connect" specification seems to assume that only OpenID Providers would be used by a resource provider and its OAuth authorization server. What happens if resource providers want to use multiple types of authenticating services such as local authentication via LDAP or federation through SAML? It seems there is something missing with the current Connect proposal. We need a way for OAuth to hand off identity information while not mandating a single protocol solution. We need a way for multiple federation protocols to work at the same time in an OAuth based service.

While I liked the original idea of OpenID Connect, I think the hand-off between OAuth2 and identity sources needs to be more flexible.

Proposing an OAuth Identity Draft Discussion
A new IETF OAuth Identity draft that lays the foundation for resource sites to be able to share artifacts that enable clients to obtain identity and personal information seems to be needed. It would enable an open solution whereby resource sites could freely choose from the many types of identity providers out there. A core OAuth Identity specification could leave room for differentiated service approaches (e.g. session management) between OpenID, SAML or any other protocol, while defining the key inter-op/hand-off points with OAuth2. Based on such an IETF draft, individual protocol profiles could then lay out the details of how particular protocols such as OpenID, or SAML, or other APIs/protocols should work in relation to OAuth2.

Comments? Thoughts?

If you are going to IIW XII, I'll have a quick ppt with some thoughts on the issues to get the discussion going. Let me know if you are interested in contributing.

ps. Don't forget to register for IIW.

Wednesday, March 9, 2011

Lightweight Web Services

There has been growing interest in a group of protocols, namely HTTP, REST, OAuth, and JSON, and how they can support web services. REST and JSON, have been around for a while, but one of the puzzling problems was how to handle authentication in REST especially for non-browser based clients using HTTP. So far the only options have been BASIC authentication or SSL/TLS mutual authentication. So far, neither of which have been adequate (but that's a whole other blog post). However, more recently OAuth2 emerged, and offers some possibilities - especially for access to user controlled resources.

Another reason for interest in these protocols has been the emergence of cloud services and smart phones. Instead of using traditional web services such as WS-*, cloud service providers are opting for lighter weight, more quick to implement approaches that focus on basic HTTP. Smart phones with increasingly popular 'app stores' and their obvious need to be lightweight also figure heavily in this surge in interest OAuth, REST, and JSON. It occurs to me that the common theme here is a drive towards something I'll call "lightweight web services".

Pragmatic cloud proponents argue that WS-* and other specifications like SAML, ID-WSF and so on have all become bloated and unworkable. They are just too much for application developers to handle. Why not get 'lightweight' and use specifications like the PortableContacts spec to transfer personal information? Traditionalists argue about security, privacy and other important aspects. In contrast, lightweight web services focus on transport layer security to do most of its work.

Are proponents trading off security, inter-operability, and flexibility for one-shot-at-a-time lightweight services? Let's take a look at the key technologies/standards that comprise lightweight web services so far and talk about some of the challenges/drivers going forwards...

HTTP is the foundation upon which Lightweight Web Services are built. The founding protocol on the web, HTTP is getting a new look as new application clients begin using HTTP rather than just browsers. Driven by social media and the emergence of smart-phone applications and cloud services, HTTP is now the foundation protocol upon which both browsers and application clients are accessing resources and services on the web.

REST has been around for a while. Before it was popular, early web systems used REST like calls in the 90s (before it was called REST). REST creates simple, easy to document APIs that are more URL centric that seem to be more friendly to developers. The emergence of social network APIs (e.g. the Facebook Graph APIs) are good examples. It seems that many developers would rather trade discovery based code generation (e.g. facilitated by WSDLs) for simple-to-read web site documentation and manual code writing against a simple REST API.

JSON or JavaScript Object Notation is the new XML. It enables simple results from REST based service calls to be returned to clients. From wikipedia...

"It is derived from the JavaScript programming language for representing simple data structures and associative arrays, called objects. Despite its relationship to JavaScript, it is language-independent, with parsers available for most programming languages."

OAuth2, originally a method of delegating authorization to access web services (typically used in social media), OAuth2 is quickly becoming a badly needed authentication/authorization services for non-browser web application clients. While browser authentication had quickly migrated from BASIC Authentication (defined by RFC 2617) to Forms based authentication supported by cookies, OAuth provides new browser-less client applications a needed method to authenticate to web services using the HTTP Authorization header.

Shaping Lightweight Web Services

Lightweight web services have come on so strong proponents have generated "need it yesterday" demand for features that aren't yet defined or standardized. Some features are critical and while others are debatable. At present, there is still no standardized authentication token suitable for non-browser web service clients; no signing and/or encryption of content (other than TLS); no concept of message handling and much more. Are we rushing to re-invent here because of the design to have a single tool for all jobs? Or is this just a case of building out REST services to a supportable, secure level of some sort?

The lightweight web has so far been a loosely associated set of technologies with some interesting design patterns in common. As enterprises are quickly joining the community, it seems important that lightweight web services gain a more formal status with discussion in a new working group.

I invite everyone to help further define what are Lightweight Web Services and to help define a WG to help steer the development of relevant IETF and non-IETF standards that make up lightweight web services.

OAuth Quick Links