Streaming Trust

In a recent discussion we had, Marie Wallace shared a wonderful analogy for verifiable credentials. I think it helps with understanding how credentials will be adopted. She compares traditional approaches to identity and newer, decentralized approaches to the move from music CDs to streaming. I'm old enough to remember the hubbub around file sharing. As this short video on Napster shows, the real winner in the fight against piracy was Apple and, ultimately, other streaming services:

Apple changed the business model for online music from one focused on sales of physical goods to one focused on licensing individual tracks. They launched the iPod and the iTunes music store and used their installed user base to bring the music companies to the table. They changed the business model and that ultimately gave birth to the music streaming services we use today.

So, what's this got to do with identity? Most of the online identity services we use today are based on a centralized "identity provider" model. In the consumer space, this is the Social Login model where you use your account at Google, Facebook, or some other service to access some third-party online service. But the same thing is true in the workforce side where each company creates great stores of customer, employee, and partner data that they can use to make authorization decisions or federate with others. These are the CDs of identity.

The analog to streaming is decentralized or self-sovereign identity (SSI). In SSI the source of identity information (e.g., a drivers license bureau, bank, university, etc.) is called the issuer. They issue credentials to a person, called the holder, who carries various digital credentials in a wallet on their phone or laptop. When someone, called the verifier, needs to know something about them, the holder can use one or more credentials to prove attributes about themselves. Instead of large, centralized collections of data that companies can tap at will, the data is streamed to the verifier when it's needed. And only the attributes that are germane to that exchange need to be shared. Cryptography ensures we can have confidence in the payload.

The three parties to credential exchange
The three parties to credential exchange (click to enlarge)

Identity streaming has several benefits:

  • Confidence in the integrity of the data is increased because of the underlying cryptographic protocols.
  • Data privacy is increased because only what needs to be shared for a given interaction is transmitted.
  • Data security is increased because there are fewer large, comprehensive troves of data about people online for hackers to exploit.
  • The burden of regulatory compliance is reduced since companies need to keep less data around when they know they can get trustworthy information from the holder just in time.
  • The cost of maintaining, backing up, and managing large troves of identity data goes away.
  • Access to new data is easier because of the flexibility of just-in-time attribute delivery.

And yet, despite these benefits, moving from big stores of identity data to streaming identity when needed will take time. The big stores already exist. Companies have dedicated enormous resources to building and managing them. They have integrated all their systems with them and depend on them to make important business decisions. And it works. So why change it?

The analogy also identifies the primary driver of adoption: demand. Napster clearly showed that there was demand for online music. Apple fixed the business model problem. And thousands of businesses were born or died on the back of this change from CDs to streaming.

Digital credentials don't have the same end user demand pull that music does. Music is emotional and the music industry was extracting huge margins by making people buy an $18 CD to get the one song they liked. People will likely love the convenience that verifiable credentials offer and they'll be more secure and private, but that's not driving demand in any appreciable way. I think Riley Hughes, CEO of Trinsic.id, is on to something with his ideas about digital trust ecosystems. Ecosystems that need increased trust and better security are likely to be the real drivers of this transition. That's demand too, but of a different sort: not demand for credentials themselves, but for better models of interaction. After all, people don’y want a drill, they want a hole.

Verifiable data transfer is a powerful idea. But to make it valuable, you need a trust gap. Here's an example of a trust gap: early on, the veracity and security of a web site was a big problem. As a result many people were scared to put their credit card into a web form. The trust gap was that there was no way to link a domain name to a public key. Transport Layer Security (TLS, also known as SSL) uses digital certificates, which link a domain name to a public key (and perhaps other data) in a trustworthy way to plug the gap.

There are clearly ecosystems with trust gaps right now. For example, fraud is a big problem in online banking and ecommerce. Fraud is the symptom of a trust gap between the scam's target and the legitimate actor that they think they're interacting with. If you can close this gap, then the fraud is eliminated. Once Alice positively knows when she's interacting with her bank and when she's not, she'll be much harder to fool. Passkeys are one solution to this problem. Verifiable credentials are another—one that goes beyond authentication (knowing who you're talking to) to transferring data in a trustworthy way.

In the case of online music, the solution and the demand were both there, but the idea wasn't legitimate in the eyes of the music industry. Apple had the muscle to bring the music industry to the table and help them see the light. They provided much needed legitimacy to the idea online music purchases and, ultimately, streaming. They didn't invent online music, rather they created a viable business model for it and made it valuable. They recognized demand and sought out a new model to fill the demand. Verifiable credentials close trust gaps. And the demand for better ways to prevent fraud and reduce friction is there. What's missing, I think, is that most of the companies looking for solutions don't yet recognize the legitimacy of verifiable credentials.


Internet Identity Workshop 36 Report

IIW 36 Attendee Pin Map

We recently completed the 36th Internet Identity Workshop. Almost 300 people from around the world called 160 sessions. The energy was high and I enjoyed seeing so many people who are working on identity talking with each other and sharing their ideas. The topics were diverse but I think it's fair to say that verifiable credentials were a hot topic. And while there were plenty of discussions about technical implementations, I think those were overshadowed by sessions discussing credential business models, use cases, and adoption. We should have the book of proceedings completed in about a month and you'll be able to get the details of sessions there. You can view past Book of Proceedings here.

As I said, there were attendees from all over the world as you can see by the pins in the map at the top of this post. Not surprisingly, most of the attendees were from the US (219), followed by Canada (20). Germany, the UK, and Switzerland rounded out the top five with 8, 7, and 6 attendees respectively. The next five, Australia (5), South Korea (3), Japan (3), Indonesia (3), and Columbia (3) showed the diversity with attendees from APAC and South America. Sadly, there were no attendees from Africa this time. Please remember we offer scholarships for people from underrepresented areas, so if you'd like to come to IIW37, please let us know.

In terms of states and provinces, California was, unsurprisingly, first with 101. New York (17), Washington (16), Utah (15), and British Columbia (11) rounded out the top five. Victoria was the source of BC's strong Canadian showing coming in fifth in cities with 8 attendees after San Jose (15), San Francisco( 13), Seattle (12), and New York (10).

The week was fabulous. I can't begin to recount the many important, interesting, and timely conversations I had. I heard from many others that they had a similar experience. IIW 37 will be held Oct 10-12, 2023 at the Computer History Museum. We'll have tickets available soon. I hope you'll be able to join us.


OAuth and Fine-grained Access Control

Learning Digital Identity

Some of the following is excerpted from my new book Learning Digital Identity from O'Reilly Media.

OAuth was invented for a very specific purpose: to allow people to control access to resources associated with their accounts, without requiring that they share authentication factors. A primary use case for OAuth is accessing data in an account using an API. For example, the Receiptify service creates a list of your recent listening history on Spotify, Last.fm, or Apple Music that looks like a shopping receipt. Here's a sample receipt of some of my listening history.

Receiptify Listening History
Receiptify Listening History (click to enlarge)

Before OAuth, if Alice (the resource owner) wanted Receiptify (the client) to have access to her history on Spotify (the resource server), she'd give Receiptify her username and password on Spotify. Receiptify would store the username and password and impersonate Alice each time it needs to access Spotify on her behalf. By using Alice's username and password, Receiptify would demonstrate that it had Alice's implicit permission to access her data on Spotify. This is sometimes called the "password antipattern" and it has several serious drawbacks:

  • The resource server can't differentiate between the user and other servers accessing the API.
  • Storing passwords on other servers increases the risk of a security breach.
  • Granular permissions are hard to support since the resource server doesn't know who is logging in.
  • Passwords are difficult to revoke and must be updated in multiple places when changed.

OAuth was designed to fix these problems by making the resource server part of the flow that delegates access. This design lets the resource server ask the resource owner what permissions to grant and records them for the specific client requesting access. Moreover, the client can be given its own credential, apart from those that the resource owner has or those used by other clients the user might have authorized.

OAuth Scopes

The page that allows the owner to grant or deny permission might display what permissions the RS is requesting. This isn't freeform text written by a UX designer; rather, it's controlled by scopes. A scope is a bundle of permissions that the client asks for when requesting the token, coded by the developer who wrote the client.

The following screenshots show the permissions screens that I, as the owner of my Twitter account, see for two different applications, Revue and Thread Reader.

OAuth Scopes Displayed in the Permission Screen
OAuth Scopes Displayed in the Permission Screen (click to enlarge)

There are several things to note about these authorization screens. First, Twitter is presenting these screens, not the clients (note the URL in the browser window). Second, the two client applications are asking for quite different scopes. Revue wants permission to update my profile as well as post and delete tweets, while Thread Reader is only asking for read-only access to my account. Finally, Twitter is making it clear to me who is requesting access. At the bottom of the page, Twitter warns me to be careful and to check the permissions that the client is asking for.

Fine-Grained Permissions

Scopes were designed so that the service offering an API could define the relatively course-grained permissions needed to access it. So, Twitter, for example, has scopes like tweet:read and tweet:write. As shown above, when a service wants to use the API for my Twitter account, it has to ask for specific scopes—if it only wants to read my tweets, it would ask for tweet:read. Once granted, they can be used with the API endpoint to gain access.

But OAuth-style scopes aren't the best tool for fine-grained permissioning in applications. To see why, imagine you are building a photo sharing app that allows a photographer like Alice to grant permission for her friends Betty and Charlie to view her vacation photos. Unlike the Twitter API, where the resources are fairly static (users and tweets, for example) and you're granting permissions to a fairly small number of applications on an infrequent basis, the photo sharing app has an indeterminant number of resources that change frequently. Consequently, the app would need to have many scopes and update them each time a user grants new permissions on a photo album to some other user. A large photo sharing service might have a million users and 50 million photo albums—that's a lot of potential scopes.

Policy-based Access Control (PBAC) systems, on the other hand, were built for this kind of permissioning. As I wrote in Not all PBAC is ABAC, your app design and how you choose to express policies make a big difference in the number of policies you need to write. But regardless, PBAC systems, with good policy stores, can manage policies and the information needed to build the authorization context much more easily than you'll usually find in a system for managing OAuth scopes.

So, if you're looking to allow client applications to access your API on behalf of your customers, then OAuth is the right tool. But if you're building an application that allows sharing between its users of whatever resources they build, then you need the fine-grained permissioning that a policy-based access control system provides.


Minimal vs Fully Qualified Access Requests

Information Desk

In access control, one system, generically known as the policy enforcement point (PEP), makes a request to anothera service, generically known as a policy decision point (PDP), for an authorization decision. The PDP uses a policy store and an authorization context to determine whether access should be granted or not. How much of the authorization context does the PEP have to provide to the PDP?

Consider the following policy1:

Managers in marketing can approve transactions if they are not the owner and if the amount is less than their approval limit.

A PDP might make the following fully qualified request to get an authorization decision:

Can user username = Alice with jobtitle = manager and department = marketing do action = approve on resource of type = financial transaction with transaction id = 123 belonging to department = finance and with amount = 3452?

This request contains all of the authorization context needed for the policy to run. A fully qualified request requires that the PEP gather all of the information about the request from various attribute sources.

Alternatively, consider this minimal request:

Can user username = Alice do action = approve on resource with transaction id = 123?

The minimal request reduces the work that the PEP must do, placing the burden on the PDP to build an authorization context with sufficient information to make a decision. We can separate this work out into a separate service called a policy information point (PIP). To build an authorization context for this request, the PIP must retrieve the following information:

  • The user’s job title, department, and approval limit.
  • The transaction’s owner and amount.

Building this context requires that the PIP have access to several attribute sources including the HR system (where information about Alice is stored) and the finance system  (where information about transactions is stored).

Attribute Value Providers

PIPs are not usually the source of attributes, but rather a place where attributes are brought together to create the authorization context. As you saw, the PIP might query the HR and Finance systems, using attributes in the request to create fully qualified authorization context. Other examples of attribute value providers (AVP) include databases, LDAP directories, Restful APIs, geolocation services, and any other data source that can be correlated with the attributes in the minimal request.

What’s needed depends on the policies you’re evaluating. The fully qualified context should only include needed attributes, not every piece of information the PIP can gather about the request. To that end, the PIP must be configurable to create the context needed without wasteful queries to unneeded AVPs.

PEP Attributes

There is context that the PEP has that the PDP might need. For example, consider this request:

Can user username = Alice do action = view on resource with transaction id = 123 from IP = 192.168.1.23 and browser = firefox?

The PIP typically would not have access to the IP address or browser type. The PEP must pass that information along for use in evaluating the policy.

Derived Attributes

The PIP can enhance information that the PDP passes in to derive new attributes. For example, consider a policy that requires the request come from specific ranges of IP addresses. The first instinct a developer might have is to embed the IP range directly in the policy. However, this is a nightmare to maintain if multiple policies have the same range check—especially when the range changes. The solution is to use derived attributes, where you can define a new attribute, is_ip_address_in_range, and have the PIP calculate it instead.

This might be generalized with a Session Attribute component in the PIP that can enhance or transform session attributes into others that are better for policies. This is just one example of how raw attributes might be transformed to provide richer, derived attributes. Other examples include deriving an age from a birthday, an overall total for several transactions, or a unique key from other attributes.

Benefits of a PIP

While the work to gather the attributes to build the authorization context may seem like it’s the same whether the PEP or PIP does it, there are several benefits to using a PIP:

  1. Simplified attribute administration—A PIP provides a place to manage integration with attribute stores that ensures the integration and interpretation are done consistently across multiple systems.
  2. Reduced Complexity—PIPs can reduce the complexity of the access control system by removing the need for individual access control components to have their own policy information. PEPs are typically designed to be lightweight so they can be used close to the point where the access is enforced. For example, you might have PEPs in multiple API gateways co-located with the services they intermediate or in smartphone apps. A PIP offloads work from the PEP to keep it lightweight.
  3. Separation of Concerns—PIPs separate policy information from policy decision-making, making it easier to update and modify policies and AVPs without impacting other parts of the access control system. For example, making fully qualified requests increases the coupling in the system because the PEP has to know more about policies to formulate fully qualified requests.
  4. Improved Scalability— PIPs can be deployed independently of other access control components, which means that they can be scaled independently as needed to accommodate changes in policy volume or access requests.
  5. Enhanced Security— PIPs can be configured to provide policy information only to authorized PDPs, which improves the security of the access control system and helps to prevent unauthorized access. In addition, a PIP builds consistent authorization contexts whereas different PEPs might make mistakes or differ in their output.

Tradeoffs

There are tradeoffs with using a PIP. For example, suppose that the PEP is embedded in the ERP system and has local access to both the HR and financial data. It might more easily and cheaply use those attributes to create a fully qualified request. Making a minimal request and requiring the PIP to make a call back to the ERP for the data to create the authorization context would be slower. But, as noted above, this solution increases the coupling between the PEP and PDP because the PEP now has to have more knowledge of the policies to make the fully qualified request. Developers need to use their judgement about request formulation to evaluate the tradeoffs.

Conclusion

By creating fully qualified authorization contexts from minimal requests, a PIP reduces the burden on developers building or integrating PEPs, allows for more flexible and reusable policies, and enriches authorization contexts to allow more precise access decisions.


Notes

  1. The information in this article was inspired by Should the Policy Enforcement Point Send All Attributes Needed to Evaluate a Request?

Photo Credit: Information Desk Charleston Airport from AutoRentals.com (CC BY 2.0, photo is cropped from original)


Passkeys: Using FIDO for Secure and Easy Authentication

Learning Digital Identity

This article is adapted from Chapter 12 of my new book Learning Digital Identity from O'Reilly Media.

I was at SLC DevOpsDays last week and attended a talk by Sharon Goldberg on MFA in 2023. She's a security expert and focused many of her remarks on the relative security of different multi-factor authentication (MFA) techniques, a topic I cover in my book as well. I liked how she described the security provisions of passkeys (also know as Fast ID Online or FIDO).

FIDO is a challenge-response protocol that uses public-key cryptography. Rather than using certificates, it manages keys automatically and beneath the covers, so it’s as user-friendly as possible. I’m going to discuss the latest FIDO specification, FIDO2, here, but the older FIDO U2F and UAF protocols are still in use as well.

FIDO uses an authenticator to create, store, and use authentication keys. Authenticators come in several types. Platform authenticators are devices that a person already owns, like a laptop or smartphone. Roaming authenticators take the form of a security key that connects to the laptop or smartphone using USB, NFC, or Bluetooth.

This is a good time for you to stop reading this and head over to Passkeys.io and try them for yourself. If you're using a relatively modern OS on your smartphone, tablet, or computer, you shouldn't have to download anything. Sign up using your email (it doesn't have to be a real email address), do whatever your device asks when you click "Save a Passkey" (on my iPhone it does Face ID, on my MacOS laptop, it does Touch ID). Then sign out.

Using Touch ID with Passkey
Using Touch ID with Passkey (click to enlarge)

Now, click on "Sign in with a passkey". Your computer will let you pick an identifier (email address) that you've used on that site and then present you with a way to locally authenticate (i.e., on the device). It's that simple. In fact, my biggest fear with passkeys is that it's so slick people won't think anything has happened.

Here's what's going on behind the scenes: When Alice registers with an online service like Passkeys.io, her authenticator (software on her phone, for example) creates a new cryptographic key pair, securely storing the private key locally and registering the public key with the service. The online service may accept different authenticators, allowing Alice to select which one to use. Alice unlocks the authenticator using a PIN, fingerprint reader, or face ID.

When Alice authenticates, she uses a client such as a browser or app to access a service like a website (see figure below). The service presents a login challenge, including the chance to select an account identifier, which the client (e.g., browser) passes to the authenticator. The authenticator prompts Alice to unlock it and uses the account identifier in the challenge to select the correct private key and sign the challenge. Alice’s client sends the signed challenge to the service, which uses the public key it stored during registration to verify the signature and authenticate Alice.

Authenticating with Passkey
Authenticating with Passkey (click to enlarge)

FIDO2 uses two standards. The Client to Authenticator Protocol (CTAP) describes how a browser or operating system establishes a connection to a FIDO authenticator. The WebAuthN protocol is built into browsers and provides an API that JavaScript from a Web service can use to register a FIDO key, send a challenge to the authenticator, and receive a response to the challenge.

One of the things I liked about Dr. Goldberg's talk is that she emphasized that the security of passkeys rests on three things:

  1. Transport Layer Security (TLS) to securely transport challenges and responses.
  2. The WebAuthN protocol that gives websites a way to invoke the local authentication machinery using a Javascript API.
  3. A secure, local connection between the client and authenticator using CTAP.

One of the weaknesses of how we use TLS today is that people don't usually check the lock icon in the browser and don't understand domain names enough to tell if they're being phished. Passkeys do this for you. The browser unambiguously transfers the domain name to the authenticator which knows if it has an established relationship with that domain or not. Authenticating that you're on the right site is a key reason they're so much more secure than other MFA alternatives. Another is having a secure channel from authenticator to service, making phishing nearly impossible because there's no way to break into the authentication flow.

To see how this helps with phishing, imagine that a service uses QR codes to initiate the Passkey registration process and that an attacker has managed to use cross-site scripting or some other weakness to switch out the QR code the service provided with one of their own. When the user scanned the attacker's QR code, the URL would not match the URL of the page the user is on and the authenticator would not even start the transaction to register a key. Passkey's superpower is that it completes the last mile of TLS using WebAuthN and local authenticators.

Passkeys provide a secure and convenient way to authenticate users without resorting to passwords, SMS codes, or TOTP authenticator applications. Modern computers and smartphones and most mainstream browsers understand FIDO protocols natively. While roaming authenticators (hardware keys) are available, for most use cases, platform authenticators (like the ones built into your smartphone or laptop) are sufficient. This makes FIDO an easy, inexpensive way for people to authenticate. As I said, the biggest impediment to its widespread use may be that people won’t believe something so easy is secure.


Monitoring Temperatures in a Remote Pump House Using LoraWAN

Snow in Island Park

I've got a pumphouse in Island Park, ID that I'm responsible for. Winter temperatures are often below 0°F (-18°C) and occasionally get as cold as -35°F (-37°C). We have a small baseboard heater in the pumphouse to keep things from freezing. That works pretty well, but one night last December, the temperature was -35°F and the power went out for five hours. I was in the dark, unable to know if the pumphouse was getting too cold. I determined that I needed a temperature sensor in the pumphouse that I could monitor remotely.

The biggest problem is that the pumphouse is not close to any structures with internet service. Wifi signals just don't make it out there. Fortunately, I've got some experience using LoraWAN, a long-range (10km), low-power, wireless protocol. This use-case seemed perfect for LoraWAN. About a year ago, I wrote about how to use LoraWAN and a Dragino LHT65 temperature and humidity sensor along with picos to get temperature data over the Helium network.

I've installed a Helium hotspot near the pumphouse. The hotspot and internet router are both on battery backup. Helium provides a convenient console that allows you to register devices (like the LHT65) and configure flows to send the data from a device on the Helium network to some other system over HTTP. I created a pico to represent the pumphouse and routed the data from the LHT65 to a channel on that pico.

The pico does two things. First it processes the hearthbeat event that Helium sends to it, parsing out the parts I care about and raising another event so other rules can use the data. Processing the data is not simple because it's packed into a base64-encoded, 11-byte hex string. I won't bore you with the details, but it involves base64 decoding the string and splitting it into 6 hex values. Some of those pack data into specific bits of the 16-bit word, so binary operations are required to seperate it out. Those weren't built into the pico engine, so I added those libraries. If you're interested in the details of decoding, splitting, and unpacking the payload, check out the receive_heartbeat rule in this ruleset.

Second, the receive_heartbeat rule raises the lht65:new_readings event in the pico adding all the relevant data from the LHT65 heartbeat. Any number of rules could react to that event depending on what needs to be done. For example, they could store the event, alarm on a threshold, or monitor the battery status. What I wanted to do is plot the temperature so I can watch it over time and let other members of the water group check it too. I found a nice service called IoTPlotter that provides a basic plotting service on any data you post to it. I created a feed for the pumphouse data and wrote a rule in my pumphouse pico to select on the lht65:new_readings event and POST the relevant data, in the right format, to IoTPlotter. Here's that rule:

  rule send_temperature_data_to_IoTPlotter {
    select when lht65 new_readings

    pre {
      feed_id = "367832564114515476";
      api_key = meta:rulesetConfig{["api_key"]}.klog("key"); 
      payload = {"data": {
                    "device_temperature": [
                      {"value": event:attrs{["readings", "internalTemp"]},
                       "epoch": event:attrs{["timestamp"]}}
                    ],
                    "probe_temperature": [
                      {"value": event:attrs{["readings", "probeTemp"]},
                       "epoch": event:attrs{["timestamp"]}}
                    ],
                    "humidity": [
                      {"value": event:attrs{["readings", "humidity"]},
                       "epoch": event:attrs{["timestamp"]}}
                    ],
                    "battery_voltage": [
                      {"value": event:attrs{["readings", "battery_voltage"]},
                       "epoch": event:attrs{["timestamp"]}}
                    ]}
                };
    }

    http:post("http://iotplotter.com/api/v2/feed/" + feed_id,
       headers = {"api-key": api_key},
       json = payload
    ) setting(resp);
 }

The rule, send_temperature_data_to_IoTPlotter is not very complicated. You can see that most of the work is just reformatting the data from the event attributes into the right structure for IoTPlotter. The result is a set of plots that looks like this:

Swanlake Pumphouse Temperature Plot
Swanlake Pumphouse Temperature Plot (click to enlarge)

Pretty slick. If you're interested in the data itself, you're seeing the internal temperature of the sensor (orange line) and temperature of an external probe (blue line). We have the temperature set pretty high as a buffer against power outages. Still, it's not using that much power because the structure is very small. Running the heater only adds about $5/month to the power bill. Pumping water is much more power intensive and is the bulk of the bill. The data is choppy because, by default, the LHT65 only transmits a payload once every 20 minutes. This can be changed, but at the expense of battery life.

This is a nice, evented system, albeit simple. The event flow looks like this:

Event Flow for Pumphouse Temperature Sensor
Event Flow for Pumphouse Temperature Sensor (click to enlarge)

I'll probably make this a bit more complete by adding a rule for managing thresholds and sending a text if the temperature gets to low or too high. Similarly, I should be getting notifications if the battery voltage gets too low. The battery is supposed to last 10 years, but that's exactly the kind of situation you need an alarm on—I'm likely to forget about it all before the battery needs replacing. I'd like to experiment with sending data the other way to adjust the frequency of readings. There might be times (like -35°F nights when the power is out) where getting more frequent results would reduce my anxiety.

This was a fun little project. I've got a bunch of these LHT65 temperature sensors, so I'll probably generalize this by turning the IoTPlotter ruleset into a module that other rulesets can use. I may eventually use a more sophisticated plotting package that can show me the data for all my devices on one feed. For example, I bought a LoraWAN soil moisture probe for my garden. I've also got a solar array at my house that I'd like to monitor myself and that will need a dashboard of some kind. If you've got a sensor that isn't within easy range of wifi, then LoraWAN is a good solution. And event-based rules in picos are a convenient way to process the data.


Not all PBAC is ABAC: Access Management Patterns

Learning Digital Identity

The primary ways of implementing access control in modern applications are (1) access control lists (ACLs), (2) role-based access control (RBAC), and (3) attribute-based access control (ABAC). I assume you're familiar with these terms in this post. If you're not there's a great explanation in chapter 12 of my new book, Learning Digital Identity.1

To explore access management patterns, let's classify applications requiring fine-grained access management into one of two types:

  • Structured—these applications can use the structure of the attribute information to simplify access management. For example, an HR application might express a policy as “all L9 managers with more than 10 reports can access compensation management functionality for their reports”. The structure allows attributes like level and number_of_reports to be used to manage access to the compensation tool with a single policy. A smalls set of policies can control access to the compensation tool. These applications are the sweet spot for ABAC.
  • Ad hoc—these applications allow users to manage access to resources they control based on identifiers for both principals and resources without any underlying structure. For example, Alice shares her vacation photo album with Betty and Charlie. The photo album, Betty, and Charlie have no attributes in common that can be used to write a single attribute-based policy defining access. These applications have a harder time making effective use of ABAC.

Ad hoc access management is more difficult than structured because of the combinatorial explosion of possible access relationships. When any principal can share any resource they control with any other principal and with any subset of possible actions, the number of combinations quickly becomes very large.

There are several approaches we can take to ad hoc access management:

  1. Policy-based—In this approach the application writes a new policy for every access case. In the example given above, when Alice shares her vacation photo album with Betty and Charlie, the application would create a policy that explicitly permits Betty and Charlie to access Alice’s vacation photo album. Every change in access would result in a new policy or the modification of an existing one. This is essentially using policies as ACLs.
  2. Group-based—In a group-based approach, we create a group for people who can access the vacation photo album and a policy that allows access to the vacation photo album if the user has a group attribute of canAccessVacationPhotos. The group name has to be unique to Alice's vacation photo album and includes the allowed action. When Alice shares the album with Betty and Charlie, we add them both to the canAccessVacationPhotos group by putting it in the groupsattribute in their profile. Group-based policies look like "principal P can access vacationPhotosAlbum if P.groups contains canAccessVacationPhotos." This is essentually RBAC.
  3. Resource-based—In this approach, we add a sharedWith or canEdit attribute to Alice’s vacation photos album that records the principals who can access the resource. Now our policy uses the resource attribute to allow access to anyone in that list. Resource-based policies look like "principal P can edit resource R if P is in R.canEdit". Every resource of the same type has the same attributes. This approach is close to ABAC because it makes use of attributes on the resources to manage access, reducing the combinatorial explosion.
  4. Hybrid—We can combine group and resource-based access management by creating groups of users and storing group names in resource attribute instead of principals. For example, if Alice adds Betty and Charlie to her group friends, then she could add friends to the sharedWith attribute on her album. The advantage of the hybrid approach is we reduce the length of the attribute lists.

Policy-Based Approach

The advantage of the policy-based approach is that it’s the simplest thing that could possibly work. Given a policy store with sufficient management features (i.e., finding, filtering, creating, modifying, and deleting policies) this is straight-forward. The chief downside is the explosion in the number of policies and the scaling that it requires of the policy store. Also since the user’s permissions are scattered among many different policies, knowing who can do what is difficult and relies on the policy store's filtering capabilities.

Group-Based Approach

The group-based approach results in a large number of groups for very specific purposes. This is a common problem with RBAC systems. But given an attribute store (like an IdM profile) that scales well, it splits the work between the attribute and policy stores by reducing the number of policies to one per share type (or combination). That is, we need a policy for each resource that allows viewing, one to allow editing, and so on.

Resource-Based Approach

The resource-based approach reduces the explosion of groups by attaching attributes to the resource, imposing structure. In the photo album sharing example, each album (and photo) would need an attribute for each sharing type (view, modify, delete). If Alice says Betty can view and modify an album, Betty’s identifier would be added to the view and modify attributes for that album. We need a policy for each unique resource type and action.

The downside of the resource-based approach is that the access management system has to be able to use resource attributes in the authorization context. Integrating the access management system with an IdP provides attributes about principals so that we can automatically make those attributes available in the authorization context. You could integrate using the attributes in an OIDC token or by syncing the authorization service with the IdP using SCIM.

But the ways that attributes can be attached to a resource are varied. For example, they might be stored in the application's database. They might be part of an inventory control system. And so on. So the access management system must allow developers to inject those attributes into the authorization context when the policy enforcement point is queried or have a sufficiently flexible policy information point to easily integrate it with different databases and APIs. Commercial ABAC systems will have solved this problem because it is core to how they function.

Conclusion

Every application, of course, will make architectural decisions about access management based its specific needs. But if you understand the patterns that are available to you, then you can think through the ramifications of your design ahead of time. Sometimes this is lost in the myth that policy-based access management (PBAC) is all the same. All of the approaches I list above are PBAC, but they're not all ABAC.


Notes

  1. The material that follows is not in the book. But it should be. An errata, perhaps.


Learning Digital Identity Podcasts

Learning Digital Identity

I was recently the guest on a couple of podcasts to talk about my new book, Learning Digital Identity. The first was with Mathieu Glaude, CEO of Northern Block. The second was with Sam Curren, the Deputy CTO of Indicio. One of the fun things about these podcasts is how different they were despite being about the same book.

Mathieu focused on relationships, a topic I deal with quite a bit in the book since I believe we build identity systems to manage relationships, not identities. In addition we discussed the tradespace among privacy, authenticity, and confidentiality and how verifiable credentials augment and improve attribute-based access control (ABAC).

Sam and I discussed identity metasystems and why they're necessary for building identity systems that enable us living effective online lives. We also talked about Kim Cameron's Laws of Identity and how they help analyze identity systems and their features. Other topics include the relationship between DIDComm and identity, how self-sovereign identity relates to IoT, and the relationship between trust, confidence, and governance.

These were both fun conversations and I'm grateful to Mathieu and Sam for lining them up. If you'd like me to talk about the book on a podcast you or your company hosts, or even for a private employee event, I'm happy to. Just send me a note and we'll line it up.


Why Doesn't This Exist Already?

Riley Hughes and I recently had a conversation about the question "why doesn't SSI exist already?" Sometimes the question is asked because people think it's such a natural idea that it's surprising that it's not the norm. Other times, the question is really a statement "if this were really a good idea, it would exist already!" Regardless of which way it's asked, the answer is interesting since it's more about the way technology develops and is adopted than the technology itself.

Riley calls identity products "extremely objectionable" meaning that there are plenty of reasons for people to object to them including vendor lock-in, privacy concerns, security concerns, and consumer sentiment. I think he's right. You're not asking people and companies to try a new app (that they can easily discard if it doesn't provide value). You're asking them to change the fundamental means that they use to form, manage, and use online relationships.

Learning Digital Identity

The last chapter of my new book, Learning Digital Identity, makes a case that there is an existing identity metasystem that I label the Social Login (SL) metasystem. The SL metasystem is supported by OpenID Connect and the various "login in with..." identity providers. The SL metasystem is widely used and has provided significant value to the online world.

There is also an emerging Self-Sovereign Identity (SSI) metasystem based on DIDs and verifiable credentials. I evaluate each in terms of Kim Cameron’s Laws of Identity. In this evaluation, the SL metasystem comes out pretty good. I believe this accounts for much of its success. But it fails in some key areas like not supporting directed (meaning not public) identifiers. As a result of these failings, SLM has not been universally adoptable. Banks, for example, aren’t going to use Google Signin for a number of reasons.

The SSI Metasystem, on the other hand, meets all of Cameron’s Laws. Consequently, I think it will eventually be widely adopted and gradually replace the SL metasystem. The key word being eventually. The pace of technological change leads us to expect that change will happen very quickly. Some things (like the latest hot social media app) seem to happen overnight. But infrastructural change, especially when it requires throwing out old mental models about how things should work is much slower. The fact is, we’ve been building toward the ideas in SSI (not necessarily the specific tech) for several decades. Work at IIW on user-centric identity led to the SL metasystem. But the predominant mental model of that metasystem didn't change much from the one-off centralized accounts people used before. You still get an account administrated by the relying party, they've just outsourced the authentication to someone else. (Which means another party is intermediating the relationship). Overcoming that mental model, especially with entrenched interests, is a long slog.

In the 80s and 90s (pre-web) people were only online through the grace of their institution (university or company). So, I was windley@cs.ucdavis.edu and there was no reason to be anything else. When the web hit, I needed to be represented (have an account) in dozens or hundreds of places with whom I no longer had a long term relationship (like employee or student). So, we moved the idea of an account from workstation operating systems to the online service. And became Sybill.

When Kim first introduced the Laws of Identity, I literally didn’t understand what he as saying. I understood the words but not the ideas. Certainly not the ramifications. I don’t think many did. He’s the first person I know who understood the problems and set out a coherent set of principles to solve them. We used Infocards in our product at Kynetx and they worked pretty well. But because of how they were rolled out, people came to associate them strictly with Microsoft. The SL metasystem won o, offering the benefits of federation, without requiring that people, developers, or companies change their mental model.

Changing metasystems isn't a matter of technology. It's a social phenomenon. Consequently it's slow and messy. Here's my answer to the question "why doesn't this exist yet?": The arc of development for digital identity systems has been bending toward user-controlled, decentralized digital identity for decades. That doesn't mean that SSI, as currently envisioned, is inevitable. Just that something like it, that better complies with Kim's laws than the current metasystem, is coming. Maybe a year from now. Maybe a decade. No one can say. But it's coming. Plan and work accordingly.


SSI Doesn't Mean Accounts Are Going Away

Creditor's Ledger, Holmes McDougall

I saw a tweet that said (paraphrasing): "In the future people won't have accounts. The person (and their wallet) will be the account." While I appreciate the sentiment, I think reality is much more nuanced than that because identity management is about relationships, not identities (whatever those are).

Supporting a relationship requires that we recognize, remember, and react to another party (person, business, or thing). In self-sovereign identity (SSI), the tools that support that are wallets and agents. For people, these will be personal. For a business or other organization they'll be enterprise wallets and agents. The primary difference between these is that enterprise wallets and agents will be integrated with the other systems that the business uses to support the relationships they have at scale.

Remembering and reacting to another entity requires that you keep information about them for the length of the relationship. Some relationships, like the one I form with the convenience store clerk when I buy a candy bar, are ephemeral, lasting only for the length of the transaction. I don't remember much while its happening and forget it as soon as it's done. Others are long-lasting and I remember a great deal in order for the relationship to have utility.

So, let's say that we're living in the future where SSI is ubiquitous and I have a DID-based relationship with Netflix. I have a wallet full of credentials. In order for my relationship to have utility, they will have to remember a lot about me, like what I've watched, what devices I used, and so on. They will likely still need to store a form of payment since it's a subscription. I call that an account. And for the service Netflix provides, it's likely not optional.

Let's consider a different use case: ecommerce. I go to a site, select what I want to buy, supply information about shipping and payment, and submit the order. I can still create a DID-based relationship, but the information needed from me beyond what I want to buy can all come from my credentials. And it's easy enough to provide that I don't mind supplying it every time. The ecommerce site doesn't need to store any of it. They may still offer to let me create an account, but it's optional. No more required than the loyalty program my local supermarket offers. The relationship I create to make the purchase can be ephemeral if that's what I want.

What will definitely go away is the use of accounts for social login. In social login, large identity providers have accounts that are then used by relying parties to authenticate people. Note that authentication is about recognizing. SSI wallets do away with that need by providing the means for different parties to easily create relationships directly and then use verifiable credentials to know things about the other with certainty. Both parties can mutually authenticate the other. But even here, social login is usually a secondary purpose for the account. I have an account with Google. Even if I never use it for logging in anywhere but Google, I'll still have an account for the primary reasons I use Google.

Another thing that goes away is logging in to your account. You'll still be authenticated, but that will fade into the background as the processes we use for recognizing people (FIDO and SSI) become less intrusive and fade into the background. We have a feel for this now with apps on our smartphones. We rarely authenticate because the app does that and then relies on the smartphone to protect the app from use by unauthorized people. FIDO and SSI let us provide similar experiences on the web as well. Because we won't be logging into them, the idea of accounts will fade from people's consciousness even if they still exist.

I don't think accounts are going away anytime soon simply because they are a necessary part of the relationship I have with many businesses. I want them to remember me and react to me in the context of the interactions we've had in the past. SSI offers new ways of supporting relationships, especially ephemeral ones, that means companies need to store less. But for long-term relationships, your wallet can't be the account. The other party needs their own means of remembering you and they will do that using tools that look just like an account.


Photo Credit: Creditor's Ledger, Holmes McDougall from Edinburgh City of Print (CC BY 2.0)