Decentralizing Agendas and Decisions

Opening Circle at IIW 34

Last month was the 34th Internet Identity Workshop (IIW). After doing the last four virtually, it was spectacular to be back together with everyone at the Computer History Museum. You could almost feel the excitement in the air as people met with old friends and made new ones. Rich the barista was back, along with Burrito Wednesday. I loved watching people in small groups having intense conversations over meals, drinks, and snacks.

Also back was IIW's trademark open space organization. Open space conferences are workshops that don't have pre-built agendas. Open space is like an unconference with a formal facilitator trained in using open space technology. IIW is self-organizing, with participants setting the agenda every morning before we start. IIW has used open space for part or all of the workshop since the second workshop in 2006. Early on, Kaliya Young, one of my co-founders (along with Doc Searls), convinced me to try open space as a way of letting participants shape the agenda and direction. For an event this large (300-400 participants), you need professional facilitation. Heidi Saul has been doing that for us for years. The results speak for themselves. IIW has nurtured many of the ideas, protocols, and trends that make up modern identity systems and thinking.

Welcome to IIW 34!
Welcome to IIW 34! (click to enlarge)
mDL Discussion at IIW 34mDL Discussion at IIW 34
mDL Discussion at IIW 34 (click to enlarge)
Agenda Wall at IIW 34 (Day 1)
Agenda Wall at IIW 34 (Day 1) (click to enlarge)

Last month was the first in-person CTO Breakfast since early 2020. CTO Breakfast is a monthly gathering of technologists in the Provo-Salt Lake City area that I've convened for almost 20 years. Like IIW, CTO Breakfast has no pre-planned agenda. The discussion is freewheeling and active. We have just two rules: (1) no politics and (2) one conversation at a time. Topics from the last meeting included LoRaWAN, Helium network, IoT, hiring entry-level software developers, Carrier-Grade NATs, and commercial real estate. The conversation goes where it goes, but is always interesting and worthwhile.

When we built the University API at BYU, we used decentralized decision making to make key architecture, governance, and implementation decisions. Rather than a few architects deciding everything, we had many meetings, with dozens of people in each, over the course of a year hammering out the design.

What all of these have in common is decentralized decision making by a group of people that results in learning, consensus, and, if all goes well, action. The conversation at IIW, CTO Breakfast, and BYU isn't the result a few smart people deciding what the group needed to hear and then arranging meetings to push it at them. Instead, the group decides. Empowering the group to make decisions about the very nature and direction of the conversation requires trust and trust always implies vulnerability. But I've become convinced that it's really the best way to achieve real consensus and make progress in heterogeneous groups. Thanks Kaliya!


Is an Apple Watch Enough?

Apple iPhone and Watch

Last week, I conducted an experiment. My phone battery needed to be replaced and the Authorized Apple Service Center was required to keep it while they ordered the new battery from Apple (yeah, I think that's a stupid policy too). I was without my phone for 2 days and decided it was an excellent time to see if I could get by using my Apple Watch as my primary device. Here's how it went.

  • First things first. For this to be any kind of success you need a cellular plan for your watch and a pair of Airpods or other bluetooth earbuds.
  • The first thing I noticed is that the bathroom, standing in the checkout line, and other places are boring without the distraction of my phone to read news, play Wordle, or whatever.
  • Siri is your friend. I used Siri a lot more than normal due to the small screen.
  • I'd already set up Apple Pay and while I don't often use it from my watch under normal circumstances, it worked great here.
  • Answering the phone means keeping your Airpods in or fumbling for them every time there's a call. I found I rejected a lot of calls to avoid the hassle. (But never your's, Lynne!) Still, I was able to take and make calls just fine without a phone.
  • Voicemail access is a problem. You have to call the number and retrieve them just like it's 1990 or something. This messed with my usual strategy of not answering calls from numbers I don't recognize and letting them go to voicemail, then reading the transcript to see if I want to call them back.
  • Normal texts don't work that I could tell, but Apple Messages do. I used voice transcription almost exclusively for sending messages, but read them on the watch.
  • Most crypto wallets are unusable without the phone.
  • For the most part, I just used the Web for banking as a substitute for mobile apps and that worked fine. The one exception was USAA.
  • The problem with USAA was 2FA. Watch apps for 2FA are "companion apps" meaning they're worthless without the phone. For TOTP 2FA, I'd mirrored to my iPad, so that worked fine. I had to use the pre-set tokens for Duo that I'd gotten when I set it up. USAA uses Verisign's VIP. It can't be mirrored. What's more, USAA's recovery relies on SMS. I didn't have my phone, so that didn't work. I was on the phone with USAA for an hour trying to figure this out. Eventually USAA decided it was hopeless and told me to conduct banking by voice. Ugh.
  • Listening to music on the watch worked fine.
  • I read books on my Kindle, so that wasn't a problem.
  • There are a number of things I fell back to my iPad for. I've already mentioned 2FA, another is maps. Maps don't work on the watch.
  • I didn't realize how many pictures I take in a day, sometimes just for utility. I used the iPad when I had to.
  • Almost none of my IoT services or devices did much with the watch beyond issuing a notification. None of the Apple HomeKit stuff worked that I could see. For example, I often use a HomeKit integration with my garage door opener. That no longer worked without a phone.
  • Battery life on the watch is more than adequate in normal situations. But hour long phone calls and listening to music challenge battery life when it's your primary device.
  • I didn't realize how many things are tied just to my phone number.

Using just my Apple Watch with some help from my iPad was mostly doable, but there are still rough spots. The Watch is a capable tool for many tasks, but it's not complete. I can certainly see leaving my phone at home more often now since most things work great—especially when you know you can get back to your phone when you need to. Not having my phone with me feels less scary now.


Photo Credit: IPhone 13 Pro and Apple Watch from Simon Waldherr (CC BY-SA 4.0)


We Need a Self-Sovereign Model for IoT

Insteon isn't keeping the lights on!

Last week Insteon, a large provider of smart home devices, abruptly closed its doors. While their web site is still up and advertises them as "the most reliable and simplest way to turn your home into a smart home," the company seems to have abruptly shut down their cloud service without warning or providing a way for customers to continue using their products, which depend on Insteon's privacy cloud. High-ranking Insteon execs even removed their affiliation with Insteon from their LinkedIn profiles. Eek!

Fortunately, someone reverse-engineered the Insteon protocol a while back and there are some open-source solutions for people who are able to run their own servers or know someone who can do it for them. Home Assistant is one. OpenHAB is another.

Insteon isn't alone. Apparently iHome terminated its service on April 2, 2022. Other smarthome companies or services who have gone out of business include Revolv, Insignia, Wink, and Staples Connect.

The problem with Insteon, and every other IoT and Smarthome company I'm aware of is that their model looks like this:

Private Cloud IoT Model
Private cloud IoT model; grey box represents domain of control

In this model, you:

  1. Buy the device
  2. Download their app
  3. Create an account on the manufacturer's private cloud
  4. Register your device
  5. Control the device from the app

All the data and the device are inside the manufacturer's private cloud. They administer it all and control what you can do. Even though you paid for the device, you don't own it because it's worthless without the service the manufacturer provides. If they take your account away (or everyone's account, in the case of Insteon), you're out of luck. Want to use your motion detector to turn on the lights? Good luck unless they're from the same company1. I call this the CompuServe of Things.

The alternative is what I call the self-sovereign IoT (SSIoT) model:

Self-Sovereign IoT Model
Self-sovereign IoT model; grey box represents domain of control

Like the private-cloud model, in the SSIoT model, you also:

  1. Buy the device
  2. Download an app
  3. Establish a relationship with a compatible service provider
  4. Register the device
  5. Control the device using the app

The fact that the flows for these two models are the same is a feature. The difference lies elsewhere: in SSIoT, your device, the data about you, and the service are all under your control. You might have a relationship with the device manufacturer, but you and your devices are not under their administrative control. This might feel unworkable, but I've proven it's not. Ten years ago we built a connected-car platform called Fuse that used the SSIoT model. All the data was under the control of the person or persons who owned the fleet and could be moved to an alternate platform without loss of data or function. People used the Fuse service that we provided, but they didn't have to. If Fuse had gotten popular, other service providers could have provided the same or similar service based on the open-model and Fuse owners would have had a choice of service providers. Substitutability is an indispensable property for the internet of things.

All companies die. Some last a long time, but even then they frequently kill off products. Having to buy all your gear from a single vendor and use their private cloud puts your IoT project at risk of being stranded, like Insteon customers have been. Hopefully, the open-source solutions will provide the basis for some relief to them. But the ultimate answer is interoperability and self-sovereignty as the default. That's the only way we ditch the CompuServe of Things for a real internet of things.


Notes

  1. Apple HomeKit and Google Home try to solve this problem, but you're still dependent on the manufacturer to provide the basic service. And making the administrative domain bigger is nice, but doesn't result in self-sovereignty.


John Oliver on Surveillance Capitalism

Surveillance capitalism is a serious subject that can be hard to explain, let alone make interesting. I believe that it threatens our digital future. The question is "what to do about it?"

John Oliver's Last Week Tonight recently took on the task of explaining surveillance capitalism, how it works, and why it's a threat. I recommend watching it. Oliver does a great job of explaining something important, complex, and, frankly, a little boring in a way that is both funny and educational.

But he didn't just explain it. He took some steps to do something about it.

In researching this story, we realized that there is any number of perfectly legal bits of f—kery that we could engage in. We could, for example, use data brokers to go phishing for members of congress, by creating a demographic group consisting of men, age 45 and up, in a 5-mile radius of the U.S. Capitol, who had previously visited sites regarding or searched for terms including divorce, massage, hair loss and mid-life crisis.

The result is a collection of real data from their experiment that Oliver threatens to reveal if Congress doesn't act. The ads they ran were Marriage shouldn't be a prison, Can you vote twice?, and Ted Cruz erotic fan fiction. I'm not sure it will actually light a fire under so moribund an institution as Congress, but it's worth a shot!


Easier IoT Deployments with LoraWan and Helium

Helium Discharge Tube

I've been interested in the internet of things (IoT) for years, even building and selling a connected car product called Fuse at one point. One of the hard parts of IoT is connectivity, getting the sensors on some network so they can send data back to wherever it's aggregated, analyzed, or used to take action. Picos are a good solution for the endpoint—where the data ends up—but the sensor still has to get connected to the internet.

Wifi, Bluetooth, and cellular are the traditional answers. Each has their limitations in IoT.

  • Wifi has limited range and, outside the home environment, usually needs a separate device-only network because of different authentication requirements. If you're doing a handful of devices it's fine, but it doesn't easily scale to thousands. Wifi is also power hungry, making it a poor choice for battery-powered applications.
  • Bluetooth's range is even more limited, requiring the installation of Bluetooth gateways. Bluetooth is also not very secure. Bluetooth is relatively good with power. I've had temperature sensor on Bluetooth that ran over a year on a 2025 battery. But still, battery replacement can end up being rel maintenance headache.
  • Cellular is relatively ubiquitous, but it can be expensive and hard to manage. Batteries for for cell phones because people charge them every night. That's not reasonable for many IoT applications, so cellular-based sensors usually need to be powered.

Of course, there are other choices using specialized IoT protocols like ZWave, Zigbee, and Insteon, for example. These all require specialize hubs that must be bought, managed, and maintained. To avoid single points of failure, multiple hubs are needed. For a large industrial deployment this might be worth the cost and effort. Bottom line: Every large IoT project spends a lot of time and money designing and managing the connectivity infrastructure. This friction reduces the appeal of large-scale IoT deployments.

Enter LoraWAN, a long-range (10km), low-power wireless protocol for IoT. Scott Lemon told me about LoRaWAN recently and I've been playing with it a bit. Specifically, I've been playing with Helium, a decentralized LoRaWAN network.

Helium is a LoRaWAN network built from hotspots run by almost anyone. In one of the most interesting uses of crypto I've seen, Helium pays people helium tokens for operating hotspots. They call the model "proof of coverage". You get paid two ways: (1) providing coverage for a given geographical area and (2) moving packets from the radio to the internet. This model has provided amazing coverage with over 700,000 hotspots deployed to date. And Helium expended very little capital to do it, compared with building out the infrastructure on their own.

I started with one of these Dragino LHT65 temperature sensors. The fact that I hadn't deployed my own hotspot was immaterial because there's plenty of coverage around me.

LHT65 Temperature Sensor
LHT65 Temperature Sensor (click to enlarge)

Unlike a Wifi network, you don't put the network credentials in the device, you put the devices credentials (keys) in the network. Once I'd done that, the sensor started connecting to hotspots near my house and transmitting data. Today I've been driving around with it in my truck and it's roaming onto other hotspots as needed, still reporting temperatures.

Temperature Sensor Coverage on Helium
Temperature Sensor Coverage on Helium (click to enlarge)

Transmitting data on the Helium network costs money. You pay for data use with data credits (DC). You buy DC with the Helium token (HNT). Each DC costs a fixed rate of $0.00001 per 24 bytes of data. That's about $0.42/Mb, which isn't dirt cheap when compared to your mobile data rate, but you're only only paying for the data you use. For 100 sensors, transmitting 3 packets per hour for a year would cost $2.92. If each of those sensors needed a SIM card and cellular account, the comparable price would be orders of magnitude higher. So, the model fits IoT sensor deployments well. And the LHT65 has an expected battery life of 10 years (at 3 packets per hour) which is also great for large-scale sensor deployments.

Being able to deploy sensors without having to also worry about building and managing the connection infrastructure is a big deal. I could put 100 sensors up around a campus, a city, a farm, or just about anywhere and begin collecting the data from them without worrying about the infrastructure, the cost, or maintenance. My short term goal is to start using these with Picos and build out some rulesets and the UI for using and managing LoRaWAN sensors. I also have one of these SenseCAP M1 LoRaWAN gateways that I'm going to deploy in Idaho later (there are already several hotspots near my home in Utah). I'll let you know how all this goes.


Photo Credit: Helium discharge tube from Heinrich Pniok (CC BY-NC-ND 3.0). Image was cropped vertically.


The Ukrainian War, PKI, and Censorship

Flag of Russia with HTTPS bar overlaid

Each semester I have students in my distributed systems class read Rainbow's End, a science fiction book by Vernor Vinge set in the near future. I think it helps them imagine a world with vastly distributed computing infrastructure that is not as decentralized as it could be and think about the problems that can cause. One of the plot points involves using certificate authorities (CA) for censorship.

To review briefly, certificate authorities are key players in public key infrastructure (PKI) and are an example of a core internet service that is distributed and hierarchical. Whether your browser trusts the certificate my web server returns depends on whether it trusts the certificate used to sign it, and so on up the certificate chain to the root certificate. Root certificates are held in browsers or operating systems. If the root certificate isn't known to the system, then it's not trusted. Each certificate might be controlled by a different organization (i.e. they hold the private key used to sign it), but they all depend on confidence in the root. Take out the root and the entire chain collapses.

Certificate validation path for windley.com
Certificate validation path for windley.com (click to enlarge)

The war in Ukraine has made hypothetical worries about the robustness of the PKI all too real. Because of the sanctions imposed on Russia, web sites inside Russia can't pay foreign CAs to renew their certificates. Modern browsers don't just shrug this off, but issue warnings and sometimes even block access to sites with expired certificates. So, the sanctions threaten to cripple the Russian web.

In response, Russia has established its own root certificate authority (see also this from KeyFactor). This is not merely a homegrown CA, located in Russia, but a state-operated CA, subject to the whims and will of the Russian government (specifically the Ministry of Digital Development).

This is interesting from several perspectives. First, from a censorship perspective, it means that Russia can effectively turn off web sites by revoking their certificates, allowing the state to censor web sites for any reason they see fit. Hierarchical networks are especially vulnerable to censorship. And while we might view state-controlled CAs as a specific problem, any CA could be a point of censorship. Recall that while SWIFT is a private company, it is located in Belgium and subject to Belgian and European law. Once Belgium decided to sanction Russia, SWIFT had to go along. Similarly, a government could pass a law mandating the revocation of any certificate for a Russian company and CAs subject to their legal jurisdiction would go along.

From the perspective of users, it's also a problem. Only two browsers support the root certificate of the new Russian CA: the Russian-based Yandex and open-source Atom. I don't think it's likely that Chrome, Safari, Firefox, Brave, Edge, and others will be adding the new Russian root CA anytime soon. And while you can add certificates manually, most people will find that difficult.

Lastly, it's a problem for the Russian economy. The new Russian CA is a massive single point of failure, even if the Russian government doesn't use it to censor. Anonymous, state actors, and other groups can target the new CA and bring large swaths of the Russian internet down. So, state-controlled and -mandated CAs are a danger to the economy they serve. Russia's actions in response to the exigency of the war are understandable, but I suspect it won't go back even after the war ends. Dependence on a single state-run CA is a problem for Russia and its citizens.

State-controlled CAs further balkanize the internet. They put web sites at risk of censorship. They make life difficult for users. They create centralized services that threaten economic stability and vibrancy. In general, hierarchies are not good architectures for building robust, trustworthy, and stable digital systems. PKI has allowed us to create a global trust framework for the web. But the war in Ukraine has shone a light on its weaknesses. We should heed this warning to engineer more decentralized infrastructures that give us confidence in our digital communications.


Photo Credits:


Are Transactional Relationships Enough?

A Happy Couple with Young Children

We don't build identity systems to manage identities. Rather we build identity systems to manage relationships.

Given this, we might rightly ask what kind of relationships are supported by the identity systems that we use. Put another way, what kind of online relationships do you have? If you're like me, the vast majority are transactional. Transactional relationships focus on commercial interactions, usually buying and selling, but not always that explicit. Transactional relationships look like business deals. They are based on reciprocity.

My relationships with Amazon and Netflix are transactional. That's appropriate and what I expect. But what about my relationships on Twitter? You might argue that they are between friends, colleagues, or even family members. But I also classify them as transactional.

My relationships on Twitter exist within Twitter's administrative control and they facilitate the relationship in order to monetize it. Even though you’re not directly participating in the monetization and may even be unaware of it, it nevertheless colors the kind, frequency, and intimacy of the interactions you have. Twitter is building their platform and making product decisions to facilitate and even promote the kinds of interactions that provide the most profit to them. Your attention and activity are the product in the transaction. What I can do in the relationship is wholly dependent on what Twitter allows.

Given this classification, the bulk of our online relationships are transactional, or at least commercial. Very few are what we might call interactional relationships. Except for email. Email is one of the bright exceptions to this landscape of transactional, administrated online relationships. If Alice and Bob exchange emails, they both have administrative, transactional relationships with their respective email providers, but the interaction does not necessarily take place within the administrative realm of a single email provider. Let's explore what makes email different.

The most obvious difference between email and many other systems that support online relationships is that email is based on a protocol. As a result:

  • The user picks and controls the email server—With an email client, you have a choice of multiple email providers. You can even run your own email server if you like.
  • Data is stored on a server "in the cloud"—The mail client needn't store any user data beyond account information. While many email clients store email data locally for performance reasons, the real data is in the cloud.
  • Mail client behavior is the same regardless of what server it connects to—As long as the mail client is talking to a mail server that speaks the right protocol, it can provide the same functionality.
  • The client is fungible—I can pick my mail client on the basis of the features it provides without changing where I receive email.
  • I can use multiple clients at the same time—I can use one email client at home and a different email client at work and still see a single, consistent view of my email. I can even access my mail from a Web client if I don't have my computer handy.
  • I can send you email without knowing anything but your email address.—none of the details about how you receive and process email are relevant to me. I simple send email to your address.
  • Mail servers can talk to each other across ownership boundaries—I can use Gmail, you can use Yahoo! mail and the mail still gets delivered.
  • I can change email providers easily or run my own server.—I receive email windley.org even though I use Gmail. I used to run my own server. If Gmail went away, I could start running my own server again. And no one else needs to know.

In short, email was designed with the architecture of the internet in mind. Email is decentralized and protocological. Email is open—not necessarily open-source—but open in that anyone can build clients and servers that speak its core protocols: IMAP and SMTP. As a result, email maximizes freedom of choice and minimizes the chance of disruption.

The features and benefits that email provides are the same ones we want for every online relationship. These properties allow us to use email to create interactional relationships. The important insight is that systems that support interactional relationships can easily support transactional ones as well, where necessary. But the converse is not true. Systems for building transactional relationships don't readily support interactional ones.

I believe that email's support for richer relationships is a primary reason it has continued to be used despite the rise of social media and work platforms like Slack and Teams. I'm not saying email is the right platform for supporting modern, online interactional relationships—it's not. Email has obvious weaknesses, most prominently it doesn't support mutual authentication of the parties to a relationship and therefore suffers from problems like SPAM and phishing attacks. Less often noted, but equally disqualifying is that email doesn't easily lend itself to layering other protocols on top of if—creative uses of MIME notwithstanding.

I've been discussing appropriate support for authenticity and privacy in the last few posts, a key requirement for interactional relationships on the internet. A modern answer to interactional relationships has to support all kinds of relationships from pseudonymous, ephemeral ones to fully authenticated ones. DIDComm is a good candidate. DIDComm has the beneficial properties of email while also supporting mutual authentication for relationship integrity and layered protocols for flexible relationship utility. These properties provide an essential foundation for building rich online relationships that feel more life-like and authentic, provide better safety, and allow people to live effective online lives.


Photo Credit: A Happy Couple with Young Children from Arina Krasnikova (Pexels License)


Provisional Authenticity and Functional Privacy

Ancient strongbox with knights (Frederiksborg Museum)

Last week, I discussed the trade offs between privacy, authenticity, and confidentiality, concluding that the real trade off is usually between privacy and authenticity. That might seem like it pits privacy against accountability and leaves us with a Hobson's choice where good privacy is impossible if we want to prevent fraud. Fortunately, the trade off is informed by a number of factors, making the outcome not nearly as bleak as it might appear at first.

Authenticity is often driven by a need for accountability1. Understanding accountability helps navigate the spectrum of choices between privacy and authenticity. As I mentioned last week, Know Your Customer (KYC) and Anti-Money Laundering (AML) regulations require that banks be able to identify the parties to transactions. That's why, when you open a bank account, they ask for numerous identity documents. The purpose is to enable law enforcement to determine the actors behind transactions deemed illegal (hopefully with a warrant). Technically, this is a bias toward authenticity at the cost of privacy. But there are nuances. The bank collects this data but doesn't need to use it unless there's a question of fraud or money laundering2.

The point is that while in a technical sense, the non-repudiability of bank transactions makes them less private, there aren't a lot of people who are concerned about the privacy of their banking transactions. The authenticity associated with those transactions is provisional or latent3. Transactions are only revealed to outside parties when legally required and most people don't worry about that. From that perspective, transactions with provisional authenticity are private enough. We might call this functional privacy.

I've used movie tickets several times as an example of an ephemeral transaction that doesn't need authenticity to function and thus is private. But consider another example where an ephemeral, non-authenticated transaction is not good enough. A while back our family went to the ice-skating rink. We bought a ticket to get in, just like at the movies. But each of us also signed a liability waiver. That waiver, which the skating rink required to reduce their risk, meant that the transaction was much less private. Unlike the bank, where I feel confident my KYC data is not being shared, I don't know what the skating rink is doing with the data.

This is a situation where minimal disclosure doesn't help me. I've given away the data needed to hold me accountable in the case of an accident. No promise was made to me about what the rink might do with it. The only way to hold me accountable and protect my privacy is for the authenticity of the transaction to be made provisional through agreement. If the skating rink were to make strong promises that the data would only be used in the event that I had an accident and threatened to sue, then even though I'm identified to the rink, my privacy is protected except in clearly defined circumstances.

Online we can make the authenticity's provisionality even more trustworthy using cryptographic commitments and key escrow. The idea is that any data about me that's needed to enforce the waiver would be hidden from the rink, unchangeable by me, and only revealed if I threaten to sue. This adds a technical element and allows me to exchange my need to trust the rink with trusting the escrow agent. Trusting the escrow agent might be more manageable than trusting every business I interact with. Escrow services could be regulated as fiduciaries to increase trust.

Provisional authenticity works when the data is only needed in a low-probability events. Often, however, data is actively used to provide utility in the relationship. In these cases, confidentiality agreements, essentially NDAs, are the answer to providing functional privacy and also providing the authenticity needed for accountability and utility. These agreements can't be the traditional contracts of adhesion where, rather than promising to protect confidentiality, companies force people to consent to surveillance. Agreements should be written to ensure that data is always shared with the same promise of confidentiality that existed in the root agreement.

Provisional authenticity and data NDAs provide good tools for protecting functional privacy without giving up accountability and relationship utility. Functional privacy and accountability are both necessary for creating digital systems that respect and protect people.


Notes

  1. Beyond accountability, a number of businesses make their living surveilling people online or are remunerated for aiding in the collection of data that informs that surveillance. I've written about surveillance economy and ideas for dealing with it previously.
  2. Note that I said need. I'm aware that banks likely use it for more than this, often without disclosing how it's being used.
  3. I'm grateful to Sam Smith for discussions that helped me clarify my thinking about this.

Photo Credit: Ancient strongbox with knights (Frederiksborg Museum) from Thomas Quine (CC BY 2.0)


Privacy, Authenticity, and Confidentiality

Who's Who in New Zealand 228

At a recent Utah SSI Meetup, Sam Smith discussed the tradeoff between privacy, authenticity and confidentiality. Authenticity allows parties to a conversation to know to whom they are talking. Confidentiality ensures that the content of the conversation is protected from others. These three create a tradespace because you can't achieve all three at the same time. Since confidentiality is easily achieved through encryption, we're almost always trading off privacy and authenticity. The following diagram illustrates these tradeoffs.

Privacy, authenticity, and confidentiality must be traded against each other.
Privacy, authenticity, and confidentiality must be traded against each other. (click to enlarge)

Authenticity is difficult to achieve in concert with privacy because it affects the metadata of a conversation. Often it requires others besides the parties to a conversation potentially knowing who else is participating—that is, it requires non-repudiation. Specifically, if Alice and Bob are communicating, not only does Alice need to know she's talking to Bob, but also needs the ability to prove to others that she and Bob were communicating.

As an example, modern banking laws include a provision known as Know Your Customer (KYC). KYC requires that banks be able to identify the parties to transactions. That's why, when you open a bank account, they ask for numerous identity documents. The purpose is to enable law enforcement to determine the actors behind transactions deemed illegal (hopefully with a warrant). So, banking transactions are strong on authenticity, but weak on privacy1.

Authenticity is another way of classifying digital relationships. Many of the relationships we have are (or could be) ephemeral, relying more on what you have than who you are. For example, a movie ticket doesn't identify who you are but does identify what you are: one of N people allowed to occupy a seat in a specific theater at a specific time. You establish an ephemeral relationship with the ticket taker, she determines that your ticket is valid, and you're admitted to the theater. This relationship, unless the ticket taker knows you, is strong on privacy, weak on authenticity, and doesn't need much confidentiality either.

A credit card transaction is another interesting case that shows the complicated nature of privacy and authenticity in many relationships. To the merchant, a credit card says something about you, what you are (i.e., someone with sufficient credit to make a purchase) rather than who you are—strong on privacy, relatively weak on authenticity. To be sure, the merchant does have a permanent identifier for you (the card number) but is unlikely to be asked to use it outside the transaction.

But, because of KYC, you are well known to your bank, and the rules of the credit card network ensure that you can be identified by transaction for things like chargebacks and requests from law enforcement. So, this relationship has strong authenticity but weaker privacy guarantees.

The tradeoff between privacy and authenticity is informed by the Law of Justifiable Parties (see Kim Cameron's Laws of Identity) that says disclosures should be made only to entities who have a need to know. Justifiable Parties doesn't say everything should be maximally private. But it does say that we need to carefully consider our justification for increasing authenticity at the expense of privacy. Too often, digital systems opt for knowing who when they could get by with what. In the language we're developing here, they create authenticated, permanent relationships, at the expense of privacy, when they could use pseudonymous, ephemeral relationships and preserve privacy.

Trust is often given as the reason for trading privacy for authenticity. This is the result of a mistaken understanding of trust. What many interactions really need is confidence in the data being exchanged. As an example, consider how we ensure a patron at a bar is old enough to drink. We could create a registry and have everyone who wants to drink register with the bar, providing several identifying documents, including birthdate. Every time you order, you'd authenticate, proving you're the person who established the account and allowing the bar to prove who ordered what when. This system relies on who you are.

But this isn't how we do it. Instead, your relationship with the bar is ephemeral. To prove you're old enough to drink you show the bartender or server an ID card that includes your birthday. They don't record any of the information on the ID card, but rather use the birthday to establish what you are: a person old enough to drink. This system favors privacy over authenticity.

The bar use case doesn't require trust, the willingness to rely on someone else to perform actions on the trustor's behalf, in the ID card holder2. But it does require confidence in the data. The bar needs to be confident that the person drinking is over the legal age. Both systems provide that confidence, but one protects privacy, and the other does not. Systems that need trust generally need more authenticity and thus have less privacy.

In general, authenticity is needed in a digital relationship when there is a need for legal recourse and accountability. Different applications will judge the risk inherent in a relationship differently and hence have different tradeoffs between privacy and authenticity. I say this with some reticence since I know in many organizations, the risk managers are incredibly influential with business leaders and will push for accountability where it might not really be needed, just to be safe. I hope identity professionals can provide cover for privacy and the arguments for why confidence is often all that is needed.


Notes

  1. Remember, this doesn't mean that banking transactions are public, but that others besides the participants can know who participated in the conversation.
  2. The bar does need to trust the issuer of the ID card. That is a different discussion.

Photo Credit: Who's Who in New Zealand 228 from Schwede66 (CC BY-SA 4.0)


What is Privacy?

Hide and seek (2)

Ask ten people what privacy is and you'll likely get twelve different answers. The reason for the disparity is that your feelings about privacy depend on context and your experience. Privacy is not a purely technical issue, but a human one. Long before computers existed, people cared about and debated privacy. Future U.S. Supreme Court Justice Louis Brandeis defined it as "the right to be left alone" in 1890.

Before the Web became a ubiquitous phenomenon, people primarily thought of privacy in terms of government intrusion. But the march of technological progress means that private companies probably have much more data about you than your government. Ad networks and the valuation of platforms based on how well they display ads to their users has led to the wide-spread surveillance of people, usually without them knowing the extent or consequences.

The International Association of Privacy Professionals (IAPP) defines four classes of privacy:

Bodily Privacy—The protection of a person's physical being and any invasion thereof. This includes practices like genetic testing, drug testing, or body cavity searches.

Communications Privacy—The protection of the means of correspondence, including postal mail, telephone conversations, electronic mail, and other forms of communication.

Information Privacy—The claim of individuals, groups, or organizations to determine for themselves when, how, and to what extent information about them is communicated to others.

Territorial Privacy—Placing limitations on the ability of others to intrude into an individual's environment. Environment can be more than just the home, including workplaces, vehicles, and public spaces. Intrusions of territorial privacy can include video surveillance or ID checks.

While Bodily and Territorial Privacy can be issues online, Communications and Information Privacy are the ones we worry about the most and the ones most likely to have a digital identity component. To begin a discussion of online privacy, we first need to be specific about what we mean when we talk about online conversations.

Each online interaction consists of packets of data flowing between parties. For our purposes, consider that a conversation. Even a simple Internet Control Message Protocol (ICMP) echo request packet is a conversation as we're defining it—the message needn't be meaningful to humans.

Conversations have content and they have metadata—the information about the conversation. In an ICMP echo, there's only metadata—the TCP and ICMP headers1. The headers include information like the source and destination IP addresses, the TTL (time-to-live), type of message, checksums, and so on. In a more complex protocol, say SMTP for email, there would also be content—the message—in addition to the metadata.

Communication Privacy is concerned with metadata. Confidentiality is concerned with content2. Put another way, for a conversation to be private, only the parties to the conversation should know who the other participants are. More generally, privacy concerns the control of any metadata about an online conversation so that only parties to the conversation know the metadata.

Defined in this way, online privacy may appear impossible. After all, the Internet works by passing packets from router to router, all of which can see the source IP address and must know the destination IP address. Consequently, at the packet level, there's no online privacy.

But consider the use of TLS (Transport Layer Security) to create an encrypted web channel between the browser and the server . At the packet level, the routers will know (and the operators of the routers can know) the IP addresses of the encrypted packets going back and forth. If a third party can correlate those IP addresses with the actual participants, then the conversation isn't absolutely private.

But other metadata—the headers—is private. Beyond the host name and information needed to set up the TLS connection, all the rest of the headers are encrypted. This includes cookies and the URL path. So, someone eavesdropping on the conversation will know the server name, but not the specific place on the site the browser connected to. For example, suppose Alice visits Utah Valley University's Title IX office (where sexual misconduct, discrimination, harassment, and retaliation are reported) by pointing her browser at uvu.edu/titleix. With TLS an eavesdropper could know Alice connected to Utah Valley University, but not know that she connected to web site for the Title IX office because the path is encrypted.

Extending this example, we can easily see the difference between privacy and confidentiality. If the Title IX office were located at a subdomain of uvu.edu, say titleix.uvu.edu, then an eavesdropper would be able to tell that Alice had connected to the Title IX web site, even if the conversation were protected by a TLS connection. The content that was sent to Alice and that she sent back would be confidential, but the important metadata showing that Alice connected to the Title IX office would not be private.

This example introduces another important term to this discussion: authenticity. If Alice goes to uvu.edu instead of titleix.uvu.edu then an eavesdropper cannot easily establish the authenticity of who Alice is speaking to at UVU—there are too many possibilities. Depending on how easily correlated Alice's IP number is with Alice, an eavesdropper can't reliably authenticate Alice either. So, while Alice's conversation with the Title IX office through uvu.edu is not absolutely private, it is probably private enough because we can't easily authenticate the parties to the conversation from the metadata alone.

Information Privacy, on the other hand, is distinguished from Communications Privacy online because it usually concerns content, rather than metadata. When Alice connects with the Title IX office, to extend the example from the previous paragraph, she might transmit data to the office, possibly by filling out Web forms, or even just by authenticating, allowing the Web site to identify Alice and correlate other information with her. All of this is done inside the confidential channel provided by the TLS connection. But Alice will still be concerned about the privacy of the information she's communicated.

Information privacy quickly gets out of the technical realm and into policy. How will Alice's information be handled? Who will see it? Will it be shared? With whom and under what conditions? These are all policy questions that impact the privacy of information that Alice willingly shared. Information privacy is generally about who controls disclosure.

Communications Privacy often involves the involuntary collection of metadata—surveillance. Information Privacy usually involves policies and practices for handling data that has been voluntarily provided. Of course, there are places where these two overlap. Data created from metadata becomes personally identifying information (PII), subject to privacy concerns that might be addressed by policy. Still, the distinction between Communications and Information Privacy is useful.

The intersection of Communications and Information privacy is sometimes called Transactional3 or Social Privacy. Transaction privacy is worth exploring as a separate category because transactional privacy is always evaluated in a specific context. Thus, it speaks to people's real concerns and their willingness to trade off privacy for a perceived benefit in a specific transaction. Transactional privacy concerns can be more transient.

The modern Web is replete with transactions that involve both metadata and content data. The risks of this data being used in ways that erode individual privacy are great. And because the mechanisms are obscure—even to Web professionals—people can't make good privacy decisions about the transactions they engage in. Transactional privacy is consequently an important lens for evaluating the privacy rights of people and they ways technology, policy, regulation, and the law can protect them.

With privacy, we're almost never dealing with absolutes. Absolute digital privacy can be achieved by simply never using the Internet. But that also means being absolutely cut off from online interaction. Consequently, privacy is a spectrum, and we must choose where we should be on that spectrum, taking all factors into consideration. Since confidentiality is easily achieved through encryption, we're almost always trading off privacy and authentication. More on that next week.


Notes

  1. ICMP packets can have data in the packet, but it's optional and almost never set.
  2. This distinction between privacy and confidentiality isn't often made in casual conversation where people often say they want privacy when they really mean confidentiality.
  3. I have seen the term "transactional privacy" used to describe the idea of people being sellers of their own data outright. That is not the sense I'm using it. I'm speaking more generally of the interactions that take place online.

Photo Credit: Hide and seek (2) from Ceescamel (CC BY-SA 4.0)