Hannah Hearth

UX for Configuration Schema

UX for developer tooling is the hot new thing for designers! We’re beginning to see great resources like these guidelines and principles for designing a good user experience in command line tools.

And I recently stumbled upon a Stack Overflow question that asks, “Are configuration files considered user interfaces?” My opinion: Yes.

I work on HashiCorp’s UX design team for Consul, a service networking solution, which is a configuration-heavy application. On the Consul team, designers and engineers partner to create experiences where users may need to write a configuration file and then run a command that processes that configuration into network infrastructure.

So, even before the user gets the CLI half of that experience, how do we design a good UX when it comes to the configuration file itself?

UX Principles & Guidelines

To summarize, we should aim to design configuration experiences that:

  • Ease adoption
  • Reduce cognitive load
  • Reduce dependencies
  • Minimize error blast-radius
  • Deprecate gradually and gracefully

Ease adoption

Reduce any friction points in the user’s getting started experience. We can do that first by setting reasonable defaults on every property, so that the user doesn’t have to manually configure much or anything at all.

A great example of this is the Consul Helm Chart, a massive configuration file that has literally hundreds of options. Infinite customizations are great for advanced users, but imagine if you had to make a decision about every single configuration property just to get started. Instead, users don’t need to know how it works or what they want at all. They can get started without setting a single property:

$ helm install consul hashicorp/consul --create-namespace -n consul

As the user grows in their understanding of the product and their own needs, they can customize the values in their file and use more advanced properties.

Another nice feature you can provide is the ability for the user to reset their file to the original defaults. This might allow users to quickly spin up demos to try out functionality and then restart from scratch.

We can also ease adoption by providing common examples. You may want to document a “production-ready” example alongside a simpler example that gets them to a working demo faster.

Reduce cognitive load

In other words, don’t make the user think too hard. We can reduce cognitive load by choosing descriptive, concise names for properties, and by being consistent about it, two things that are easier said than done. To learn more about naming things well, I’ll urge you to check out Kate Gregory’s talk.

In general, my mini research process for naming includes:

  • Market research for what this thing is typically called in our industry across blogs, open source, or social media, depending on the project
  • What do our competitor products call it in their API, CLI, or UI?
  • How do our company’s other product lines refer to something similar?
  • What is the long-term vision for this feature, so that even if it looks one way today, it’s flexible for future growth?
  • What “word type” am I looking for? (descriptive, common noun? passive adjective? There are more options here than you might realize at first)
  • Sticky notes board with synonyms, with a pros and cons list for each

Reduce dependencies

Will the value of one property impact how the user can configure another property? Or, even worse, will it impact a completely separate set of properties in other configuration files?

Unlike CLIs, APIs, and UIs, configuration files don’t have a call and response option. There’s no way to inform a user immediately that their action (changing a property value) has had an unexpected effect on the validity of the rest of their file or application. So, this means we need to reduce dependencies between properties.

Minimize error blast-radius

Okay, so we want to reduce dependencies, but that’s not always avoidable. In cases where one property value would impact another property’s value options, we need to make sure the dependencies that exist are as clear as possible or even automate their changes.

For example, in Consul, when you add a config for an ingress gateway, it doesn’t simply work on its own. You also need to take the second step of adding to your intention configuration to ensure the ingress gateway can actually connect with the destination service.

To illustrate, here’s a file showing the ingress gateway config:

Kind = "ingress-gateway"
Name = "us-east-ingress"
Listeners = [
  {
    Port     = 3456
    Protocol = "tcp"
    Services = [
      {
        Name = "db"
      }
    ]
  }
]

And then separately, in your intention configuration file, here’s where you would add that ingress gateway:

Kind = "service-intentions"
Name = "db"
Sources = [
  {
    Name   = "us-east-ingress"
    Action = "allow"
  }
]

In user testing, we saw that people were likely to create the ingress gateway, but unlikely to know that they needed to update their intentions configuration as a result. Having intentions as a separately configurable entity provides a desirable level of security.

So, how can we minimize potential user error here? One idea we’ve brainstormed (haven’t implemented yet) is a command line helper that cleans up missing configuration. For example, if the CLI tool scans our configuration entries, and notices that we have an ingress gateway with a service listed for db, but we don’t have an intention indicating the gateway can connect with db, perhaps it can assume that was user error and automatically generate that config on behalf of the user, stage it, and ask the user for confirmation. If that’s too risky, the CLI tool could not go quite so far as automating, but simply lint: find gaps and share those so that the user can manage them for themselves.

Deprecate gradually

Ideally, we would always design schema to be flexible and extensible enough to can grow with our needs. But sometimes we ship an MVP without knowing what the future holds, so even though we may strive to maintain backwards compatibility, we may need to move forward with deprecated properties or deprecated configuration schema entirely.

When we need to deprecate a configuration property or type, our goal should be to do so gradually, transparently, and gracefully.

In terms of gradual, we should give users at least one major release of warning before deprecation. This might mean that we have to support two configuration properties or types that do roughly the same thing during a single release.

For example, when Consul announced a massive improvement to the ACL system, we needed to move toward a new configuration schema for tokens. In the old ACL model, users could specify a rule directly within a token, like this:

"Name": "Example Token",
"Type": "client",
"Rules": 
  "node" { 
    policy = "write" 
  } 
  "service" { 
    policy = "read" 
  }

In the new ACL system, users can create token config that references separate config for roles and policies, where the rule specifics live. This separation is helpful so that people can reuse the same policy sets across many tokens, but only need to change them in one place.

For example, a new token might look like this:

"Name": "Example Token",
"Description": "Just an example",
"Policies": ["Example Policy"]

While the policy configuration for that token looks like this:

"Name": "Example Policy",
"Rules": {
  service_prefix "web" {
    policy = "write"
  }
}

But what do we do about the old token format? Supporting both for eternity would cause confusion and reduce our ability to fully support features with the new UX.

Because the ACL system is the backbone of security for Consul deployments, we chose to leave the old format available for multiple major releases, providing warnings that it would be deprecated in a future release. That covers our gradual and transparent goals, but what about our goal of doing it gracefully?

Many customers, for instance, had thousands of ACL tokens. How might we enable them to migrate to the new token format before deprecation without forcing them into a manual process ripe with human error?

To gracefully deprecate a configuration feature, it’s always best to provide an automated option for migration that is not required.

For our ACL upgrade, we designed our automation option like this:

consul acl policy create -name "migrated-$id" -from-token $id

What’s next

Now that we understand the basics of UX for configuration schema, I’d love to share more about designing a UI based on a config-heavy application. Maybe that’ll be my next blog post.