Enterprise Application Development 0x00 - Configuration

2021-09-19

Ruffy

C# 💠 .NET Coding EAD Development Configuration Clean Code 🏠 Architecture Enterprise

1. Introduction
2. Example Application
1. 2.1. Enterprise Application Development 0x00
  1. 2.1.1. Application Configuration
  2. 2.1.2. Configuration Sanity Checks
    1. 2.1.2.1. Sanity Check Checklist

Introduction

Welcome to the first article in the series of EAD (= Enterprise Application Development). Applications developed and designed for enterprise environments are often complicated, hard to maintain, tricky to deploy, barely monitorable and often suffer from a mixture of “why is this even a service” and “how the fuck can one service do so many things”. This article series is supposed to guide you through essential techniques and best practices, to prevent applications from being tech-debt right from the beginning. Each article will tackle another aspect that makes an application a charm for developers, operations, administrators and users. The first article is about properly configuring services and applications.

Example Application

To have a sample application that guides you through this article-series, let’s just assume that you and I were asked to develop an application together. Our customer wants us to write a simple HTTP Proxy for any given website (let’s just say we want to proxy to blog.rtrace.io). So, we’re receiving HTTP requests from our clients, forward these to the respective site, wait for the site to respond and finally send the response back to the client. In other words, we want to implement a simple proxy application for a single site.

Our task seems simple, so let’s add some requirements to spice things up a little bit:

Application is listening on Port 80 (HTTP)
Application is listening on Port 443 (HTTPS) if a certificate is configured
The proxied target-website (e.g., blog.rtrace.io) shall be configurable
To prevent the application from being rate-limited on our target-website (e.g., blog.rtrace.io)
- app needs to (optionally) proxy its traffic through the TOR network
- app needs to (optionally) proxy its traffic through any other proxy
To prevent high latency and/or temporary upstream downtimes
- app needs to (optionally) cache the responses from the proxied websites
- the caching duration needs to be configurable

Enterprise Application Development 0x00

Application Configuration

I have been a Software-Developer for over 10 years now. And one thing that fascinates me until this day, is the insane impact configuration has. It’s non-trivial to make an application configurable without being an absolute nightmare for testers, developers, and its users. Things get a lot more complicated if there is no GUI where users can change the configuration. Especially if the only way to change the behavior of an application is to edit configuration files on the filesystem of their machine. Users might be confronted with JSON, YAML or TOML for the first time, also users might not necessarily be experts in the domain of our application. This forces us as developers to build even more robust configuration formats and parsers to make it as simple for our users as possible. While we’re talking about users, we’re also talking about system-administrators that install our application onto a production web server. Don’t be fooled, also system-administrators sometimes have similar problems with these things.

To better visualize what I mean, I wanted to share a very special bug-case with you. It’s a case opened by the IT-department of a company using my open source Prometheus Network Device Exporter. NDE is a very simple “service” written in Python parsing host-files (such as /etc/hosts) and pinging those to create uptime metrics of these devices (pingable = online, not pingable = offline).

Hello unclearParadigm,
thank you for open sourcing NDE. We’re using NDE to help us lower our power consumption over night.
We identify PCs and notebooks in our offices that have not been shutdown after office-hours and track these.
Our company incentivizes employees to shutdown their work-equipment after work by paying additional 10$ monthly if they do.
We recently updated the config to use CustomNetworkDevices rather than reading the file from /etc/hosts.
This turned out to break NDE. Here’s the configuration of CustomNetworkDevices we’re trying to use.
CustomNetworkDevices = { “192.168.1.10”, “192.168.1.11”, “192.168.1.12” }
When starting the application an error message “Unexpected token in configuration.py“ appears. Please advise.
Regards,

Well, I have to admit, I was kind of happy that someone actually uses NDE. It’s really a great use for such a simple Prometheus Exporter. What the creator of this issue was not aware, is that the configuration file is not JSON. It’s just a Python file that I abused to configure NDE (see Source Code here). Well yes, what the administrator tried seemed logical. Whose fault is it now? Should I have made it more obvious? How should the admin know that this is python? Do I as developer expect users of NDE to know Python? Should I have written it somewhere? Or should I provide an alternative configuration file allowing JSON configuration?

Enough about NDE. This section of the article explains good configuration practices for Developers. But what do I consider good? Here’s an attempt to summarize the way I’d want an application to be configurable if I were the user of it.

Good configuration …

is well documented
can be configured from multiple sources (if necessary)
can be configured in various configuration formats (depending on the user preferences)
does not break between incremental releases
gives users a good idea what exactly is improperly configured (human readable error messages)
sanity/plausibility-checks the configured parameters
has sane default configurations that can be overridden if necessary

Configuration Sources

An application often requires multiple layers and sources of configuration. Think of this example. You have an application where you want some settings configured by administrators or authorized personnel, and other (less important settings) by users. An easy way to achieve this, is to separate the configuration files into locations with different file-permissions. The default configuration - accessible for administrators - is located in /etc/myservice/config.json and user-specific configuration is located in the corresponding home directory ~/.config/myservice/config.json (while we’re on it, please follow the XDG spec for storing per-user configuration). Another use case for separating configuration might be environment variables. Think about the good ol’ HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables. On POSIX compliant operating systems these variables turned out to be standards. So it makes sense to also read these variables from the Environment variables if available.

Some settings are read from configuration files. Others maybe read from Environment Variables of your Operating System. Another potential source is a Service Discovery broker like Consul. Maybe a service ships with a Sqlite database, where the configuration is persisted, maybe it’s trying to connect to a remote-database? There are many ways services can get their configuration. Thinking of configuration sources is a great way. So if you’re a developer and hear “multiple sources”, you might already see that you’ll need multiple configuration-providers for these sources. If you’re an OOP programmer that provider is most likely a separate class.

Create a typed model

public class MyProxyConfiguration {
  public int HttpListenPort { get; set; }
  public string ProxiedResource { get; set; }
  public TimeSpan CachingDurationInSeconds { get; set; }
}

This simple configuration object assumes a “flat” organized structure. For more complex and larger configurations it might be useful to introduce configuration hierarchies and/or nested configuration objects to tame the size of such classes. The biggest benefit of such classes might be that all available configuration options for an application are visible in just one place. No need to go through every file in your codebase to see whether a specific configuration option is used or not. Unused fields of such classes are detectable by linters and IDEs long before compile- and run-time. Another nice side-effect of such classes is, that we have a well-known schema that we can use to de-serialize our configurations to (e.g., from files). Also, this is a shared model that all our sources need to provide.

Combining multiple providers

Well, now we have different providers for different sources of configuration options. Each of them providing configuration options. Now we need to combine these, so they fill all fields of our previously created configuration class.

In .NET you could combine such providers with the built-in ConfigurationBuilder. The order defines which sources have a higher priority. Latter added Configuration sources take precedence over earlier added sources.

private static void ConfigureAppConfiguration(HostBuilderContext context, IConfigurationBuilder builder) {
  builder
    .AddTomlFile("/etc/myproxy/systemdefaults.toml", true, false)
    .AddTomlFile("/etc/myproxy/adminstrator_overrides.toml", true, true)
    .AddJsonFile("~/.config/myproxy/usersettings.json", true, true)
    .AddYamlFile("~/.config/myproxy/usersettings.yaml", true, true)
    .AddUserSecrets<Startup>(true)
    .AddEnvironmentVariables();
}

If your application is supposed to run as a container, make sure to also allow configuration through environment variables.

Configuration hot-reloading

Hot reloading is a simple mechanism that allows the configuration to be edited while the application is running. As soon as the configuration source is changed (e.g., saved), the program automatically reloads the configuration and continues to run without any noticeable downtime. Most applications nowadays do not support hot-reloading of configuration. It’s also not mandatory, however it’s a nice goody, especially for services where downtime is not tolerable, or restarts take a lot of time. And some existing configuration frameworks have this as built-in feature. Let’s take the Options pattern from the .NET framework.

First you configure the sources for your application within the ConfigurationBuilder to have the “reloadOnChange” parameter set to true like so.

private static void ConfigureAppConfiguration(HostBuilderContext context, IConfigurationBuilder builder) {
  builder
    .AddTomlFile("/etc/myproxy/systemdefaults.toml", required: true, reloadOnChange: true);
}

Later you just use the ASP.NET Core dependency injection container (Microsoft.Extensions.DependencyInjection), and tell it to inject our previously defined MyProxyConfiguration model, based on the given Configuration field (IConfiguration injected into the Startup.cs).

services.Configure<MyProxyConfiguration>(Configuration);

Under the hood, this will inject a caching Proxy to the configuration, which will return cached values for already accessed properties, or read from the configuration source(s) if it hasn’t been read so far. If one of the configuration sources emits the reloadOnChange-event, the proxy clears itself.

.NET calls this the Option Pattern. In my opinion that’s one of the most elegant ways to solve configuration at least for applications written in C#.

Configuration Sanity Checks

having multiple sources of configuration is convenient. However, it might turn out painful to debug when something is not configured correctly. That’s why I prefer to do Configuration Sanity checks on Start of the application that crashes the application if something is missing or just improperly configured. Also, make sure to print an Error-Message (e.g., to stderr with what exactly is misconfigured or missing).

Sanity Check Checklist

Ensure that DateTime and Timespan types entered as strings are valid (enforce ISO-8601 like YYYY-MM-DDTHH:mm:ssZ)
are ConnectionStrings parseable (left empty? not set at all?)
are Listen Ports in a valid range? (max. numbers? restricted range?)
are all fields set, are some fields missing?