Configuration files are user interfaces

121 points by todsacerdoti 6 hours ago

comex 2 hours ago

Sounds interesting as a format, but the implementation is a big supply-chain attack risk if you're not already in the JVM ecosystem.

This is because the only implementation is written in Kotlin. There are Python and Rust packages, but they both just link against the Kotlin version.

How do you build the Kotlin version? Well, let's look at the Rust package's build.rs:

https://github.com/kson-org/kson/blob/main/lib-rust/kson-sys...

It defaults to simply downloading a precompiled library from GitHub, without any hash verification.

You can instead pass an environment variable to build libkson from source. However, this will run the ./gradlew script in the repo root, which… downloads an giant OpenJDK binary from GitHub and executes it. Later in the build process it does the same for pixi and GraalVM.

The build scripts also only support a small list of platforms (Windows/Linux/macOS on x86_64/arm64), and don't seem to handle cross-compilation.

The compiled library is 2MB for me, which is actually a lot less than I was expecting, so props for that. But that's fairly heavy by Rust standards.

wofo an hour ago

Glad you liked the format. I hope we can close the implementation gaps as development advances, and I'd love to see native libraries sprout for all conceivable programming languages!
Edit: point taken about verifying checksums, just created an issue for it (https://github.com/kson-org/kson/issues/222)

theamk 5 hours ago

KSON proudly claims "no whitespace sensitivity", which means "misleading indentation" is back. And it's pretty light on syntax, so there are going to be plenty of mistakes made here.

Here is an example I made in a few minutes:

    ports:
       - 80
       - 8000 - 10000
       - 12000 -
       - 14000

Guess how it parses? answer:

    {"ports":[80,8000,10000,12000,[14000]]}

andrewla 4 hours ago

I actually prefer a syntax that is whitespace-sensitive, but should not give meaning to whitespace. That is, the whitespace should not perform a semantic duty, but should be required to be correct.
This is roughly equivalent to saying that a linter can transform the AST of the language into a canonical representation, and the syntax will be rejected unless it matches the canonical representation (modulo things like comments or whitespace-for-clarity).
- wofo 3 hours ago
  
  This sounds like a stricter version of KSON's current warnings for misleading indentation... maybe KSON should have an opt-in feature for this. Thanks for the idea!
  
  rfl890 an hour ago
  
  What's KSON?
  
  satiric a few seconds ago
  
  The configuration language that is the subject of the article
- giveita 2 hours ago
  
  So, opposite of python?
wofo 4 hours ago
Hmmm that's interesting. KSON actually shows a warning when there's misleading indentation, exactly to prevent this sort of thing! It seems like the detection logic only considers indents after a new line, so it doesn't complain in this case. I just opened an issue to see if things can be tightened up a bit (https://github.com/kson-org/kson/issues/221).
To see misleading indentation warnings in action, you can try the following snippet in the playground (https://kson.org/playground/) and will properly get a warning:
```
    ports:
       - 80
       - 8000
         - 10000
       - 12000
       - 14000
```
Next to that, note that KSON already has an autoformatter, which also helps prevent misleading indentation in files.
SebastianKra 4 hours ago

Assuming this takes off, there would be a prettier-plugin that corrects any weird formatting.
When I think about it, any language should come with a strict, non-configurable built-in formatter anyways.
- stronglikedan 4 hours ago
  
  > any language should come with a strict, non-configurable built-in formatter
  Would that be on the language, or the IDEs that support it? Seems out of scope to the language itself, but maybe I'm misunderstanding.
kookamamie 2 hours ago

Yes, the syntax makes no intuitive sense, whatsoever.

taeric an hour ago

Meanwhile, I peek at my emacs config and continue to wonder why people don't just embrace a programming language.

Yes, there are bad consequences that can happen. No, you don't dodge having problems by picking a different data format. You just pick different problems. And take away tools from the users to deal with them.

skobes 22 minutes ago

Why is TOML allegedly "too minimal" for a "small configuration file"?

kccqzy 4 hours ago

Configuration files need to be powerful programming languages (in terms of expressiveness) while being restricted (in terms of network and I/O and non-determinism). We need to aim very high for configuration languages especially when we treat them like user interfaces. Look at Cue (https://cuelang.org/), Starlark or Dhall (https://dhall-lang.org/) for inspiration, not JSON, unless your configuration file is almost always written programmatically.

candiddevmike 4 hours ago

Or Jsonnet (https://jsonnet.org), if you do like JSON but want less quoting.
madeofpalk 3 hours ago

Any configuration language that doesn't support strict/user/explicit types is worthless (ahem jsonnet).
The idea of configuring something but not actually having any sort of assurances that what you're configuring is correct is maddening. Building software with nothing but hopes and dreams.
arccy 3 hours ago

expressiveness unfortunately usually means that while you can read the output value, you lose the ability to modify it programmatically...

maxbond 10 minutes ago

Is the K in KSON for Kotlin? Does the S stand for anything? I skimmed the docs but didn't find anything addressing the name.

Xss3 4 hours ago

JSON5 is good enough that it works for frontend devs, backend, qa, firmware, data science, chemists, optical engineers, and the hardware team, in my org at least. Interns pick up on it quickly.

The comment option gives enough space for devs to explain new options flags and objects included to those familiar enough to be using it.

For customer facing configurations we build a UI.

stronglikedan 4 hours ago

In the kitchen sink example (https://json5.org/), they say:
> "backwardsCompatible": "with JSON",
But in that same example, they have a comment like this:
> // comments
Wouldn't that make it not compatible with JSON?
- crazygringo 3 hours ago
  
  It's confusing.
  From what I understand, it's "backwards-compatible" with JSON because valid JSON is also valid JSON5.
  But it's not "forwards-compatible" precisely because of comments etc.
- kiitos 35 minutes ago
  
  Backwards-compatible means the new thing can handle the old things. Here JSON5 is backwards-compatible with JSON.
  Forwards-compatible means the old thing can handle the new things. Here JSON is not forwards-compatible with JSON5.
- rapfaria 4 hours ago
  
  Your existing JSON < 5 will work with json5, not the other way around
- arvindh-manian 4 hours ago
  
  It’s a superset of JSON. I guess they mean it’s backwards compatible in terms of reading existing JSONs?

your_fin 39 minutes ago

I came across https://kdl.dev recently, and it has become my favored yaml-replacement when normal code doesn't work.

The data model is closer to XML than JSON, though, so unfortunately it's not a drop-in replacement for yaml.

Small sample:

  package { 
    name my-pkg 
    version "1.2.3"
    dependencies { 
      // Nodes can have standalone values as well as // key/value pairs. 
      lodash "^3.2.1" optional=#true alias=underscore 
    }
 }

olejorgenb 34 minutes ago

I wish they had some sort of include/composable file mechanism though

ruuda 5 hours ago

From the application point of view, recently I'm converging on this: define data structures for your config. Ensure it can be deserialized from json and toml. (In Rust this is easy to do with Serde; in Python with Pydantic or dataclasses.) Users can start simple and write toml by hand. If you prefer KSON, sure, write KSON and render to json. If config is UI, I think the structure of the data, and names of fields and values, matter much more than the syntax. (E.g. `timeout = 300` is meaningless regardless of syntax; `timeout_ms = 300` or `timeout = "300 ms"` are self-documenting.)

When the configuration grows complex, and you feel the need to abstract and generate things, switch to a configuration language like Cue or RCL, and render to json. The application doesn't need to force a format onto the user!

jdwyah 3 hours ago

the duration one in particular bugs me. I work on a dynamic configuration system and i was super happy when we added proper duration support. we took the approach of storing in iso duration format as a string. so myconfig = `5s` then you get a duration object and can call myconfig.in_millis. so much better imo.
wofo 3 hours ago

I like this take! Have you used Cue or RCL yourself? How was the experience?
- ruuda 29 minutes ago
  
  I’ve toyed with Cue, but never in a production setting. I like the ideas behind it, it’s very elegant that the same mechanism enables constraining values and reducing boilerplate. It’s somewhat limited compared to Jsonnet, RCL, Dhall, etc., you don’t get user-defined functions, but the flip side of that is that when you see something being defined, you can be confident that it ends up in the output like that, that it’s not just an input to a series of intractable transformations. I haven’t used it in large enough settings to get a feeling for how much that matters. Also, I find the syntax a bit ugly.
  We did a prototype at work to try different configuration languages for our main IaC repository, and Cue was the one I got furthest with, but we ended up just using Python to configure things. Python is not that bad for this: the syntax is light, you get types, IDE/language server support, a full language. One downside is that it’s difficult to inspect a single piece of configuration, you run the entry point and it generates everything.
  As for RCL, I use it almost daily as a jq replacement with easier to remember syntax. I also use it in some repositories to generate GitHub Actions workflows, and to keep the version numbers in sync across Cargo.toml files in a repository. I’m very pleased with it, but of course I am biased :-)
- diarrhea an hour ago
  
  Not the OP but I’m a big fan of this pattern as well.
  At work we generate both k8s manifests as well as application config in YAML from a Cue source. Cue allows both deduplication, being as DRY as one can hope to be, as well as validation (like validating a value is a URL, or greater than 1, whatever).
  The best part is that we have unit tests that deserialize the application config, so entire classes of problems just disappear. The generated files are committed in VCS, and spell out the entire state verbatim - no hopeless Helm junk full of mystery interpolation whose values are unknown until it’s too late. No. The entire thing becomes part of the PR workflow. A hook in CI validates that the generated files correspond to the Cue source (run make target, check if git repo has changes afterwards).
  The source of truth are native structs in Go. These native Go types can be imported into Cue and used there. That means config is always up to date with the source of truth. It also means refactoring becomes relatively easy. You rename the thing on the Go side and adjust the Cue side. Very hard to mess up and most of it is automated via tooling.
  The application takes almost its entire config from the file, and not from CLI arguments or env vars (shudder…). That means most things are covered by this scheme.
  One downside is that the Cue tooling is rough around the edges and error messages can be useless. Other than that, I fully intend to never build applications differently anymore.

JohnMakin 2 hours ago

Having now worked with terraform for 8 years, I could not agree more. Now, also because of having worked with terraform for 8 years and seeing how that's played out, I've heard and become tired of the whole "superset of json, transcribable to YAML, whitespace is not significant (which has never been a gripe of mine ever, not sure why every product cares so much about that)" promise of a silver bullet, and you very much face the same exact problems, just in different form. Terraform (HCL, to be specific) in particular can become fantastically ugly and verbose and "difficult to modify."

Configuration is difficult, the tooling is rarely the problem (at least in my experience).

mholt 5 hours ago

This is why Caddy has config adapters: bring any config file language you like, and Caddy will run it. It's built-into the binary and just takes a command line flag to switch languages: https://caddyserver.com/docs/config-adapters

kiitos 7 minutes ago

those formats aren't bijective with each other, right? so there's no way for you to say that foo.cue can be equivalently transformed to any foo.json or any foo.nginx or whatever representation, because those transformations are necessarily lossy, no?
kevmo314 4 hours ago

This makes it difficult to configure Caddy in anything except the native Caddyfile language due to a lack of thorough documentation. It's an interesting idea, but configuring Caddy with a yaml config that someone prior deemed a great idea was quite painful.
Curiously, LLMs have made it a lot easier. One step away from an English adapter that routes through an LLM to generate the config.

epolanski an hour ago

Imho the best configuration file is code written in the language itself.

Configuring TypeScript applications with the `defineConfig` pattern that takes asynchronous callbacks allowing you to code over settings is very useful. And it's fully typed and programmable.

It's particularly useful because it also allows you to trivially use one single configuration file where you only set what's different between environments with some env === "local" then use this db dependency or turn on/off sentry, etc.

Zig is another language that shows that configuration and building should just be code in the language itself.

eternityforest an hour ago

Programmatically manipulating it and validating it gets harder though.

VectorLock 4 hours ago

I like where their head is at here, especially the "superset of JSON" part. Some of the things I'm not _in love_ with like the %% ending blocks or how maybe a bit of significant whitespace might make things a bit less misleading with indentation as others have said, but overall I think I like this better than YAML.

i_s 3 hours ago

KSON looks interesting. Where I work we did a metadata type project in Pkl recently, which is somewhat similar. Unfortunately, developments on the tooling front for Pkl have taken an extremely very long time. Not sure the the tooling/LSPs are anywhere close to what the language offers yet.

I like the language embedding feature in KSON - we would use that. Have you thought about having functions and variables? That is something you get in Pkl and Dhall which are useful.

wofo 2 hours ago

Thanks for the kind words :)
This sounds like the kind of question for Daniel himself to chime in, since he has the best overview of the language's design and vision. He's not super active on HN, but I'll give him a heads up! Otherwise feel free to join our Zulip (https://kson-org.zulipchat.com) and we can chat over there.

matsemann an hour ago

Question: if whitespace isn't significant, how does it determine ambiguity? Like, if I have a Person that has a Dog, and both have a Name attribute. If I then add a Name after my dog definition, how does it know if it's the name of the dog or the person?

skydhash 5 hours ago

There’s two configuration that I like:

- The key-value pair. Maybe some section marker (INI,..). Easy to sed.

- The command kind. Where the file contains the same command that can be used in other place (vim, mg, sway). More suited to TUI and bigger applications.

With these two, include statement are nice.

gricardo99 4 hours ago

With all the code syntax highlighting support as a feature, I feel it will become tempting to put code in configuration files (which some of their examples show). That just feels wrong. Code should go in code files/modules/libraries, not mixed with configuration files. If your configuration starts to become code, maybe you need to rethink your software architecture. Or perhaps KSON proves that principle to be too rigid and inferior, and leads to more intelligible, manageable software. I guess we'll have to see.

giveita 2 hours ago

I think it is telling no programming language has settled on YAML or JSON as a syntax. Because that would drive you nuts.

But we allow it for files that tend to make production changes usually without any unit tests!

I'd prefer something syntaxed like a programming language but without turing completeness.

knome 2 hours ago

Azure and GitHub build pipelines are written in yaml and have conditionals, variables, template expansions, etc
- ruuda an hour ago
  
  GitHub Actions also have function calls, it’s just that they can only occur in very specific places in the program, and to define a function you have to create a Git repository.
  And don’t forget Ansible playbooks!

rmah 2 hours ago

Has anyone used both kson and hjson? We use hjson as a more forgiving json-like config format and really like it. kson seems more feature rich, but I'm a bit concerned about things like embedding code in configs. So any thoughts vs hjson?

bedatadriven 2 hours ago

Came across hocon recently and prefer over both yaml and kson. https://learnxinyminutes.com/hocon/

stevekrouse an hour ago

The people behind KSON are geniuses. No joke. I worked with Daniel. And he's an amazing human too. Congrats on launching!!!

bee_rider 5 hours ago

KSON looks neat.

I think the post is hurt by the desire to sort of… “have a theory” or take a new stance. The configuration file is obviously not a user interface, it is data. It is data that is typically edited with a text editor. The text editor is the UI. The post doesn’t really justify the idea of calling the configuration file, rather than the program used to edit it, the UI. Instead it focuses on a better standard for the data.

The advancement of standards that make the data easier to handle inside the text editor is great, though! Maybe the slightly confusing (I dare say confused) blog title will help spread the idea of kson, regardless.

Edit: another idea, one that is so obvious that nobody would write a blog post about it, is that configuring your program is part of the UX.

brap 4 hours ago

While I agree with the general idea, 2 minutes of looking at KSON was enough for me. If this is UI, it’s an ugly one.

I’ll stick to JSON. When JSON isn’t enough it usually means the schema needs an update.

kiitos 41 minutes ago

two huge thumbs down for KSON

dustingetz 4 hours ago

yes but to validate it you need dynamic runtime logic and therefore a live server with all the I/O glue code that entails. i.e., static types alone cannot render your tax forms

atoav 2 hours ago

My personal opinion is: If your config needs to be so complex you can't make do with TOML you should just use a interpreted programming language instead. It is totally acceptable to use a config.py for example. You get lists, dicts, classes, and all more or less known and well behaved and people can automate away the boring stuff.

But I'd strongly encourage everybody to think about whether that deep configurability is really needed.

Spivak 6 hours ago

Or you just use YAML. It's a configuration language for your software, you control which parser you use which can be YAML 1.2, you can use the safe loader which can't run untrusted code, and you're parsing the values into your language's types so any type confusion will be instantly caught.

I agree that it's not perfect but worse is better and familiar is a massive win over making your users look up a new file format or set their editor up for it. If you truly hate YAML that's fine, there's plenty of other familiar formats: INI, toml, JSON.

chromalchemy 4 hours ago

YamlScript (YS) adds more programmatic expressiveness to YAML. It extends an executable Clojure runtime (non-jvm, like Babaskha). Created by YAML inventor/maintainer
https://yamlscript.org/
NeutralForest 4 hours ago

I'm just so tired of writing bash scripts inside the YAML. I want to be able to lint and test whatever goes into my pipelines and actions. Not fight shitty DX on top of whatever Azure spits out when it fails.
- kiitos 39 minutes ago
  
  if you're writing bash inside of yaml then something has gone wrong well before yaml entered the picture, this is a problem with e.g. azure not with the yaml format
- ksenzee 4 hours ago
  
  This is entirely understandable, and entirely the fault of whoever thought bash scripts belong in configuration files. If you’re trying to stuff a tiger into a desk drawer, the natural consequences are hardly the fault of the desk drawer.
  
  NeutralForest 4 hours ago
  
  But it's the situation we're in now. It's what you see in the docs and what my colleagues write as well. I entirely agree that scripting into a markup language doesn't make sense yet the inertia is there and I wish there was some way out.
- hk1337 4 hours ago
  
  > I'm just so tired of writing bash scripts inside the YAML.
  Why would you be doing that?
  
  NeutralForest 4 hours ago
  
  Can't tell if asked honestly. Because that's how most platforms handle their pipelines. Terraform or Bicep let you use a declarative language for your platform. Everything else is calling cli commands or scripts from pipelines, written in YAML.
  
  hk1337 3 hours ago
  
  I suppose I am confused by what you mean with, "writing bash scripts".
  Like, are you _literally_ scripting in the value of an item or are you just equating that they are similar?
  Literal being:
  get_number_of_lines:
  command: > #!/bin/bash wc -l
  
  NeutralForest 3 hours ago
  
  Something like this: https://learn.microsoft.com/en-us/azure/devops/pipelines/tas...
  Invariably, ppl will write inline scripts instead of actual scripts in their own files. There are also some SDKs for most of these operations that would let you do it in code but they are not always 100% the same as the CLI, some options are different, etc.
  
  Spivak 2 hours ago
  
  Yes, this is how Gitlab pipelines work. It's actually easier to just inline the script most of the time than have a bucket of misc scripts lying around. Especially since you have hooks like before/after_script which would be really awkward to externalize.

righthand 3 hours ago

Yes all user interfaces are a key/value list.

That’s why 90% of each iOS update is just another menu or a reorganization of menus and why there are 3 different ways to access the same info/menus on iOS.

hk1337 4 hours ago

Anecdote: I am still of the opinion that in most (~99%)[1] of the situations people are in "YAML Hell" because they put themselves in "YAML Hell".

1: I pulled that out of my butt, there's no factual data to it.