CUE data validation and schema definition language

Has anyone here encountered or used CUE before?

@gbathree and @OctavioDuarte, I think you may find this very interesting as an alternative means of defining conventions in text format. It requires way less boilerplate and has a bunch of relevant features and built-in tooling that JSON Schema lacks, such as constraints and validation.

The Tour section of the docs is a great starting point, and I’m still digging into more of the details, but I think the examples and explanation from Types are Values, quoted in full below, gets to the core benefits w/r/t farmOS conventions.

CUE merges the concepts of values and types. In CUE, types are values.

A field can be specified with:

  • a concrete value such as "foo", 42, or true - something that could be represented in JSON,
  • a type such as int or string,
  • or something in between the two such as >=500, or !="foo" - not concrete, but more specific than a basic type.

The following examples show a CUE schema; a typical CUE constraint that refines the schema; and some concrete values that satisfy both the constraint and, therefore, the schema.

# CUE Schema
municipality: {
	name:    string
	pop:     int
	capital: bool
}

# CUE Constraint
largeCapital: {
	name:    string
	pop:     >5M
	capital: true
}

# Concrete values
kinshasa: {
	name:    "Kinshasa"
	pop:     16.32M
	capital: true
}

With CUE, we generally start with a broad definition of a schema describing all possible instances and then progressively narrow down these definitions for a particular use case until a concrete data instance remains.

Some other nifty features:

  • Complex expressions that includes mathematical and boolean operators, conditionals, regex, queries and projections
  • Integrations with JSON, YAML, OpenAPI and ProtoBufs
  • A handy CLI tool for running validations and other tasks
  • A module & packaging system for breaking up definitions between multiple files

This is my new obsession, as perhaps you can tell. Am I just the last person to learn about this? Or am I simply overstating its potential?

5 Likes

It certainly looks interesting! I’m a NixOs user, so I am in on the “turn everything into a programming language” wagon.

I like projects that have a solid conceptual foundation, like this one. It also seems to have an adequate scope for the needs @gbathree and I have identified so far.

I’m not sure about it having enough real extra features to be a drastic improvement over JSON Schema for our use case. JSON Schema lacks one or two key features that are causing issues to Greg and I, but is surprisingly featureful and I would even say that it has most of the features highlighted by CUE. Since the JSON schema is not like data but actual data, you end up having similar behaviors when you want to achieve complex stuff. That said the syntax for CUE is way more terse and meaningful.

I would like to feature freeze what @gbathree and I are coding soon (as we are reching a state in which we provide most of the features we wanted) and take one or two days to compare possible implementations in other systems.
It is true that the JSON Schema guys do not provide an implementation, but there are several which I’d even say are better than the specification (sadly, making their features non standard). One that is a real advantage to us and I didn’t see in a first pass on CUE is you can compile stand-alone JS validation code, which can be distributed with 0 dependencies for validation.

@jgaehring Have you checked into LinkML? It is another project that seems to offer better schema features but it is very different to CUE. This one is more batteries included (as our own JSON Convention Builder does, it will build and publish documentation for you) but since it relies on YAML and is not a programming language, is less radical in the possibilities it offers.

2 Likes

That makes sense. I probably overstated the gains and it will inevitably depend on specific use cases. Like you, I appreciate the solid conceptual foundation. The influence of logic programming is palpable.

I have to say, the proliferation non-standard implementations and extensions in JSON Schema is my number one issue with it, and another reason why the CUE’s fundamental design seems so much more promising (plus their docs are so much nicer to navigate).

I think one option with CUE, though I have no idea how easily it would work, is the OpenAPI integration, which if I’m not mistaken, still essentially uses JSON Schema as its core Schema Object. The implementation is not available from the CLI tool, unforunately, and is a separate library written in Go. Still, it seems possible to create a pipeline that could support writing conventions in any number of formats, including JSON Schema, CUE, YAML, and Protocol Buffers. Maybe not super useful, but it could be a way to easily explore other formats w/o fear of incompatibility with other conventions.

There also seems to be some code generation written for Go, so maybe not immediately useful for JS projects, but seems promising.

@paul121 shared this with me very recently, but I had not looked too closely. The prospect of modelling with simple YAML that can then generate JSON Schema and JSON-LD is a very tempting prospect indeed!

I should clone your convention builder one of these days and play around with it locally to see what you’ve been up to. I’ve peeked at the GitLab occasionally but not extensively. Lots of exciting possibilities on the horizon!

@jgaehring Actually (please. imagine the bothersome nerd meme), we are meeting with Mike, Greg, Paul and people from PASA to have a tour around our FarmOS leveraging tools.
If you want to join, that would be great. I’ll add you to the invitation.

1 Like

Oh excellent, I sure will. See you then, and thank you!