The following use case builds on a conversation from yesterday with @paul121 and @OctavioDuarte, which was partly prompted by the CUE data validation and schema definition language and partly by a discussion around conventions in general and Our Sci’s mirror_farmos. We discussed convening some workshops for anyone else with a use case like the one below, preferably with their requirements defined ahead of time and a couple of concrete examples we could test out. We can work on documenting them and thinking through possible solutions, maybe try implementing some things. @OctavioDuarte also suggested we start with some JSON Schema and experiment with converting it to other notations, like CUE or SHACL, using some of the automation tools that are already available. I think that’s a great idea. We could also try modeling some entities by hand, either in accordance with the farmOS Data Model or as independent models that would maintain compatibility with it.
Testing Location Logic by Implementing the farmOS Data Model in Node.js
In farmOS.js, I replicated a fair amount of the farmOS Data Model, or at least, the parts that can be reconstructed from JSON Schema alone. With farmOS.js as its primary syncing engine, Field Kit is able to sync its own records with any farmOS server, regardless whatever additional modules the server may have installed and however that may impact the way entities are structured. Because farmOS.js can also be used in Node.js, not just the browser as with Field Kit, it could theoretically be used with its own database connection and an Express web server as a naive, implementation of farmOS. Basically you could run a “headless” clone of farmOS, which is similar to what @OctavioDuarte has done with mirror_farmos
.
However, there would still be some unresolved issues with data integrity if someone was relying exclusively on such an implementation, without a regular farmOS server somewhere in the loop to reconcile certain entity fields from time to time. The best example I know of to illustrate the kind of reconciliation that’s required but not represented by the JSON Schema, is the location logic. Without that logic to update an asset’s present location and geometry in a consistent manner, you could end up with the same asset in two non-overlapping locations or other issues. The inventory logic and group membership logic would pose similar issues, and probably other undocumented procedures for maintaining the data integrity.
So my main ask, which I think is representative of this general problem, is this: How can farmOS location logic be represented in serialized form, so that it could be replicated on another device, shared over the wire, and versioned in the event of future changes to the location logic’s main algorithm?
I see two general approaches for addressing this:
- Serialize the location logic itself, in the form of some kind of RPC or query syntax or even raw code.
- Make the concept of locations and/or movements into an independent data structure (call that an entity, an object, or what you will), so they can be referenced separately from both the asset it refers to and the log that records when and how the action took place.
I probably favor the second option at this point, but obviously that would be a breaking change, so it can’t really be expected that farmOS itself would adopt that model outside of a major version upgrade. That is fine for my purposes. I’m confident a Node.js server and database built this way would still be able to connect with a regular farmOS server to exchange data, since farmOS.js is already does this pretty well as a far less robust conflict resolution.
There are still other aspects of the logic that this may not address, but for the most part, farmOS comes pretty close to an event sourcing architecture, so I’m mainly looking for places where that could be fully realized. By separating certain fields like geometry
and is_movement
from standard logs and into more atomic events that could reference the asset and then, in turn, be referenced by a log, I think it could greatly simplify the logic required to materialize the state of a given asset’s location from those records, with easily reproducible logic that could be applied universally. Perhaps in some instances, approach #1 could be employed, but with simpler, more generic logic that was less stateful, and perhaps in combination with #2, something akin to AT Protocol’s Lexicon.
Or maybe none of this is really necessary. Maybe it’s enough if the relevant fields can be adequately specified, isolated and protected against overriding configurations. And if it can be guaranteed that the core logic won’t be subject to arbitrary changes, like with the addition of a farmOS/Drupal module, I’d be pretty happy about that. I’m looking now at ActivityPub’s spec for server-to-server interactions and occurs to me the current farmOS logic specifications aren’t all that different. I’m not committed to any one approach, but I think the best way to find out is to try implementing this with special attention given to the location logic to see how it handles potential conflicts.
Documented decisions regarding locations
Some relevant links on the decision to make locations a type of asset entity in farmOS 2.x, which were previously represented as taxonomy terms in 1.x: