Example Timeline for Syncing Archived Assets

Timeline

This is a follow-up from this week’s dev call, where we discussed how syncing and merging works in farmOS.js and Field Kit. @Symbioquine proposed a hypothetical situation where there might be issues with the syncing logic.

7am

We assume 2 users have a copy of an animal asset, ASSET_1, stored on their phones in Field Kit, as well as on the farmOS server. They’re all in the same state, although there’s a little additional metadata on the Field Kit devices.

  • farmOS Server
    • Initial state of ASSET_1
      • {
          changed: 6am,
          status: active,
          archived: null,
        }
        
  • Field Kit A
    • Initial state of ASSET_1
      • {
          changed: 6am,
          status: { data: active, changed: 6am },
          archived: { data: null, changed: 6am },
        }
        
  • Field Kit B
    • Initial state of ASSET_1
      • {
          changed: 6am,
          status: { data: active, changed: 6am },
          archived: { data: null, changed: 6am },
        }
        

8am

User A takes the asset for slaughter, marks the change, but doesn’t sync to the server.

  • Field Kit A
    • User sets ASSET_1
      • {
          changed: 8am,
          status: { data: archived, changed: 8am },
          archived: { data: 8am, changed: 8am },
        }
        

10am

User B finds the asset missing, assumes it was taken to slaughter about an hour ago, marks the change, but doesn’t sync.

  • Field Kit B
    • User sets ASSET_1
      • {
          changed: 10am,
          status: { data: archived, changed: 10am },
          archived: { data: 9am, changed: 10am },
        }
        

12pm

User B goes home for lunch and syncs to the server.

  • Field Kit B
    • User initiates sync process
    • Requests server’s ASSET_1
  • farmOS Server
    • Sends data for ASSET_1 to Field Kit B
      • {
          changed: 6am,
          status: active,
          archived: null,
        }
        
  • Field Kit B
    • Receives server’s ASSET_1
    • Merges local and remote, Last-Write-Wins, so local remains unchanged
      • {
          changed: 10am,
          status: { data: archived, changed: 10am },
          archived: { data: 9am, changed: 10am },
        }
        
    • Sends data for ASSET_1 to server
      • {
          changed: 10am,
          status: archived,
          archived: 9am,
        }
        
  • farmOS Server
    • Receives Field Kit B’s ASSET_1
    • Updates ASSET_1
      • {
          changed: 10am*,
          status: archived,
          archived: 9am,
        }
        

* = This is the one important data point I’m not entirely sure about, but I’m assuming the server accepts the client’s changed value.

1pm

User A gets back from the slaughterhouse and syncs with the server.

  • Field Kit A
    • User initiates sync process
    • Requests server’s ASSET_1
  • farmOS Server
    • Sends data for ASSET_1 to Field Kit A
      • {
          changed: 10am,
          status: archived,
          archived: 9am,
        }
        
  • Field Kit A
    • Receives server’s ASSET_1
    • Merges local and remote, Last-Write-Wins, so local is overwritten
      • {
          changed: 10am,
          status: { data: archived, changed: 8am* },
          archived: { data: 9am, changed: 10am },
        }
        
    • No more recent changes, so no updates sent to server

* = If I recall correctly this is what happens, since the values haven’t changed. I should probably check the code again, but I’m not sure it matters either way.

Conclusions

The more accurate data gets overwritten in this case, but I still don’t know if there’s any way around this, programmatically at least. The underlying assumption is that newer data is more trustworthy, hence the Last-Write-Wins approach, but short of discarding that approach, which I still believe is correct, I don’t think there’s much we can do about this.

2 Likes

Thanks for writing this up @jgaehring!

Let me propose a slightly modified version of the story - without the technical details for the moment - so we can see whether we’re aligned on the expected behavior.

Let’s assume the farm wifi is down but everybody is happy doing hands-on stuff and they haven’t really noticed.

On Monday user A slaughters animal #1 and moves animal #2 into the newly vacated paddock. Both are recorded through Field Kit in the correct order, but not synced.

On Tuesday user B notices that animal #1 is no longer in the paddock and that a different animal is there now. User B decides to archive animal #1 because they knew it was going to be slaughtered and ask their colleagues which animal was moved into the paddock. User B’s archive action isn’t synced either.

On Wednesday user A and B are having their weekly meeting and confer on the dispositions of animals #1 & #2.

User C: “… You’ll be pleased to hear I fixed the wifi this morning folks… User B weren’t you wondering what had happened in animal #1’s old paddock?”

User B: “Yeah, some unknown animal seems to have replaced animal #1 and I couldn’t tell what had happened from farmOS.”

User A: “Yep that’s right, on Monday I did slaughter animal #1 and the unknown animal you saw in its paddock was animal #2.”

User B: “Did you remember to record it in farmOS?”

User A: “Yeah, see here in Field Kit, the animal in the paddock is animal #2 and the archive date for animal #1 is Monday.”

User B: “Huh, that’s not what I see in my Field Kit. On Tuesday I tried to use Field Kit to check what had happened and it still showed animal #1, even though I could see that some other animal was there. I marked animal #1 as archived since I knew it was supposed to be slaughtered.”

User A: “Oh, look. There’s a red badge at the top in my Field Kit that shows 2 unsynced changes. Let me hit this sync button…”

User B: “I’ll do that too…”

[regardless of the order those syncs go through]

User A and B should both now see that animal #1 was archived on Monday and never overlapped in its tenancy of the paddock with animal #2.

2 Likes

Awesome, thanks for laying out the original scenario you had in mind, @Symbioquine.

I think in essence the scenarios are the same, at least in terms of the syncing logic. Again, whichever value for any particular field was entered last, that will be the value that wins out when resolving any conflict.

An important aspect of all this is that the archived timestamp is wholly separate from the changed timestamp. So when this occurs,

there will have to be a separaate decision about what value (timestamp) to assign to the archived field. The user or the application logic (eg, a particular field module) can decide that, but regardless of what the archived value is, the changed value will always be the current timestamp, even if the archived value is a timestamp of the day before.

Getting back to your example scenario, if User B decides to record that the animal was archived on Tuesday, then the data will be wrong. But I believe that problem is outside the scope of what farmOS.js and Field Kit Core API’s are trying to provide. It will still be incumbent upon the developer of the particular module (let’s call it the Animal Harvest Module) to decide what value to assign to archived. They will need to decide whether to assign the current timestamp to archived automatically, or require the user to manually enter a time, or assign some other default, etc. Similarly, if it’s considered an invalid state for two animals to occupy the same paddock at the same time, that too should be the responsibility of the module developer and/or user.

farmOS.js and Field Kit Core API’s are only going to solve the problem of concurrent computing processes, smoothing over some of that complexity of storing mulitiple copies of the data on independent systems and providing a simpler interface for making changes that simulates a more unified system. Beyond that, the intention is to leave decisions about the actual domain model, e.g. what is considered an invalid state, up to the module developer/user, to give them the greatest freedom to implement their logic and behavior on top of that abstraction.

2 Likes

I can see the argument that when User B chose to also archive the animal they could perhaps have entered a more “correct” date.

However, let’s assume that Field Kit - like farmOS itself - doesn’t give you the option of specifying the archived date explicitly.

Instead, the user’s intent is to perform the state change from active to archived. Their expectation presumably is that the timestamp for when the asset was archived is the time they took that action in Field Kit - even if syncing happens much later.

So, Field Kit is implicitly capturing the date associated with the state transition so that it can fulfill the user’s expectation there.

Now, I would argue that multiple users syncing for that archive transition should be idempotent and produce the same results regardless of what order the syncing happens in.

To achieve that, I believe the merging state logic would need to honor the earlier of the archived timestamps.

It seems fairly irrelevant to compare the asset revision timestamp against when the user elected to make the state transition for this case.

1 Like

Hope I didn’t shut the conversation down with my last reply…! :grimacing:

I realize there are a couple points from your last posts that I didn’t fully respond to;

I’m not convinced the “changed” timestamp can be set via the API. Here’s how I tested:

$ OAUTH2_ACCESS_TOKEN=`curl --silent -X POST -d "grant_type=password&username=root&password=test&client_id=farm&scope=openid" http://localhost/oauth/token | grep -Po 'access_token":"\K[^"]+'`

$ API_RESPONSE=`curl --silent --header "Authorization: Bearer $OAUTH2_ACCESS_TOKEN" "http://localhost/api"`
$ echo $API_RESPONSE | jq '.meta.farm.version'
"2.x"

$ SHEEP_ANIMAL_TYPE_ID=`curl --silent --header "Authorization: Bearer $OAUTH2_ACCESS_TOKEN" "http://localhost/api/taxonomy_term/animal_type" -H "Content-Type: application/vnd.api+json" -H "Accept: application/vnd.api+json" -X POST -d '{"data": {"type": "taxonomy_term--animal_type", "attributes": {"name": "Sheep"}, "relationships": {}}}' | jq -r '.data.id'`

$ ANIMAL_DOLLY_ID=`curl --silent --header "Authorization: Bearer $OAUTH2_ACCESS_TOKEN" "http://localhost/api/asset/animal" -H "Content-Type: application/vnd.api+json" -H "Accept: application/vnd.api+json" -X POST -d '{"data": {"type": "asset--animal", "attributes": {"name": "Dolly"}, "relationships": {"animal_type": {"data": {"type": "taxonomy_term--animal_type", "id": "'$SHEEP_ANIMAL_TYPE_ID'"}}}}}' | jq -r '.data.id'`

curl --silent --header "Authorization: Bearer $OAUTH2_ACCESS_TOKEN" "http://localhost/api/asset/animal/$ANIMAL_DOLLY_ID" -H "Content-Type: application/vnd.api+json" -H "Accept: application/vnd.api+json" -X PATCH -d '{"data": {"type": "asset--animal", "id": "'$ANIMAL_DOLLY_ID'", "attributes": {"status": "archived", "archived": "2021-07-17T19:45:49+00:00", "changed": "2021-01-01T19:45:49+00:00"}}}' | jq

That results in;

{
  "jsonapi": {
    "version": "1.0",
    "meta": {
      "links": {
        "self": {
          "href": "http://jsonapi.org/format/1.0/"
        }
      }
    }
  },
  "errors": [
    {
      "title": "Forbidden",
      "status": "403",
      "detail": "The current user is not allowed to PATCH the selected field (changed).",
...

It could be a permission issue, but I suspect that it’s actually more that the “changed” field is only intended to be set in scenarios such as a migration. It would be nice if I could find more official documentation, but this thread implies that anyway. Maybe @mstenta knows - or knows where to look…?


I respect that.

I guess my point is that it’s going to be hard to abstract that field merging logic in a way that doesn’t hard-code more specific domain logic and/or severely curtail the freedom of module developers/users to bring their domain knowledge to bear.

I believe this (archived state/timestamp merging case) isn’t the only case where such inconsistencies will occur either. For example, updating the intrinsic geometry and changing the “is_fixed” field is another case where domain knowledge is needed about the interplay between different fields.

1 Like

Great discussion @jgaehring and @Symbioquine!

Regarding created and changed timestamps specifically, these are managed by the server, and changed is set to the current timestamp whenever an asset is saved/updated.

I wasn’t sure if it was possible to change those via API, so thanks for testing that @Symbioquine.

It might be best to consider these “internal” properties - which represent the “server’s perspective” of when the asset was created/changed.

In which case, they will be set when records are synced. And client apps like Field Kit can not override them.

I can see arguments for/against this, but in general I think I like that behavior. And it provides an extra piece of information to help understand the order in which overlapping syncs may have taken place.

On that note, we also have revisions enabled for assets in 2.x, so a revision user and timestamp will automatically be saved with each sync. This also provides some audit trail for detangling potential overlaps like those described above. It’s still a manual process, but I think that’s inescapable to some extent, as you both pointed out.

1 Like

Also important to clarify, if this was ambiguous: I think @jgaehring 's examples include both the core changed timestamp on the asset as a whole AND a farmOS.js/Field Kit specific changed timestamp that is saved as meta information on EACH field of the asset - for tracking what was actually changed within Field Kit itself.

I was only referring to the core changed timestamp on the asset as a whole in my comment above, to be clear.

That field-level changed timestamp isn’t synced to the server - or between clients - right?

Hey everyone, thanks for the feedback! Just signing back in from a busy weekend of travel and intermittent data service.

This is correct, and worth noting. Thanks for pointing that out. I wanted to include all those values, b/c they are all used by FK to determine the value of archived when merging.

I agree that there should be some logic to handle transitions like this. I guess I just don’t see that as the purpose of the merge function (in farmOS.js) specifically. Really, the value of archived, or the value of any field for that matter, is not even a determining factor in the logic merge function. It really only considers the metadata so it can establish the chronology of events. Eventually, I would like this process to be non-destructive, so all those changes are preserved in their order, and can be played back or reconstructed at a later date, but that’s outside the scope of what I’m trying to achieve with this alpha release (and probably the general release as well).

I also want to point out that, at this stage, nowhere in Field Kit Core, nor in the Tasks module, is it possible to modify any asset whatsoever, and there aren’t even any references anywhere in the codebase to the archived field. So someone has to write some code to do that, at this point, which is why this is a great time to be discussing where and how that should occur!

Probably, there needs to be some codified logic specific to individual fields or types of fields, which only applies to that format of data and what it represents. And it needs to exist somewhere in between farmOS.js’s merge function and the module or other application code which actually triggers the change—that is, if we want such changes to be consistent across all modules and implementations. Right now, FK Core would probably the most appropriate place for such logic. Eventually, however, I think we need to think how such logic can be communicated between farmOS systems in-band. With 2.x, we are finally getting to a point where so much of the farmOS Data Model, at least as pertains to record types, can be communicated across the wire as JSON Schema. But to truly enable both interoperability and modularity, we will need to be able communicate specific field logic in such a way as well. Maybe that means some form of RPC, or just more rigorous documentation. In any event, I don’t think it’s a blocker for our alpha and beta releases, and may be outside the scope of 2.0 entirely. Hopefully by farmOS 3.0 we can get to the point where we can achieve a truly distributed architecture, but… baby steps :baby:

2 Likes