March 14th meeting - geometry, schemas

Some outcomes

Topic 1: How do we store geometry data?

How is geometry calculated by FarmOS… and is that a reasonable convention to pull out?

How should we address alternative formats:

Discussion about Location & Geometry - how are they stored?

  • What farmOS already has in terms of schema specification
  • Pulling in from other places in a standard way (proposal)
  • Location constraints in a useful way

Decisions:

  • Computed values… can we specify computed values?
  • Creating a location convention w/ detail on
    • What is required (geometry.value)
    • That it’s validated (validate the wkt)
    • How location ‘bubbles up’ and where (Location | farmOS)
    • What it means to have a separate geometry w/ a location
    • Validate using regex
    • Etc.

Location and time are central to database currently for Laura Morton’s IRA- GHG project with USDA - Linking conservation practices to soil carbon - trying to build ‘one database to rule them all’

[farmosinstance]/api

For logs & assets, geometry stored in attributes - only needs geometry.value as a wkt -

Link to OpenTEAM slack convo on WKT vs Geojson - if you have access to OT slack you can read this, I can add to forum post l8r

Wkt built not via farmOS, but via openlayers - can push w whatever wkt you want

Supports multiple formats - not shapefiles, but kml, geojson, etc.

WKT is ONLY simple geometry (primitive, less flexibility for this field, for our purposes) - other formats include other information, & other information should be stored elsewhere in this case

Subsequent geometry fields are done in farmOS via information from value

Figures out bounding box & centroid (-> lat/long) from wkt information

In terms of validating data for WKT - considering using json schema regular expressions to check for keyword (polygon etc) - can\t check for full format currently in json schema. Other ideas?

Plan on doing this via helper functions in schema builder

Stackoverflow link - but cant be done inside the schema

Additional information is useful but not required - should we keep them optional?

Use is application-specific, which points to keeping them optional - can run more efficiently if we don’t require them.

Also from a data compression standpoint - optional = data as compressed as possible

Concern is that the schema is too loose - should it be more restrictive. How is the WKT validated?

Value of the schema is via validation - how can we make this field validatable?

Currently you could externally validate e.g. what the centroid is, & there are many ways to calculate

Limits now are ‘is this a number?’ ‘is this a string’ & beyond that it’s application-specific

Distinctions -

Validation of data type vs value range, computed values

Assets have an intrinsic geometry field, but geometry field itself is computed

Implicit bubbling-up from a log to an asset to a field - a log can be connected to a planting but have its own unique location/ Assets can pull from logs

Can conceptualize assets as a tag for logs to show relationship

For a broader location convention, want to include that capability

Assets can have intrinsic geometry or be moveable - if moveable, derived via most recent movement log - built with this complexity for a reason, so it can handle this variability

Topic 2: Any updates on animal ag from Kevin?

  • initial draft received comments that will affect most categories. New date for public release is June (first version for internal testing)

Topic 3: some suggestions regarding Comet Farm cropland data schema

Inventory of suggestions: Airtable - Comet Farm [Public View]

Topic 4: Suggestion for translating data from multiple sources

If you’re bringing in information, you should

  1. retain information from that external source in case you need to regenerate the convention from that source

  2. using the file relationship

E.g. if you have a geojson with location and metainformation, all that info should get pulled into relevant place in convention, but also get retained

PCSC workbook discussions - schemas/ontologies, COMET tie ins

Mapping, USDA x comet-planner buy-in for implementation

On farmOS side, leaning toward asking maximal amount of questions to meet all three requirements (since we have to ask info necessary for workbook & comet-planner already for completion, want to be able to use same info for both / ask the superset now, hopefully collapse it in the future )

Key is if you’re coming from a detailed, log-based system you can make reasonable assumptions from…or not.

Maybe future topics

  • Can we have a bigger conversation about animal ag - how is data stored associated with animals and herds in the convention?

Could loop in with TerraGenesis grazing data services collab work at later point

1 Like

Gitlab post from Greg re: WKT vs GeoJson mentioned in call today: