I would like to spend/trigger some thoughts on the integration of PostGIS for an improved handling of spatial data.
It probably also helps to connect with further software (such as QGIS) and with the integration of further data and analysis functions.
I’m very interested in this too! We did some initial testing with PostGIS in farmOS 1.x, and it worked pretty seamlessly - although there are a couple of outstanding PostgreSQL bugs in 1.x so its not recommended. However, PostgreSQL will be the default recommended database for farmOS 2.x (in active development now). So it would be great to start exploring some ideas for the near future!
For what it’s worth, both farmOS 1.x and 2.x use the Geofield module to store geometry data alongside assets and logs. By default, this is stored as Well Known Binary, in a way that is cross-compatible with MySQL/MariaDB/PostgreSQL. However, the Geofield module supports swappable “storage backends”. I believe that Geofield does already support PostGIS storage in Drupal 7, but I’m not sure about Drupal 9 (which is what 2.x is built on). I see some open issues in their issue queue:
So I think the work here will be in developing a deeper understanding of what is/isn’t currently possible with Geofield, and where we go from there! In a perfect world, farmOS server admins could choose either “default” or “postgis” backend storage when they install farmOS for the first time, which would keep the same flexibility in backend we currently offer, while allowing for more PostGIS-specific possibilities for those who have it. This also means we need to be thinking about how these different storages get used throughout farmOS, wherever geometry data is retrieved and worked with in the code.
@jolau, you’re the second person I’ve heard recently say that farmOS “should have PostGIS integration” without describing specifically why.
I can imagine a number of ways Geospatial features at the database layer might be helpful in conjunction with farmOS (e.g. from a performance or interoperability standpoint), but I’d have a hard time pitching it as a critical feature myself without more concrete profiling/use-case data.
For the record, perhaps it would be worth describing what problem(s) you’re hoping to solve with farmOS and PostGIS together that is harder to solve with farmOS or PostGIS alone.
Dunno if this is sufficiently detailed to answer your request for UseCase justification, @symbioquine -but i can tell you why i was so happy to hear from @mstenta that PostGIS integration will be made significantly easier in farmOS 2.x (with PostgreSQL as the standard db, that is a given).
That being the situation w/r/t my geospatial data architecture, there are many individual UseCases that would benefit from PostGIS integration, which i could further elaborate if need be, but perhaps you get the picture: for any farmer with a GIS (a large and growing % of farm managers) with PostGIS (closest thing to a standard in the world of GeoSpatial) as database/ integration layer, the benefits having PostGIS hooks in farmOS would be great indeed.
When interoperability is really an edge case, one has to question the benefits… But i suspect if you were to survey the market of people managing farms of significant size (let’s say >100 acres- mine being at the very bottom of that range, FWIW), you would find the use of Geographic Information Systems to be quite common.
@walt, you explained it very well.
Since farming depends on spatial and spatio-temporal data, I see the usage of geospatial tools as an enabler for further, enhanced data processing steps.
The usage of PostGIS, as one of the most popular and feature rich data bases for geospatial data - with its broad usage and the support of many standards and algorithms would increase the options for farmOS to connect with further data sets as well as enabling other UI/Tool-stacks to use a common/shared data set without conversions etc.
QGIS would be one tool (which has an excellent interface with PostGIS). However, also further tools, such as python, R or several other tools for data visualization/analytics would be able to consume data which is accessible via SFA (https://www.ogc.org/standards/sfa).
However, a clear access between the tools and the database is needed to not generate mess in your data. I am unfurtunately not familiar with drupal, but maybe mstenta can help to understand the data access layer and we could discuss option on how to access farmOS data using further tools.
Depending on the use-case, read only access woul definitely do one part of the job and could be also solved by setting up a geoserver (geoserver.org) for external data access.
An integration using a CRUD interface would enable full interaction between the tools, but this brings further complexity.
I think this is a key point. These use-cases seem to be more about general GIS interoperability with farmOS than about PostGIS specifically.
I would argue that having farmOS support OGC standards APIs - either directly or through extensions/proxies - would have more value for most use-cases than enabling PostGIS storage at the DB layer.
Having farmOS expose OGC standards APIs would allow those APIs to provide extra value - such as creating new movement logs when asset geometries are changed or enforcing the enumeration of valid area types.
On the other hand, building interoperability using PostGIS at the DB layer means the interoperating software also has to understand the farmOS DB layer and - if modifying it, do so only in ways that are still valid in farmOS. It also couples both pieces of software to PostgreSQL.
As an example, exposing OGC standards APIs could be done as separate WFS proxy like I have experimented with in farm-os-area-feature-proxy or as a farmOS/contrib module which exposes the WFS API directly.
In other words, this would mean the interoperating software - QGIS in Walt’s case - would probably need to understand the relationships of many of the farmOS DB tables, not just those which hold the geospatial data. I believe QGIS can do that fairly easily, but it would be a non-trivial amount of configuration when you take into account all the data fields associated with an asset like a planting.
Also if you wanted to make the interoperating software understand the location of an asset, you would need to re-implement the logic of finding the most recent movement log for that asset.
None of this is insurmountable, it’s just a fairly high cost to pay for every new piece of interoperating software. Especially when the result is likely fragility from high coupling to the low level data storage of farmOS - @mstenta can probably say better than I can for farmOS - but most platforms don’t guarantee that their internal data models will stay backwards compatible.
I agree: integrations should use the API if at all possible, not direct database connections. Database tables probably won’t change much (although worth noting that farmOS 1.x and 2.x are quite different) - but the API should remain relatively stable. It serves to abstract the lower-level database structures.
Also if you wanted to make the interoperating software understand the location of an asset, you would need to re-implement the logic of finding the most recent movement log for that asset.
Yes, the API already shows “current location” information for assets - which is calculated based on logs. So API consumers don’t need to replicate that logic. They can just ask “where is this asset right now?” (and maybe in the future: “where was this asset at this time in the past?” (future too)).
And to be clear, by “integration” in my last comment I mean: connections between farmOS and other systems (eg: ArcGIS/QGIS).
But there is also the lower-level “integration” ideas I was referring to earlier, which specifically referred to “writing code in farmOS to allow it to leverage some of the power of PostGIS for doing things IN farmOS” (not third-party integrations, but “PostGIS integration” is more what I meant).
Ah! so… If i’m understanding you right, @mstenta, then this is about much more than simple data sync between two apps that share a datatype (e.g. a water point, a fence line, a field polygon, etc.). Well then…
From my naive (i.e. non-technical) user perspective, i can only guess that if farmOS were using PostGIS for all it’s worth, then maybe we would not be limited to the hierarchical view of Areas that farmOS imposes (? anyway, that’s how i understood it, setting up my areas in farmOS). So maybe we could have different views on our areas that overlap and intersect in different ways, much as layers do in a GIS implementation -yes?
And if that were the case… Well, then geospatial objects in farmOS could inherit attributes that might be defined in a different “layer,” if you know what i mean. For example: in my GIS, every landplot is defined in terms of its use (e.g. pasture, woodland, annual cropping, etc.) in one layer… But then in another layer, geographic areas are defined in terms of soil type. Now if i am deploying a number of soil sensors around the farm, i might want to decide their placement in terms of both land use and soil type, which are defined in two different layers of my GIS. If those two attributes could be brought together auto-magically in the same view of sensor data… Is this the sort of thing that might be possible, with the deeper form of integration we’re talking about here?
My understanding in most GIS applications is that the data storage and the map layers can have different cardinality. For example there can be multiple layers using the same underlying data file/table (e.g. Shapefile, PostGIS table, etc).
In your case, wouldn’t there be one data source for both/all area layers? I would have assumed there would be one file/table holding the areas’ geometries/attributes. Multiple layers could reference that file/table with different filters/symbology/labeling-rules/etc. In that way there is one logical definition of each area, but potentially many ways to look at them.
There are multiple ways of organizing this sort of data - and supporting queries like that.
Ignoring the cardinality of the layers or underlying data storage for a moment; the land use and soil type could be attributes of the area as I believe you’re describing or they could be in a separate data source. For example, my county provides shapefiles which show the geological survey soil types - several of these soil type geometries might intersect with a single area and a spatial join is needed to find the soil type at any point in an area.
Even attributes of the area could be stored in different ways. They could be part of the same file/table as the geometry or they could be referenced from another table - in QGIS this is accomplished via joins, or auxiliary storage.
I believe the ideas are actually sort of unrelated.
There is one narrow path where farmOS internally uses PostGIS and happens to cater well to the sort of data model and querying you’re describing, but I don’t think it would be what one would arrive at by minimally “integrating PostGIS and farmOS”.
Instead, I think you might be best served by an approach that lets you use farmOS as the “source of truth” (at least for the geometries for your areas) and allows you to work with those geometries in QGIS. Attributes like land-use and soil type could also be modeled in farmOS or could be associated - as I described above - using spatial joins, attributes joins, or auxiliary storage.
Very curious to understand these ideas more! Worth noting that farmOS 2.x is reworking the concept of “Areas” - making them into new types of Assets. They will still be able to be arranged hierarchically. But it will also be possible to “archive” them, so you can redraw areas from time to time if you want, while still saving the old ones, along with the records related to them. This introduces a new challenge (and new opportunity) for finding old records: instead of asking farmOS to “show me what happened in this Area”, we will want to ask “show me what happened in all areas/logs that have ever overlapped this geometry”.
Imagine a special map that let’s you drop a point and see every record with a geometry that intersects it. This kind of query would be easy with PostGIS (as I understand it)! But it would also be possible to achieve the same thing in other databases (and/or with logic outside the db). So I like to think about these features as working in many possible contexts, with “progressive enhancement” (more efficient querying) available if you use PostGIS.
I don’t want to be mean or seem like I’m opposed to this idea. I actually do think there likely are features and performance improvements that could come from PostGIS integration.
That said, @jolau I think there’s something missing if you just list features of technology X when considering integrating it with technology Y.
To illustrate this, I could make a list like that about Git;
Thinking about integration [with farmOS] of source control systems and Merkle trees that Git already provides;
Has wide adoption
Distributed
Supports branching and merging
Provides data integrity through a chain of hashed commit information - Merkle tree
…
What’s missing is;
How - describe roughly how the technology and individual features are relevant to farmOS and what the vision is for how the integration would work
Why - describe what specific problem(s) you are hoping that manner of integration of the technologies would solve for your use-case
Ideally several folks also post similar descriptions about their use-cases and have it turn out that there is a common integration strategy for the technology which makes many folks use-cases work better with farmOS.
It isn’t enough for the technology or idea of an integration between technologies to be cool - it should solve a concrete and important problem better than the alternatives. I guess I’m trying to advocate here for a level of rigor in showing the merit of the proposed integration in those terms.
Fair enough, @Symbioquine : it is incumbent on those of us who advocate for some particular line of development to express in reasonably clear terms the problems it will solve for users of that system… And this conversation is still a long way from that level of clarity!
Still, from my naive (non-technical) user perspective, there’s something i don’t yet understand about what “PostGIS Integration for farmOS” really means, and what that might imply for users of the software(s!) in my context -i.e. what software tool do we use for what job?
As i said earlier in this thread: i’m maintaining an instance of QGIS + PostGIS (along w/ some legacy data in Shapefiles) as our farm’s “System of Record” for all things geospatial -from which i have exported landplot data in .KML format, for import as Areas into FarmOS. This was a minor PITA, but not too bad, as it was only a few dozen polygons that are unlikely to change in the next few years -well worth the effort, since it enables me to geolocate Assets and actions logged in farmOS within the appropriate landplot.
Still: that’s a very limited use of GIS data, when we consider the sort of use that PostGIS enables -i.e.:
Yes! That is indeed the sort of query that a PostGIS database enables -and if farmOS had affordances for such queries, i would be using the heck out of them, i am sure. What i’m not at all sure about is if any amount of geospatial capability in farmOS would ever persuade me to use farmOS as the canonical “Source of Truth” for my Area geometries, as @Symbioquine suggested i might want to do. QGIS is such a powerful tool for defining/ styling/ elaborating areas (i.e. map-making), i don’t see the farmOS project ever taking on such scope (nor should it!)
This sounds like the best of all possible worlds to me… But to be sure i’ve got the right take on this, i have to ask: If/when a farmOS 2.x instance is PostgreSQL at the back-end, with all geospatial data stored in PostGIS form, would it be possible for such data to be read directly by QGIS? And, looking at it the other way around: if i use QGIS for defining geospatial entities like landplots, would this farmOS 2.x instance be able to read it directly? Could it be possible to have PostGIS data entities shared by both farmOS and QGIS, with full CRUD access enabled on both sides? Of course both apps would have to respect the same data integrity constraints; i just wonder if this is even possible (if not, i need to impose some restrictions on my dream-space )
There’s a lot of things it could mean. Thanks for highlighting the ongoing ambiguity!
I think the strategy that we were batting around above - and that @mstenta was talking about when he mentioned “progressive enhancement” - was the idea that farmOS could support internally using PostGIS to store geometry and that could enable extra features or improve performance for features.
As I also described, this minimal strategy of integrating PostGIS and farmOS would probably allow you to directly query the database and have stuff show up in QGIS, but it likely wouldn’t be as easy as working with the PostGIS tables you already have and your maps might break (or allow edits that break farmOS) in the future if you update farmOS and it has changed its internal implementation details.
As an alternative to that approach, I proposed that most use-cases would be better served by farmOS acting as a geometry server. (e.g. WFS) This is separate from whether farmOS internally uses PostGIS, but could allow working with the farmOS areas and asset geometries in a read/write capacity through QGIS.
I agree farmOS shouldn’t try to do everything QGIS does, nor even would I expect it to satisfy all farming-related mapping use-cases. That isn’t what I was referring to when I suggested that farmOS could be your “source of truth” for your area geometries.
I meant is that, if farmOS acted as a geometry server, you could use farmOS to be the authoritative place your areas’ geometries are stored while still using them to power the area layer(s) of your (Q)GIS maps. Most of the geospatial responsibilities of QGIS still work regardless of whether the data is stored in a shapefile, PostGIS, or an arbitrary geometry server implementing standard protocols like WFS.
Selected Q & A
Possible, but not likely to be easy nor stable long term.
Also possible, but not likely to be easy without significant restructuring of your existing tables to match what farmOS internally would expect.
Again possible, but as I hope I’ve made clear there are actually multiple paths to having “full CRUD access enabled on both sides” and the one involving direct sharing of PostGIS tables probably isn’t the best.
In the above gif, I show that areas already defined in farmOS can be displayed within QGIS maps. I then draw some new areas via QGIS and show that they got saved back into farmOS.
I’m not claiming that farm-os-area-feature-proxy is a final solution to these problems, but it may serve some narrow slice of these broader use-cases and it is a good demonstration of the power of fully armed and operational farmOS server acting as a geometry server.
Cool! I had no idea that any sort of live connection between farmOS and QGIS was already possible; thanks for this, @Symbioquine!
Powerful demo indeed. Dunno if it will help w/ my own narrow use-case, since my Co-ordinate Reference System is EPSG:3763 (standard here in Portugal), while your solution specifies a different CRS. Is that a hard constraint, i wonder?
I was just trying to keep the development effort low to get the demo out there sooner and to unblock my own use-case - using farmOS as the source of truth for my record-keeping/mapping.
As far as I recall it would just take some extra work to make the proxy either use a configurable CRS or translate between CRS’ as the case may be.