Early Draft Post: Thinking about Organizing Agricultural Data with farmOS

I started sketching out another post for the blog that I was hoping to get some early feedback on. The later parts are just an outline and I’m not totally sure where I’m going with it. All feedback welcome. Please feel free to tear it apart :slight_smile:

Thinking about Organizing Agricultural Data with farmOS

Any kind of record keeping should serve a purpose. Record data itself is useless unless it can provide some insight, shape future decisions, or satisfy external requirements.

In agriculture some example drivers of record keeping are tracking the pedigree/lineage of animals/plants to help manage genetic diversity or breeding goals; tracking yields inputs and crop rotations to detect patterns in what practices are working; and gathering data required by regulatory bodies for permit or certification purposes.

Unfortunately, one can’t always predict what data will be needed for these things. Thus, a common strategy is to keep some sort of records even when the exact purposes to which they’ll be put aren’t yet known.

There are a lot of options when it comes to what tools that can be used for agricultural record keeping. Among these options is farmOS. farmOS lands in a sweet spot on the gradient between on one hand free-form tools like paper notebooks or spreadsheets and on the other hand purpose-built tools that target specific market/livestock/regulatory niches.

farmOS lands in that sweet spot by providing a structure of record primatives while remaining largely unopinionated about how data is organized within that structure. On top of that farmOS is highly open and extensible which allows for more opinionated consistency rules to be added on a use-case by use-case basis. It is also possible to use farmOS’ API to programmatically restructure large amounts of data as one’s recording keeping requirements evolve.

Where to Start?

It is clearly critical to determine what granularity of data to capture. Even if one could easily track the movements and growth of every macroscopic organism in an area with technology, the data wouldn’t be very useful by itself. Yet more technology could probably be used to infer some types of meaning from all that data, but the complexity of such a system would be enormous and very costly from a sensing-hardware, storage, and computational standpoint.

Fortunately, by working backwards from some hypothetical goals for the data, we can arrive at a much more reasonable strategy. We don’t need to track every ant and native shrub to have useful records for a market garden or a goat farm. The market garden might want to keep track of what is planted where, when it was planted, and harvest dates/yields. The goat farm might want to keep track of age, sex, breeding events, and so on about the goats.

It is easy to get overwhelmed imagining all the things that one might want to keep track of, however with farmOS it is easy to start small and - with a few reasonable assumptions - the patterns will scale to capturing more and more types of data as it is warranted.

Often it is helpful to look at one’s goals for the agricultural operation as a whole. If the goal is to raise animals, then perhaps tracking the individual animals or herds of animals is the core primitive to start with. Similarly if one’s goal is to grow vegetables perhaps the core primitive is plants or land areas (fields, or beds) of plants. In farmOS the first step would be to create assets for these core primitives.

Part of that process is determining how to connect between the physical reality and the records on the computer. Chances are there is already some sort of naming strategy and this tends to form the first layer of connection.

Asset Type and Granularity

  • When to use an existing asset type vs a custom one
  • Pros and cons of fine/coarse granularity assets. i.e. animals vs herds

Naming Assets

  • Ideas for designing naming patterns or adapting existing physical naming/tagging systems to farmOS.

Log Type, Purpose, and Frequency

  • When to use an existing log type vs a custom one
  • Tradeoffs with fine/coarse granularity logs. i.e. play-by-play activities/observations vs the minimum set of logs to capture only key changes

Naming Logs

  • Ideas for designing naming patterns or adapting existing task tracking/naming to log names

Next Steps

Allude to future posts about capturing data via quantities, inventory, sensors, and more

3 Likes

This is great @Symbioquine! I really like this description:

farmOS lands in that sweet spot by providing a structure of record primatives while remaining largely unopinionated about how data is organized within that structure. On top of that farmOS is highly open and extensible which allows for more opinionated consistency rules to be added on a use-case by use-case basis.

I also really like the structure you are sketching up for the asset type, log type, and naming stuff… these are some really good baseline things to explain and understand when you’re getting started, which we don’t really do a good job of covering elsewhere. This will be a very helpful blog post, I think.

Maybe worth mentioning strategies for taking quick notes in the field and then coming back to fill them in/add details afterwards? Or maybe that’s another post… :wink:

3 Likes

I think this is on target. FarmOS provides a very open and flexible canvas for record keeping. As a new farmer, I did not have experience and a baseline for what information I needed to track. I made a few assumptions and dove in. I found myself going down a given path for structuring the data and then would run into problems later with some of my assignments and choices. I had to go back and restructure some of the data entries. Bit by bit I got to something that was working. As I move to Version 2, I will need to do further restructuring to get my data assignments cleaner and more consistent.

For new users, there needs to be a set of Use Case template level examples based on different farm operations, like “Organic broad acre row crops” or such. These would help get users off to a good start. These templates would also aid future development by helping to keep end use cases in focus. Did the new proposal make a given use case easier to manage or harder? Right now my goal heading into Ver 2 is to define the template for my operation answering the questions in your outline.

3 Likes

@graffte thanks for the kind words about my draft and the ideas! I know we’ve talked about having some curated example data in the past, but I like your idea of having a few separate ones for different kinds of farms/operations.

Just to tie this together, I’ve previously demonstrated how sample data can be captured in a farmOS module and installed. I guess the next step would be to make a number of stand-alone modules installable via composer/drush which represent the curated examples… @mstenta Do you think it might make sense for those to live in core (assuming they demonstrate/exercise core functionality)? That would be cool because then they could be leveraged as part of testing and demo instances more trivially too.

Bringing it back to this post, I think I’ll keep the scope smaller and just address the general ideas around strategies for naming and asset/log granularity. We can always have future posts which introduce a given set of sample data and walks through the rational behind how the data is organized therein.

3 Likes

I’d probably start it in a separate module repo (eg: github.com/farmOS/farm_demo), perhaps with sub-modules for different use-cases. This would keep it isolated and allow additional dependencies to be pulled in if necessary (eg: https://www.drupal.org/project/default_content), without the considerations of maintaining that in core. If it proves to be well-maintained we can consider pulling it into core in the future.

1 Like

I took another stab at this post with the idea that the first step is to walk folks through the high level ideas of figuring out what to record and the general thought processes around it.

A later post should probably talk about more specific details of type, naming, frequency of assets/logs/etc.

Hopefully, this strikes an okay balance of being beginner-oriented and not condescending. I know a lot of this is obvious, but I also don’t think it’s written down anywhere related to farmOS.

All feedback is welcome!


title: Organizing Farm Data with farmOS
date: 2023-08-03
author: Symbioquine
slug: 2023/organizing-farm-data

Organizing Farm Data with farmOS

Growing things is busy work that requires tracking many small (but critical) details and being responsive to unpredictable factors such as weather, pests, and market forces. A naive outsider may have strong notions about what data is required to keep things running smoothly and how it is recorded/stored. Similarly, farmers will have a pretty wide spread of actual and aspirational record keeping practices.

Here are some common mindsets when it comes to farm record keeping;

  1. “Obviously farmers should record everything. More data is better. Increasingly fine-grained data will lead to more optimal practices/production.”
  2. “Record keeping for my farm is easy. I just jot down free-form notes and organize/transcribe them as I go. The unimportant data gets lost naturally along the way.”
  3. “Record keeping for my farm is near impossible. I can’t predict what data I will need and trying to record or organize it is a waste of already scarce time.”

We can imagine all sorts of challenges and opportunities that might be implied by those mindsets but each of them holds an important lesson too.

Costs and Benefits

It may be obvious, but the first thing to consider is what you are trying to achieve and how is that different from where you are now.

It can be tempting to blindly pursue a comprehensive record system, record ad-hoc data without a plan, or even skip record keeping altogether. Chances are that none of those is a great fit for your farm. Instead, most farms probably need something in the middle of those extremes. You can’t record every minutiae since it would take too much time and have diminishing returns. Completely ad-hoc data entry probably will not serve any particular purpose well, but skipping record keeping entirely would be giving up on a valuable mechanism to keep things running smoothly and improve operations over time.

What to do?

Note: The thought process outlined here is almost certainly not original, but is hopefully presented it in a way that is tailored to be immediately useful to those entering data in farmOS.

Any time spent entering data has a cost, both in dollars - if paying someone - and in opportunity cost since that time could be spent on the primary work of actually taking care of your growing things. That means that we should have a clear understanding of the value proposition for each kind of data. Sometimes the work entering a type of data will be an investment. Other times it will represent regulatory requirement or a hedge against some possible negative outcome. In both cases, try to imagine the expected value for a given type of data as the probability of needing the data multiplied by the value of that data if it turns out to be useful. Those values might be expressed in dollars or time and will necessarily be best guesses, but they can provide a bit of a framework for determining if the opportunity cost for entering the data is worth it.

A simple example would be if having a peice of data up-to-date and in a convenient place will save you 10 minutes per day but takes an hour a week to enter/update, it might only barely be worth it.

More complexly, if having historical seeding rates (perhaps in kg/acre) lets you avoid $100 in wasted seed 50% of years and takes an hour per year to enter and use the data, then the expected value of entering that data is $50/year.

Working Backwards

Now that we have a framework for evaluating the opportunity represented by a type of data, we need to find some to evaluate.

Start by considering your existing operation and where your opportunities are. In other words, what do you want to change that seems like it could be improved with better data/analysis. Perhaps you already have good data for production and profits from a point-of-sale system, but have less visibility into the day-to-day operations that affect those production numbers. Alternatively, you might have great pre-production metrics from another tool or even a spreadsheet, but lack a good way to track day-to-day tasks assigned to workers or the movements/locations of animals/plants.

Whatever those opportunities are, write goals around them like this;

  • “I want to improve grazing rotation efficiency by being able to compare historical movements and animal weight gain.”
  • “I want to produce more consistent plant starts for sale by having data that will let me optimize the timing and quantity started year over year.”
  • “I want to reduce employee time lost looking for tools and materials by keeping track of where they are located.”

Next, take that statement and imagine what data it implies. For instance the grazing rotation example might require knowing the timing of which (and how many) animals were in which pastures along with - at least - starting and ending weights for those animals. The plant starts example might require knowing the dates when new batches of starts were planted along with the quantity and sales records for the resulting starts.

It might also be important to consider the ways in which the data will need to be accessed to achieve the goal. Maybe the data gets used once a month as some sort of report or maybe it is accessed on a day-to-day basis. This can help inform the granularity of the data and the selection of data formats/tools.

Data Structure and Re-usability

This next part is arguably the hardest because it requires researching (and possibly experimenting with) the available tools and data model choices. It also involves weighing the trade-offs between coarser or finer grained data. Here we are discussing use-cases where farmOS is a good fit, but it is important to keep in mind that for other use-cases accounting, GIS, or even spreadsheet/paper tools might be a better fit.

The farmOS data model provides for representing your growing things and related stuff using assets. It also provides logs for recording events. The basic ideas of how to use those primitives (and a few others) are described in the farmOS user guide. Armed with that information, we can start to decide what structure and granularity will help meet your goals.

In the grazing rotation example we need to decide whether to represent the herd/flock/etc as individual animal assets or as a single group asset. Correspondingly the weights might be recorded as an animal count, total weight, and average weight for the whole group or as individual weight logs for each animal. One thing that can help with that decision is determining whether there are other ways to re-use the data. e.g. Would having individual animal pasture locations and weight history be helpful in finding animals or making care decisions day-to-day?

farmOS is very unopinionated about how one uses its data model. This can be helpful but also means we need to exercise some creativity when choosing the best fit for any given use-case. In the employee tool and material finding example above, it might be tempting to track the locations of each tool or material when employees move them around, but it might not be realistic to expect employees to be updating those locations very well in real-time. Instead, you could choose to just model the normal storage location for the tool or material assets that way employees know where to look first and where to put the materials back when they’re done with them. By also referencing the tools and materials on logs for tasks assigned to those employees, other employees could look up who to ask if the tool or material isn’t where it was expected to be.

Developing Conventions

The patterns of what data gets entered and how it is recorded/used are called conventions. It is expected that farmOS will eventually have tools that can work directly with conventions, but for now conventions are a way to capture those patterns and possibly refine them over time. The audience for your convention(s) is yourself and others who will use farmOS with you.

Basically, a convention is just a written description of the purpose of the data being recorded and how to record it. This can be a plain english (or language of your choice) set of instructions for what data to enter in sufficient detail that someone who is generally familiar with farmOS, but not with your use-case can enter or validate the data.

Writing conventions - especially more detailed/formal ones - is optional, but is a good way to capture the work you have done thinking through the what, how, and why of the data you plan to record. It also could be a good way to share your plan with collaborators or employees.

Next Steps

Hopefully this has been a helpful overview of one strategy for choosing what data to record in farmOS. Ideally, it will inspire you to choose something strategic to improve through better record keeping - and start soon.

Remember not to get too wrapped up in making it perfect initially. It can help to set up a rhythm for improvements to your record keeping - possibly yearly/seasonal - so you can feel free to capture data a way that feels good now and improve on it in the next cycle once you know more or have seen a return on your investment in the data already entered.

Also remember that you’re not alone. It is natural to have questions and struggle to fit the messy real-world into the orderly software one. Feel free ask for help and share your victories on the farmOS forum or chat. Chances are, you will inspire others to think differently about their own data or even lead to improvements in farmOS itself.

6 Likes

This is GREAT @Symbioquine! I say open a PR!

3 Likes

Done: https://github.com/farmOS/farmOS-community-blog/pull/20

3 Likes