WO2024097621A1

WO2024097621A1 - Machine learning approach to detecting and tracing aggregated ecosystem attributes

Info

Publication number: WO2024097621A1
Application number: PCT/US2023/078118
Authority: WO
Inventors: Abigail SHERBURNE; Ryan Jones; Francis Kelly; Alice S. CHANG; Nathan BASCH; Joseph WEEKS; Jyoti SHANKAR; Misha Sidorsky; Peyton COLES
Original assignee: Indigo Ag, Inc.
Priority date: 2022-10-31
Filing date: 2023-10-27
Publication date: 2024-05-10

Abstract

An application identifies a geographic boundary associated with a crop. The application retrieves remote sensing data corresponding to the geographic boundary and a time, and applies a machine-learned model to the remote sensing data, the machine-learned model configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the geographic boundary and the time. The application modifies a crop database to include the ecosystem data in association with the crop.

Description

MACHINE LEARNING APPROACH TO DETECTING AND TRACING AGGREGATED

ECOSYSTEM ATTRIBUTES

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No. 63/381,756, filed October 31, 2022, which is incorporated by reference in its entirety.

[0002] The following matters are hereby incorporated by reference herein in their entireties, and are together referred to as “Remote Sensing Applications”:

Title: “Imagery -Based Boundary Identification for Agricultural Fields”, U.S. Patent Application No. 17/681,126, filed February 25, 2022;

Title: “Remote Sensing Algorithms for Mapping Regenerative Agriculture”, U.S. National Phase Patent Application No. 18/156,814, ‘371 filing date January 19, 2023;

Title: “Field-Scale Harvest Date Estimation Based on Satellite Imagery”, WO Application No. PCT/US2023/029167, filed August 1, 2023;

Title: “Crop Yield Forecasting Models”, U.S. National Phase Patent Application No. 17/625,287, ‘371 filing date January 6, 2022;

Title: “Modeling Field Irrigation With Remote Sensing Imagery”, U.S. Patent Application No. 17/704,773, filed March 25, 2022, issued on September 12, 2023 as U.S. Patent No, 11,755,966; and

Title: “Uncertainty Prediction Models”, WO Application No. PCT/US2023/032930, filed September 15, 2023.

TECHNICAL FIELD

[0003] The disclosure generally relates to machine learning, and more particularly, to modelling data using machine learning to identify and attribute ecosystem attributes for a source of crops.

BACKGROUND

[0004] Identifying ecosystem attributes for a product is extremely complex. For example, to determine the amount of carbon emissions that went into a grain product, field level data for each grain field that contributed to the product may be gathered and tracked from a growth cycle through aggregation and production. The complexity in tracking at this level of granularity results in inefficient and unscalable systems that require a large amount of computational resources for each component of each product that is tracked.

SUMMARY

[0005] According to embodiments of the present disclosure, methods and computer program products for tracing sustainable ecosystem attributes are provided. In an embodiment, an application identifies a geographic boundary associated with a crop. The application retrieves remote sensing data corresponding to the geographic boundary and a time, and applies a machine-learned model to the remote sensing data, the machine-learned model configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the geographic boundary and the time. The application modifies a crop database to include the ecosystem data in association with the crop. Machine learning approaches may be employed to identify the geographic boundary, the crop, the time, and any other data used to determine measures of ecosystem attributes. By relying on a machine learning approach to predict these features, computational complexity is reduced, efficiencies are increased, and scalability can be achieved. Moreover, the outputs of these machine learning approaches may act as input to downstream processing for maintaining accurate determinations of ecosystem attributes at each aggregation of product.

BRIEF DESCRIPTION OF THE DRAWINGS

[0006] FIG. 1 illustrates one embodiment of a system environment for implementing an ecosystem attribute tracking application, in accordance with an embodiment.

[0007] FIG. 2 illustrates one embodiment of exemplary modules and databases used by the ecosystem attribute tracking application, in accordance with an embodiment.

[0008] FIG. 3 shows one embodiment of an exemplary user interface showing a map and ecosystem attributes, in accordance with an embodiment.

[0009] FIG. 4 shows one embodiment of an exemplary user interface showing blending from a plurality of sourcing regions, in accordance with an embodiment.

[0010] FIG. 5 illustrates an exemplary flowchart showing a process for implementing an ecosystem attribute tracking application, in accordance with an embodiment.

[0011] The figures depict various embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.

DETAILED DESCRIPTION

[0012] Tracking ecosystem attributes for products that integrate crops from multiple producers is a complex task, particularly in scenarios where some producers opt for environmentally friendly growing practices and some do not. In such circumstances, there is currently no way to identify the source of a given blended agricultural product. An entity sourcing agricultural product is therefore unable to determine the environmental attributes of products, and there is no facility to track, update, and verify environmental attributes over time. Systems and methods are disclosed herein for tracking of agricultural products from the field level and at different points of aggregation, which can integrate data from a plurality of resources such as growers, distribution centers, sourcing entities, and remote sensing to trace ecosystem attributes throughout the supply chain, for example through one or more statistical or machine learning models that probabilistically predict the distribution of crops.

[0013] The figures and the following description relate to certain embodiments by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of what is claimed.

[0014] Reference will now be made in detail to several embodiments, examples of which are illustrated in the accompanying figures. It is noted that wherever practicable similar or like reference numbers may be used in the figures and may indicate similar or like functionality. The figures depict embodiments of the disclosed system (or method) for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles described herein.

[0015] As used herein an “ecosystem benefit” is used equivalently with “ecosystem attribute” or “environmental attribute,” each refer to an environmental characteristic (for example, as a result of agricultural production) that may be quantified. Examples of ecosystem benefits include without limitation reduced water use, reduced nitrogen use, increased soil carbon sequestration, greenhouse gas emission avoidance, etc. An example of a mandatory program requiring accounting of ecosystem attributes is California’s Low Carbon Fuel Standard (LCFS). Field-based agricultural management practices can be a means for reducing the carbon intensity of biofuels (e.g., biodiesel from soybeans). [0016] An “ecosystem impact” is a change in an ecosystem attribute relative to a baseline. In various embodiments, baselines may reflect a set of regional standard practices or production (a comparative baseline), prior production practices and outcomes for a field or farming operation (a temporal baseline), or a counterfactual alternative scenario (a counterfactual baseline). For example, a temporal baseline for determination of an ecosystem impact may be the difference between a com crop production period and the com crop production period of the prior year. In some embodiments, an ecosystem impact can be generated from the difference between an ecosystem attribute for the latest crop production period and a baseline ecosystem attribute averaged over a number (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) of prior production periods.

[0017] More generally, a counterfactual scenario refers to what could have happened within the crop growing season in an area of land given alternative practices. In various embodiments, a counterfactual scenario is based on an approximation of sourcing region practices.

[0018] A “sustainability claim” is a set of one or more ecosystem benefits associated with an agricultural product (for example, including ecosystem benefits associated with production of an agricultural product). For example, a production entity may source raw agricultural products from producers reducing irrigation, in order to make a sustainability claim of supporting the reduction of water demand on the final processed agricultural product.

[0019] Emissions of greenhouse gases are often categorized as Scope 1, Scope 2, or Scope 3. Scope 1 emissions are direct greenhouse gas emissions that occur from sources that are controlled or owned by an organization. Scope 2 emissions are indirect greenhouse gas emissions associated with purchase of electricity, steam, heating, or cooling. Scope 3 emissions are the result of activities from assets not owned or controlled by the reporting organization, but that the organization indirectly impacts in its value chain. Scope 3 emissions represent all emissions associated with an organization’s value chain that are not included in that organization’s Scope 1 or Scope 2 emissions. Scope 3 emissions include activities upstream of the reporting organization or downstream of the reporting organization. Upstream activities include, for example, goods and services (e.g., agricultural production such as wheat, soybeans, or corn may be purchased inputs for production of animal feed), capital goods, upstream fuel and energy, upstream transportation and distribution (e.g., transportation of raw agricultural products such as grain from the field to a grain elevator), waste generated in upstream operations, business travel, employee commuting, or leased assets. Downstream activities include, for example, transportation and distribution other than with the vehicles of the reporting organization, processing of goods, use of goods, end of life treatment of goods, and so on.

[0020] As used herein, a “crop-growing season” may refer to fundamental unit of grouping crop events by non-overlapping periods of time. In various embodiments, harvest events are used where possible.

[0021] A “product” is any item of agricultural production, including crops and other agricultural products, in their raw, as-produced state (e.g., wheat grains), or as processed (e.g., oils, flours, polymers, consumer goods (e.g., crackers, cakes, plant based meats, animalbased meats (for example, beef from cattle fed a product such as corn grown from a particular field), bioplastic containers, efc.). In addition to harvested physical products, a product may also include a benefit or service provided via use of the associated land (for example, for recreational purposes such as a golf course), pasture land for grazing wild or domesticated animals (where domesticated animals may be raised for food or recreation).

[0022] As used herein, “quality” or a “quality metric” may refer to any aspect of an agricultural product that adds value. In some embodiments, quality is a physical or chemical attribute of the crop product. For example, a quality may include, for a crop product type, one or more of: a variety; a genetic trait or lack thereof; genetic modification of lack thereof; genomic edit or lack thereof; epigenetic signature or lack thereof; moisture content; protein content; carbohydrate content; ash content; fiber content; fiber quality; fat content; oil content; color; whiteness; weight; transparency; hardness; percent chalky grains; proportion of corneous endosperm; presence of foreign matter; number or percentage of broken kernels; number or percentage of kernels with stress cracks; falling number; farinograph; adsorption of water; milling degree; immature grains; kernel size distribution; average grain length; average grain breadth; kernel volume; density; L/B ratio; wet gluten; sodium dodecyl sulfate sedimentation; toxin levels (for example, mycotoxin levels, including vomitoxin, fumonisin, ochratoxin, or aflatoxin levels); and damage levels (for example, mold, insect, heat, cold, frost, or other material damage).

[0023] In some embodiments, quality is an attribute of a production method or environment. For example, quality may include, for a crop product, one or more of: soil type; soil chemistry; climate; weather; magnitude or frequency of weather events; soil or air temperature; soil or air moisture; degree days; rain fed; irrigated or not; type of irrigation; tillage frequency; cover crop (present or historical); fallow seasons (present or historical); crop rotation; organic; shade grown; greenhouse; level and types of fertilizer use; levels and type of chemical use; levels and types of herbicide use; pesticide-free; levels and types of pesticide use; no-till; use of organic manure and byproducts; minority produced; fair-wage; geography of production (e.g., country of origin, American Viti cultural Area, mountain grown); pollution-free production; reduced pollution production; levels and types of greenhouse gas production; carbon neutral production; levels and duration of soil carbon sequestration; and others. In some embodiments, quality is affected by, or may be inferred from, the timing of one or more production practices. For example, food grade quality for crop products may be inferred from the variety of plant, damage levels, and one or more production practices used to grow the crop. In another example, one or more qualities may be inferred from the maturity or growth stage of an agricultural product such as a plant or animal. In some embodiments, a crop product is an agricultural product.

[0024] In some embodiments, quality is an attribute of a method of storing an agricultural good (e.g., the type of storage: bin, bag, pile, in-field, box, tank, or other containerization), the environmental conditions (e.g., temperature, light, moisture, relative humidity, presence of pests, CO2 levels) during storage of the crop product, method of preserving the crop product (e.g., freezing, drying, chemically treating), or a function of the length of time of storage. In some embodiments, quality may be calculated, derived, inferred, or subjectively classified based on one or more measured or observed physical or chemical attributes of a crop product, its production, or its storage method. In some embodiments, a quality metric is a grading or certification by an organization or agency. For example, grading by the USDA, organic certification, or non-GMO certification may be associated with a crop product. In some embodiments, a quality metric is inferred from one or more measurements made of plants during growing season. For example, wheat grain protein content may be inferred from measurement of crop canopies using hyperspectral sensors and/or NIR or visible spectroscopy of whole wheat grains. In some embodiments, one or more quality metrics are collected, measured, or observed during harvest. For example, dry matter content of com may be measured using near-infrared spectroscopy on a combine. In some embodiments, the observed or measured value of a quality metric is compared to a reference value for the metric. In some embodiments, a reference value for a metric (for example, a quality metric or a quantity metric) is an industry standard or grade value for a quality metric of a particular agricultural good (for example, U.S. No. 3 Yellow Corn, Flint), optionally as measured in a particular tissue (for example, grain) and optionally at a particular stage of development (for example, silking). In some embodiments, a reference value is determined based on a supplier’s historical production record or the historical production record of present and/or prior marketplace participants.

[0025] A “field” is the area where agricultural production practices are being used (for example, to produce an agricultural product). As used herein, a “field boundary” may refer to a geospatial boundary of an individual field.

[0026] In various embodiments, a field is a unique object that has temporal and spatial dimensions. In various embodiments, the field is enrolled in one or more programs, where each program corresponds to a methodology. As used herein a “methodology” (equivalently “program eligibility requirements” or “program requirements”) is a set of requirements associated with a program, and may include, for example, eligibility requirements for the program (for example, eligible regions, permitted practices, eligible participants (for example, size of farms, types of product permitted, types of production facilities permitted, efc.) and or environmental effects of activities of program participants, reporting or oversight requirements, required characteristics of technologies (including modeling technologies, statistical methods, efc.) permitted to be used for prediction, quantification, verification of results by program participants, etc. Examples of methodologies include protocols administered by Climate Action Reserve (CAR) (climateactionreserve.org), such as the Soil Enrichment Protocol; methodologies administered by Verra (verra.org), such as the Methodology for Improved Agricultural Land Management, farming sustainability certifications, life cycle assessment, and other similar programs. In various embodiments, the field data object includes field metadata. “One or more methodologies” refers to a data structure comprising program eligibility requirements for a plurality of programs. More briefly, a methodology may be a set of rules set by a registry or other third party, while a program implements the rules set in the methodology.

[0027] In various embodiments, the field metadata includes a field identifier that identifies a farm (e.g., a business) and a farmer who manages the farm (e.g., a user). In various embodiments, the field metadata includes field boundaries that are a collection of one or more polygons describing geospatial boundaries of the field. In some embodiments, polygons representing fields or regions within fields (e.g., management event boundaries, efc.) may be detected from remote sensing data using computer vision methods (for example, edge detection, image segmentation, and combinations thereof) or machine learning algorithms (for example, maximum likelihood classification, random tree classification, support vector machine classification, ensemble learning algorithms, convolutional neural network, etc.).

[0028] In various embodiments, the field metadata includes farming practices that are a set of farming practices on the field. In various embodiments, farming practices are a collection of practices across multiple years. For example, farming practices include crop types, tillage method, fertilizers and other inputs, etc. as well as temporal information related to each practice which is used to establish crop growing seasons and ultimately to attribute outcomes to practices. In various embodiments, the field metadata includes outcomes. In various embodiments, the outcomes include at least an effect size of the farming practices and an uncertainty of the outcome. In various embodiments, an outcome is a recorded result of a practice, notably: harvest yields, sequestration of greenhouse gases, and/or reduction of emissions of one or more greenhouse gases.

[0029] In various embodiments, the field metadata includes agronomic information, such as soil type, climate type, etc. In various embodiments, the field metadata includes evidence of practices and outcomes provided by the grower or other sources. For example, a scale ticket from a grain elevator, an invoice for cover crop seed from a distributor, farm machine data, remote sensing data, a time stamped image or recording, etc. In various embodiments, the field metadata includes product tracing information such as storage locations, intermediaries, final buyer, and tracking identifiers.

[0030] In various embodiments, the field object is populated by data entry from the growers directly. In various embodiments, the field object is populated using data from remote sensing (satellite, sensors, drones, efc.). In various embodiments, the field object is populated using data from agronomic data platforms such as John Deere and Granular, and/or data supplied by agronomists, and/or data generated by remote sensors (such as aerial imagery, satellite derived data, farm machine data, soil sensors, efc.). In various embodiments, at least some of the field metadata within the field object is hypothetical for simulating and estimating the potential effect of applying one or more practices (or changing one or more practices) to help growers make decisions as to which practices to implement for optimal economic benefit.

[0031] In various embodiments, the system may access one or more model capable of processing the field object, processing the field object (e.g., process the field object based on one or more model), and returning an output based on the metadata contained within the field object. In various embodiments, a collection of models that can be applied to a field object to estimate, simulate, and/or quantify the outcome (e.g., the effect on the environment) of the practices implemented on a given field. In various embodiments, the models may include process-based biogeochemical models. In various embodiments, the models may include machine learning models. In various embodiments, the models may include rule-based models. In various embodiments, the models may include a combination of models (e.g., ensemble models). As used herein “ecosystem attribute quantification methods” comprise one or more of: empirical models, process-based models, machine learning models, biogeochemical models, ecosystem service models, models based on remotely sensed data, life-cycle assessment and inventory’ models, ensemble models, food web models, population models, direct measurement and statistical sample designs, crop growth models, or combinations thereof.

[0032] As used herein, a “management event” may refer to a grouping of data about one or more farming practices (such as tillage, harvest, efc.) that occur within a field boundary or an enrolled field boundary. A “management event” contains information about the time when the event occurred, and has a geospatial boundary defining where within the field boundary the agronomic data about the event applies. Management events are used for modeling and credit quantification, designed to facilitate grower data entry and assessment of data requirements. Each management event may have a defined management event boundary that can be all or part of the field area defined by the field boundary. A “management event boundary” (equivalently a “farming practice boundary”) is the geospatial boundary of an area over which farming practice action is taken or avoided. In some embodiments, if a farming practice action is an action taken or avoided at a single point, the management event boundary is point location. As used herein, a farming practice and agronomic practice are of equivalent meaning.

[0033] As used herein, a “management zone” may refer to an area within an individual field boundary defined by the combination of management event boundaries that describe the presence or absence of management events at any particular time or time window, as well as attributes of the management events (if any event occurred). A management zone may be a contiguous region or a non-contiguous region. A “management zone boundary” may refer to a geospatial boundary of a management zone. In some embodiments, a management zone is an area coextensive with a spatially and temporally unique set of one or more farming practices. In some embodiments, an initial management zone includes historic management events from one or more prior cultivation cycles (for example, at least 2, at least 3, at least 4, at least 5, or a number of prior cultivation cycles required by a methodology). In some embodiments, a management zone generated for the year following the year for which an initial management zone was created will be a combination of the initial management zone and one or more management event boundaries of the next year. A management zone can be a data-rich geospatial object created for each field using an algorithm that crawls through management events (e.g., all management events) and groups the management events into discrete zonal areas based on features associated with the management event(s) and/or features associated with the portion of the field in which the management event(s) occur. The creation of management zones enables the prorating of credit quantification for the area within the field boundary based on the geospatial boundaries of management events.

[0034] In some embodiments, a management zone is created by sequentially intersecting a geospatial boundary defining a region wherein management zones are being determined (for example, a field boundary), with each geospatially management event boundary occurring within that region at any particular time or time window, wherein each of the sequential intersection operations creates two branches - one with the intersection of the geometries and one with the difference. The new branches are then processed with the next management event boundary in the sequence, bifurcating whenever there is an area of intersection and an area of difference. This process is repeated for all management event boundaries that occurred in the geospatial boundary defining the region. The final set of leaf nodes in this branching process define the geospatial extent of the set of management zones within the region, wherein each management zone is non-overlapping and each individual management zone contains a unique set of management events relative to any other management zone defined by this process.

[0035] As used herein, a “zone-cycle” may refer to a single cultivation cycle on a single management zone within a single field, considered collectively as a pair that define a foundational unit (e.g., also referred to as an “atomic unit”) of quantification for a given field in a given reporting period.

[0036] As used herein, a “baseline simulation” may refer to a point-level simulation of constructed baselines for the duration of the reported project period, using initial soil sampling at that point (following SEP requirements for soil sampling and model initialization) and management zone-level grower data (that meets SEP data requirements). [0037] As used herein, a “with-project simulation” may refer to a point-level simulation of adopted practice changes at the management zone level that meet SEP requirements for credit quantification.

[0038] As used herein, a “field-level project start date” may refer to the start of the earliest cultivation cycle, where a practice change was detected and attested by a grower. [0039] As used herein, a “cultivation cycle” (equivalently a “crop production period” or “production period”) may refer to the period between the first day after harvest or cutting of a prior crop on a field or the first day after the last grazing on a field, and the last day of harvest or cutting of the subsequent crop on a field or the last day of last grazing on a field. For example, a cultivation cycle may be: a period starting with the planting date of current crop and ending with the harvest of the current crop, a period starting with the date of last field prep event in the previous year and ending with the harvest of the current crop, a period starting with the last day of crop growth in the previous year and ending with the harvest or mowing of the current crop, a period starting the first day after the harvest in the prior year and the last day of harvest of the current crop, etc. In some embodiments, cultivation cycles are approximately 365 day periods from the field-level project start date that contain completed crop growing seasons (planting to harvest/mowing, or growth start to growth stop). In some embodiments, cultivation cycles extend beyond a single 365 day period and cultivation cycles are divided into one or more cultivation cycles of approximately 365 days, optionally where each division of time includes one planting event and one harvest or mowing event.

[0040] As used herein, a “historic cultivation cycles” may refer to defined in the same way as cultivation cycles, but for the period of time in the required historic baseline period. [0041] As used herein, a “historic segments” may refer to individual historic cultivation cycles, separated from each other in order to use to construct baseline simulations. [0042] As used herein, a “historic crop practices” may refer to crop events occurring within historic cultivation cycles.

[0043] As used herein, a “baseline thread/parallel baseline threads” may refer to each baseline thread is a repeating cycle of the required historic baseline period, that begin at the management zone level project start date. The number of baseline threads equals the number of unique historic segments (e.g., one baseline thread per each year of the required historic baseline period). Each baseline thread begins with a unique historic segment and runs in parallel to all other baseline threads to generate baseline simulations for a with -project cultivation cycle.

[0044] As used herein, an “overlap in practices” may refer to unrealistic agronomic combinations that arise at the start of baseline threads, when dates of agronomic events in the concluding cultivation cycle overlap with dates of agronomic events in the historic segment that is starting the baseline thread. In this case, logic is in place based on planting dates and harvest dates to make adjustments based on the type of overlap that is occurring.

[0045] An “indication of a geographic region” is a latitude and longitude, an address or parcel id, a geopolitical region (for example, a city, county, state), a region of similar environment (e.g., a similar soil type or similar weather), a supply shed, a boundary file, a shape drawn on a map presented within a GUI of a user device, image of a region, an image of a region displayed on a map presented within a GUI of a user device, a user id where the user id is associated with one or more production locations (for example, one or more fields). Geographic regions may be of any scale, whether large (e.g., an entire country), small (e.g., a boundary of a single field), or anything in between (e.g., a sourcing region).

[0046] For example, polygons representing fields may be detected from remote sensing data using computer vision methods (for example, edge detection, image segmentation, and combinations thereof) or machine learning algorithms (for example, maximum likelihood classification, random tree classification, support vector machine classification, ensemble learning algorithms, convolutional neural network, etc.).

[0047] “Ecosystem observation data” are observed or measured data describing an ecosystem, for example weather data, soil data, remote sensing data, emissions data (for example, emissions data measured by an eddy covariance flux tower), populations of organisms, plant tissue data, and genetic data. In some embodiments, ecosystem observation data are used to connect agricultural activities with ecosystem variables. Ecosystem observation data may include survey data, such as soil survey data (e.g., SSURGO). In various embodiments, the system performs scenario exploration and model forecasting, using the modeling described herein. In various embodiments, the system proposes climate-smart crop fuel feedstock CI integration with an existing model, such as the Greenhouse gases, Regulated Emissions, and Energy use in Technologies Model (GREET), which can be found online at https://greet.es.anl.gov/ (the GREET models are incorporated by reference herein). [0048] A “crop type data layer” is a data layer containing a prediction of crop type, for example USDA Cropland Data Layer provides annual predictions of crop type, and a 30m resolution land cover map is available from MapBiomas (https://mapbiomas.org/en). A crop mask may also be built from satellite-based crop type determination methods, ground observations including survey data or data collected by farm equipment, or combinations of two or more of: an agency or commercially reported crop data layer (e.g., CDL), ground observations, and satellite-based crop type determination methods.

[0049] A “vegetative index” (“VI”) is a value related to vegetation as computed from one or more spectral bands or channels of remote sensing data. Examples include simple ratio vegetation index (“RVI”), perpendicular vegetation index (“PVI”), soil adjusted vegetation index (“SAVI”), atmospherically resistant vegetation index (“AR VI”), soil adjusted atmospherically resistant VI (“SARVI”), difference vegetation index (“DVI”), normalized difference vegetation index (“ND VI”). ND VI is a measure of vegetation greenness which is particularly sensitive to minor increases in surface cover associated with cover crops.

[0050] SEP” stands for soil enrichment protocol. The SEP version 1.0 and supporting documents, including requirements and guidance, (incorporated by reference herein) can be found online at https://www.climateactionreserve.org/how/protocols/soil-enrichment/. As is known in the art, SEP is an example of a carbon registry methodology, but it will be appreciated that other registries having other registry methodologies (e.g., carbon, water usage, efc.) may be used, such as the Verified Carbon Standard VM0042 Methodology for Improved Agricultural Land Management, vl.O (incorporated by reference herein), which can be found online at https://verra.org/methodology/vm0042-methodology-for-improved- agri cultural -land-management-v 1-0/. The Verified Carbon Standard methodology quantifies the greenhouse gas (GHG) emission reductions and soil organic carbon (SOC) removals resulting from the adoption of improved agricultural land management (ALM) practices.

Such practices include, but are not limited to, reductions in fertilizer application and tillage, and improvements in water management, residue management, cash crop and cover crop planting and harvest, and grazing practices.

[0051] “LRR” refers to an Land Resource Region, which is a geographical area made up of an aggregation of Major Land Resource Areas (MLRA) with similar characteristics.

[0052] DayCent is a daily time series biogeochemical model that simulates fluxes of carbon and nitrogen between the atmosphere, vegetation, and soil. It is a daily version of the CENTURY biogeochemical model. Model inputs include daily maximum/minimum air temperature and precipitation, surface soil texture class, and land cover/use data. Model outputs include daily fluxes of various N-gas species (e.g., N2O, NOx, N2); daily CO2 flux from heterotrophic soil respiration; soil organic C and N; net primary productivity; daily water and nitrate (NO3) leaching, and other ecosystem parameters.

[0053] “LCIA” refers to Life Cycle Impact Assessment.

[0054] “LCI” refers to Life Cycle Inventories.

[0055] “LCA” refers to Life Cycle Assessment.

[0056] CF” refers to Characterization Factors.

[0057] “ASD” refers to Agricultural Statistical Districts.

[0058] Chain of Custody” refers to the custodial sequence that occurs as ownership or control of the material supply is transferred from one custodian to another in the supply chain. There are a number of "chain of custody models" with different levels of rigor and adherence to Scope 3 reporting. This is important as a product may begin the journey through the supply chain under one Chain of Custody Model, and transfer to another as it moves through the supply chain.

[0059] Chain of Custody Model” describes the approach taken to demonstrate the link (physical or administrative) between the verified unit of production and the claim about the final product. There are 5 different chain of custody models outlined by GHGP.

[0060] “Traceability” refers to the ability to verify the history, location, or application of an item by means of documented recorded identification. The ISEAL guidance (available at https://www.isealalliance.org/sites/default/files/resource/2017-

1 l/ISEAL_Chain_of_Custody_Models_Guidance_September_2016.pdf) further discusses traceability and chain of custody, and is hereby incorporated by reference.

[0061] FIG. 1 illustrates one embodiment of a system environment for implementing an ecosystem attribute tracking application, in accordance with an embodiment. As depicted in FIG. 1, environment 100 includes client device 110 (with application 111 installed thereon), network 120, ecosystem attribute tracking application 130, and remote sensor 140. While only one instance of each item is depicted, this is for illustrative convenience, and references in the singular to each item is meant to cover instances where plural items exist.

[0062] Client device 110 is a device with which a user may interface with ecosystem attribute tracking application 130. Client device 110 may be any device having a user interface and capable of communication with ecosystem attribute tracking application 130. For example, client device 110 may be a personal computer, laptop, tablet, wearable device, kiosk, smart phone, or any other device having components capable of performing the functionality disclosed herein.

[0063] Optionally, client device 110 may have application 111 installed thereon. Application 111 may provide an interface between client device 110 and ecosystem attribute tracking application 130. Application 111 may be a stand-alone application installed on client device 110 that is communicatively coupled with ecosystem attribute tracking application 130 to perform at least some of the activity described with respect to ecosystem attribute tracking application 130 on client device 110, or may be accessed by way of a secondary application, such as a browser application. Any activity described herein with respect to ecosystem attribute tracking application 130 may be performed wholly or in part (e.g., by distributed processing) by application 111. That is, while activity is primarily described as performed in the cloud by ecosystem attribute tracking application 130, this is merely for convenience, and all of the same activity may be performed wholly or partially locally to client device 130 by application 111. Exemplary activity of application 111 may include providing a user interface to a user that shows a map interface and accepts input of parameters or any other input described herein by the user (e.g., with respect to determining ecosystem attributes).

[0064] Network 120 facilitates transmission of data between client device 110, ecosystem attribute tracking application 130, and remote sensor 140, as well as any other entity with which any entity of environment 100 communicates. Network 120 may be any data conduit, including the Internet, short-range communications, a local area network, wireless communication, cell tower-based communications, or any other communications.

[0065] Ecosystem attribute tracking application 130 receives inputs from one or more users of client device 110 and processes those inputs along with other data (e.g., from remote sensor 140) to determine ecosystem attributes. Remote sensor 140 may be used by ecosystem attribute tracking application 130 to determine factors that contribute to ecosystem attributes of imaged areas, such as supply sheds and/or fields. Ecosystem attribute tracking application 130 may have its functionality distributed across any number of servers, and may have some or all functionality performed local to client devices using application 111. Further details about ecosystem attribute tracking application 130 are disclosed below with respect to FIGS. 2-5.

[0066] FIG. 2 illustrates one embodiment of exemplary modules and databases used by the ecosystem attribute tracking application, in accordance with an embodiment. As depicted in FIG. 2, ecosystem attribute tracking application 130 includes secure document analysis module 202, geographic boundary module 204, auxiliary information module 206, map manipulation module 208, remote sensing data processing module 210, blending module 212, machine learning models database 252, and index database 254. These modules and this database are exemplary; fewer or more modules and/or databases may be used to achieve the functionality disclosed herein.

[0067] In some embodiments, secure document analysis module 202 identifies a geographic boundary associated with a crop. Various embodiments for using secure document analysis module 202 to perform this function will now be described, though it should be noted that other manners of identifying the geographic boundary are also disclosed (e.g., user input of geographical data) that can be used in combination with secure document analysis or on its own to identify a geographic boundary. As used herein, the term secure electronic document (sometimes referred to as “secure document” or “document”) may refer to an electronic representation of an agreement between parties with respect to usage of a crop. For example, a secure electronic document may be a document indicating attributes of the document including one or more of entities, what the crop is, a time associated with the production of the crop, one or more representations of quality or quality specifications, one or more geographic locations, and so on. The term “secure” refers to an attribute of the document as being verifiable as an accurate representation of a volume of agricultural crop (e.g., a crop such as corn, soybeans, wheat, barley, sorghum, beets, and the like). For example, a document may be secure based on features enabling auditing to show the document has not been tampered with since the document was generated. As another example, the document may be enabled with features that prevent editing after the document was generated.

[0068] Secure document analysis module 202 may determine one or more of the attributes of the document using a rules-based approach and/or a machine learning approach, such as a supervised learning approach and/or a generative machine learning approach (e.g., using one or more large language models). On a rules-based approach, secure document analysis module 202 may follow rules to identify the attributes. For example, a rule may state that where one or more given keyword(s) are found, they are indicative of an attribute. An example of such a rule would be the term “source field”, which may be indicative of a geographical location, or “provider” which may be indicative of a party. The rule may indicate how to resolve the attribute based on finding those keywords. Rather than finding particular keywords, synonyms or other fuzzy logic may be used to identify whether terms identify a match.

[0069] Where secure document analysis module 202 deploys a machine learning approach, one or more supervised machine learning models may be trained to take as input a portion or an entirety of a secure electronic document and to output one or more attributes. Training examples may include examples of text indicative as labeled with one or more attributes. For example, an example may include a phrase, “12 bushels of corn for production year 2023”, and this phrase (and/or the entire sentence, paragraph, or any other amount of the secure electronic document up to and including the entirety of the document) may be labeled to indicate a crop type of “corn” and a time of “2023”. Separate models may be trained to detect different attributes (e.g., a separate model to detect time, a separate model to detect crop type, a separate model to detect parties, and so on). Alternatively, models may be trained to output predictions for more than one (and potentially all) attributes.

[0070] Where secure document analysis module 202 deploys a generative machine learning approach, secure document analysis module 202 may prepare one or more prompts for a large language model (LLM) that include instructions. The prompts may, for example, instruct the LLM with a skill level it is to assume and what it is to look for within the document, and perhaps how to identify those features. Secure document analysis module 202 may receive from the LLM an indication of values for attributes and/or an indication of which attributes cannot be resolved from within the document. Secure document analysis module 202 may iteratively prompt the LLM based on its output indicating ambiguities in attributes that cannot be resolved to ask narrower and narrower questions regarding identifying those attributes.

[0071] Some examples of usage of secure document analysis module 202 follow. A user of client device 101 may be requesting ecosystem attribute information with respect to an input crop such as corn. Secure document analysis module 202 may determine a time to bound its analysis of the ecosystem attribute based on analysis of a secure electronic document associated with the crop (e.g., a cultivation cycle’s start and end time). For example, the user may input (or provide a link to) one or more secure electronic documents into a user interface of ecosystem attribute tracking application 130, and secure document analysis module 202 may determine therefrom the time using any rules-based or machinelearning approach. The time may be a time when the crop was produced, such as a harvest year, a range of months within a year.

[0072] Geographic boundary module 204 determines a geographic boundary within which ecosystem attribute tracking application 130 is to determine ecosystem attribute qualities. In some embodiments, identifying the geographic boundary associated with the crop comprises receiving an input of the boundary based on an interaction by a user into a map displayed on a user interface. For example, a user may select individual plots (e.g., fields) on a map interface. Geographic boundary module 204 may draw the boundary around the individual plots. The plots may be contiguous, or may be disjoint, where the geographic boundary comprises a plurality of boundaries around different plots that together form the geographic boundary.

[0073] In some embodiments, secure document analysis module 202 may have determined one or more locations that inform the geographic boundary (e.g., using machine learning and/or rules approaches). For example, various fields, plots, aggregators (e.g., grain elevators), and so on may have been identified. Geographic boundary module 204 may draw a boundary that includes these locations. Alternatively or additionally, geographic boundary module 204 may output a recommendation to a user areas identified by secure document analysis module 202. The recommendation may include areas highlighted on the map. The user may select the areas and accept or decline each recommended area. Moreover, when presenting the map to the user (e.g., for user input of locations), geographic boundary module 204 may localize the map to a position that is localized around one or more of the areas identified by secure document analysis module 202. For example, the map may be centered around a centroid based on the areas identified by secure document analysis module 202. [0074] In some embodiments, geographic boundary module 204 detects uncertainty in a location determination and draws a geographic boundary based on the detected uncertainty. In some embodiments, uncertainty may be input by a user. For example, where a user is unsure of exact plots from which a crop is sourced, the user may indicate a region (e.g., a radius around a center point or any other geometric shape), and geographic boundary module 204 may determine the boundary to be or include that shape.

[0075] In some embodiments, geographic boundary module 204 detects uncertainty based on output from a machine learning model. For example, where the supervised machine learning model and/or LLM may output a confidence value with its prediction of location from its analysis of a secure electronic document. In some embodiments, geographic boundary module 204 may compare the confidence value with a threshold minimum confidence value, and may determine uncertainty where the confidence value does not meet the threshold minimum confidence value. Geographic boundary module 204 may responsively draw a larger area (e.g., inversely correlated with how far below the threshold the confidence value is) around the identified location. In some embodiments, rather than use the threshold value, geographic boundary module 204 may widen the boundary around the determined location by an amount that is inversely correlated with how far below a 100% confidence the prediction is.

[0076] In some embodiments, geographic boundary module 204 may determine whether or not to include an area within a geographic boundary based on a clustering property of an uncertain geographic location with respect to certain geographic locations. For example, where an uncertain area is within a threshold distance of (e.g., adjacent to) or otherwise clustered with (e.g., owned by a same property owner as a certain plot) another plot having the same crop or same crop type, then geographic boundary module 204 may responsively determine to include that uncertain area within the geographic boundary. Geographic boundary module 204 may highlight uncertain areas to a user with a flag that those areas are uncertain, and request the user confirm those areas for inclusion by way of the user interface. [0077] As mentioned in the foregoing, where a machine learning approach is used with respect to secure electronic documents, one or more machine learning models may be trained using a set of historical secure electronic documents as labeled by known field locations corresponding to each of the historical secure electronic documents. In such scenarios, the machine learning model(s) may be trained to infer missing geographical locations based on labels of missing geographical locations on examples of the set of historical secure electronic documents as applied based on user input of what those examples are missing. In order to train the machine learning models, geographic boundary module 204 may present to users predicted geographical locations (e.g., as predicted from a secure electronic document using a rules and/or machine learning approach). Users may indicate additional geographic locations that were missing from the analysis of the secure electronic document. Geographic boundary module 204 may save training examples including the secure electronic document and/or predicted geographical locations as labeled by the indicated additional geographic locations. With such training, the machine learning models may be used by geographic boundary module 204 to predict missing geographical locations from predictions made by secure document analysis module 202 and to include those missing geographical locations in a geographic boundary.

[0078] In some embodiments, geographic boundary module 204 may, rather than use an exact boundary around identified areas, elect to approximate the geographic boundary (e.g., by having a wider boundary that also encompasses other areas). Geographic boundary module 204 may perform this approximation in response to determining that a confidence score (e.g., output by a machine learning model in connection with activity of secure document analysis module 202) is below a threshold confidence score. Moreover, in scenarios where missing locations are predicted by a machine learning model, confidence scores may also be used to draw a wider boundary around those missing locations.

[0079] In some embodiments, geographic boundary module 204 may apply a machine- learned model to secure electronic documents (e.g., input by a user), and may receive as output from the model a geographic boundary representing the highest probability region for the crop production. This highest probability region may be based on any combination of the afore-mentioned processes including analysis of the secure electronic documents for contents, mapping contents to attributes and/or locations, approximating boundary locations, and so on, where the machine learning model may determine a plurality of candidate geographic boundaries and probabilities of their being an accurate representation, and may select the candidate with the highest probability of being an accurate representation.

[0080] Geographic boundary module 204 may apply either a same or a different machine- learned model to the secure electronic documents, and may receive as output from the model a total volume of crop produced within the geographic boundary. Alternatively, after determining the geographic boundary, the geographic boundary 204 may apply the machine learning model to the geographic boundary directly (e.g., rather than, or in addition to, the secure electronic documents). The machine learning model may be configured to output the volume based on training examples that label historical geographic boundaries (e.g., possibly along with other auxiliary information) with volumes for a given time range (e.g., cultivation cycle).

[0081] The same or a different machine learning model may additionally be configured to determine one or more aggregation facilities within the geographic boundary (e.g., using a training set that shows historical aggregation facilities within that geographic boundary). For each of the one or more aggregation facilities, the same or a different machine learning model may be configured to determine a volume of crop associated with each of the one or more aggregation facilities (e.g., using training examples having labels for the aggregation facilities of volume of crop for a given cultivation cycle).

[0082] Auxiliary information module 206 may determine auxiliary information relating to mapped features. Auxiliary information module 206 may determine this information by using index database 254. Index database 254 may map different attributes corresponding to geographic locations. For example, a geographic location may be mapped to one or more parties, one or more crop types, and so on. An aggregation location (e.g., grain silo) may be mapped to different plots that contribute to an aggregation, crop type(s) that are aggregated, parties that contribute, and so on. By leveraging index database 254, auxiliary information module 206 may determine auxiliary information relating to data that is mapped or otherwise determined by secure document analysis module 202 and may augment that information. Geographic boundary module 204 may, based on this augmented information, provide augmented information about mapped features. Moreover, geographic boundary module 204 may determine areas that are part of a geographic boundary based on the augmented information (e.g., where a party is identified from a secure electronic document, one or more corresponding locations may be determined by auxiliary information module 206 for inclusion in the geographic boundary).

[0083] Map manipulation module 208 may manipulate a map shown on a user interface based on user input and/or processing performed by ecosystem attribute tracking application 130. Map manipulation module 208 may perform any above-discussed initialization activity (e.g., centering the map on determined locations). Map manipulation module 208 may determine locations from a secure electronic document (e.g., production locations, growth locations, aggregation locations, etc., based on output from secure document analysis module 202) and may configure a map to show those locations. For example, map manipulation module 208 may receive user input of a command (e.g., show associated aggregation locations or production locations for this set of secure electronic documents), and may responsively configure the map to show those locations.

[0084] Remote sensing data processing module 210 retrieves remote sensing data corresponding to the geographic boundary and a time (e.g., a year in which a crop was grown, a growing cycle, or some other time boundary around a particular crop). The remote sensing data may include satellite imagery, drone imagery, aircraft imagery, and derivatives thereof. In some embodiments, the remote sensing data is stored in index database 254 that maps the time and one or more geographic locations within the geographic boundary to remote sensing data (for example, images from satellite, drown, or plane mounted remote sensors). That is, as data is captured by a remote sensor 140. The data from these data sources can be combined, and a standard feature set can be extracted from the combined data, enabling crop prediction models to be generated across different temporal systems, different spatial coordinate systems, and measurement systems. For example, sensor data streams can be a time series of scalar values linked to a specific latitude/longitude coordinate. Likewise, LiDAR data can be an array of scalar elevation values on a 10m rectangular coordinate system, and satellite imagery can be spatial aggregates of bands of wavelengths within specific geographic boundaries. After aggregating and standardizing data from these data streams (for instance to a universal coordinate system, such as a Military Grid Reference System), feature sets can be extracted and combined (such as a soil wetness index from raw elevation data, or cumulative growing degree days from crop types and planting dates). Remote sensor data, such as images (or portions thereof) of different plots or other locations are stored in association with those locations. Remote sensor data processing module 210 retrieves this data using the index.

[0085] Remote sensor data processing module 210 applies a machine-learned model to the remote sensing data, the machine-learned model configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the geographic boundary and the time. Manners of retrieving remote sensing data and using machine learning to determine ecosystem data is described in detail in the Remote Sensing Applications. Anything disclosed in the Remote Sensing Applications may be used by remote sensing data processing module 210 to produce ecosystem data.

[0086] Remote sensing data processing module 210, after determining the ecosystem data, modifies a crop database (e.g., index database 254) to include the ecosystem data in association with the crop. This enables ecosystem data to be tracked as the crop is used in downstream production (e.g., aggregated with other crops or used to create an end product or other blend). In some embodiments, the ecosystem data stored in the crop database is leveraged, rather than re-using the machine learning model, where a same crop, time, and location within the geographic location are later queried. That is, index database 254 may be referenced prior to retrieving satellite data, where remote sensing data processing module 210 leverages existing data for areas within the geographic boundary that previously had ecosystem data predicted, and limits its processing to determining ecosystem data for a subset of the area within the geographic boundary where this determination was not previously made. In this manner, computational efficiency is achieved in limiting the predictions required to be made by the machine learning model used to process remote sensor data. Moreover, network efficiencies are achieved by avoiding retrieving data from index database 254 to perform those computations.

[0087] In some embodiments, the machine-learned model used by remote sensing data processing module 210 predicts a crop type associated with the geographic boundary and the time. In such embodiments, remote sensing data processing module 210 modifies the crop database to include the crop type (e.g., in association with any other auxiliary data related to the crop type within index 254).

[0088] In some embodiments, the machine-learned model used by remote sensing data processing module 210 predicts a total volume of crop production associated with the geographic boundary and the time. Remote sensing data processing module 210 may additionally obtain, based on processing of the secure electronic documents by secure document analysis module 202, a predicted total volume of crop produced within the geographic region and the time. Remote sensing data processing module 210 may determine whether the predicted volumes are within a threshold tolerance of one another. Responsive to determining that they are not within the threshold tolerance (e.g., that the amount indicated within the remote sensing data does not match what was expected based on the secure electronic documents), remote sensing data processing module 210 may output an alert of the deviation on the user interface. The determination of the threshold tolerance may be based on whether a difference between the total volume of crop predicted by the machine-learned model to remote sensing data and the total volume of crop predicted by the machine-learned model applied to one or more secure electronic documents is greater than a threshold amount. [0089] In some embodiments, remote sensing data processing module 210 may predict the presence within the geographic boundary of one or more crop processing facilities (for example, an oilseed crushing facility, a biofuel production facility, an animal feeding operation, a mill, and the like), on-farm storage of a crop, transfer of a crop between aggregation facilities, and/or carry-over stocks of a crop, based on a difference between the total volume of crop predicted by the machine-learned model to remote sensing data and the total volume of crop predicted by the machine-learned model applied to one or more secure electronic documents. The carry-over stocks of a crop are at an aggregation facility or on- farm storage.

[0090] In some embodiments, the total volume of crop predicted by the machine-learned model applied to one or more secure electronic documents may be smaller than the difference between the total volume of crop predicted by the machine-learned model to remote sensing image data. Responsive to this condition, remote sensing data processing module 210 may generate a plurality of sets of field boundaries within the geographic boundary, where the predicted combined crop production of each set of fields is equivalent to the total volume of crop predicted by the machine-learned model applied to the one or more secure electronic documents. Remote sensing data processing module 210 may then apply an ecosystem attribute quantification method to remote sensing data for each field boundary within the set of field boundaries where the ecosystem attribute quantification method is configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the field boundaries, and may generate an uncertainty value for the ecosystem attributes associated with the geographic boundary from the distribution of one or more ecosystem attributes estimated for the sets of field boundaries. The uncertainty value may be used to predict confidence values in any manner described herein.

[0091] In some embodiments, the machine-learned model used by remote sensing data processing module 210 predicts one or more intermodal crop transfer facilities within the geographic boundary.

[0092] Following determination of ecosystem data for a given geographic boundary, map manipulation module 208 may receive, from user input into a user interface, a requested volume of a crop and a target profile of the measure of the one or more ecosystem attributes. For example, the user may indicate that for 100 bushels of corn, the user would like it in the aggregate to have no more than a maximum indicated threshold of carbon emissions associated with the corn. Map manipulation module 208 may determine combinations of crop within areas of the map that would achieve this target, and may configure the map on the user interface to indicate one or more locations of the crop whose volume together matches the target profile. As an example, the target profile may include one or more of a specific value, a threshold minimum value, and a threshold maximum value for the measure of the one or more ecosystem attributes. Moreover, the user may select a sub-set of the geographic boundary to perform the search to match the target profile, and the search may be limited to just that subset.

[0093] Blending module 212 determines ecosystem attributes for a crop and/or product at various stages in production. For example, grain from plots that each have their own ecosystem attributes may be aggregated at a grain silo. Blending module 212 tracks proportions of the grain that have each given characteristic, thereby determining aggregate ecosystem attributes by performing a statistical operation (e.g., average). As a particular example, using blending module 212, an average emissions output from the grain may be determined for the entire aggregation within the silo. This extends to transitioning from high quality tracking earlier in the supply chain to aggregate tracking later in the supply chain. [0094] In some embodiments, blending module 212 may determine based on output from secure document analysis module 202 a proportion of a crop within an aggregation from each plot from which the crop was sourced. In some embodiments, user input may be made to input a source and amount within an aggregation. In some embodiments, blending module 212 may validate these determinations and/or inputs using Global Positioning System (GPS) data (e.g., using GPS of transportation instruments such as trailers carrying crop to validate that the crop in fact came from a given plot to the point of aggregation. As a further example, the coordinates at which a truck was filled may be read in order to increase confidence that the agricultural product e.g., grain) was in fact delivered came from the field where indicated.

[0095] Blending module 212 may track a chain of custody in which materials or products with a set of specified characteristics are mixed according to certain criteria with materials or products without that set of characteristics resulting in a known proportion of the specified characteristics in the final output. This is a similar approach to batch-level mass balance. In batch-level mass balance, segregation is maintained until the final point of blending or mixing for a specific batch of a product, while mixing with non-certified product is controlled and recorded, so the proportion of certified content in each final product is known. Examples of such models are available in section 2.3.1 of the ISEAL guidance (available at https://www.isealalliance.org/sites/default/files/resource/2017-

1 l/ISEAL_Chain_of_Custody_Models_Guidance_September_2016.pdf, and hereby incorporated by reference herein in its entirety). In additional embodiments, blending module 212 may apply alternative statistical or machine learning models to determine the blend ration and/or the blended characteristics of the aggregated physical product.

[0096] In some embodiments, blending module 212 may be used to control blending at the sourcing region in order to achieve a target ecosystem attribute or to boost downstream traceability for enrolled growers. For example, a specific output product profile can be achieved by scheduling deliveries within specified windows based on the predicted qualities of the product to be delivered and a desired blended product profile. In exemplary controlled blending / batch-level mass balance models, several claims may be made. For example, until the point of blending or mixing certified with non-certified product, ‘identity preserved’ or ‘segregation’ models are followed for the certified component; a proportion of certified and non-certified components is recorded; the percentage of certified content actually contained in the final product is known; only the percentage of content that is certified may be labeled as certified.

[0097] There are two claim options available for batch-level mass balance, e.g., for a mix with one-third certified and two-thirds non-certified product: a 33% can carry a claim of ‘fully certified’, or 100% can carry a claim of ‘contains 33% certified content’ (and various other combinations to achieve a true claim).

[0098] As noted above, in some embodiments, the blend ratio of crops from multiple sources arriving at a delivery location (e.g., a sourcing region, equivalently referred to as a supply shed) is determined through the application of a machine learning model. Blending module 212 uses this blending ratio to determine the aggregate ecosystem attributes of the blended agricultural products. In various embodiments, each blending model includes two or more sub-models that are configured to work together, for example as an ensemble of models.

[0099] In various embodiments, the machine learning models described herein may be any suitable model as is known in the art (e.g., artificial neural networks (ANN), random forests (RF), decision trees, network flow optimization models, etc.). More generally, it will be appreciated that a variety of machine learning techniques are suitable for estimating and predicting the relationship between input features such as the field data object and a dependent variable such as the blend ratio. In various embodiments, the sub-models may utilize the same or different machine learning models.

[0100] Blending module 212 may deploy a ledger to track blending information at each hop from sourcing to final product. That is, delivery from a first sourcing region to a next location (e.g., next sourcing region and/or processing facility) may be tracked in a same manner. Blending module 212 may apply a distribution model to outbound transport data to determine the destination of blended agricultural product. In this way, ecosystem attributes may be attributed to a final delivery even where the chain of custody from an interim delivery location to a final delivery location is unavailable. In some embodiments, the distribution model is a statistical or machine learning model based on prior delivery information, and the particular location of the interim delivery. This provides a probabilistic fallback where more accurate tracking cannot be employed.

[0101] In various embodiments, transport data comprising information regarding transport of crops from the sourcing region to a final destination are read. In some embodiments, these data are not readily available. In such embodiments, GPS tracking data of the transport vehicles and vehicle characteristics such as payload capacity can be collected at the sourcing region via automated vehicle identification and classification. In some embodiments, data may be recorded at the sourcing region, identifying the companies to which the sourcing region is supplying the grains.

[0102] In various embodiments, the distribution model described herein may be any suitable statistical model or a machine learning model known in the art (e.g., artificial neural networks (ANN), random forests (RF), decision trees, etc.). In various embodiments, the distribution model can be can further include two or more sub-models, that are configured to work together, for example as an ensemble. In various embodiments, the sub-models may utilize the same or different machine learning models.

[0103] While the examples provided herein refer to a single data structure, it will be appreciated that in various embodiments, the data structure may be distributed over multiple linked objects. For example, the data structure may comprise a plurality of database entries related by one or more key values. In another example, the data structure may comprises a plurality of objects related by association or inheritance.

[0104] Various embodiments described herein use artificial neural networks. Suitable artificial neural networks include but are not limited to a feedforward neural network, a radial basis function network, a self-organizing map, learning vector quantization, a recurrent neural network, a Hopfield network, a Boltzmann machine, an echo state network, long short term memory, a bi-directional recurrent neural network, a hierarchical recurrent neural network, a stochastic neural network, a modular neural network, an associative neural network, a deep neural network, a deep belief network, a convolutional neural networks, a convolutional deep belief network, a large memory storage and retrieval neural network, a deep Boltzmann machine, a deep stacking network, a tensor deep stacking network, a spike and slab restricted Boltzmann machine, a compound hierarchical-deep model, a deep coding network, a multilayer kernel machine, or a deep Q-network.

[0105] In various embodiments, the learning system employs Extreme Gradient Boosting (XGBoost) for predicting county scale yields. This algorithm is employed in various embodiments because: 1) its tree-based structure can handle the non-linear relationships between predictors and outcomes and 2) it automatically captures interactions among features well, so they do not need to be pre-computed. Additionally, XGBoost is computationally efficient relative to similar machine learning methods.

[0106] FIG. 3 shows one embodiment of an exemplary user interface showing a map and ecosystem attributes, in accordance with an embodiment. As shown in FIG. 3, user interface 300 includes a map with which a user may interact. The user may be of a prospective entity that is sourcing one or more agricultural products. The user may have user interface 300 draw one or more agricultural boundaries (e.g., the three depicted agricultural boundaries) through any manner discussed above with respect to FIG. 2. For example, the user may input a set of secure electronic documents, and the boundaries may be drawn automatically. As shown, the user may select portions within a geographic boundary, or an entire area within a geographic boundary, to obtain auxiliary information (e.g., information about a particular aggregator within a boundary). In some embodiments, the user may select points of interest, such as a given aggregator (e.g., grain silo), and user interface 300 may highlight fields that contribute to that aggregator where known, or highlighted approximated fields, through any manner discussed above with respect to FIG. 2.

[0107] FIG. 4 shows one embodiment of an exemplary user interface showing blending from a plurality of supply sheds, in accordance with an embodiment. As depicted in FIG. 4, user interface 400 includes ecosystem attribute selector 410 and sourcing region indicators 420. Ecosystem attribute tracking application 130 may receive user input of an indication of required maximum levels for one or more ecosystem attributes. Responsive to receiving this user input, ecosystem attribute tracking application 130 may output sourcing region indicators 430 that, where agricultural product is sourced in at least partial volumes from each supply shed, would together in the aggregate satisfy the required maximum levels for the ecosystem attributes.

[0108] In the particular example depicted, a user inputs a desired target emissions level for carbon of 0.270 per kilogram for a total of 150,000 bushels of com. Sourcing region indicators 420 are responsively output that show volume available from various supply sheds along with their associated estimated carbon emissions levels. In some embodiments, ecosystem attribute tracking application 130 selects the sourcing region indicators 420 using a machine learning model, where inputs may include the selected attributes, and where historical and/or current training data is used to recommend sourcing region indicators 420. [0109] In some embodiments, a user profile for the sourcing entity is stored based on past activity of either the user alone, or an entity that the user represents. For example, where an entity having many users representing it exists, their activity in the aggregate may yield preferences of the entity. That is, users may avoid facilities that use ethanol, or may avoid sourcing regions that transport agricultural product using truck rather than rail. Ecosystem attribute tracking application 130 may train a machine learning model to predict preferences of the user(s), and may generate a recommendation of sourcing region indicators 420 based on the predicted preferences. Moreover, where users make selections of given one(s) of the recommended sourcing regions and/or reject the recommended sourcing regions, ecosystem attribute tracking application 130 may retrain the model, such that a different ordering and/or set of sourcing regions would be shown the next time a user from that group of users runs requests the same or similar blending requirements.

[0110] FIG. 5 illustrates an exemplary flowchart showing a process for implementing an ecosystem attribute tracking application, in accordance with an embodiment. Process 500 may occur by one or more processors of ecosystem attribute tracking application 130 executing instructions stored on a non-transitory computer-readable medium to perform the operations of one or more of the modules of ecosystem attribute tracking application 130. Process 500 begins with ecosystem attribute tracking application 130 identifying 510 a geographic boundary associated with a crop (e.g., using geographic boundary module 204). Ecosystem attribute tracking application 130 then retrieves 520 remote sensing data corresponding to the geographic boundary and a time (e.g., using remote sensing data processing module 210), and applies 530 a machine-learned model to the remote sensing data, the machine-learned model configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the geographic boundary and the time. Ecosystem attribute module 130 modifies a crop database to include the ecosystem data in association with the crop (e.g., for use by blending module 212).

Claims

WHAT IS CLAIMED IS:

1. A method comprising: identifying a geographic boundary associated with crop production; retrieving remote sensing data corresponding to the geographic boundary and a time; applying an ecosystem attribute quantification method to the remote sensing data, the ecosystem attribute quantification method configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the geographic boundary and the time; and modifying a crop database to include the ecosystem data in association with crop production of a geographic region within the geographic boundary.

2. The method of claim 1, further comprising: determining the time based on one or more secure electronic documents associated with crop production, the time being a time when the crop was produced.

3. The method of claim 2, wherein determining the time based on the one or more secure electronic documents associated with crop production comprises: applying a machine-learned model to the secure electronic document, the machine- learned model configured to produce a determination of the time when the crop was produced.

4. The method of claim 1, wherein determining the geographic boundary comprises: applying a machine-learned model to one or more secure electronic documents, the machine-learned model configured to generate a geographic boundary representing a highest probability region for the crop production.

5. The method of claim 1, further comprising applying a machine-learned model to one or more secure electronic documents, the machine-learned model configured to determine a total volume of crop produced within the geographic boundary.

6. The method of claim 1, further comprising applying a machine-learned model to one or more secure electronic documents, the machine-learned model configured to determine one or more aggregation facilities within the geographic boundary.

7. The method of claim 6, wherein the machine-learned model is additionally configured to determine a volume of crop associated with each of the one or more aggregation facilities.

8. The method of claim 1, wherein identifying the geographic boundary associated with crop production comprises receiving an input of the boundary based on an interaction by a user into a map displayed on a user interface.

9. The method of claim 8, wherein the map, prior to receiving the input, is centered on a location determined by: applying a machine-learned model to a secure electronic document, the machine- learned model configured to provide one or more geographic locations; and determining a centroid of the one or more geographic locations to be where the map is to be centered.

10. The method of claim 1, wherein identifying the geographic boundary associated with the crop comprises: applying a machine-learned model to one or more secure electronic documents, the machine-learned model configured to provide one or more geographic locations; and determining the geographic boundary based on the one or more geographic locations.

11. The method of claim 10, wherein the geographic boundary is determined based on clustering properties of the one or more geographic locations with respect to one another.

12. The method of claim 10, wherein the machine-learned model is trained using a set of historical secure electronic documents as labeled by known field locations corresponding to each of the historical secure electronic documents.

13. The method of claim 12, wherein the machine-learned model is trained to infer missing geographical locations based on labels of missing geographical locations on examples of the set of historical secure electronic documents as applied based on user input of what those examples are missing.

14. The method of claim 13, wherein the geographic boundary is approximated where a confidence score output by the machine-learned model for an inference is below a threshold confidence score.

15. The method of claim 10, wherein the machine-learned model is additionally configured to output one or more of the time and one or more parties corresponding to the geographic boundary.

16. The method of claim 1, wherein the method further comprises: retrieving a plurality of secure electronic documents; determining one or more geographic locations corresponding to a crop production location based on content of the secure electronic documents; and configuring a map on a user interface to show the one or more geographic locations, the configuring occurring without user input.

17. The method of claim 1, wherein the geographic boundary is determined based on locations of one or more aggregation facilities that hold quantities of the crop after the crop has been harvested.

18. The method of claim 1, wherein the remote sensing data is stored in a database and retrieved using an index that maps the time and one or more geographic locations within the geographic boundary to the remote sensing data.

19. The method of claim 1, wherein the ecosystem data stored in the crop database is leveraged, rather than re-using a machine learning model, where a same crop, time, and location within the geographic boundary are later queried.

20. The method of claim 1, further comprising: receiving, from user input into a user interface, a requested volume of a crop and a target profile of the measure of the one or more ecosystem attributes; and configuring a map on the user interface to indicate one or more locations of the crop whose volume together matches the target profile.

21. The method of claim 20, wherein the target profile includes one or more of a specific value, a threshold minimum value, and a threshold maximum value for the measure of the one or more ecosystem attributes.

22. The method of claim 20, wherein the target profile is selected in connection with a given geographic region within the geographic boundary.

23. The method of claim 1, further comprising applying a machine-learned model to the remote sensing data, the machine-learned model configured to predict a crop type associated with the geographic boundary and the time.

24. The method of claim 23, further comprising modifying the crop database to include the crop type.

25. The method of claim 1, further comprising applying a machine-learned model to the remote sensing data, the machine-learned model configured to predict a total volume of crop production associated with the geographic boundary and the time.

26. The method of claim 1, further comprising: applying a machine-learned model to the remote sensing data, the machine-learned model configured to predict a total volume of crop production associated with the geographic boundary and the time, and applying a machine-learned model to one or more secure electronic documents, the machine-learned model configured to predict a total volume of crop produced within the geographic region and the time.

27. The method of claim 26, comprising: predicting a presence within the geographic boundary of one or more crop processing facilities, on-farm storage of a crop, transfer of a crop between aggregation facilities, or carry-over stocks of a crop, based on a difference between the total volume of crop predicted by the machine-learned model to remote sensing data and the total volume of crop predicted by the machine-learned model applied to one or more secure electronic documents.

28. The method of claim 27, wherein the carry-over stocks of a crop are at an aggregation facility or on-farm storage.

29. The method of claim 26, wherein the total volume of crop predicted by the machine- learned model applied to one or more secure electronic documents is smaller than a difference between the total volume of crop predicted by the machine-learned model to remote sensing image data, and further comprising: generating a plurality of sets of field boundaries within the geographic boundary, wherein a predicted combined crop production of each set of fields is equivalent to the total volume of crop predicted by the machine-learned model applied to the one or more secure electronic documents, applying an ecosystem attribute quantification method to remote sensing data for each field boundary within the set of field boundaries wherein the ecosystem attribute quantification method is configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the field boundaries, and generating an uncertainty value for the ecosystem attributes associated with the geographic boundary from a distribution of one or more ecosystem attributes estimated for the sets of field boundaries.

30. The method of claim 1, further comprising applying a machine-learned model to the remote sensing data, the machine-learned model configured to identify one or more intermodal crop transfer facilities within the geographic boundary.

31. The method of claim 1, further comprising: generating a plurality of field boundaries within the geographic boundary, wherein retrieving remote sensing data corresponding to the geographic boundary and a time comprises retrieving remote sensing data for each of the plurality of field boundaries; and wherein applying an ecosystem attribute quantification method to the remote sensing data comprises applying an ecosystem attribute quantification method to fieldlevel remote sensing data, the ecosystem attribute quantification method configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with each of the field boundaries within the geographic boundary and the time.

32. A non-transitory computer-readable medium comprising memory with instructions encoded thereon, the instructions, when executed by one or more processors, causing the one or more processors to perform operations, the instructions comprising instructions to: identify a geographic boundary associated with crop production; retrieve remote sensing data corresponding to the geographic boundary and a time; apply an ecosystem attribute quantification method to the remote sensing data, the ecosystem attribute quantification method configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the geographic boundary and the time; and modify a crop database to include the ecosystem data in association with crop production of a geographic region within the geographic boundary.

33. A system comprising: memory with instructions encoded thereon; and one or more processors that, when executing the instructions, are caused to perform operations comprising: identifying a geographic boundary associated with crop production; retrieving crop production data corresponding to the geographic boundary and a time based on remote sensing data; applying an ecosystem attribute quantification method to the remote sensing data, the ecosystem attribute quantification method configured to produce ecosystem data comprising a measure of one or more ecosystem attributes associated with the geographic boundary and the time; and modifying a crop database to include the ecosystem data in association with crop production of a geographic region within the geographic boundary.