CN117874498A

CN117874498A - Intelligent forestry big data system, method, equipment and medium based on data lake

Info

Publication number: CN117874498A
Application number: CN202410280842.0A
Authority: CN
Inventors: 李晓林; 曾维朝; 李凡
Original assignee: Aerospace Guangtong Technology Shenzhen Co ltd
Current assignee: Aerospace Guangtong Technology Shenzhen Co ltd
Priority date: 2024-03-12
Filing date: 2024-03-12
Publication date: 2024-04-12
Anticipated expiration: 2044-03-12
Also published as: CN117874498B

Abstract

The invention relates to the technical field of big data, and discloses an intelligent forestry big data system, method, equipment and medium based on a data lake, wherein the intelligent forestry big data system comprises a data integration module, a data lake storage module, a grid mapping module, a resource analysis module and a plan decision module, and the intelligent forestry big data system comprises the following components: the data integration module is used for carrying out data examination and data cleaning on the historical forestry data to obtain integrated forestry data; the data lake storage module is used for carrying out data pooling and data lake storage on the integrated forestry data to obtain a forestry data lake; the resource analysis module is used for extracting the forestry resource characteristics of the space time sequence forestry data lake according to the forestry grid model to obtain a forestry resource characteristic sequence, and training the forestry grid model into a resource analysis model by utilizing the forestry resource characteristic sequence; and the plan decision module is used for carrying out decision analysis and plan adjustment on the forestry production plan by utilizing the resource analysis model to obtain a standard forestry plan. The invention can improve the efficiency of forestry data analysis.

Description

Intelligent forestry big data system, method, equipment and medium based on data lake

Technical Field

The invention relates to the technical field of big data, in particular to an intelligent forestry big data system, method, equipment and medium based on a data lake.

Background

The intelligent forestry big data technology is applied to the forestry field, and the effective management, protection and utilization of forest resources are realized by collecting, storing, processing and analyzing large-scale forestry data.

The existing intelligent forestry big data system is based on a simple database technology, the storage and data analysis of forestry data are realized through a general database method, in practical application, the quality of the forestry data can be influenced by various factors in a data collection or processing process, the accuracy and reliability of data analysis and decision making can be influenced, and the technical challenges of data island, low data processing speed and the like exist, so that the efficiency in the process of forestry data analysis is low.

Disclosure of Invention

The invention provides an intelligent forestry big data system, method, equipment and medium based on a data lake, and mainly aims to solve the problem of low efficiency in forestry data analysis.

In order to achieve the above purpose, the invention provides a data lake-based intelligent forestry big data system, which comprises a data integration module, a data lake storage module, a grid mapping module, a resource analysis module and a plan decision module, wherein:

The data integration module is used for acquiring historical forestry data of a target forestry area, and performing data examination and data cleaning on the historical forestry data to obtain integrated forestry data;

the data lake storage module is used for carrying out data pooling and data lake storage on the integrated forestry data to obtain a forestry data lake;

the grid mapping module is configured to perform remote sensing geographical partitioning and space grid mapping operation on the target forestry area by using the forestry data lake to obtain a forestry grid model, where the grid mapping module is specifically configured to, when performing remote sensing geographical partitioning and space grid mapping operation on the target forestry area by using the forestry data lake to obtain the forestry grid model: extracting a forestry remote sensing map sequence from the forestry data lake, and performing geographic coordinate transformation on the forestry remote sensing map sequence to obtain a geographic remote sensing map sequence; respectively carrying out image denoising and image size alignment operation on the geographic remote sensing image sequence to obtain a standard remote sensing image sequence; sampling the standard remote sensing image sequence by using the following remote sensing supersampling algorithm to obtain a sampled remote sensing image sequence:

Wherein,for the +.f in the sequence of sampled remote sensing maps>The pixel coordinates in the sampling remote sensing pictures are +.>Gray value of corresponding pixel, +.>For the +.f in the sequence of the standard remote sensing map>The pixel coordinates in the standard remote sensing pictures are +.>Gray value of the pixel of +.>For pixel coordinates +.>Horizontal pixel interpolation coefficient corresponding to the pixel of (2), and (c)>For pixel coordinates +.>Longitudinal pixel interpolation coefficients corresponding to pixels of (2), respectively>The interpolation weight is preset; performing primary distinguishing blocks on the sampling remote sensing image sequence to obtain a remote sensing image block sequence; carrying out image feature convolution and feature cluster fusion on the remote sensing block group sequence to obtain a standard remote sensing block group sequence; extracting a partition structure from the standard remote sensing block group sequence, and establishing a forestry grid model according to the partition structure;

the resource analysis module is used for extracting the forestry data lake space time sequence forestry resource characteristics according to the forestry grid model to obtain a forestry resource characteristic sequence, and training the forestry grid model into a resource analysis model by utilizing the forestry resource characteristic sequence;

the plan decision module is used for acquiring a preset forestry production plan, and carrying out decision analysis and plan adjustment on the forestry production plan by utilizing the resource analysis model to obtain a standard forestry plan.

Optionally, the data integration module is specifically configured to, when performing data inspection and data cleaning on the historical forestry data to obtain integrated forestry data:

performing structural examination on the historical forestry data to obtain a structured forestry data set, a semi-structured forestry data set and an unstructured forestry data set;

performing data standardization and data deduplication operation on the structured forestry data set to obtain a primary structured forestry data set;

updating the primary structured forestry data set into a standard structured forestry data set by using a preset outlier forestry data detection algorithm;

performing data analysis, field extraction and de-duplication and de-noising operations on the semi-structured forestry data set to obtain a standard semi-structured forestry data set;

performing data verification and de-duplication de-noising operation on the unstructured forestry data set to obtain a standard unstructured forestry data set;

and collecting the standard structured forestry data set, the standard semi-structured forestry data set and the standard unstructured forestry data set into integrated forestry data.

Optionally, the data integration module is specifically configured to, when updating the primary structured forestry data set to a standard structured forestry data set:

Extracting features of the primary structured forestry data set to obtain a primary forestry data feature set;

calculating a forestry data outlier set corresponding to the primary structured forestry data set by using an outlier forestry data detection algorithm and the primary forestry data feature set as follows:

wherein,refers to the +.f in the forestry data outlier set>Individual forestry data outliers,/->、/>Is a preset index,/->、/>Is a preset countermeasure coefficient, < >>Refers to the +.f in the primary structured forestry dataset>Individual primary structured forestry data, < >>Refers to the +.f in the primary structured forestry dataset>Individual primary structured forestry data, < >>Is the total number of data of said primary structured forestry dataset,/for>Refers to the +.>Individual primary forestry data characteristics,/->Refers to the +.>Individual primary forestry data characteristics,/->Is a transposed symbol->Is a covariance function symbol;

performing outlier threshold screening on the forestry data outlier group to obtain a standard data outlier group;

collecting primary structured forestry data corresponding to the standard data outlier group in the primary structured forestry data set into an outlier forestry data set;

And carrying out data smoothing filling operation on the outlier forestry data group in the primary structured forestry data to obtain a standard structured forestry data set.

Optionally, the data lake storage module is specifically configured to, when performing data pooling and data lake storage on the integrated forestry data to obtain the forestry data lake:

extracting a data structure set and a data attribute set from the integrated forestry data;

splicing the data structure set and the data attribute set into a data area key set;

clustering and pooling the integrated forestry data according to the data area key set to obtain a forestry data pool group;

performing format matching on the forestry data pool group according to the data area key set to obtain a forestry storage format group;

and carrying out engine matching and engine storage on the forestry data pool group according to the forestry storage format group to obtain a forestry data lake.

Optionally, the resource analysis module is specifically configured to, when performing spatial time sequence forestry resource feature extraction on the forestry data lake according to the forestry grid model to obtain a forestry resource feature sequence:

extracting a gas data sequence, a soil data sequence and a resource data sequence from the forestry data lake according to a time sequence ordering mode;

Performing feature extraction and feature fusion operation on the meteorological data sequence and the soil data sequence to obtain a meteorological soil feature sequence;

extracting a partition structure from the forestry grid model, and splitting the meteorological soil characteristic sequence into a meteorological soil characteristic group sequence according to the partition structure;

splitting the resource data sequence into a resource data group sequence according to the partition structure;

extracting an environmental time sequence feature group sequence from the meteorological soil feature group sequence;

extracting a resource time sequence feature group sequence from the resource data group sequence, and collecting the environment time sequence feature group sequence and the resource time sequence feature group sequence into a woodland resource feature sequence.

Optionally, the resource analysis module is specifically configured to, when training the forestry grid model into a resource analysis model by using the forestry resource feature sequence:

performing grid mapping on the forestry grid model by utilizing an environmental time sequence feature group sequence in the forestry resource feature sequence to obtain a mapped grid model;

performing time sequence convolution and embedding activation operation on the mapping grid model to obtain an analysis resource feature group sequence;

Calculating a grid loss value between the sequence of resource time sequence feature groups and the sequence of analysis resource feature groups by using the following grid loss value algorithm:

wherein,refers to the grid loss value, +.>Refers to serial number, & gt>Means the total number of features of each resource timing feature group in the sequence of resource timing feature groups,/->Refers to the sequence length of the resource time sequence feature group sequence,/->Means +.>The +.>Resource timing feature->Means +.>The +.f. in the analysis resource profile>Analyzing resource characteristics, < >>Is a preset constant->For a predetermined loss weight->For Laplacian sign,/->For dot product symbol, ++>For cross sign>Is an absolute value symbol;

and carrying out iterative updating of model parameters on the mapping grid model according to the grid loss value to obtain a resource analysis model.

Optionally, the plan decision module is specifically configured to, when performing decision analysis and plan adjustment on the forestry production plan by using the resource analysis model to obtain a standard forestry plan:

extracting a planned production period and a corresponding planned resource yield group from the forestry production plan;

Calculating an analysis resource characteristic group corresponding to the planned production period by using the resource analysis model;

performing data mapping on the analysis resource characteristic set to obtain an analysis forestry resource data set;

carrying out statistical classification on the analysis forestry resource data set to obtain an analysis resource yield set;

performing matching decision on the planned resource yield group according to the analysis resource yield group to obtain a resource decision result;

and carrying out plan adjustment on the forestry production plan by utilizing the resource decision result to obtain a standard forestry plan.

In order to solve the problems, the invention also provides a data lake-based intelligent forestry big data method, which comprises the following steps:

acquiring historical forestry data of a target forestry region, and performing data examination and data cleaning on the historical forestry data to obtain integrated forestry data;

carrying out data pooling and data lake storage on the integrated forestry data to obtain a forestry data lake;

performing remote sensing geographical partitioning and space grid mapping operation on the target forestry region by utilizing the forestry data lake to obtain a forestry grid model, wherein the performing remote sensing geographical partitioning and space grid mapping operation on the target forestry region by utilizing the forestry data lake to obtain the forestry grid model comprises the following steps: extracting a forestry remote sensing map sequence from the forestry data lake, and performing geographic coordinate transformation on the forestry remote sensing map sequence to obtain a geographic remote sensing map sequence; respectively carrying out image denoising and image size alignment operation on the geographic remote sensing image sequence to obtain a standard remote sensing image sequence; sampling the standard remote sensing image sequence by using the following remote sensing supersampling algorithm to obtain a sampled remote sensing image sequence:

carrying out space time sequence forestry resource feature extraction on the forestry data lake according to the forestry grid model to obtain a forestry resource feature sequence, and training the forestry grid model into a resource analysis model by utilizing the forestry resource feature sequence;

and acquiring a preset forestry production plan, and performing decision analysis and plan adjustment on the forestry production plan by utilizing the resource analysis model to obtain a standard forestry plan.

In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data lake-based intelligent forestry big data method described above.

In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned data lake-based smart forestry big data method.

According to the method, the integrated forestry data are obtained by carrying out data examination and data cleaning on the historical forestry data, so that integration and unified management on the historical forestry data can be realized, subsequent data lake storage is facilitated, the forestry data lakes are obtained by carrying out data pooling and data lake storage on the integrated forestry data, data privacy can be ensured by means of pooling technology, structured storage access can be realized, access-along-use is realized, meanwhile, the workload of data processing is reduced, the compression rate and the query efficiency of the data are improved, a forestry grid model is obtained by carrying out remote sensing geographical partitioning and space grid mapping operation on the target forestry area by utilizing the forestry data lakes, grid partitioning and grid model establishment of the forestry area can be realized according to remote sensing data of the target forestry area, and therefore the relationship and topological structure of geographical space are better combined, interaction and dependence between nodes are extracted, and the accuracy of forestry resource data prediction is improved.

The characteristic sequence of the forestry resources is extracted, characteristic extraction can be carried out on influence factors and record data of the forestry resources from two dimensions of space and time, so that accuracy of forestry resource analysis is improved, decision analysis and plan adjustment are carried out on the forestry production plan according to the resource analysis model, a standard forestry plan is obtained, production conditions of the forestry resources can be analyzed by combining with a space-time grid model, the production plan is correspondingly adjusted, and utilization efficiency of forestry big data is improved. Therefore, the intelligent forestry big data system, method, equipment and medium based on the data lake can solve the problem of low efficiency in forestry data analysis.

Drawings

FIG. 1 is a functional block diagram of a data lake-based intelligent forestry big data system according to an embodiment of the present invention;

figure 2 is a flow chart of extracting integrated forestry data provided by an embodiment of the present invention;

figure 3 is a flow chart of generating a standard structured forestry dataset provided by an embodiment of the present invention;

FIG. 4 is a flow chart of a data lake-based intelligent forestry big data method according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an electronic device for implementing a data lake-based smart forestry big data method according to an embodiment of the present invention;

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The embodiment of the application provides an intelligent forestry big data system based on a data lake. The execution subject of the intelligent forestry big data system based on the data lake comprises, but is not limited to, at least one of a server side, a terminal and the like which can be configured to execute the electronic equipment of the system provided by the embodiment of the application. In other words, the data lake-based intelligent forestry big data system can be executed by software or hardware installed at a terminal device or a server device. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

FIG. 1 is a functional block diagram of a data lake-based intelligent forestry big data system according to an embodiment of the present invention.

The intelligent forestry big data system 100 based on the data lake can be installed in electronic equipment. Depending on the functions implemented, the data lake-based intelligent forestry big data system 100 can include a data integration module 101, a data lake storage module 102, a grid mapping module 103, a resource analysis module 104, and a planning decision module 105. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.

In the present embodiment, the functions concerning the respective modules/units are as follows:

the data integration module 101 is configured to obtain historical forestry data of a target forestry area, and perform data inspection and data cleaning on the historical forestry data to obtain integrated forestry data.

In the embodiment of the invention, in order to realize unified management of forestry data, the data integration module 101 is required to perform data inspection and data cleaning on the acquired historical forestry data, so that the structure type attribute of the data is known, the accurate value of the data is realized, and further the subsequent step of data pool storage is facilitated.

In detail, the target forestry region refers to a forestry region in which a forestry big data system needs to be constructed, the historical forestry data refers to historical data obtained by carrying out forestry monitoring on the target forestry region in the past, and the historical forestry data comprises forest resource investigation data, remote sensing image data, meteorological monitoring data, soil analysis data and the like of the target forestry region.

In the embodiment of the present invention, referring to fig. 2, when the data integration module 101 performs data inspection and data cleaning on the historical forestry data to obtain integrated forestry data, the data integration module is specifically configured to:

s21, carrying out structural examination on the historical forestry data to obtain a structured forestry data set, a semi-structured forestry data set and an unstructured forestry data set;

s22, carrying out data standardization and data deduplication operation on the structured forestry data set to obtain a primary structured forestry data set;

s23, updating the primary structured forestry data set into a standard structured forestry data set by using a preset outlier forestry data detection algorithm;

s24, carrying out data analysis, field extraction and de-duplication and de-noising operation on the semi-structured forestry data set to obtain a standard semi-structured forestry data set;

S25, performing data verification and de-duplication and de-noising operations on the unstructured forestry data set to obtain a standard unstructured forestry data set;

s26, the standard structured forestry data set, the standard semi-structured forestry data set and the standard unstructured forestry data set are collected to integrate forestry data.

Specifically, the structured forestry data set, the semi-structured forestry data set and the unstructured forestry data set are data sets respectively formed by integrating structured data, semi-structured data and unstructured data in the historical forestry data, one of the structured forestry data sets can be a primary soil analysis data table or a primary meteorological data record table at a certain position, one of the unstructured forestry data sets can be remote sensing image data, one of the semi-structured forestry data sets can be record checking data for the type, the size and the like of a forest crop at a certain position, the structure checking means checking and classifying the data structure of each forestry data in the forestry data sets, the data structure comprises structured, unstructured and semi-structured, wherein the structured can be a table structure, the semi-structured can be in an XML (extensive markup language), HTML (hypertext markup language) or the like, and the unstructured can be picture data.

In detail, the data normalization refers to converting the format of data and the unit of data into a unified format, for example, converting the format of data such as date, time and the like, and the denoising and de-duplication method may be to perform de-duplication by using a hash de-duplication method and to perform denoising by using a median filtering or gaussian filtering method.

Specifically, the data parsing method may be xml. Etre. Elementtree, json module, etc., and the field extracting method may be regular expression.

Specifically, referring to fig. 3, when the data integration module 101 updates the primary structured forestry data set to a standard structured forestry data set by using a preset outlier forestry data detection algorithm, the data integration module is specifically configured to:

s31, extracting features of the primary structured forestry data set to obtain a primary forestry data feature set;

s32, calculating a forestry data outlier set corresponding to the primary structured forestry data set according to the primary forestry data feature set;

s33, performing outlier threshold screening on the forestry data outlier group to obtain a standard data outlier group;

s34, collecting primary structured forestry data corresponding to the standard data outlier group in the primary structured forestry data set into an outlier forestry data set;

And S35, carrying out data smoothing filling operation on the outlier forestry data set in the primary structured forestry data to obtain a standard structured forestry data set.

In detail, the outlier proportion of the data can be detected by the outlier forestry data detection algorithm according to the distribution characteristics of the data and the difference of the data characteristics, so that outlier data in the forestry data can be conveniently determined, the primary forestry data characteristic set is a characteristic obtained by performing simple characteristic coding on each primary structured forestry data in the primary structured forestry data set, and the characteristic coding mode can be normalized coding or softmax and the like.

In detail, the forestry data outlier set corresponding to the primary structured forestry data set is calculated using an outlier forestry data detection algorithm as follows:

wherein,refers to the +.f in the forestry data outlier set>Individual forestry data outliers,/->Is a preset index,/->、/>Is a preset countermeasure coefficient, < >>Refers to the +.f in the primary structured forestry dataset>Individual primary structured forestry data, < >>Refers to the +.f in the primary structured forestry dataset>Individual primary structured forestry data, < >>Is the total number of data of said primary structured forestry dataset,/for >Refers to the +.>Individual primary forestry data characteristics,/->Refers to the +.>Individual primary forestry data characteristics,/->Is a transposed symbol->Is the covariance function symbol.

In the embodiment of the invention, the data integration module 101 is utilized to perform data inspection and data cleaning on the historical forestry data to obtain the integrated forestry data, so that the integration and unified management of the historical forestry data can be realized, and the subsequent data lake storage is convenient.

The data lake storage module 102 is configured to perform data pooling and data lake storage on the integrated forestry data to obtain a forestry data lake.

In the embodiment of the invention, because the variety of the integrated forestry data is various, the data volume is large, the time consumption of data processing is long, and the efficiency is low, the integrated forestry data is required to be stored in the data lake by utilizing the data lake storage module 102, and the use efficiency of the integrated forestry data is improved by utilizing the access-following characteristics and the efficient access logic of the data lake.

In detail, the data pooling refers to splitting the integrated forestry data into a plurality of data pools, wherein the data pools are a subset or a logic organization unit of data lakes for storing data of a specific type or a specific purpose, the data pools are generally created according to business requirements or the purpose of data management, and can contain related data sets for easier data management, access and analysis, and the forestry data lakes are data lakes storing the integrated forestry data, and the data lakes are a storage method for storing and managing large data by adopting a distributed storage system.

In the embodiment of the present invention, when the data lake storage module 102 performs data pooling and data lake storage on the integrated forestry data to obtain a forestry data lake, the data lake storage module is specifically configured to:

In detail, the data attribute set refers to a set formed by introducing keywords of each data obtained by keyword matching and attribute feature words in the title keywords, each data area key in the data area key set is a partition key feature of the data, and clustering and pooling can be performed by using a K neighbor clustering algorithm.

Specifically, the forestry storage format group is a preset storage format aiming at different data area keys, and comprises Parquet, ORC, avro and the like, the forestry storage format group has the advantages of high compression rate, good query performance and the like, the format matching refers to selecting a forestry storage format corresponding to the data area keys, such as part corresponding to structured soil data, the engine matching refers to storing forestry data pools of different forestry storage formats by using different storage engines, such as storing Amazon S3 engines aiming at ORC formats, and the storage engines comprise Amazon S3, azure Data Lake Storage, hadoop HDFS and the like.

In the embodiment of the invention, the data lake storage module 102 is utilized to perform data pooling and data lake storage on the integrated forestry data to obtain the forestry data lake, so that the data privacy can be ensured by means of pooling technology, structured storage and access can be realized, access and use are realized, meanwhile, the workload of data processing is reduced, and the compression rate and the query efficiency of the data are improved.

The grid mapping module 103 is configured to perform remote sensing geographical partition and space grid mapping operation on the target forestry area by using the forestry data lake, so as to obtain a forestry grid model.

In the embodiment of the invention, since the forestry resource data is greatly influenced by weather, soil and the like, and the analysis of the forestry resource can not be realized by combining the time and space characteristics by the simple statistical model, the characteristics of the forestry resource need to be analyzed by utilizing the neural grid model.

In detail, the forestry mesh model may be a graph neural network model or a space-time convolutional neural mesh model.

In the embodiment of the present invention, when the grid mapping module 103 performs remote sensing geographical partition and spatial grid mapping operation on the target forestry area by using the forestry data lake, the method is specifically used for:

Extracting a forestry remote sensing map sequence from the forestry data lake, and performing geographic coordinate transformation on the forestry remote sensing map sequence to obtain a geographic remote sensing map sequence;

respectively carrying out image denoising and image size alignment operation on the geographic remote sensing image sequence to obtain a standard remote sensing image sequence;

sampling the standard remote sensing image sequence to obtain a sampled remote sensing image sequence;

performing primary distinguishing blocks on the sampling remote sensing image sequence to obtain a remote sensing image block sequence;

carrying out image feature convolution and feature cluster fusion on the remote sensing block group sequence to obtain a standard remote sensing block group sequence;

and extracting a partition structure from the standard remote sensing block group sequence, and establishing a forestry grid model according to the partition structure.

Specifically, the forestry remote sensing image sequence is a sequence formed by images obtained by remote sensing image data acquisition of the target forestry region in a past time period recorded in the forestry data lake, and the geographic coordinate conversion is to perform matching conversion on image pixels of each forestry remote sensing image in the forestry remote sensing image sequence and actual coordinates in the ground.

In detail, the image denoising can be realized by utilizing a median filtering algorithm, the size alignment refers to cutting the size of the forestry remote sensing image to a uniform size, and the remote sensing supersampling algorithm can be combined with local pixels of the image to better keep detailed features of the image, retain more fine features and improve the antialiasing effect of the image.

In detail, the standard remote sensing graph sequence is sampled by using the following remote sensing supersampling algorithm:

wherein,for the +.f in the sequence of sampled remote sensing maps>The pixel coordinates in the sampling remote sensing pictures are +.>Gray value of corresponding pixel, +.>For the +.f in the sequence of the standard remote sensing map>The pixel coordinates in the standard remote sensing pictures areGray value of the pixel of +.>For pixel coordinates +.>Horizontal pixel interpolation coefficient corresponding to the pixel of (2), and (c)>For pixel coordinates +.>Longitudinal pixel interpolation coefficients corresponding to pixels of (2), respectively>And (5) the interpolation weight is preset.

Specifically, the primary distinguishing block is to split each sampled remote sensing picture in the sampled remote sensing picture sequence into a plurality of remote sensing blocks according to a preset size, collect all the remote sensing blocks into a remote sensing block group, the feature clustering fusion is to calculate the similarity between adjacent image features according to the image features obtained by the image feature convolution, and the average similarity between the block positions in the remote sensing block group sequence is subjected to threshold clustering.

Specifically, the partition structure refers to a split structure of the blocks, namely, a regional partition shape structure corresponding to the blocks obtained after the subsequent blocks are clustered and fused, and the establishment of the forestry grid model according to the partition structure refers to the establishment of a grid node structure of the forestry grid model according to the partition structure, and node edges are added between adjacent nodes to obtain the forestry grid model.

In the embodiment of the invention, the remote sensing geographical partition and the space grid mapping operation are carried out on the target forestry region by utilizing the forestry data lake to obtain the forestry grid model, and the grid partition and the grid model of the forestry region can be established according to the remote sensing data of the target forestry region, so that the relationship and the topological structure of the geographical space are better combined, the interaction and the dependency relationship between the nodes are extracted, and the accuracy of the forestry resource data prediction is improved.

The resource analysis module 104 is configured to extract spatial time sequence forestry resource features of the forestry data lake according to the forestry grid model, obtain a forestry resource feature sequence, and train the forestry grid model into a resource analysis model by using the forestry resource feature sequence.

In the embodiment of the invention, in order to analyze forestry resources in a target forestry area, multi-azimuth resource analysis is required to be performed according to the recorded data of the forestry resources by combining the remote sensing image, the weather recorded data of the forestry and the time sequence change rule of the soil analysis data of the forestry.

In the embodiment of the present invention, when the resource analysis module 104 performs spatial time sequence forestry resource feature extraction on the forestry data lake according to the forestry grid model to obtain a forestry resource feature sequence, the method is specifically used for:

performing data feature mapping and feature fusion operation on the meteorological data sequence and the soil data sequence to obtain a meteorological soil feature sequence;

Specifically, the meteorological data sequence is a meteorological monitoring data sequence of the target forestry area arranged according to time sequence, the soil data sequence is a soil detection data sequence of the target forestry area arranged according to time sequence, and the resource data sequence is a data sequence recorded according to time sequence and used for detecting growth, types and the like of forest crops in each region of the target forestry area.

In detail, the feature fusion refers to feature stitching fusion of a meteorological feature sequence corresponding to the meteorological data sequence and a soil feature sequence corresponding to the soil data sequence according to a sequence and a corresponding geographic position, and the environmental time sequence feature group sequence and the resource time sequence feature group sequence can be extracted by using a gating structure of the time sequence neural network.

In detail, the resource analysis module 104 is specifically configured to, when training the forestry grid model into a resource analysis model using the forestry resource feature sequence:

calculating a grid loss value between a resource time sequence feature group sequence in the forestry resource feature sequence and the analysis resource feature group sequence;

Specifically, the grid mapping refers to mapping each environmental time sequence feature in the environmental time sequence feature group sequence to each grid of the forestry grid model according to the grid structure position and the time sequence relation, and the grid loss value algorithm globally analyzes the difference between the resource time sequence feature group sequence and the analysis resource feature group sequence through the numerical value difference and the distribution difference of the feature, so that the model training efficiency is improved.

In detail, the time sequence convolution refers to analyzing the analysis time sequence feature group sequence corresponding to the environment time sequence feature group sequence by using the time sequence neural network in the mapping grid model, wherein the analysis time sequence feature group sequence refers to a sequence formed by environment time sequence feature groups at the next moment obtained by predictive analysis of each environment time sequence feature group in the environment time sequence feature group sequence, and the embedding activation refers to performing resource feature analysis on the analysis time sequence feature group sequence to obtain the analysis resource feature group sequence.

Specifically, the input of the resource analysis model is an environmental time sequence feature set composed of forestry soil data and forestry meteorological data of each block in the target forestry region, and the output is an analysis resource feature set of a future time period.

Specifically, the grid loss value is calculated using the grid loss value algorithm as follows:

wherein,refers to the grid loss value, +.>Refers to serial number, & gt>Means the total number of features of each resource timing feature group in the sequence of resource timing feature groups,/->Refers to the sequence length of the resource time sequence feature group sequence,/->Means +. >The +.>Resource timing feature->Means +.>The +.f. in the analysis resource profile>Individual analysis resource featuresSyndrome of deficiency of kidney qi>Is a preset constant->For a predetermined loss weight->For Laplacian sign,/->For dot product symbol, ++>For cross sign>Is an absolute value sign.

In the embodiment of the invention, the characteristic sequence of the forestry resources is extracted by utilizing the resource analysis module 104, so that the characteristic extraction can be carried out on the influence factors and the record data of the forestry resources from two dimensions of space and time, and the accuracy of the analysis of the forestry resources is further improved.

The plan decision module 105 is configured to obtain a preset forestry production plan, and perform decision analysis and plan adjustment on the forestry production plan by using the resource analysis model to obtain a standard forestry plan.

In the embodiment of the invention, in order to realize effective utilization of forestry big data, the forestry resource distribution state of a future target forestry area can be analyzed by utilizing a resource analysis model, so that decision adjustment is carried out on a forestry production plan.

In detail, the forestry production plan refers to an economic utilization plan of forestry resources, i.e., the number of various forestry resources produced and the expected economic yield.

In the embodiment of the present invention, when the plan decision module 105 performs decision analysis and plan adjustment on the forestry production plan by using the resource analysis model to obtain a standard forestry plan, the plan decision module is specifically configured to:

In detail, the planned production cycle refers to the production cycle covered by the forestry production plan, and the resource yield group refers to the yield of various forestry resources required in the forestry production plan, for example, 1000 birches having a diameter of 50 cm are required.

Specifically, the calculating the analysis resource feature set corresponding to the planned production period refers to predicting an environmental time sequence feature set at the next moment by using an environmental time sequence feature set sequence in historical forestry data, iterating step by step to obtain an environmental time sequence feature set sequence corresponding to the production period, and predicting the analysis resource feature set by using the obtained environmental time sequence feature set sequence corresponding to the production period.

In detail, the step of data mapping is reverse mapping of the step of extracting the sequence of resource time sequence feature groups from the sequence of resource data groups in the resource analysis module 104, the statistical classification refers to classification statistics according to the types of forestry resources, and the analysis resource yield groups are yield data groups of various forestry resources in the future, which are analyzed to be consistent with the classification format of the planning resource yield groups.

Specifically, the analysis forestry resource data set refers to prediction forestry resource data of each partition in the target forestry area, including yields of various forestry resources of each partition, the matching decision refers to judging whether the analysis resource yield set can meet numerical requirements of the planning resource yield set, if so, the resource decision result is a passing result, if not, the resource decision result is a difference value of various resources, and the planning adjustment refers to increasing or decreasing the corresponding types of forestry resources according to the difference value.

In the embodiment of the invention, the plan decision module 105 and the resource analysis model are utilized to carry out decision analysis and plan adjustment on the forestry production plan to obtain the standard forestry plan, and the space-time grid model can be combined to analyze the production condition of forestry resources, so that the production plan is correspondingly adjusted, and the utilization efficiency of forestry big data is improved.

The characteristic sequence of the forestry resources is extracted, characteristic extraction can be carried out on influence factors and record data of the forestry resources from two dimensions of space and time, so that accuracy of forestry resource analysis is improved, decision analysis and plan adjustment are carried out on the forestry production plan according to the resource analysis model, a standard forestry plan is obtained, production conditions of the forestry resources can be analyzed by combining with a space-time grid model, the production plan is correspondingly adjusted, and utilization efficiency of forestry big data is improved. Therefore, the intelligent forestry big data system based on the data lake can solve the problem of low efficiency in forestry data analysis.

Referring to fig. 4, a flow chart of a data lake-based intelligent forestry big data method according to an embodiment of the present invention is shown. In this embodiment, the data lake-based intelligent forestry big data method includes:

s1, acquiring historical forestry data of a target forestry area, and performing data examination and data cleaning on the historical forestry data to obtain integrated forestry data.

S2, carrying out data pooling and data lake storage on the integrated forestry data to obtain a forestry data lake.

S3, performing remote sensing geographical partitioning and space grid mapping operation on the target forestry region by utilizing the forestry data lake to obtain a forestry grid model, wherein the performing remote sensing geographical partitioning and space grid mapping operation on the target forestry region by utilizing the forestry data lake to obtain the forestry grid model comprises the following steps: extracting a forestry remote sensing map sequence from the forestry data lake, and performing geographic coordinate transformation on the forestry remote sensing map sequence to obtain a geographic remote sensing map sequence; respectively carrying out image denoising and image size alignment operation on the geographic remote sensing image sequence to obtain a standard remote sensing image sequence; sampling the standard remote sensing image sequence by using the following remote sensing supersampling algorithm to obtain a sampled remote sensing image sequence:

/>

Wherein,for the +.f in the sequence of sampled remote sensing maps>The pixel coordinates in the sampling remote sensing pictures are +.>Gray value of corresponding pixel, +.>For the +.f in the sequence of the standard remote sensing map>The pixel coordinates in the standard remote sensing pictures are +.>Gray value of the pixel of +.>For pixel coordinates +.>Horizontal pixel interpolation coefficient corresponding to the pixel of (2), and (c)>For pixel coordinates +.>Longitudinal pixel interpolation coefficients corresponding to pixels of (2), respectively>The interpolation weight is preset; performing primary distinguishing blocks on the sampling remote sensing image sequence to obtain a remote sensing image block sequence; carrying out image feature convolution and feature cluster fusion on the remote sensing block group sequence to obtain a standard remote sensing block group sequence; and extracting a partition structure from the standard remote sensing block group sequence, and establishing a forestry grid model according to the partition structure.

S4, carrying out space time sequence forestry resource feature extraction on the forestry data lake according to the forestry grid model to obtain a forestry resource feature sequence, and training the forestry grid model into a resource analysis model by utilizing the forestry resource feature sequence.

S5, acquiring a preset forestry production plan, and performing decision analysis and plan adjustment on the forestry production plan by using the resource analysis model to obtain a standard forestry plan.

In detail, the data lake-based intelligent forestry big data method in the embodiment of the present invention adopts the same technical means as the data lake-based intelligent forestry big data system described in fig. 1, and can produce the same technical effects, and is not described herein.

Fig. 5 is a schematic structural diagram of an electronic device for implementing the data lake-based intelligent forestry big data method according to an embodiment of the present invention.

The electronic device 1 may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a smart forestry big data program based on a data lake.

The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the entire electronic device using various interfaces and lines, executes various functions of the electronic device and processes data by running or executing programs or modules stored in the memory 11 (for example, executing smart forestry big data programs based on data lakes, etc.), and calling data stored in the memory 11.

The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in an electronic device and various types of data, such as codes of smart forestry big data programs based on data lakes, etc., but also for temporarily storing data that has been output or is to be output.

The communication bus 12 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.

The communication interface 13 is used for communication between the electronic device and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.

Only an electronic device having components is shown, and it will be understood by those skilled in the art that the structures shown in the figures do not limit the electronic device, and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.

For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.

It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.

The smart forestry big data program based on data lakes stored by the memory 11 in the electronic device 1 is a combination of instructions that, when executed in the processor 10, can implement:

/>

wherein,for the +.f in the sequence of sampled remote sensing maps>The pixel coordinates in the sampling remote sensing pictures are +.>Gray value of corresponding pixel, +.>For the +.f in the sequence of the standard remote sensing map>The pixel coordinates in the standard remote sensing pictures areGray value of the pixel of +.>For pixel coordinates +.>Horizontal pixel interpolation coefficient corresponding to the pixel of (2), and (c)>For pixel coordinates +.>Longitudinal pixel interpolation coefficients corresponding to pixels of (2), respectively >The interpolation weight is preset; performing primary distinguishing blocks on the sampling remote sensing image sequence to obtain a remote sensing image block sequence; carrying out image feature convolution and feature cluster fusion on the remote sensing block group sequence to obtain a standard remote sensing block group sequence; extracting a partition structure from the standard remote sensing block group sequence, and establishing a forestry grid model according to the partition structure;

In particular, the specific implementation method of the above instructions by the processor 10 may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.

Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).

The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:

Wherein,for the +.f in the sequence of sampled remote sensing maps>Individual sampling remote sensing mapThe pixel coordinates in the slice are +.>Gray value of corresponding pixel, +.>For the +.f in the sequence of the standard remote sensing map>The pixel coordinates in the standard remote sensing pictures areGray value of the pixel of +.>For pixel coordinates +.>Horizontal pixel interpolation coefficient corresponding to the pixel of (2), and (c)>For pixel coordinates +.>Longitudinal pixel interpolation coefficients corresponding to pixels of (2), respectively>The interpolation weight is preset; performing primary distinguishing blocks on the sampling remote sensing image sequence to obtain a remote sensing image block sequence; carrying out image feature convolution and feature cluster fusion on the remote sensing block group sequence to obtain a standard remote sensing block group sequence; extracting a partition structure from the standard remote sensing block group sequence, and establishing a forestry grid model according to the partition structure;

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, system, and system may be implemented in other manners. For example, the system embodiments described above are merely illustrative, e.g., the division of the modules is merely a logical function division, and other manners of division may be implemented in practice.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, system, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and extend human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. Multiple units or systems set forth in the system embodiments may also be implemented by one unit or system in software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. The intelligent forestry big data system based on the data lake is characterized by comprising a data integration module, a data lake storage module, a grid mapping module, a resource analysis module and a planning decision module, wherein:

Wherein,for the +.f in the sequence of sampled remote sensing maps>The pixel coordinates in the sampling remote sensing pictures are +.>Gray value of corresponding pixel, +.>For the +.f in the sequence of the standard remote sensing map>The pixel coordinates in the standard remote sensing pictures areGray value of the pixel of +.>For pixel coordinates +.>Horizontal pixel interpolation coefficient corresponding to the pixel of (2), and (c)>For pixel coordinates +.>Longitudinal pixel interpolation coefficients corresponding to pixels of (2), respectively>The interpolation weight is preset; performing primary distinguishing blocks on the sampling remote sensing image sequence to obtain a remote sensing image block sequence; carrying out image feature convolution and feature cluster fusion on the remote sensing block group sequence to obtain a standard remote sensing block group sequence; extracting a partition structure from the standard remote sensing block group sequence, and establishing a forestry grid model according to the partition structure;

2. The data lake-based intelligent forestry big data system of claim 1, wherein the data integration module is configured to, when performing data review and data cleaning on the historical forestry data to obtain integrated forestry data:

3. A data lake-based intelligent forestry big data system of claim 2, wherein the data integration module, when updating the primary structured forestry data set to a standard structured forestry data set, is specifically configured to:

wherein,refers to the +.f in the forestry data outlier set>Individual forestry data outliers,/->、/>Is an index to be set in advance and,、/>is a preset countermeasure coefficient, < >>Refers to the +.f in the primary structured forestry dataset>Individual primary structured forestry data, < >>Refers to the +.f in the primary structured forestry dataset>Individual primary structured forestry data, < >>Is the total number of data of said primary structured forestry dataset,/for>Refers to the +.>Individual primary forestry data characteristics,/->Refers to the +.>Individual primary forestry data characteristics,/->Is the transposed symbol of the symbol, Is a covariance function symbol;

4. The intelligent forestry big data system based on data lake of claim 1, wherein the data lake storage module is specifically configured to, when performing data pooling and data lake storage on the integrated forestry data to obtain a forestry data lake:

5. The data lake-based intelligent forestry big data system of claim 1, wherein the resource analysis module is configured to, when performing spatial time sequence forestry resource feature extraction on the forestry data lake according to the forestry grid model, obtain a forestry resource feature sequence:

6. A data lake-based intelligent forestry big data system of claim 5, wherein the resource analysis module, when training the forestry grid model into a resource analysis model using the sequence of forestry resource characteristics, is specifically configured to:

wherein,refers to the grid loss value, +.>Refers to serial number, & gt>Means the total number of features of each resource timing feature group in the sequence of resource timing feature groups,/->Refers to the sequence length of the resource time sequence feature group sequence,/->Means +.>The +.>Resource timing feature->Means +.>The +.f. in the analysis resource profile>Analyzing resource characteristics, < > >Is a preset constant->For a predetermined loss weight->For Laplacian sign,/->For dot product symbol, ++>For cross sign>Is an absolute value symbol;

7. A data lake-based intelligent forestry big data system of claim 1, wherein the plan decision module is configured to, when utilizing the resource analysis model to perform decision analysis and plan adjustment on the forestry production plan to obtain a standard forestry plan:

8. A data lake-based intelligent forestry big data method, the method comprising:

wherein,for the +.f in the sequence of sampled remote sensing maps >Individual sampling remote sensing mapThe pixel coordinates in the slice are +.>Gray value of corresponding pixel, +.>For the +.f in the sequence of the standard remote sensing map>The pixel coordinates in the standard remote sensing pictures areGray value of the pixel of +.>For pixel coordinates +.>Horizontal pixel interpolation coefficient corresponding to the pixel of (2), and (c)>For pixel coordinates +.>Longitudinal pixel interpolation coefficients corresponding to pixels of (2), respectively>The interpolation weight is preset; performing primary distinguishing blocks on the sampling remote sensing image sequence to obtain a remote sensing image block sequence; carrying out image feature convolution and feature cluster fusion on the remote sensing block group sequence to obtain a standard remote sensing block group sequence; extracting a partition structure from the standard remote sensing block group sequence, and establishing a forestry grid model according to the partition structure;

9. An electronic device, the electronic device comprising:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data lake-based intelligent forestry big data method as recited in claim 8.

10. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements a data lake-based smart forestry big data method as claimed in claim 8.