CN117828075A - Agricultural condition data classification method, agricultural condition data classification device and storage medium - Google Patents

Agricultural condition data classification method, agricultural condition data classification device and storage medium Download PDF

Info

Publication number
CN117828075A
CN117828075A CN202311722817.5A CN202311722817A CN117828075A CN 117828075 A CN117828075 A CN 117828075A CN 202311722817 A CN202311722817 A CN 202311722817A CN 117828075 A CN117828075 A CN 117828075A
Authority
CN
China
Prior art keywords
agricultural
text
data
features
condition data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311722817.5A
Other languages
Chinese (zh)
Inventor
顾静秋
赵春江
吴华瑞
朱华吉
郭威
缪祎晟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Center of Information Technology of Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Research Center of Information Technology of Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Center of Information Technology of Beijing Academy of Agriculture and Forestry Sciences filed Critical Research Center of Information Technology of Beijing Academy of Agriculture and Forestry Sciences
Priority to CN202311722817.5A priority Critical patent/CN117828075A/en
Publication of CN117828075A publication Critical patent/CN117828075A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Primary Health Care (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mining & Mineral Resources (AREA)
  • Marketing (AREA)
  • Agronomy & Crop Science (AREA)
  • Economics (AREA)
  • Animal Husbandry (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a classification method, a device and a storage medium of agricultural condition data, wherein the method comprises the following steps: classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural condition data; acquiring text semantic features and text theme features of the agricultural condition data, and determining agricultural condition text features; and classifying the agricultural condition data based on the agricultural condition text characteristics, the dimension labels and the attention weights to determine classification results. According to the agricultural text data classification method, device and storage medium, the agricultural text data can be initially classified according to the dimension labels, and then refined classification is performed according to the agricultural text characteristics and the attention weight, so that interference of noise data can be reduced, computing resources and time are saved, processing efficiency is improved, and further accuracy of text classification is improved.

Description

Agricultural condition data classification method, agricultural condition data classification device and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and apparatus for classifying agricultural condition data, and a storage medium.
Background
Currently, the generation and propagation of agricultural condition data are driven by the Internet and self-media together. The scale of text agricultural condition data is continuously increased, and the text data comprises weather reports, geology and pest disaster conditions, soil quality analysis, farmer observation reports, market price data and the like generated by platforms such as social media, agricultural blogs and application programs. The text agricultural condition data analysis mining has important significance for improving the agricultural production efficiency, the agricultural product quality, the risk management and the decision making, and is beneficial to promoting the sustainable development of rural communities and the innovation of the agricultural field. The traditional agricultural condition information acquisition application depends on a large amount of manual investigation and expert analysis, and has the defects of high cost, poor timeliness, strong subjectivity and the like.
The existing classification and early warning method of the agricultural text information needs to be realized by adopting deep learning to perform feature extraction and modeling based on the agricultural text information in a large scale and based on the agricultural text features and integrating a topic analysis technology. Aiming at the specific field, the technology of specific modal data and the theme feature fusion technology cannot fully utilize the current massive agricultural condition text data, and the original features are complied and fused, so that the data mining utilization rate is low, and the value is difficult to develop.
Disclosure of Invention
Aiming at the technical problems, the embodiment of the application provides a classification method, a classification device and a storage medium of agricultural condition data.
In a first aspect, an embodiment of the present application provides a method for classifying agricultural condition data, including:
classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural condition data;
acquiring text semantic features and text topic features of the agricultural condition data, and determining agricultural condition text features based on the text semantic features and the text topic features;
and classifying the agricultural data based on the agricultural text features, the dimension labels and the attention weights to determine classification results, wherein the attention weights are determined based on the importance degree of each sub-word in the agricultural text.
In some embodiments, classifying the agricultural text data based on the dimension tag in the agricultural text data to obtain agricultural condition data includes:
performing word segmentation processing on the agricultural text data to obtain sub words of the agricultural text data;
marking the subwords of the agricultural text data according to the time, the place and the theme to obtain the subwords containing dimension labels;
inputting the subword containing the dimension label into a pre-training model, and obtaining agricultural condition data output by the pre-training model.
In some embodiments, the inputting the subword including the dimension tag into a pre-training model, and obtaining the agricultural condition data output by the pre-training model includes:
converting the sub-words containing the dimension labels into word embedding information;
acquiring a pre-training feature vector based on the word embedding information;
inputting the pre-training feature vector into a two-way long-short-term memory model, and obtaining a hidden state sequence output by the two-way long-short-term memory model;
and classifying the agricultural text data based on the hidden state sequence to determine agricultural condition data.
In some embodiments, the acquiring text semantic features and text topic features of the agricultural context data and determining agricultural context text features based on the text semantic features and the text topic features includes:
Based on a pre-training model and dimension labels, acquiring text semantic features of the agricultural condition data, and acquiring text topic features of the agricultural condition data based on a topic model;
and splicing the text semantic features and the text theme features to obtain the agricultural text features.
In some embodiments, the classifying the agricultural context data based on the agricultural context text feature, the dimension tag, and the attention weight to determine a classification result includes:
determining the attention weight based on word frequency and inverse text frequency index;
weighting and fusing the agronomic text features and the attention weight to obtain fused feature vectors;
and classifying the agricultural condition data based on the dimension labels and the fused feature vectors to determine classification results.
In some embodiments, the method further comprises:
determining an agricultural condition early warning value based on the classification discrete value of the classification result;
and comparing the agricultural condition early warning value with a preset threshold value to trigger agricultural condition early warning.
In some embodiments, the dimension tag includes one or more of the following:
a time dimension tag;
geographic scope dimension labels;
a theme dimension tag;
and (5) degree dimension labels.
In a second aspect, an embodiment of the present application further provides a classification device for agricultural condition data, including:
The first acquisition module is used for classifying the agricultural text data based on dimension labels in the agricultural text data to acquire agricultural condition data;
the second acquisition module is used for acquiring text semantic features and text theme features of the agricultural condition data and determining agricultural condition text features based on the text semantic features and the text theme features;
the first determining module is used for classifying the agricultural condition data based on the agricultural condition text characteristics, the dimension labels and the attention weight, and determining classification results, wherein the attention weight is determined based on the importance degree of each subword in the agricultural text.
In some embodiments, the first acquisition module comprises:
the first processing sub-module is used for carrying out word segmentation processing on the agricultural text data to obtain sub-words of the agricultural text data;
the second processing sub-module is used for marking the sub-words of the agricultural text data according to time, place and theme to obtain sub-words containing dimension labels;
the first acquisition sub-module is used for inputting the sub-words containing the dimension labels into a pre-training model and acquiring agricultural condition data output by the pre-training model.
In some embodiments, the first acquisition submodule includes:
The first processing unit is used for converting the subwords containing the dimension labels into word embedding information;
a first acquisition unit configured to acquire a pre-training feature vector based on the word embedding information;
the second acquisition unit is used for inputting the pre-training feature vector into a two-way long-short-term memory model and acquiring a hidden state sequence output by the two-way long-term memory model;
and the first determining unit is used for classifying the agricultural text data based on the hidden state sequence to determine agricultural condition data.
In some embodiments, the second acquisition module comprises:
the second acquisition sub-module is used for acquiring text semantic features of the agricultural condition data based on the pre-training model and the dimension labels and acquiring text topic features of the agricultural condition data based on the topic model;
and the third acquisition sub-module is used for splicing the text semantic features and the text theme features to acquire the agricultural text features.
In some embodiments, the first determining module comprises:
a first determination sub-module for determining the attention weight based on word frequency and inverse text frequency index;
the first fusion sub-module is used for weighting and fusing the agronomic text features and the attention weight to obtain fused feature vectors;
And the second determining sub-module is used for classifying the agricultural condition data based on the dimension labels and the fused feature vectors to determine classification results.
In some embodiments, the classification device of agricultural condition data further includes:
the second determining module is used for determining an agricultural condition early warning value based on the classification discrete value of the classification result;
the first comparison module is used for comparing the agricultural condition early warning value with a preset threshold value and triggering agricultural condition early warning.
In some embodiments, the dimension tag includes one or more of the following:
a time dimension tag;
geographic scope dimension labels;
a theme dimension tag;
and (5) degree dimension labels.
In a third aspect, an embodiment of the present application further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements a method for classifying agricultural condition data according to any one of the foregoing methods when the processor executes the program.
In a fourth aspect, embodiments of the present application also provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of classifying agricultural condition data as described in any of the above.
In a fifth aspect, embodiments of the present application further provide a computer program product comprising a computer program which, when executed by a processor, implements a method of classifying agricultural condition data as described in any one of the above.
According to the agricultural text data classification method, device and storage medium, the agricultural text data can be initially classified according to the dimension labels, and then refined classification is performed according to the agricultural text characteristics and the attention weight, so that interference of noise data can be reduced, computing resources and time are saved, processing efficiency is improved, and further accuracy of text classification is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the related art, the following description will briefly describe the drawings that are required to be used in the embodiments or the related technical descriptions, and it is obvious that, in the following description, the drawings are some embodiments of the present application, and other drawings may be obtained according to these drawings without any inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a classification method of agricultural condition data according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a secondary classification model according to an embodiment of the present application;
Fig. 3 is a schematic structural diagram of a classification device for agricultural condition data according to an embodiment of the present application;
fig. 4 is a schematic entity structure of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Fig. 1 is a flow chart of a classification method of agricultural condition data provided in an embodiment of the present application, as shown in fig. 1, where the embodiment of the present application provides a classification method of agricultural condition data, including:
and step 101, classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural condition data.
Specifically, agricultural condition text data has the characteristics of itself, and can relate to specific agricultural field characteristic information, including crops, cultivation, weather, market price, epidemic situation and the like, and include field specific terms and knowledge; the agricultural text may cover a variety of topics, from weather forecast and agricultural technology to market conditions and disaster situations, the text content may be very diverse. Agricultural text is often related to agricultural activities in a specific region or rural area, so geographic location information and regional characteristics can be very important; the timeliness of agricultural text is often high, as they may involve weather predictions, seasonal agricultural activities, etc., requiring timely information and decision support; agriculture context may include emotions and attitudes related to agriculture, such as concerns about weather conditions, influence of market fluctuations, etc. Due to the diversity of information sources, agricultural text data, especially short text data, may be noisy in information, including inaccurate data, subjective views, and inconsistent information.
In order to solve the above problems, the classification method for agricultural condition data provided by the embodiment of the application can collect internet agricultural text data first, analyze agricultural condition data characteristics of an agricultural professional platform text data set, and perform data preprocessing. And then labeling the agricultural text data meeting the requirements according to time, space and theme dimension by using dimension labels. And finally, inputting the agricultural text data containing the dimension labels into a pre-training model, and obtaining a first classification result output by the model.
The first classification is to classify the agricultural text data to obtain agricultural condition data in the agricultural text data. The first classification result may be agricultural condition data and non-agricultural condition data.
Further, the pre-training model may be a BERT (bidirectional encoder representations from transformer) model, the agricultural text data including dimension labels may be vectorized and then input into a BERT model, BERT embedding of the text is obtained, and the BERT embedding is pooled to obtain BERT feature vectors with a fixed length. BERT feature vectors are input into a two-way long and short Term Memory (BiLSTM) model as inputs to a sequence model. And adding a full connection layer after BiLSTM to realize agricultural condition data two-class.
Step 102, acquiring text semantic features and text topic features of the agricultural condition data, and determining agricultural condition text features based on the text semantic features and the text topic features.
Specifically, according to the first classification result, the agricultural condition data classified as the positive example is screened from the original data. Text semantic features of the agricultural condition data are obtained through a pre-training model, and text topic features of the agricultural condition data are obtained through a topic model. And then vector splicing is carried out on the text semantic features and the text theme features to obtain the agricultural text features.
Further, dimension labels can be marked for the agriculture subject and the influence degree according to agriculture data screened from the first classification, a dimension label sequence is input into the BERT model, self-attention features are extracted from output of the BERT model, and text semantic features with fixed lengths are generated from the multi-head self-attention features.
Further, text data can be converted into a document-word frequency matrix according to agricultural condition data screened from the first classification, the document-word frequency matrix is input into a LDA (Latent Dirichlet Allocation) model, and topic distribution is extracted as text topic features.
And step 103, classifying the agricultural data based on the agricultural text features, the dimension labels and the attention weights to determine classification results, wherein the attention weights are determined based on the importance degree of each sub-word in the agricultural text.
Specifically, in the prior art, the agronomic text features are input into the BiLSTM as a representation of the entire text, the BiLSTM splices and outputs the text features at different moments through the output vectors of the forward and reverse memory network, and the last moment state vector gathers the semantic features of the entire text, and generally, the vector is directly used for classification.
However, this approach considers that the role of each word in the text is indiscriminate, the contribution degree of the whole semantic meaning is the same, but in the actual agricultural text, the contribution of different agricultural words to the classification of the agricultural text is inconsistent, so the classification method of the agricultural text data provided by the embodiment of the application introduces a attention layer behind the BiLSTM layer, and assigns a weight to the role of each word in the text. And carrying out weighted fusion according to the weights to replace average calculation and outputting the fused feature vector. And finally, carrying out secondary classification according to the dimension labels and the fused feature vectors to realize the detailed classification of the agricultural condition data.
According to the agricultural text data classifying method, the agricultural text data can be initially classified according to the dimension labels, and then refined classification is performed according to the agricultural text characteristics and the attention weight, so that interference of noise data can be reduced, computing resources and time are saved, processing efficiency is improved, and further accuracy of text classification is improved.
In some embodiments, classifying the agricultural text data based on the dimension tag in the agricultural text data to obtain agricultural condition data includes:
performing word segmentation processing on the agricultural text data to obtain sub words of the agricultural text data;
marking the subwords of the agricultural text data according to the time, the place and the theme to obtain the subwords containing dimension labels;
inputting the subword containing the dimension label into a pre-training model, and obtaining agricultural condition data output by the pre-training model.
In particular, data acquisition may be advanced. And collecting agricultural condition plate data of an agricultural technology popularization platform, internet mainstream agricultural media, microblogs and other self-media agricultural information text data. And acquiring 30-300 words of short text data according to the agriculture analysis requirement.
And then carrying out data preprocessing on the agricultural text data. Deleting special characters, punctuation marks, HTML tags, etc. in the text. Common disused words such as "have", "yes" and the like are removed. The text is divided into words or sub-words and each word or sub-word is assigned a unique tag. Entity recognition tools are used to mark time, place and agricultural topic entities in text. These entities are labeled as special tags so that the model can identify them.
For example, time can be marked as [ t ], places can be marked as [ l ], and agricultural themes can be classified into four categories of soil moisture content, disaster condition and seedling condition according to early warning requirements, and the four categories of the agricultural themes are marked as [ m ] and [ d ] and the like. CLS is added at the beginning of the text as a classification tag and SEP is added between sentences as a separation tag.
And finally, marking the agricultural text, inputting the preprocessed agricultural text data as features into the BERT model, and obtaining agricultural condition data output by the model.
Wherein text containing time, place and agricultural topics may be labeled as an "agricultural" category. Text that does not contain these elements may be labeled as a "non-agricultural" category.
According to the classification method for the agricultural condition data, the non-agricultural condition data can be rapidly screened out through first classification, and the noise of agricultural condition data classification is reduced, so that the whole classification process is accelerated, and the problem of unbalanced data of the main stream of the non-agricultural condition data is solved.
In some embodiments, the inputting the subword including the dimension tag into a pre-training model, and obtaining the agricultural condition data output by the pre-training model includes:
converting the sub-words containing the dimension labels into word embedding information;
Acquiring a pre-training feature vector based on the word embedding information;
inputting the pre-training feature vector into a two-way long-short-term memory model, and obtaining a hidden state sequence output by the two-way long-short-term memory model;
and classifying the agricultural text data based on the hidden state sequence to determine agricultural condition data.
Specifically, the preprocessed agricultural text data is input into a BERT model as a feature, each tagged word is converted into a context-dependent word embedding by the BERT on the basis of the pre-trained BERT model, the word embedding comprises a word vector, a position vector and a segmentation vector, and a related word embedding matrix, namely word embedding information, is output, wherein each row is a word embedding, and each column corresponds to one time step. And inputting word embedded information into the BiLSTM, further extracting features according to sequence dependency relations among the context information capturing words, and outputting the features as a hidden state sequence of the BiLSTM, wherein the hidden state of each time step captures the context information of the corresponding time step. And then an output layer for the two categories is added thereto. The output layer may be a binary classifier for determining whether the text is agricultural data.
It is worth mentioning that the BERT version may use BERT base. epoch is the training times of all samples of the model, after training data is finished once, the current training result is measured by a verification set, the training is stopped when the optimal effect is achieved, and the application is 10. The value of the sample number batch_size of each batch is 100 in training, the learning rate influences the convergence rate of the model, and the value is set to be 1e-5 in training; dropout is to let neurons stop working with a certain probability p, which can prevent overfitting and can be set to 0.5; the BERT hidden layer dimension is the characteristic dimension of the final word vector, the dimension is 768 consistent with the dimension of the BiLSTM input layer, the dimension of the BiLSTM hidden layer represents the number of neurons of the hidden layer in the network, 384 is set, and the layer number is set to be 2; out_size is the number of tags.
According to the agricultural condition data classification method, the agricultural text data can be subjected to classification filtering through the BERT model and the BiLSTM model, non-agricultural condition data can be rapidly screened out, noise of agricultural condition data classification is reduced, the whole classification flow is accelerated, and the problem of data unbalance of the non-agricultural condition data in the main stream is solved.
In some embodiments, the acquiring text semantic features and text topic features of the agricultural context data and determining agricultural context text features based on the text semantic features and the text topic features includes:
Based on a pre-training model and dimension labels, acquiring text semantic features of the agricultural condition data, and acquiring text topic features of the agricultural condition data based on a topic model;
and splicing the text semantic features and the text theme features to obtain the agricultural text features.
Specifically, data preprocessing may be performed. And screening out the agricultural condition data classified as positive examples from the original data according to the first classification result, wherein the data are used for the second classification. The data tag can be defined as a 5-dimensional tag according to the agronomic classification early warning requirement. Including a time dimension, a geographic scope dimension, a topic dimension (including primary and secondary topics), and a degree dimension. And labeling 5-dimensional labels on the agricultural condition data.
Feature extraction is then performed. And (3) encoding the preprocessed agricultural condition data by using a pre-trained BERT model to generate the word embedding of BERT. For multi-labels of agricultural data, for each label, a binary vector is created for all possible values of the label, the length of the vector being equal to the number of possible values. Then, the position corresponding to the value in the vector is set to 1, and the remaining positions are set to 0. The multiple values of each tag are encoded as a binary multi-hot vector. For example, time [1,0,0,0,0,0], secondary topic [0,0,1,0,0,0,0,0,0,0]. The input is encoded by a multi-layer transducer. The output of the CLS tag is used as a text semantic feature representation of the entire input text.
The preprocessed agricultural data is represented as a document-word matrix, where each row represents a document and each column represents a vocabulary item. The elements in the matrix are TF-IDF (word frequency-inverse document frequency) values for each vocabulary item in the text. The number of topics to be extracted is defined as 18 according to the division of the agricultural condition secondary topics. Topic modeling is performed on the text data using an LDA model. For each document, a trained LDA model is used to extract the topic distribution as text topic features of the text. The text topic feature is an 18-dimensional vector that contains a probability distribution for each topic that represents the weight of each topic in the text.
And finally, performing characteristic splicing. Bert represents text semantic features as 768-dimensional vectors (i.e., the output of CLS tags) concatenated with LDA theme feature 18-dimensional vectors. The spliced agronomic text feature is a 786 dimension vector, and the agronomic text feature is input into BiLSTM as a representation of the entire text.
According to the agricultural condition data classification method, text semantics are understood through the BERT model, and text topic features are extracted through the topic model, so that text content is better understood through combination of the text semantic and the topic model. And through BiLSTM and attention mechanism, complex relation and dependence in the text can be learned, and the expression capability of text features is enhanced.
In some embodiments, the classifying the agricultural context data based on the agricultural context text feature, the dimension tag, and the attention weight to determine a classification result includes:
determining the attention weight based on word frequency and inverse text frequency index;
weighting and fusing the agronomic text features and the attention weight to obtain fused feature vectors;
and classifying the agricultural condition data based on the dimension labels and the fused feature vectors to determine classification results.
Specifically, the attention layer is introduced after the BiLSTM layer, assigning a weight to each word's contribution in the text.
TF-IDF can be used to evaluate how important a word is to a document. Based on this idea, the degree of contribution of the vocabulary to the differently classified document sets is calculated. The contribution degree is used as the prior attention weight of the word to participate in the attention weight learning. The TF and IDF improvement calculation formula is as follows:
wherein n is i,j Representing the frequency of occurrence of vocabulary i in document j, Σ k n k,j The sum of the occurrence frequencies of all words in the document j is represented, l represents the classification, and m represents the number of documents under the classification l. I D I represents the number of documents in the document set, and I j: t i ∈d j The i represents the number of documents that contain the vocabulary i.
The TF-IDF weights are then normalized by a softmax function.
Wi,l=softmax(TF i,l *IDF i )
An attention scoring function is defined, the formula is as follows, where h t And outputting a feature vector at the moment t, wherein W is a learnable weight, and b is a bias term.
S t =tanh(Wh t +b)
Then carrying out softmax normalization to ensure that the sum of the weights is 1, and the weight value is 0,1]The formula is as follows, wherein,is S t Transpose of u w Is a randomly initialized vector and can be learned in a feed forward network.
The attention weight is added to TF-IDF priori knowledge, the weighted fusion BiLSTM outputs the feature vector of each moment, and the calculation formula is as follows:
after the fused feature vectors are obtained, the text can be classified according to 5-dimensional multi-labels, and a Sigmoid function can be used as an activation function, wherein y is an actual text category, and W y To classify the weight matrix, b y For classifying biasAnd (5) placing.
The cross entropy loss function is then used to drive back propagation, as follows, where D is the number of training set samples, l is the number of classification labels, y is the actual class,to predict category, λ θ|| 2 The parameter term is regularized for L2.
According to the agricultural condition data classifying method, the agricultural condition data is classified for the second time, the TF-IDF word weight calculation is improved through calculating the contribution degree of the vocabulary to the document sets of different classifications, the TF-IDF word weight calculation is used as the prior attention weight of the word to participate in attention weight learning, the attention weight after learning of the feature vector of each moment of BiLSTM output and the TF-IDF prior weight are weighted and fused to obtain enhanced text features, and a multi-label classifying layer is adopted, so that the text can be accurately classified, and the result is more accurate.
In some embodiments, the method further comprises:
determining an agricultural condition early warning value based on the classification discrete value of the classification result;
and comparing the agricultural condition early warning value with a preset threshold value to trigger agricultural condition early warning.
Specifically, fig. 2 is a schematic flow chart of a secondary classification model provided in the embodiment of the present application, as shown in fig. 2, two-stage training may be performed on a constructed deep neural network model by using an agricultural short text data set, and agricultural condition semantic features of the data set are learned, so as to obtain an accurate agricultural condition text classification model, and based on the model, the agricultural condition text can be determined, so as to determine detail classification. When a new text appears, the text can be automatically input into a model, the model output accords with whether the information is agricultural condition information, if so, the detailed classification is continuously output, and the agricultural condition early warning active pushing service is realized according to the classification characteristics of agricultural condition data, the early warning rules and the triggering mechanism.
Early warning rules and early warning trigger mechanisms can be constructed first. Based on historical data and domain knowledge, defining a time dimension, an influence area dimension, an agricultural theme dimension and a weight coefficient of 4 dimensions of influence degree according to different agricultural classification, and generating a weight matrix.
And setting quantitative scores such as time scores, regional scores, topic type scores and degree scores according to the classification discrete values of different dimensionalities of the agricultural condition. The early warning rule can be freely combined through products of different dimension weights and values, and corresponding threshold triggering can be set.
For example: pesticide early warning value=w1×time score+w2×area score+w3×topic type score+w4×extent score.
And then aiming at agricultural text information acquired in a period of time, acquiring agricultural condition detail classification by using a secondary classification model designed by the patent. And based on the acquired agricultural condition data classification information, acquiring the weight of each dimension corresponding to classification, and simultaneously mapping the discrete value of each dimension of the classification information into a continuous value. And calculating the agricultural condition early warning score according to the classification early warning rule, comparing the threshold values, and triggering agricultural condition early warning. Different early warning levels can be defined according to the score.
Finally, based on the acquired agriculture information text and classification information, summarizing analysis can be performed, a space-time agriculture information analysis thematic map is generated, and agriculture information classification inquiry is performed.
According to the classification method of the agricultural condition data, provided by the embodiment of the application, the agricultural condition early warning quantification rule is defined, the rule parameter weight and the threshold value can be dynamically adjusted, automatic early warning can be judged and realized according to the classification result and the early warning rule, and the flexibility and the adaptability of early warning service are improved.
In some embodiments, the dimension tag includes one or more of the following:
a time dimension tag;
geographic scope dimension labels;
A theme dimension tag;
and (5) degree dimension labels.
Specifically, the data tag can be defined as a 5-dimensional tag according to the agriculture-intelligence-classification pre-warning requirement. Including time dimension, including label values of about 1 day, about three days, one week, one month, three months, 6 months, 1 year, etc.; geographic scope dimension, including county, city, province, national tag value; theme dimension, adding a secondary theme in addition to the soil moisture content, the disaster condition, the seedling condition and the primary theme of the market, wherein the disaster condition comprises ecological disasters, meteorological disasters, geological disasters, biological disasters and the like; the dimension of extent, which includes particularly severe, generally, slight, and unknown. And labeling 5-dimensional labels on the agricultural condition data.
According to the classification method of the agricultural condition data, provided by the embodiment of the application, the agricultural condition data characteristics are combined, the data are marked from the dimensions of space time, theme and the like, and the accuracy of text classification can be improved by adapting to the data training models with different secondary classifications.
The method in the above embodiment will be further described below with specific examples.
The experimental data of the embodiment of the application uses agricultural condition data, agricultural information network information data and microblog data issued by an agricultural technology popularization platform, and total 100 ten thousand pieces of short text experimental data. The data coverage covers a variety of different agricultural text types, topics, time ranges, and geographic areas. The classification method of the agricultural condition data provided by the embodiment of the application is compared with a traditional text classification algorithm K nearest neighbor (K-NearestNeighbor, KNN) based on statistics, a textCNN algorithm based on a convolutional neural network, a BiLSTM+attention algorithm combining a two-way long-short-term memory network and an Attention mechanism, and a deep pyramid convolutional neural network (Deep Pyramid Convolutional Neural Networks, DPCNN). The results are shown in Table 1. The accuracy rate of the classification method of the agricultural condition data provided by the embodiment of the application is 90.78%, and the F1 value is 90.14% which is superior to other model algorithms.
TABLE 1 comparative experiments (in%)
Model Accuracy rate of Recall rate of recall F1 value
KNN 81.81 79.63 80.71
TextCNN 86.32 83.81 85.05
BiLSTM+Attention 89.55 86.34 87.92
DPCNN 88.45 87.08 87.76
The method of the application 90.78 89.5 90.14
Compared with the prior art, the agricultural condition data classification method provided by the embodiment of the application comprehensively considers the effective characteristics, the spatial region characteristics and the theme characteristics of the agricultural condition text, so that the model can better understand the agricultural condition data; a two-stage classification method is adopted, and the preliminary classification is firstly carried out, and then the classification is refined. The method reduces the interference of noise data, saves calculation resources and time, improves the processing efficiency and improves the accuracy of text classification; the BERT, LDA, biLSTM and the Attention mechanism are integrated, the text is analyzed from different angles, the model can better capture the context information of the text, and the extraction and classification performances of the text characteristics are improved, so that the classification accuracy is improved; the adoption of multi-label classification is suitable for various types of agricultural condition data, and can provide more detailed information.
It is worth mentioning that the classification method of the agricultural condition data provided by the embodiment of the application not only can be used for agricultural condition text classification, but also can be expanded to other business scenes. The accurate agriculture condition text classification and early warning provided can provide timely farmland information, help farmers to take timely agricultural measures such as irrigation, fertilization, pest control and the like, and accordingly improve crop yield and quality. Timely agricultural condition early warning can help farmers to cope with natural disasters, climate changes and other risk factors, and agricultural losses are reduced, including harvest losses and livestock losses. By accurate information, resources can be managed more efficiently, reducing waste of resources, e.g., unnecessary water, fertilizer, and pesticide use, and production costs. Accurate agricultural text classification and early warning also helps farmers and agricultural practitioners to better understand market trends and make more intelligent agricultural decisions, including sales and price prediction of agricultural products.
Fig. 3 is a schematic structural diagram of a classification device for agricultural condition data according to an embodiment of the present application, as shown in fig. 3, where the classification device for agricultural condition data according to an embodiment of the present application includes a first obtaining module 301, a second obtaining module 302, and a first determining module 303, where:
the first obtaining module 301 is configured to classify the agricultural text data based on a dimension tag in the agricultural text data to obtain agricultural condition data;
a second obtaining module 302, configured to obtain text semantic features and text topic features of the agricultural condition data, and determine agricultural condition text features based on the text semantic features and the text topic features;
a first determining module 303, configured to determine a classification result by classifying the agricultural text data based on the agricultural text feature, the dimension tag, and an attention weight, where the attention weight is determined based on an importance degree of each subword in the agricultural text.
In some embodiments, the first acquisition module comprises:
the first processing sub-module is used for carrying out word segmentation processing on the agricultural text data to obtain sub-words of the agricultural text data;
the second processing sub-module is used for marking the sub-words of the agricultural text data according to time, place and theme to obtain sub-words containing dimension labels;
The first acquisition sub-module is used for inputting the sub-words containing the dimension labels into a pre-training model and acquiring agricultural condition data output by the pre-training model.
In some embodiments, the first acquisition submodule includes:
the first processing unit is used for converting the subwords containing the dimension labels into word embedding information;
a first acquisition unit configured to acquire a pre-training feature vector based on the word embedding information;
the second acquisition unit is used for inputting the pre-training feature vector into a two-way long-short-term memory model and acquiring a hidden state sequence output by the two-way long-term memory model;
and the first determining unit is used for classifying the agricultural text data based on the hidden state sequence to determine agricultural condition data.
In some embodiments, the second acquisition module comprises:
the second acquisition sub-module is used for acquiring text semantic features of the agricultural condition data based on the pre-training model and the dimension labels and acquiring text topic features of the agricultural condition data based on the topic model;
and the third acquisition sub-module is used for splicing the text semantic features and the text theme features to acquire the agricultural text features.
In some embodiments, the first determining module comprises:
a first determination sub-module for determining the attention weight based on word frequency and inverse text frequency index;
the first fusion sub-module is used for weighting and fusing the agronomic text features and the attention weight to obtain fused feature vectors;
and the second determining sub-module is used for classifying the agricultural condition data based on the dimension labels and the fused feature vectors to determine classification results.
In some embodiments, the classification device of agricultural condition data further includes:
the second determining module is used for determining an agricultural condition early warning value based on the classification discrete value of the classification result;
the first comparison module is used for comparing the agricultural condition early warning value with a preset threshold value and triggering agricultural condition early warning.
In some embodiments, the dimension tag includes one or more of the following:
a time dimension tag;
geographic scope dimension labels;
a theme dimension tag;
and (5) degree dimension labels.
Specifically, the classification device for agricultural condition data provided in the embodiment of the present application can implement all the method steps implemented in the embodiment of the classification method for agricultural condition data, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the embodiment of the method are omitted herein.
Fig. 4 is a schematic physical structure of an electronic device provided in an embodiment of the present application, as shown in fig. 4, the electronic device may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. Processor 410 may invoke logic instructions in memory 430 to perform a method of classifying agricultural data, the method comprising:
classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural condition data;
acquiring text semantic features and text topic features of the agricultural condition data, and determining agricultural condition text features based on the text semantic features and the text topic features;
and classifying the agricultural data based on the agricultural text features, the dimension labels and the attention weights to determine classification results, wherein the attention weights are determined based on the importance degree of each sub-word in the agricultural text.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In some embodiments, classifying the agricultural text data based on the dimension tag in the agricultural text data to obtain agricultural condition data includes:
performing word segmentation processing on the agricultural text data to obtain sub words of the agricultural text data;
marking the subwords of the agricultural text data according to the time, the place and the theme to obtain the subwords containing dimension labels;
inputting the subword containing the dimension label into a pre-training model, and obtaining agricultural condition data output by the pre-training model.
In some embodiments, the inputting the subword including the dimension tag into a pre-training model, and obtaining the agricultural condition data output by the pre-training model includes:
converting the sub-words containing the dimension labels into word embedding information;
acquiring a pre-training feature vector based on the word embedding information;
inputting the pre-training feature vector into a two-way long-short-term memory model, and obtaining a hidden state sequence output by the two-way long-short-term memory model;
and classifying the agricultural text data based on the hidden state sequence to determine agricultural condition data.
In some embodiments, the acquiring text semantic features and text topic features of the agricultural context data and determining agricultural context text features based on the text semantic features and the text topic features includes:
Based on a pre-training model and dimension labels, acquiring text semantic features of the agricultural condition data, and acquiring text topic features of the agricultural condition data based on a topic model;
and splicing the text semantic features and the text theme features to obtain the agricultural text features.
In some embodiments, the classifying the agricultural context data based on the agricultural context text feature, the dimension tag, and the attention weight to determine a classification result includes:
determining the attention weight based on word frequency and inverse text frequency index;
weighting and fusing the agronomic text features and the attention weight to obtain fused feature vectors;
and classifying the agricultural condition data based on the dimension labels and the fused feature vectors to determine classification results.
In some embodiments, the method further comprises:
determining an agricultural condition early warning value based on the classification discrete value of the classification result;
and comparing the agricultural condition early warning value with a preset threshold value to trigger agricultural condition early warning.
In some embodiments, the dimension tag includes one or more of the following:
a time dimension tag;
geographic scope dimension labels;
a theme dimension tag;
and (5) degree dimension labels.
Specifically, the electronic device provided in the embodiment of the present application can implement all the method steps implemented by the method embodiment in which the execution subject is the electronic device, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those of the method embodiment in the embodiment are omitted herein.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing a method of classifying agricultural condition data provided by the above methods, the method comprising:
classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural condition data;
acquiring text semantic features and text topic features of the agricultural condition data, and determining agricultural condition text features based on the text semantic features and the text topic features;
and classifying the agricultural data based on the agricultural text features, the dimension labels and the attention weights to determine classification results, wherein the attention weights are determined based on the importance degree of each sub-word in the agricultural text.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a method of classifying agricultural condition data provided by the above methods, the method comprising:
Classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural condition data;
acquiring text semantic features and text topic features of the agricultural condition data, and determining agricultural condition text features based on the text semantic features and the text topic features;
and classifying the agricultural data based on the agricultural text features, the dimension labels and the attention weights to determine classification results, wherein the attention weights are determined based on the importance degree of each sub-word in the agricultural text.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
In addition, it should be noted that: the terms "first," "second," and the like in the embodiments of the present application are used for distinguishing between similar objects and not for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the terms "first" and "second" are generally intended to be used in a generic sense and not to limit the number of objects, for example, the first object may be one or more.
In the embodiment of the application, the term "and/or" describes the association relationship of the association objects, which means that three relationships may exist, for example, a and/or B may be represented: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
The term "plurality" in the embodiments of the present application means two or more, and other adjectives are similar thereto.
The term "determining B based on a" in the present application means that a is a factor to be considered in determining B. Not limited to "B can be determined based on A alone", it should also include: "B based on A and C", "B based on A, C and E", "C based on A, further B based on C", etc. Additionally, a may be included as a condition for determining B, for example, "when a satisfies a first condition, B is determined using a first method"; for another example, "when a satisfies the second condition, B" is determined, etc.; for another example, "when a satisfies the third condition, B" is determined based on the first parameter, and the like. Of course, a may be a condition in which a is a factor for determining B, for example, "when a satisfies the first condition, C is determined using the first method, and B is further determined based on C", or the like.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for classifying agricultural condition data, comprising:
classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural condition data;
acquiring text semantic features and text topic features of the agricultural condition data, and determining agricultural condition text features based on the text semantic features and the text topic features;
and classifying the agricultural data based on the agricultural text features, the dimension labels and the attention weights to determine classification results, wherein the attention weights are determined based on the importance degree of each sub-word in the agricultural text.
2. The method for classifying agricultural context data according to claim 1, wherein classifying the agricultural text data based on dimension labels in the agricultural text data to obtain agricultural context data comprises:
Performing word segmentation processing on the agricultural text data to obtain sub words of the agricultural text data;
marking the subwords of the agricultural text data according to the time, the place and the theme to obtain the subwords containing dimension labels;
inputting the subword containing the dimension label into a pre-training model, and obtaining agricultural condition data output by the pre-training model.
3. The method for classifying agricultural condition data according to claim 2, wherein the inputting the subword including the dimension tag into the pre-training model to obtain agricultural condition data output by the pre-training model includes:
converting the sub-words containing the dimension labels into word embedding information;
acquiring a pre-training feature vector based on the word embedding information;
inputting the pre-training feature vector into a two-way long-short-term memory model, and obtaining a hidden state sequence output by the two-way long-short-term memory model;
and classifying the agricultural text data based on the hidden state sequence to determine agricultural condition data.
4. The method for classifying agricultural context data according to claim 1, wherein the acquiring text semantic features and text topic features of the agricultural context data and determining agricultural context text features based on the text semantic features and the text topic features comprises:
Based on a pre-training model and dimension labels, acquiring text semantic features of the agricultural condition data, and acquiring text topic features of the agricultural condition data based on a topic model;
and splicing the text semantic features and the text theme features to obtain the agricultural text features.
5. The method for classifying agricultural condition data according to claim 1, wherein the classifying the agricultural condition data based on the agricultural condition text feature, the dimension tag and the attention weight to determine a classification result includes:
determining the attention weight based on word frequency and inverse text frequency index;
weighting and fusing the agronomic text features and the attention weight to obtain fused feature vectors;
and classifying the agricultural condition data based on the dimension labels and the fused feature vectors to determine classification results.
6. The method of claim 1, further comprising:
determining an agricultural condition early warning value based on the classification discrete value of the classification result;
and comparing the agricultural condition early warning value with a preset threshold value to trigger agricultural condition early warning.
7. The method of classifying agricultural condition data according to any one of claims 1 to 6, wherein the dimension labels include one or more of:
A time dimension tag;
geographic scope dimension labels;
a theme dimension tag;
and (5) degree dimension labels.
8. A classification device for agricultural condition data, comprising:
the first acquisition module is used for classifying the agricultural text data based on dimension labels in the agricultural text data to acquire agricultural condition data;
the second acquisition module is used for acquiring text semantic features and text theme features of the agricultural condition data and determining agricultural condition text features based on the text semantic features and the text theme features;
the first determining module is used for classifying the agricultural condition data based on the agricultural condition text characteristics, the dimension labels and the attention weight, and determining classification results, wherein the attention weight is determined based on the importance degree of each subword in the agricultural text.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the method of classifying agricultural condition data according to any one of claims 1 to 7 when executing the program.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method of classifying agricultural condition data according to any one of claims 1 to 7.
CN202311722817.5A 2023-12-14 2023-12-14 Agricultural condition data classification method, agricultural condition data classification device and storage medium Pending CN117828075A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311722817.5A CN117828075A (en) 2023-12-14 2023-12-14 Agricultural condition data classification method, agricultural condition data classification device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311722817.5A CN117828075A (en) 2023-12-14 2023-12-14 Agricultural condition data classification method, agricultural condition data classification device and storage medium

Publications (1)

Publication Number Publication Date
CN117828075A true CN117828075A (en) 2024-04-05

Family

ID=90510640

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311722817.5A Pending CN117828075A (en) 2023-12-14 2023-12-14 Agricultural condition data classification method, agricultural condition data classification device and storage medium

Country Status (1)

Country Link
CN (1) CN117828075A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614538A (en) * 2018-12-17 2019-04-12 广东工业大学 A kind of extracting method, device and the equipment of agricultural product price data
CN113901813A (en) * 2021-10-09 2022-01-07 东南大学 Event extraction method based on topic features and implicit sentence structure
US20220398374A1 (en) * 2021-06-15 2022-12-15 Siemens Healthcare Gmbh Method and apparatus for segmenting a medical text report into sections
CN116563006A (en) * 2023-04-03 2023-08-08 深圳智能思创科技有限公司 Service risk early warning method, device, storage medium and device
CN116756320A (en) * 2023-06-27 2023-09-15 河南农业大学 Agricultural question classification method based on pre-training language model and theme enhancement
WO2023178903A1 (en) * 2022-03-24 2023-09-28 上海帜讯信息技术股份有限公司 Industry professional text automatic labeling method and apparatus, terminal, and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614538A (en) * 2018-12-17 2019-04-12 广东工业大学 A kind of extracting method, device and the equipment of agricultural product price data
US20220398374A1 (en) * 2021-06-15 2022-12-15 Siemens Healthcare Gmbh Method and apparatus for segmenting a medical text report into sections
CN113901813A (en) * 2021-10-09 2022-01-07 东南大学 Event extraction method based on topic features and implicit sentence structure
WO2023178903A1 (en) * 2022-03-24 2023-09-28 上海帜讯信息技术股份有限公司 Industry professional text automatic labeling method and apparatus, terminal, and storage medium
CN116563006A (en) * 2023-04-03 2023-08-08 深圳智能思创科技有限公司 Service risk early warning method, device, storage medium and device
CN116756320A (en) * 2023-06-27 2023-09-15 河南农业大学 Agricultural question classification method based on pre-training language model and theme enhancement

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
朱国杰: "基于文本特征的不良网页检测系统的研究与实现", 中国优秀硕士学位论文全文数据库 信息科技辑, 15 February 2022 (2022-02-15), pages 138 - 1241 *
郝婷等: "融合Bert和BiLSTM的中文短文本分类研究", 软件工程, vol. 26, no. 3, 5 March 2023 (2023-03-05), pages 58 - 62 *
香慧敏: "农业文本多标签分类及自动标引应用研究", 中国优秀硕士学位论文全文数据库 农业科技;信息科技, no. 3, 15 March 2023 (2023-03-15) *

Similar Documents

Publication Publication Date Title
US11893641B2 (en) Sentiment and rules-based equity analysis using customized neural networks in multi-layer, machine learning-based model
Swathi et al. An optimal deep learning-based LSTM for stock price prediction using twitter sentiment analysis
Sun et al. Using long short-term memory recurrent neural network in land cover classification on Landsat and Cropland data layer time series
Katarya et al. Impact of machine learning techniques in precision agriculture
CN106649561A (en) Intelligent question-answering system for tax consultation service
CN110309508A (en) A kind of VWAP quantization transaction system and method based on investor sentiment
Anbananthen et al. An intelligent decision support system for crop yield prediction using hybrid machine learning algorithms
Deepa et al. An effective automated ontology construction based on the agriculture domain
Rojarath et al. Probability-weighted voting ensemble learning for classification model
Cuong et al. An approach based on deep learning that recommends fertilizers and pesticides for agriculture recommendation
Kumar et al. An algorithm for automatic text annotation for named entity recognition using Spacy framework
Wang et al. A novel stock index direction prediction based on dual classifier coupling and investor sentiment analysis
Swaminathan et al. Meta learning-based dynamic ensemble model for crop selection
Xu et al. Novel Early-Warning Model for Customer Churn of Credit Card Based on GSAIBAS-CatBoost.
Kipkogei et al. Business success prediction in Rwanda: a comparison of tree-based models and logistic regression classifiers
CN109635289A (en) Entry classification method and audit information abstracting method
CN115375361A (en) Method and device for selecting target population for online advertisement delivery and electronic equipment
CN117828075A (en) Agricultural condition data classification method, agricultural condition data classification device and storage medium
Jiang et al. Network public comments sentiment analysis based on multilayer convolutional neural network
Li Textual Data Mining for Financial Fraud Detection: A Deep Learning Approach
Gilbert et al. Explainable AI for Black Sigatoka Detection
Geetha Farm’s Smart BOT
Ali et al. Classifying Arabic farmers’ complaints based on crops and diseases using machine learning approaches
CN114896987B (en) Fine-grained emotion analysis method and device based on semi-supervised pre-training model
Vaca et al. Board of Directors' Profile: A Case for Deep Learning as a Valid Methodology to Finance Research

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination