CN113205161A - Traditional Chinese medicine producing area distinguishing system and method based on soil parameters - Google Patents

Traditional Chinese medicine producing area distinguishing system and method based on soil parameters Download PDF

Info

Publication number
CN113205161A
CN113205161A CN202110754881.6A CN202110754881A CN113205161A CN 113205161 A CN113205161 A CN 113205161A CN 202110754881 A CN202110754881 A CN 202110754881A CN 113205161 A CN113205161 A CN 113205161A
Authority
CN
China
Prior art keywords
traditional chinese
chinese medicine
soil
result
producing area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110754881.6A
Other languages
Chinese (zh)
Other versions
CN113205161B (en
Inventor
牟松波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Microchip Blockchain And Edge Computing Research Institute
Original Assignee
Beijing Microchip Blockchain And Edge Computing Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Microchip Blockchain And Edge Computing Research Institute filed Critical Beijing Microchip Blockchain And Edge Computing Research Institute
Priority to CN202110754881.6A priority Critical patent/CN113205161B/en
Publication of CN113205161A publication Critical patent/CN113205161A/en
Application granted granted Critical
Publication of CN113205161B publication Critical patent/CN113205161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/24Earth materials
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products

Abstract

The invention discloses a system and a method for distinguishing producing areas of traditional Chinese medicines based on soil parameters, which comprises an input determination module, a soil determination module and a control module, wherein the input determination module is used for acquiring traditional Chinese medicine type information input by a user and measuring soil scraped from traditional Chinese medicines to obtain a soil determination result; the identification and judgment module is used for acquiring traditional Chinese medicine type information and a soil measurement result, preprocessing and characteristic selection are carried out on the soil measurement result, and data after preprocessing and characteristic selection are subjected to discriminant analysis through a pre-constructed neural network model, a pre-constructed linear discriminant model and a pre-constructed XGboost model to obtain a production place discrimination result of a corresponding type of traditional Chinese medicine; and the feedback module outputs the origin judgment result. According to the system, the traditional Chinese medicine producing area is judged by measuring the traditional Chinese medicine adhesion soil parameters, the relation between the soil parameters and the traditional Chinese medicine producing area is constructed by the neural network model, the linear judgment model and the XGboost model, the producing area judgment is more convenient and efficient, and the judgment result is more accurate and reliable.

Description

Traditional Chinese medicine producing area distinguishing system and method based on soil parameters
Technical Field
The invention relates to the technical field of traditional Chinese medicine producing area tracing, in particular to a traditional Chinese medicine producing area distinguishing system and method based on soil parameters.
Background
At present, the problem that the quality of traditional Chinese medicines is difficult to guarantee due to the pretend of the producing area is increasingly highlighted, and therefore, a method for distinguishing the producing area of genuine medicinal materials by a biological activity detection method is produced, but the biological detection method is complicated and complicated, so that the method is difficult to be implemented in the traditional Chinese medicine industry in a short time, and particularly, the biological activity detection method has the following defects:
1. sample pretreatment process is tedious
Taking rhizoma polygonati as an example, when bioactive substances such as polysaccharide, total saponin, total phenol, total flavone and the like in a sample are detected, fresh rhizomes need to be collected, cleaned and sliced, and meanwhile, the samples need to be dried, ground and sieved. The series of treatment processes are multiple and long in time consumption, and the difficulty in identifying the producing area is increased.
2. High cost and difficulty in identifying production area
The prior art requires the use of a large number of instruments such as chromatographs, rotary evaporators, spectrophotometers, etc. In addition, a plurality of solvents are required to be prepared for extracting each bioactive substance. Because the important bioactive substances of different Chinese medicine varieties are often different, a unified rapid detection method for the bioactive substances is difficult to construct. Further identification of the source can only be performed after data acquisition by operations performed in a particular laboratory. This makes the cost of identifying the producing area of polygonatum relatively high and the identification difficulty relatively high.
3. Poor practicability
At present, the main bioactive substances and the determination method of most traditional Chinese medicines have no clear conclusion or standard, and the amount of identification samples of the origin is less in the existing data, and the detailed description of the substantive content for really identifying the origin of most traditional Chinese medicines is absent, so that the universality is not high, the wide application is not realized, and the practicability is poor.
Meanwhile, a traditional Chinese medicine decoction piece quality tracing system based on a block chain technology is also provided at present, and the system acquires the full-flow information of traditional Chinese medicine planting and selling in an industry supervision mode by constructing and integrating a full supply chain, but is very difficult to implement and low in short-term feasibility.
It is not easy to find that the traditional Chinese medicine producing area distinguishing method is difficult to meet the actual traditional Chinese medicine producing area tracing requirement due to the reasons of insufficient convenience, difficult implementation and the like.
Therefore, how to provide a convenient, efficient and reliable system for judging the origin of traditional Chinese medicine is a problem that needs to be solved urgently by those skilled in the art.
Disclosure of Invention
In view of the above, the invention provides a traditional Chinese medicine producing area distinguishing system and method based on soil parameters, and effectively solves the problems that the existing traditional Chinese medicine producing area distinguishing method is not convenient and fast enough, and has low feasibility and the like.
In order to achieve the purpose, the invention adopts the following technical scheme:
in one aspect, the invention provides a system for distinguishing producing areas of traditional Chinese medicine based on soil parameters, which comprises:
the input measuring module acquires traditional Chinese medicine type information input by a user, measures soil scraped from traditional Chinese medicines and obtains a soil measuring result;
the identification and judgment module is used for acquiring the traditional Chinese medicine type information and the soil measurement result, preprocessing and feature selection are carried out on the soil measurement result, and the data after preprocessing and feature selection are subjected to discriminant analysis through a pre-constructed neural network model, a linear discriminant model and an XGboost model to obtain the production place discrimination result of the corresponding type of traditional Chinese medicine; and
and the feedback module is used for outputting the origin judgment result.
Considering that the rhizome Chinese medicine has many root nodes, even if the rhizome Chinese medicine is cleaned normally, a certain amount of soil still remains. And a lot of wild traditional Chinese medicines are not cleaned and directly attached with a large amount of soil. For soil, the soil parameter components of a certain place generally do not change greatly, and the soil parameter components can be conveniently added into a database after being collected once, so that the soil parameter components can be used as a data basis for judging the producing area of the traditional Chinese medicine.
Further, the input measuring module comprises a traditional Chinese medicine type input unit and a soil measuring unit;
the traditional Chinese medicine type input unit is used for acquiring traditional Chinese medicine type information input by a user, and the soil determination unit is used for measuring soil scraped from traditional Chinese medicines to obtain a soil determination result.
Furthermore, the input measuring module further comprises a rationality judging unit, the rationality judging unit is used for judging whether the soil measuring result is reasonable or not, and if the soil measuring result is reasonable, the soil measuring result is transmitted to the identification judging module; and if not, prompting to re-measure.
For the input measuring module, specific types of traditional Chinese medicines need to be input, then soil is scraped through detection, and contact measurement is directly carried out by using an instrument. Soil measurements can then be obtained. In order to ensure that the obtained measuring result has higher data reference value, the invention is also provided with a rationality judging unit for judging the rationality of the soil measuring result, and if the soil measuring result is detected reasonably, the data is transmitted to the next module. And if the data are identified to be unreasonable, prompting to re-measure.
Further, the identification judging module comprises:
the data acquisition unit is used for acquiring the traditional Chinese medicine type information and the soil measurement result;
the data preprocessing unit is used for preprocessing the soil measuring result and selecting characteristics;
the neural network identification unit is used for inputting all the preprocessed data into the neural network model to obtain a first prediction probability of the traditional Chinese medicine producing area;
the linear identification unit is used for inputting the preprocessed data with the selected characteristics into a linear discrimination model to obtain a second prediction probability of the traditional Chinese medicine producing area;
the XGboost identification unit is used for inputting the preprocessed data with the selected characteristics into an XGboost model to obtain a traditional Chinese medicine producing area prediction result; and
and the analysis and judgment unit is used for equally weighting the first prediction probability of the traditional Chinese medicine producing area and the second prediction probability of the traditional Chinese medicine producing area, comparing the equally weighted result with a preset threshold value and obtaining a producing area judgment result by combining the traditional Chinese medicine producing area prediction result.
For the identification and judgment module, firstly, data input into the measurement module is obtained, wherein the data comprise soil measurement results and Chinese medicine types. The pre-constructed model in the system is used for a matching algorithm of a database of known traditional Chinese medicine types, if the types of the traditional Chinese medicines are unknown, a national soil database is used, but the accuracy is reduced. After the data are sent to the identification judgment module, the data are standardized, and then the neural network model, the linear discriminant model and the XGboost module are used for discriminant analysis. And then, uniformly weighting the prediction results and the probability values obtained by the neural network model and the linear discriminant model. If the result is judged to be a high credibility result or a medium high credibility result, returning to judge the producing area and marking high credibility or medium high credibility, otherwise, giving possible results and marking non-high credibility.
Furthermore, the system for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters further comprises an upgrading module, wherein the upgrading module is used for updating the data of the identification judging module.
For the upgrading module, a user can upgrade the system periodically to obtain the identification and judgment module of the updated version. Meanwhile, the user uploads each analysis result to the cloud in a desensitization mode, so that each model in the identification and judgment module is convenient to upgrade, for the model, new model training is performed by using newly-stored soil data and soil data already stored in the database in a supervised learning and semi-supervised learning mode, and if the model passes the robustness test and is confirmed to be upgraded, iterative upgrading is performed, and the user is waited for upgrading.
On the other hand, the invention also provides a method for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters, which comprises the following steps:
data acquisition: acquiring traditional Chinese medicine type information input by a user, and measuring soil scraped from traditional Chinese medicines to obtain a soil measurement result;
data processing: preprocessing and feature selection are carried out on the soil determination result, and data after preprocessing and feature selection are subjected to discriminant analysis through a pre-constructed neural network model, a linear discriminant model and an XGboost model, so that the production place discrimination result of the corresponding type of traditional Chinese medicine is obtained; and
and (4) outputting a result: and outputting the origin judgment result.
Further, the data processing process specifically includes:
acquiring the traditional Chinese medicine type information and the soil measurement result, and performing pretreatment and feature selection on the soil measurement result;
inputting all the preprocessed data into a neural network model to obtain a first prediction probability of the producing area of the traditional Chinese medicine;
inputting the preprocessed data with the selected characteristics into a linear discrimination model to obtain a second prediction probability of the producing area of the traditional Chinese medicine;
inputting the preprocessed data with the selected characteristics into an XGboost model to obtain a prediction result of the producing area of the traditional Chinese medicine;
and equally weighting the first prediction probability of the traditional Chinese medicine producing area and the second prediction probability of the traditional Chinese medicine producing area, comparing the equally weighted result with a preset threshold value, and obtaining a producing area judgment result by combining the traditional Chinese medicine producing area prediction result.
Further, the method for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters further comprises the following steps:
data upgrading: and updating data of the identification judgment module.
In the process of outputting the result, the method mainly displays the analysis result and forms a character storage result. For the system, the text storage result can be downloaded locally or uploaded at the cloud.
According to the technical scheme, compared with the prior art, the traditional Chinese medicine producing area judging system and method based on the soil parameters are provided, the producing area of the traditional Chinese medicine is judged by measuring the soil parameters attached to the traditional Chinese medicine, the relation between the soil parameters and the producing area of the traditional Chinese medicine is established by a combined judging method of a neural network model, a linear judging model and an XGboost model, the traditional Chinese medicine producing area judging process is more convenient and efficient, and the judging result is more accurate and reliable.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic structural diagram of a system for distinguishing the origin of a Chinese medicine based on soil parameters according to the present invention;
FIG. 2 is a schematic structural diagram of an input measurement module;
FIG. 3 is a schematic structural diagram of an identification module;
FIG. 4 is a schematic diagram of the internal working principle of the upgrade module;
FIG. 5 is a schematic diagram of the internal working principle of the feedback module;
FIG. 6 is a schematic flow chart of the implementation of the method for distinguishing the producing area of traditional Chinese medicine based on soil parameters.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
On one hand, referring to the attached fig. 1, the embodiment of the invention discloses a traditional Chinese medicine producing area distinguishing system based on soil parameters, which comprises:
the input measuring module 1 is used for acquiring traditional Chinese medicine type information input by a user and measuring soil scraped from traditional Chinese medicines to obtain a soil measuring result;
the identification and judgment module 2 is used for acquiring traditional Chinese medicine type information and a soil measurement result, preprocessing and feature selection are carried out on the soil measurement result, and data after preprocessing and feature selection are subjected to discriminant analysis through a pre-constructed neural network model, a linear discriminant model and an XGboost model to obtain a place of origin discrimination result of a corresponding type of traditional Chinese medicine; and
and the feedback module 3 is used for outputting a production place judgment result.
Specifically, referring to fig. 2, the input determination module 1 includes a Chinese medicine kind input unit 101 and a soil determination unit 102;
the Chinese medicine type input unit 102 is used for acquiring Chinese medicine type information input by a user, and the soil determination unit 102 is used for measuring soil scraped from Chinese medicines to obtain a soil determination result.
Preferably, the input determination module 1 further comprises a rationality judgment unit 103, wherein the rationality judgment unit 103 is used for judging whether the soil determination result is reasonable, and if so, the soil determination result is transmitted to the identification judgment module 2; and if not, prompting to re-measure.
For the measured value of the soil composition, each element should have a reasonable upper and lower limit value, and the value is obtained and generally determined by multiplying the maximum value of the collected data in the database by a specific coefficient. The rationality judgment unit is arranged in the embodiment to prevent the classification result from failing due to the error measurement of the instrument and influence the reliability of the system. Of course, whether to upload according to the measured value, or to perform median or truncation processing on the abnormal value may be set reasonably according to the actual application requirements, and no limitation is made herein.
Specifically, referring to fig. 3, the identification judging module 2 includes:
a data acquisition unit 201, configured to acquire traditional Chinese medicine type information and soil measurement results;
the data preprocessing unit 202 is used for preprocessing the soil measurement result and selecting characteristics;
the neural network identification unit 203 is used for inputting all the preprocessed data into the neural network model to obtain a first prediction probability of the traditional Chinese medicine producing area;
the linear recognition unit 204 is used for inputting the preprocessed data with the selected characteristics into a linear discrimination model to obtain a second prediction probability of the traditional Chinese medicine producing area;
the XGboost identification unit 205 is used for inputting the preprocessed data with the selected characteristics into an XGboost model to obtain a traditional Chinese medicine producing area prediction result; and
the analysis and judgment unit 206 is configured to equally weight the first prediction probability of the traditional Chinese medicine producing area and the second prediction probability of the traditional Chinese medicine producing area, compare the equally weighted result with a preset threshold, and obtain a producing area judgment result by combining the traditional Chinese medicine producing area prediction result.
Of course, the unit components for rationality determination may be arranged inside the identification and determination module 2, so as to perform rationality determination on the production area determination results obtained by the respective determination units 206, and ensure the authenticity and reliability of the production area determination results.
For the establishment of the model, a reasonable parameter set needs to be determined, including common heavy metal elements and non-metal elements and pH values, such as: pH, Li, Be, B, Na, Mg, Al, Si, P, S, K, Ca, Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, Ga, Ge, As, Se, Rb, Sr, Y, Nb, Mo, Cd, Sb, Cs, Ba, La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, Tl, Pb, Th. For each kind of Chinese medicine, the importance of parameters, i.e., feature selection, is performed by using the selectKBest method or the Lasso method, and 6 to 10 parameters are selected in this embodiment. Intra-class enhancement of the sample set is then performed, along with normalization processing. And importing all the parameters into a neural network model, and importing the selected important parameters into a linear discriminant model and an XGboost model, so as to generate model parameters of common Chinese medicine types and store the model parameters into a database.
Taking a Linear Discriminant model as an example, LDA (Linear Discriminant Analysis), also called Fisher Linear Discriminant, is a dimension reduction technique for supervised learning, that is, each sample of its data set is output by category. The specific linear discriminant model establishment method is as follows:
set the data as
Figure 217158DEST_PATH_IMAGE001
Wherein is arbitrary
Figure 921809DEST_PATH_IMAGE002
Is composed ofqThe dimension data is stored in a memory of the storage device,
Figure 729228DEST_PATH_IMAGE003
all of (1) tokClass samples. Wherein
Figure 228343DEST_PATH_IMAGE004
Is as follows
Figure 121212DEST_PATH_IMAGE005
The number of the class samples is set as,
Figure 629554DEST_PATH_IMAGE006
is as follows
Figure 494742DEST_PATH_IMAGE005
A set of class samples is then generated,
Figure 164758DEST_PATH_IMAGE007
is as follows
Figure 748186DEST_PATH_IMAGE005
The mean vector of the class samples is then calculated,
Figure 794639DEST_PATH_IMAGE008
is as follows
Figure 311071DEST_PATH_IMAGE009
A sample-like covariance matrix.
(1) And preprocessing the sample data. The pretreatment is as follows: filtering invalid values, filling missing values and carrying out standardization processing; the filling method comprises the following steps: for each missing location, the median of the column of values for the remaining samples of the source is selected to be filled. The standardization treatment comprises the following steps: after dividing the sample into a training sample and a test sample, carrying out standardization processing on each value in the training sample, namely:
Figure 620830DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 488292DEST_PATH_IMAGE011
as a data set
Figure 541698DEST_PATH_IMAGE012
To middle
Figure 912637DEST_PATH_IMAGE013
The first of the data
Figure 393297DEST_PATH_IMAGE014
The value of the column is such that,
Figure 482475DEST_PATH_IMAGE015
as a data set
Figure 870731DEST_PATH_IMAGE012
First, the
Figure 361756DEST_PATH_IMAGE014
The average value of the columns is,
Figure 810054DEST_PATH_IMAGE016
as a data set
Figure 855371DEST_PATH_IMAGE012
First, the
Figure 781739DEST_PATH_IMAGE014
Standard deviation of the columns.
(2) Assume that the dimension of the projected low-dimensional space is
Figure 127269DEST_PATH_IMAGE017
Corresponding basis vector is
Figure 746469DEST_PATH_IMAGE018
The basis vectors form a matrix of
Figure 13503DEST_PATH_IMAGE019
The matrix is one
Figure 9140DEST_PATH_IMAGE020
Of the matrix of (a).
Defining the inter-class divergence matrix as:
Figure 412440DEST_PATH_IMAGE021
wherein the content of the first and second substances,
Figure 202541DEST_PATH_IMAGE022
all sample mean vectors.
Defining the intra-class divergence matrix as:
Figure 737297DEST_PATH_IMAGE023
(3) since the covariance of the projection points of the same kind of sample is desired to be as small as possible; the projection points of the heterogeneous samples are far away as possible, so that the distance between the class centers is as large as possible, and the optimization target can be a function as follows:
Figure 271046DEST_PATH_IMAGE024
since this function is difficult to solve, this embodiment is implemented with the following alternative optimization objectives:
Figure 325590DEST_PATH_IMAGE025
among them, there are:
Figure 552172DEST_PATH_IMAGE026
Figure 324956DEST_PATH_IMAGE019
is closed form of
Figure 396817DEST_PATH_IMAGE027
Is/are as follows
Figure 40288DEST_PATH_IMAGE028
A matrix composed of eigenvalue vectors corresponding to the maximum non-zero generalized eigenvalue satisfies
Figure 172192DEST_PATH_IMAGE029
(4) LDA (linear discriminant model) assumes that the sample data of each category conforms to Gaussian distribution, and by using the transformed data, the mean and variance of the projection data of each category can be calculated by using maximum likelihood estimation, so as to obtain the probability density function of the Gaussian distribution of the category.
(5) When a new sample comes, the embodiment uses the already obtained sampleWProjecting the new data, respectively bringing the projected sample characteristics into Gaussian distribution probability density functions of various categories, calculating the probability of the new sample belonging to the category, and obtaining the probability value of each category, namely:
Figure 901114DEST_PATH_IMAGE030
for the purpose of the linear discriminant analysis,
Figure 511087DEST_PATH_IMAGE031
can be viewed as a multivariate gaussian distribution with the following distribution:
Figure 805802DEST_PATH_IMAGE032
preferably, referring to fig. 1, the above system for distinguishing the producing area of traditional Chinese medicine based on soil parameters further includes an upgrading module 4, and the upgrading module 4 is configured to update data of the identification and judgment module 2.
Referring to fig. 4, the upgrade module 4 may upgrade the system periodically to obtain an updated version of the identification module. Meanwhile, the user uploads each analysis result to the cloud in a desensitization mode, so that each model in the identification and judgment module is convenient to upgrade, for the model, new model training is performed by using newly-stored soil data and soil data already stored in the database in a supervised learning and semi-supervised learning mode, and if the model passes the robustness test and is confirmed to be upgraded, iterative upgrading is performed, and the user is waited for upgrading.
For the feedback module 3, see fig. 5, it mainly displays the results of the analysis and forms the text save results. The text storage result can be downloaded locally or uploaded at the cloud.
On the other hand, referring to fig. 6, the embodiment of the invention also discloses a method for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters, which comprises the following steps:
s1: data acquisition: acquiring traditional Chinese medicine type information input by a user, and measuring soil scraped from traditional Chinese medicines to obtain a soil measurement result;
s2: data processing: preprocessing and selecting characteristics of the soil measurement result, and performing discriminant analysis on the data after preprocessing and characteristic selection through a pre-constructed neural network model, a linear discriminant model and an XGboost model to obtain the production place discrimination result of the corresponding type of traditional Chinese medicine; and
s3: and (4) outputting a result: and outputting a production place judgment result.
The data processing process specifically includes:
acquiring traditional Chinese medicine type information and a soil determination result, and performing pretreatment and feature selection on the soil determination result;
inputting all the preprocessed data into a neural network model to obtain a first prediction probability of the producing area of the traditional Chinese medicine;
inputting the preprocessed data with the selected characteristics into a linear discrimination model to obtain a second prediction probability of the producing area of the traditional Chinese medicine;
inputting the preprocessed data with the selected characteristics into an XGboost model to obtain a prediction result of the producing area of the traditional Chinese medicine;
equally weighting the first prediction probability of the traditional Chinese medicine producing area and the second prediction probability of the traditional Chinese medicine producing area, comparing the equally weighted result with a preset threshold value, and obtaining a producing area judgment result by combining the traditional Chinese medicine producing area prediction result.
Preferably, the method for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters further comprises the following steps:
s4: data upgrading: and updating data of the identification judgment module.
The implementation process of constructing the neural network model, the linear discriminant model and the XGBoost model in the above method is explained in detail below by taking sealwort as an example:
144 rhizoma polygonati traditional Chinese medicine soil samples from nine production places of Anhui Liuan, Anhui Chizhou, Hunan Huai, Shaanxi Hanzhong, Shaanxi Shanluo, Shanxi Changzhi, Liaoning Cushun, Liaoning Benxi and Liaoning Anhushan are randomly divided according to the proportion of a training set test set 3:1, and 108 training samples and 36 test samples are total.
First, outlier exclusion and normalization processing were performed on 108 samples. The 7 most relevant parameters were selected by the SelectKBest method, in this case Be, Mg, P, S, Ti, Na, Sr elements in the soil.
Putting all parameters into a neural network model, inputting data into a 52-dimensional floating point number vector, setting an optimization algorithm as RMSProp by the neural network, setting a loss function as Catogoricacalcoss transmit, setting a training turn as 500 turns, setting a batch processing number as 8, and setting the rest parameters as shown in the following table 1:
TABLE 1 neural network model setup parameters
Figure DEST_PATH_IMAGE034
And putting the 7 parameters into a linear discriminant model, inputting data into a 7-dimensional floating point number vector, and obtaining the linear discriminant model by using default settings.
Similarly, the 7 parameters are put into an XGBoost classification model, the input data is a 7-dimensional floating point vector, and the parameters of the XGBoost classification model are as shown in the following table 2:
TABLE 2 XGboost classification model setup parameters
Figure DEST_PATH_IMAGE036
In this embodiment, the specific weight voting and rationality determination method is: if the equal weighted value is more than or equal to 1.6, the result is regarded as a highly credible result and the predicted producing area is returned; if the equal weighted value is less than 1.6 but more than 1.3 and is the same as the prediction result of the XGboost model, the result is regarded as a highly credible result and the prediction place of production is returned; if the weighted value is between 1 and 1.3, the three models predict the same producing area, and the predicted producing area is returned and is regarded as a highly credible result; when the probability value of the single model is more than or equal to 0.9 and is the same as the prediction result of the XGboost model, returning the prediction producing area and considering the single model as a medium-high credible result; all other cases return to the place of origin with higher probability and remind of possible misjudgment. And finally, storing the obtained prediction result for use.
For 36 test specimens, the test specimens of examples 1, 7 and 35 were used, and the results are shown in Table 3 below:
table 3 test samples in examples 1, 7 and 35 parameter data
Sample measurement example Producing area pH Li ……
1 Shanxi Shandong Luo 7.57 128.77 ……
7 Benxi Liaoning 7.26 47.48 ……
35 Liaoning shun (a Chinese character of' Liaoning 7.32 60.90 ……
For example 1, the neural network model considers that 98% of the probability is from shanxi merchant, the linear discriminant model considers that 99% of the probability is from shanxi merchant, and the XGBoost classification model considers that the example is from shanxi merchant, and finally considers that the place of origin is shanxi merchant and is a highly reliable result.
For example 7, the neural network model considers that 99% of the probability is from the benxi Liaoning, the linear discriminant model considers that 99% of the probability is from the Ningning, and the XGboost classification model considers that the example is from the Benxi Liaoning, and finally considers that the producing area is the Benxi Liaoning and is a high-degree credible result.
For example 35, the neural network model considers that 91% of the probability is from Liaoning Cushun, and the linear discriminant model considers that 94% of the probability is from Liaoning Cushun, so although the XGboost classification model considers that the example is from Shanxi Ching, the place of origin is considered to be Liaoning Cushun and is a highly credible result.
And then, the origin of the soil measurement result obtained by actual measurement can be judged by utilizing the constructed neural network model, the linear judgment model and the XGboost model.
In summary, the system and method for distinguishing the producing area of traditional Chinese medicine based on soil parameters disclosed in the embodiments of the present invention have the following advantages compared with the prior art:
1. the method for identifying the producing area of the traditional Chinese medicine by using the parameter components in the traditional Chinese medicine attached soil is provided, the identification method of the producing area of the traditional Chinese medicine is expanded, the production area identification speed is increased, and the production area identification cost is reduced;
2. the system can judge the producing area of the traditional Chinese medicine by measuring the parameters of the soil attached to the traditional Chinese medicine;
3. according to the database of the producing areas of the traditional Chinese medicines, the relationship between the soil parameters and the producing areas of the traditional Chinese medicines is constructed through a combined discrimination method of a neural network model, a linear discrimination model and an XGboost model in consideration of discrimination accuracy and discrimination convenience, and theoretical support is provided for the traditional Chinese medicine discrimination method for discriminating the producing areas through attaching soil.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. A traditional Chinese medicine producing area distinguishing system based on soil parameters is characterized by comprising:
the input measuring module acquires traditional Chinese medicine type information input by a user, measures soil scraped from traditional Chinese medicines and obtains a soil measuring result;
the identification and judgment module is used for acquiring the traditional Chinese medicine type information and the soil measurement result, preprocessing and feature selection are carried out on the soil measurement result, and the data after preprocessing and feature selection are subjected to discriminant analysis through a pre-constructed neural network model, a linear discriminant model and an XGboost model to obtain the production place discrimination result of the corresponding type of traditional Chinese medicine; and
and the feedback module is used for outputting the origin judgment result.
2. The system for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters as claimed in claim 1, wherein the input determination module comprises a traditional Chinese medicine type input unit and a soil determination unit;
the traditional Chinese medicine type input unit is used for acquiring traditional Chinese medicine type information input by a user, and the soil determination unit is used for measuring soil scraped from traditional Chinese medicines to obtain a soil determination result.
3. The system for distinguishing the producing areas of the traditional Chinese medicines according to the soil parameters is characterized in that the input determination module further comprises a rationality judgment unit, the rationality judgment unit is used for judging whether the soil determination result is reasonable or not, and if the soil determination result is reasonable, the soil determination result is transmitted to the identification judgment module; and if not, prompting to re-measure.
4. The system for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters as claimed in claim 1, wherein the identification and judgment module comprises:
the data acquisition unit is used for acquiring the traditional Chinese medicine type information and the soil measurement result;
the data preprocessing unit is used for preprocessing the soil measuring result and selecting characteristics;
the neural network identification unit is used for inputting all the preprocessed data into the neural network model to obtain a first prediction probability of the traditional Chinese medicine producing area;
the linear identification unit is used for inputting the preprocessed data with the selected characteristics into a linear discrimination model to obtain a second prediction probability of the traditional Chinese medicine producing area;
the XGboost identification unit is used for inputting the preprocessed data with the selected characteristics into an XGboost model to obtain a traditional Chinese medicine producing area prediction result; and
and the analysis and judgment unit is used for equally weighting the first prediction probability of the traditional Chinese medicine producing area and the second prediction probability of the traditional Chinese medicine producing area, comparing the equally weighted result with a preset threshold value and obtaining a producing area judgment result by combining the traditional Chinese medicine producing area prediction result.
5. The system for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters as claimed in claim 1, further comprising an upgrading module, wherein the upgrading module is used for updating the data of the identification and judgment module.
6. A traditional Chinese medicine producing area distinguishing method based on soil parameters is characterized by comprising the following steps:
data acquisition: acquiring traditional Chinese medicine type information input by a user, and measuring soil scraped from traditional Chinese medicines to obtain a soil measurement result;
data processing: preprocessing and feature selection are carried out on the soil determination result, and data after preprocessing and feature selection are subjected to discriminant analysis through a pre-constructed neural network model, a linear discriminant model and an XGboost model, so that the production place discrimination result of the corresponding type of traditional Chinese medicine is obtained; and
and (4) outputting a result: and outputting the origin judgment result.
7. The method for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters as claimed in claim 6, wherein the data processing process specifically comprises:
acquiring the traditional Chinese medicine type information and the soil measurement result, and performing pretreatment and feature selection on the soil measurement result;
inputting all the preprocessed data into a neural network model to obtain a first prediction probability of the producing area of the traditional Chinese medicine;
inputting the preprocessed data with the selected characteristics into a linear discrimination model to obtain a second prediction probability of the producing area of the traditional Chinese medicine;
inputting the preprocessed data with the selected characteristics into an XGboost model to obtain a prediction result of the producing area of the traditional Chinese medicine;
and equally weighting the first prediction probability of the traditional Chinese medicine producing area and the second prediction probability of the traditional Chinese medicine producing area, comparing the equally weighted result with a preset threshold value, and obtaining a producing area judgment result by combining the traditional Chinese medicine producing area prediction result.
8. The method for distinguishing the producing area of the traditional Chinese medicine based on the soil parameters as claimed in claim 6, further comprising:
data upgrading: and updating data of the identification judgment module.
CN202110754881.6A 2021-07-05 2021-07-05 Traditional Chinese medicine producing area distinguishing system and method based on soil parameters Active CN113205161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110754881.6A CN113205161B (en) 2021-07-05 2021-07-05 Traditional Chinese medicine producing area distinguishing system and method based on soil parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110754881.6A CN113205161B (en) 2021-07-05 2021-07-05 Traditional Chinese medicine producing area distinguishing system and method based on soil parameters

Publications (2)

Publication Number Publication Date
CN113205161A true CN113205161A (en) 2021-08-03
CN113205161B CN113205161B (en) 2021-12-03

Family

ID=77022695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110754881.6A Active CN113205161B (en) 2021-07-05 2021-07-05 Traditional Chinese medicine producing area distinguishing system and method based on soil parameters

Country Status (1)

Country Link
CN (1) CN113205161B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071091A (en) * 2023-03-16 2023-05-05 山东丰茂源认证服务有限公司 Pharmacy raw material traceability system based on Internet of things

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050044788A1 (en) * 2003-04-09 2005-03-03 Chung-Shih Tang Floating plant cultivation platform and method for growing terrestrial plants in saline water of various salinities for multiple purposes
CN105474963A (en) * 2015-12-15 2016-04-13 安徽省康君食品有限公司 Radix Salviae Miltiorrhizae planting method
CN106770617A (en) * 2017-04-10 2017-05-31 山东省分析测试中心 It is a kind of that the method that the place of production is traced to the source is carried out to the red sage root using trace element and rare earth element assay combination multi-variate statistical analysis
CN107677647A (en) * 2017-09-25 2018-02-09 重庆邮电大学 Chinese medicine place of production discrimination method based on principal component analysis and BP neural network
CN109523751A (en) * 2018-09-30 2019-03-26 康美中药材数据信息服务有限公司 A kind of production of crude drugs soil environment method for early warning, electronic equipment and storage medium
CN111667889A (en) * 2020-07-20 2020-09-15 山东中医药大学 Method for predicting content of quality marker in salvia miltiorrhiza

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050044788A1 (en) * 2003-04-09 2005-03-03 Chung-Shih Tang Floating plant cultivation platform and method for growing terrestrial plants in saline water of various salinities for multiple purposes
CN105474963A (en) * 2015-12-15 2016-04-13 安徽省康君食品有限公司 Radix Salviae Miltiorrhizae planting method
CN106770617A (en) * 2017-04-10 2017-05-31 山东省分析测试中心 It is a kind of that the method that the place of production is traced to the source is carried out to the red sage root using trace element and rare earth element assay combination multi-variate statistical analysis
CN107677647A (en) * 2017-09-25 2018-02-09 重庆邮电大学 Chinese medicine place of production discrimination method based on principal component analysis and BP neural network
CN109523751A (en) * 2018-09-30 2019-03-26 康美中药材数据信息服务有限公司 A kind of production of crude drugs soil environment method for early warning, electronic equipment and storage medium
CN111667889A (en) * 2020-07-20 2020-09-15 山东中医药大学 Method for predicting content of quality marker in salvia miltiorrhiza

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHANGLIN NI 等: "Identification of Geographical Origin of Honeysuckle(Lonicera Japonica Thunb)by Discriminant Analysis Using Rare Earth Elements", 《ANALYTICAL LETTERS》 *
张文丽 等: "基于稳定同位素技术的竹节参产地识别研究", 《中草药》 *
王游游 等: "基于稳定同位素和矿质元素的决明子产地特征与溯源判别研究", 《核农学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071091A (en) * 2023-03-16 2023-05-05 山东丰茂源认证服务有限公司 Pharmacy raw material traceability system based on Internet of things
CN116071091B (en) * 2023-03-16 2023-06-20 山东丰茂源认证服务有限公司 Pharmacy raw material traceability system based on Internet of things

Also Published As

Publication number Publication date
CN113205161B (en) 2021-12-03

Similar Documents

Publication Publication Date Title
Feng et al. Crop type identification and mapping using machine learning algorithms and sentinel-2 time series data
US11144576B2 (en) Target class feature model
CN110634080B (en) Abnormal electricity utilization detection method, device, equipment and computer readable storage medium
CN103630528A (en) Method for identifying producing area of tea by using element content in the tea
CN102324038B (en) Plant species identification method based on digital image
CN106560841A (en) Wuyi rock tea production place identification method based on deep learning
Shi et al. Optimization of electronic nose sensor array by genetic algorithms in Xihu-Longjing Tea quality analysis
CN104268556A (en) Hyperspectral image classification method based on nuclear low-rank representing graph and spatial constraint
Chen et al. Diagnosing of rice nitrogen stress based on static scanning technology and image information extraction
CN103822897A (en) White spirit appraising and source-tracing method based on infrared spectroscopy
CN113205161B (en) Traditional Chinese medicine producing area distinguishing system and method based on soil parameters
CN107132266A (en) A kind of Classification of water Qualities method and system based on random forest
CN112270596A (en) Risk control system and method based on user portrait construction
CN108520249A (en) A kind of construction method of cell sorter, apparatus and system
Ning et al. Discrimination of six tea categories coming from different origins depending on polyphenols, caffeine, and theanine combined with different discriminant analysis
Cai et al. Machine learning algorithms improve the power of phytolith analysis: A case study of the tribe Oryzeae (Poaceae)
Qiu et al. Phenology-pigment based automated peanut mapping using sentinel-2 images
Chen et al. Effect of training strategy for positive and unlabelled learning classification: Test on Landsat imagery
Xie et al. Annual land-cover mapping based on multi-temporal cloud-contaminated landsat images
Li et al. An automatic plant leaf stoma detection method based on YOLOv5
CN109657733B (en) Variety discriminating method and system based on constituent structure feature
Chen et al. Identifying of rice phosphorus stress based on machine vision technology
CN109211814B (en) It is a kind of to be set a song to music the soil profile kind identification methods of face partition characteristics based on three-dimensional light
CN109063735A (en) A kind of classification of insect Design Method based on insect biology parameter
CN105954206B (en) The measurement method and system of purple maize leaf anthocyanin content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Xiao Linjie

Inventor after: Jiao Peng

Inventor before: Mou Songbo

GR01 Patent grant
GR01 Patent grant