CN113269196A

CN113269196A - Method for realizing hyperspectral medical component analysis of graph convolution neural network

Info

Publication number: CN113269196A
Application number: CN202110811547.XA
Authority: CN
Inventors: 王耀南; 尹阿婷; 毛建旭; 曾凯; 张辉; 朱青; 周显恩; 李亚萍; 赵禀睿; 陈煜嵘; 苏学叁
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2021-07-19
Filing date: 2021-07-19
Publication date: 2021-08-17
Anticipated expiration: 2041-07-19
Also published as: CN113269196B; WO2023000653A1

Abstract

The invention discloses a method for realizing hyperspectral medical component analysis of graph convolution neural network. On the one hand, the medical hyperspectral image data is processed into graph data, which greatly reduces the number of pixels and effectively reduces the amount of data; , extracting the feature information of the drug with the graph convolutional neural network model, effectively learning the spatial relationship between the visual features in the drug hyperspectral image and the drug components, improving the representation ability of the classification features of the drug components, and improving the accuracy of the tested drugs. Composition and attribute accuracy, enabling non-destructive and rapid detection and analysis of pharmaceutical composition and quality.

Description

A Realization Method of Hyperspectral Pharmaceutical Composition Analysis by Graph Convolutional Neural Network

技术领域technical field

本发明涉及高端医药高光谱智能检测分析领域，特别是涉及一种图卷积神经网络高光谱医药成分分析的实现方法，该方法引入了图卷积神经网络技术，可用于高光谱医药成分和质量的无损分析。The invention relates to the field of high-end medical hyperspectral intelligent detection and analysis, in particular to a method for realizing hyperspectral medical component analysis of graph convolutional neural network. The method introduces graph convolutional neural network technology and can be used for hyperspectral medical composition and quality nondestructive analysis.

背景技术Background technique

医药安全是关系人民群众身体健康和经济发展的大事，已经成为了人们时刻关注的民生与公共安全问题，保障医药质量安全对维护国家安定和社会和谐稳定具有重大意义。现有药物成分质量检测方法，如化学检测方法、分光光度法等，只能适应于抽样检测，且具有破坏性，无法满足医药质量无损检测的要求。近年来，近红外光谱检测技术在药物分析领域中应用十分广泛，其光谱信息是一种鲁棒性很强的类“指纹”特征，可以用来将不同药品成分计量分类。光谱检测法作为检验医药品质、质量的保障，已被2015 版《中国药典》收录，但其仅能检测光源照射点被测试样成分的定量信息，无法对药物的整体成分进行分析。因此，亟需研究新型、通用、可靠的医药成分质量光谱检测分析方法。Medical safety is a major event related to people's health and economic development. It has become a people's livelihood and public safety issues that people are always concerned about. Ensuring the quality and safety of medicines is of great significance to maintaining national stability and social harmony and stability. Existing quality testing methods for pharmaceutical ingredients, such as chemical testing methods, spectrophotometry, etc., are only suitable for sampling testing, and are destructive, unable to meet the requirements of non-destructive testing of pharmaceutical quality. In recent years, near-infrared spectral detection technology has been widely used in the field of pharmaceutical analysis, and its spectral information is a robust "fingerprint"-like feature that can be used to measure and classify different pharmaceutical ingredients. Spectral detection method, as a guarantee for testing the quality and quality of medicines, has been included in the 2015 edition of the Chinese Pharmacopoeia, but it can only detect the quantitative information of the components of the tested sample at the point where the light source is irradiated, and cannot analyze the overall components of the drug. Therefore, there is an urgent need to develop a new, general and reliable spectral detection and analysis method for the quality of pharmaceutical ingredients.

高光谱成像技术可以同时获取被测药物的光谱信息和空间信息，且获取的数据信息量十分丰富，能准确地反映被检医药的整体性质，很好地满足了当前医药整体成分的无损检测分析需求。目前高光谱成像技术结合化学计量学相关算法，在制药领域开展了药材和片剂的鉴别、固态片剂中有效成分及辅料的均匀性分布检测、载药薄膜的组成及分布情况监测等相关研究，表明了高光谱技术能作为制药领域的高效能无损质量检测手段。但由于医药种类多样、成分复杂，同时高光谱数据量非常庞大，化学计量学方法难以提取药物的有效特征信息，被测药物的成分和属性预测精度不高。深度学习擅于发掘多维数据中的复杂关系，是目前海量数据处理与分析最好的方法之一。其中图神经网络是一类用于处理图域信息的神经网络，由于对生物分子结构、分子之间的功能关系具有强解释性，目前已在脑科学、医学诊断、药物发现和研究等医药领域受到广泛关注。图神经网络对拓扑数据结构的空间特征具有较好的学习能力，但很难直接用于医药高光谱图像的成分分析中。因此急需针对多种多样的医药种类与复杂的药物成分分析难题，深度探索医药高光谱图像的视觉信息，结合对待测药品的空间特征，提高药物成分分析的精度。Hyperspectral imaging technology can simultaneously obtain spectral information and spatial information of the tested drug, and the obtained data is very informative, which can accurately reflect the overall nature of the tested drug, and well meet the current non-destructive testing and analysis of the overall composition of the drug. need. At present, hyperspectral imaging technology combined with chemometrics-related algorithms has carried out related researches in the pharmaceutical field, such as the identification of medicinal materials and tablets, the uniform distribution detection of active ingredients and excipients in solid tablets, and the composition and distribution monitoring of drug-loaded films. , indicating that hyperspectral technology can be used as a high-efficiency nondestructive quality inspection method in the pharmaceutical field. However, due to the variety of medicines, the complex components, and the huge amount of hyperspectral data, it is difficult for chemometrics to extract the effective characteristic information of medicines, and the prediction accuracy of the components and properties of the tested medicines is not high. Deep learning is good at discovering complex relationships in multi-dimensional data, and it is one of the best methods for processing and analyzing massive data. Among them, the graph neural network is a kind of neural network used to process the information in the graph domain. Because of its strong explanatory power on the structure of biomolecules and the functional relationship between molecules, it has been widely used in the fields of medicine such as brain science, medical diagnosis, drug discovery and research. Widespread concern. Graph neural network has good learning ability for the spatial features of topological data structure, but it is difficult to be directly used in the component analysis of medical hyperspectral images. Therefore, it is urgently necessary to deeply explore the visual information of medical hyperspectral images for various types of medicines and complex drug composition analysis problems, and combine the spatial characteristics of the drugs to be tested to improve the accuracy of drug composition analysis.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明提出一种图卷积神经网络高光谱医药成分分析的实现方法，通过学习高光谱医药图像中药物的光谱信息特征和有效成分空间分布特征，有效实现无损药物成分分析与质量的快速检测。In view of this, the present invention proposes a method for implementing hyperspectral medical component analysis with graph convolutional neural network. By learning the spectral information characteristics and the spatial distribution characteristics of effective components of drugs in hyperspectral medical images, it can effectively achieve non-destructive drug component analysis and quality. rapid detection.

一方面，本发明提供了一种图卷积神经网络高光谱医药成分分析的实现方In one aspect, the present invention provides a method for implementing hyperspectral medical component analysis using a graph convolutional neural network.

法，包括以下步骤：method, including the following steps:

步骤1、获取医药高光谱图像，构建医药高光谱数据集,所述医药高光谱数据集包括训练集和测试集；Step 1. Obtain a medical hyperspectral image, and construct a medical hyperspectral data set, where the medical hyperspectral data set includes a training set and a test set;

步骤2、利用超像素分割算法，将所述训练集中的医药高光谱图像进行分割，得到互不重叠的超像素,所述互不重叠的超像素构成医药高光谱超像素集合；Step 2, using a superpixel segmentation algorithm to segment the medical hyperspectral images in the training set to obtain non-overlapping superpixels, and the non-overlapping superpixels constitute a medical hyperspectral superpixel set;

步骤3、分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角，以及质心像素到每个超像素区域边界距离的特征参数，构造图数据的特征矩阵；Step 3. Count the pixel mean value, centroid pixel position, perimeter, area, area azimuth of each superpixel, and the characteristic parameters of the distance from the centroid pixel to the boundary of each superpixel area, and construct a feature matrix of the graph data;

步骤4、以每个超像素为图节点，最近邻超像素为边，构建区域邻接图，并获得图数据的邻接权值矩阵；Step 4. Taking each superpixel as a graph node and the nearest neighbor superpixel as an edge, construct a regional adjacency graph, and obtain the adjacency weight matrix of the graph data;

步骤5、将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练，得到图卷积神经网络的模型参数；Step 5. Input the feature matrix, the adjacency weight matrix and the medical hyperspectral component label corresponding to the medical hyperspectral image in the training set into the graph convolutional neural network for training, and obtain the model parameters of the graph convolutional neural network;

步骤6、将测试集中的医药高光谱图像重复步骤2至4，得到需进行药物成分分析的区域邻接图，并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵，将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中，得到药物成分分析结果。Step 6. Repeat steps 2 to 4 for the medical hyperspectral images in the test set to obtain a region adjacency graph that needs to be analyzed for drug components, and obtain a feature matrix and an adjacency weight matrix of the region adjacency graph that needs to be analyzed for drug components. The feature matrix and adjacency weight matrix obtained in a centralized manner are input into the graph convolutional neural network model initialized by the model parameters trained in step 5, and the result of drug component analysis is obtained.

进一步地，所述步骤1具体包括以下过程：Further, the step 1 specifically includes the following process:

步骤1.1、准备药物样品：头孢丙烯片、土霉素片、马来酸氯苯那敏片、呋塞米片、阿司匹林肠溶片、珀乙红霉素片、裸花紫珠分散片七种药品样本；Step 1.1. Prepare drug samples: seven kinds of cefprozil tablets, oxytetracycline tablets, chlorpheniramine maleate tablets, furosemide tablets, aspirin enteric-coated tablets, erythromycin tablets, naked flower purple beads dispersible tablets drug samples;

步骤1.2、获取医药高光谱图像，构建医药高光谱数据集

：采用高光谱分选仪获取药物样品的医药高光谱图像，并对采集的医药高光谱图像进行反射率校正，将校正后的图像作为医药高光谱数据集的样本；Step 1.2. Obtain medical hyperspectral images and construct a medical hyperspectral dataset

: Use a hyperspectral sorter to obtain the medical hyperspectral image of the drug sample, and perform reflectance correction on the collected medical hyperspectral image, and use the corrected image as the sample of the medical hyperspectral data set;

步骤1.3、将医药高光谱数据集

随机划分为训练集

和测试集

，

,

，

，

为

中第i个样本的图像，

为

中第i个样本对应的药物成分标签，

为训练集

中第i个样本的图像，

为训练集

中第i个样本对应的药物成分标签，

为测试集

中第i个样本的图像，

为测试集

中第i个样本对应的药物成分标签,d表示医药高光谱数据集

中的样本总数，s表示训练集

中的样本总数，m表示测试集

中的样本总数。Step 1.3, the medical hyperspectral dataset

Randomly divided into training set

and test set

,

for

The image of the ith sample in ,

for

The drug component label corresponding to the i -th sample in

for the training set

The image of the ith sample in ,

for the training set

The drug component label corresponding to the i -th sample in

for the test set

The image of the ith sample in ,

for the test set

The drug component label corresponding to the i -th sample in , d represents the medical hyperspectral dataset

The total number of samples in , s is the training set

The total number of samples in , m is the test set

The total number of samples in .

进一步地，采用K折交叉验证法对步骤1.3中医药高光谱数据集

进行训练集

与测试集

的划分。Further, the K-fold cross-validation method was used to analyze the TCM hyperspectral data set in step 1.3.

run the training set

with the test set

division.

进一步地，所述步骤2具体表现为：采用SLIC算法对所述训练集中的医药高光谱图像进行分割，通过计算像素点之间的空间距离和光谱距离，迭代式更新超像素聚类中心和边界范围，在新的聚类中心和旧的聚类中心之间误差小于预设阈值时停止迭代，从而得到互不重叠的超像素，所述互不重叠的超像素构成医药高光谱超像素集合

，

为第i个超像素，N为互不重叠的超像素个数。Further, the step 2 is embodied as follows: using the SLIC algorithm to segment the medical hyperspectral images in the training set, by calculating the spatial distance and spectral distance between the pixel points, iteratively updating the superpixel cluster center and boundary. range, the iteration is stopped when the error between the new cluster center and the old cluster center is less than a preset threshold, so as to obtain non-overlapping superpixels, and the non-overlapping superpixels constitute a medical hyperspectral superpixel set

,

is the ith superpixel, and N is the number of non-overlapping superpixels.

进一步地，所述步骤3具体表现为：将步骤2中得到的每个超像素

，获取每个超像素

的像素均值

、质心像素

位置

、周长

、面积

、区域方位角

以及质心像素

到每个超像素区域边界取东、南、西、北、东南、东北、西南、西北8个方向的距离

，从而获得特征矩阵X，

，其中，N为超像素个数，M为特征维数，

表示实数集。Further, the step 3 is embodied as: each superpixel obtained in step 2 is

, get each superpixel

pixel mean of

, centroid pixel

Location

,perimeter

,area

, area azimuth

and centroid pixels

The distance to the boundary of each superpixel area is taken in eight directions: east, south, west, north, southeast, northeast, southwest, and northwest

, so as to obtain the feature matrix X ,

, where N is the number of superpixels, M is the feature dimension,

represents the set of real numbers.

进一步地，步骤4中邻接权值矩阵的具体实现包括以下步骤：Further, the specific implementation of the adjacency weight matrix in step 4 includes the following steps:

步骤4.1、根据步骤2得到的医药高光谱超像素集合V，将医药高光谱超像素集合中的超像素

构成一个个图节点，采用K最近邻算法选取离超像素

最近的K个超像素点构建边，从而构成区域邻接图G；Step 4.1. According to the medical hyperspectral superpixel set V obtained in step 2, the superpixels in the medical hyperspectral superpixel set are

A graph node is formed, and the K nearest neighbor algorithm is used to select the distance from the superpixel

The nearest K superpixels construct edges to form a region adjacency graph G;

步骤4.2、根据步骤2获取的医药高光谱超像素集合V中的每个超像素区域，统计每个超像素区域的相邻超像素，得到相邻超像素集合

；Step 4.2, according to each superpixel area in the medical hyperspectral superpixel set V obtained in step 2, count the adjacent superpixels of each superpixel area, and obtain the adjacent superpixel set

;

步骤4.3、根据步骤3获取超像素

的像素均值

，计算每个超像素间的像素均值距离

；Step 4.3, obtain superpixels according to step 3

pixel mean of

, calculate the pixel mean distance between each superpixel

;

步骤4.4、根据步骤3获取超像素

的质心像素

位置

，计算每个超像素间质心坐标距离

；Step 4.4, obtain superpixels according to step 3

centroid of pixels

Location

, calculate the centroid coordinate distance between each superpixel

;

步骤4.5、根据步骤4.3获取的超像素间的像素均值距离

和步骤4.4获取的超像素间质心坐标距离

进行计算，得到邻接权值矩阵A，

。Step 4.5, the pixel mean distance between superpixels obtained according to step 4.3

and the distance between the centroid coordinates of the superpixels obtained in step 4.4

Perform the calculation to get the adjacency weight matrix A,

.

进一步地，所述步骤5的具体实现包括以下步骤：Further, the specific implementation of the step 5 includes the following steps:

步骤5.1、采用Xavier方法初始化图卷积神经网络模型的模型参数

；Step 5.1, use the Xavier method to initialize the model parameters of the graph convolutional neural network model

;

步骤5.2、根据步骤4构建的区域邻接图G，计算各图节点的度矩阵D，

；Step 5.2. Calculate the degree matrix D of each graph node according to the region adjacency graph G constructed in step 4,

;

步骤5.3、图卷积神经网络模型中每层图卷积神经网络GCN的特征H由下式计算：Step 5.3. The feature H of each layer of graph convolutional neural network GCN in the graph convolutional neural network model is calculated by the following formula:

(4)

其中，

，W为可学习的权值参数矩阵，

为激活函数，且l=0时，

，X为特征矩阵；in,

, W is a learnable weight parameter matrix,

is the activation function, and when l = 0,

, X is the feature matrix;

步骤5.4、在训练阶段，通过图卷积、可微池化操作来调整W以持续性的减少误差，从而优化输出，损失函数由下式计算：Step 5.4. In the training phase, adjust W through graph convolution and micro-pooling operations to continuously reduce the error, thereby optimizing the output. The loss function is calculated by the following formula:

(5)

其中，

是训练样本

的真实标签，s为训练样本数量，L为损失函数；in,

is the training sample

The true label of , s is the number of training samples, L is the loss function;

步骤5.5、根据损失函数L的梯度经反向传播调整整个图卷积神经网络模型的模型参数

，以此作为步骤5.1中的网络初始化参数，不断迭代步骤5.1到步骤5.5直到图卷积神经网络模型对药物成分分析精度趋于稳定。Step 5.5. Adjust the model parameters of the entire graph convolutional neural network model through backpropagation according to the gradient of the loss function L

, as the network initialization parameter in step 5.1, and iterates step 5.1 to step 5.5 until the graph convolutional neural network model tends to stabilize the accuracy of drug component analysis.

进一步地，步骤4.3中每个超像素间的像素均值距离

由下式计算：Further, the pixel mean distance between each superpixel in step 4.3

Calculated by:

（1）

(1)

式中，

表示第i个超像素的像素均值，

表示第j个超像素的像素均值。In the formula,

represents the pixel mean of the ith superpixel,

represents the pixel mean of the jth superpixel.

进一步地，步骤4.4中每个超像素间质心坐标距离

由下式计算：Further, the centroid coordinate distance between each superpixel in step 4.4

Calculated by:

（2）

(2)

式中，

表示第i个超像素质心，

表示第j个超像素质心,

表示第i个超像素质心的横坐标，

表示第i个超像素质心的纵坐标，

表示第j个超像素质心的横坐标,

表示第j个超像素质心的纵坐标。In the formula,

represents the i -th superpixel centroid,

represents the jth superpixel centroid,

represents the abscissa of the i -th superpixel centroid,

represents the ordinate of the i -th superpixel centroid,

represents the abscissa of the j -th superpixel centroid,

represents the ordinate of the j -th superpixel centroid.

进一步地，步骤4.5中邻接权值矩阵A由下式计算：Further, in step 4.5, the adjacency weight matrix A is calculated by the following formula:

（3）。

(3).

故此，本发明提供的图卷积神经网络高光谱医药成分分析的实现方法，首先，获取医药高光谱图像，构建包括训练集和测试集的医药高光谱数据集；其次，利用超像素分割算法，将所述训练集中的医药高光谱图像进行分割，得到互不重叠的超像素；然后，分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角，以及质心像素到每个超像素区域边界距离的特征参数，构造图数据的特征矩阵；接着，以每个超像素为图节点，最近邻超像素为边，构建区域邻接图，并获得图数据的得到邻接权值矩阵；再次，将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练，得到图卷积神经网络的模型参数；最后，步骤2至4，得到需进行药物成分分析的区域邻接图，并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵，将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中，得到药物成分分析结果。与现有技术相比，本发明一方面，将医药高光谱图像数据处理成图数据，大幅度降低了像素数量，有效减少了数据量；另一方面，以图卷积神经网络模型提取药物的特征信息，有效地学习了医药高光谱图像中的视觉特征与药物成分间的空间关系，提升了药物成分分类特征的表示能力，提高了被测药物的成分和属性精度，解决了医药种类多样、组成成分复杂、物理特性各异等难题，实现了无损药物成分分析与质量的快速检测。Therefore, in the method for realizing the hyperspectral medical component analysis of graph convolutional neural network provided by the present invention, firstly, obtaining a medical hyperspectral image, and constructing a medical hyperspectral data set including a training set and a testing set; secondly, using a superpixel segmentation algorithm, The medical hyperspectral images in the training set are divided to obtain non-overlapping superpixels; then, the pixel mean, centroid pixel position, perimeter, area, area azimuth, and centroid pixel to The feature parameters of the boundary distance of each superpixel region are used to construct the feature matrix of the graph data; then, using each superpixel as a graph node and the nearest superpixel as an edge, construct a region adjacency graph, and obtain the adjacency weights of the graph data. Matrix; again, input the feature matrix, the adjacency weight matrix and the medical hyperspectral component label corresponding to the medical hyperspectral image in the training set into the graph convolutional neural network for training to obtain the model parameters of the graph convolutional neural network Finally, in steps 2 to 4, the region adjacency graph that needs to be analyzed for drug components is obtained, and the feature matrix and adjacency weight matrix of the region adjacency graph that needs to be analyzed for drug components are obtained, and the feature matrix and adjacency weight matrix obtained in the test set are obtained. The matrix is input into the graph convolutional neural network model initialized by the model parameters trained in step 5, and the result of drug composition analysis is obtained. Compared with the prior art, on the one hand, the present invention processes the medical hyperspectral image data into graph data, which greatly reduces the number of pixels and effectively reduces the amount of data; The feature information effectively learns the spatial relationship between the visual features in the medical hyperspectral image and the drug components, improves the representation ability of the classification features of the drug components, improves the composition and attribute accuracy of the tested drugs, and solves the problem of various types of medicines, Due to the complex composition and different physical properties, the rapid detection of non-destructive drug composition analysis and quality has been realized.

附图说明Description of drawings

构成本发明的一部分的附图用来提供对本发明的进一步理解，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。在附图中：The accompanying drawings constituting a part of the present invention are used to provide further understanding of the present invention, and the exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the attached image:

图1为本发明实施例一提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图；1 is a flowchart of a method for implementing hyperspectral medical component analysis using a graph convolutional neural network according to Embodiment 1 of the present invention;

图2为本发明实施例二提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图;2 is a flowchart of a method for implementing hyperspectral medical component analysis using a graph convolutional neural network provided in Embodiment 2 of the present invention;

图3为本发明实施例中邻接权值矩阵获取过程的流程图;3 is a flowchart of an adjacent weight matrix acquisition process in an embodiment of the present invention;

图4为本发明实施例的图卷积神经网络模型的结构框架示意图；4 is a schematic diagram of a structural framework of a graph convolutional neural network model according to an embodiment of the present invention;

图5为本发明实施例的高光谱医药成分分析数据集部分样本示意图。FIG. 5 is a schematic diagram of some samples of a hyperspectral medical component analysis data set according to an embodiment of the present invention.

具体实施方式Detailed ways

需要说明的是，在不冲突的情况下，本发明中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。It should be noted that the embodiments of the present invention and the features of the embodiments may be combined with each other under the condition of no conflict. The present invention will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

图1是根据本发明实施例一提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图。如图1所示，本发明的一种图卷积神经网络高光谱医药成分分析的实现方法通过以下步骤实现：FIG. 1 is a flowchart of a method for implementing hyperspectral medical component analysis with a graph convolutional neural network according to Embodiment 1 of the present invention. As shown in Fig. 1, a kind of realization method of graph convolutional neural network hyperspectral medical component analysis of the present invention is realized through the following steps:

步骤1、获取医药高光谱图像，构建医药高光谱数据集,该医药高光谱数据集包括训练集和测试集；Step 1. Obtain a medical hyperspectral image, and construct a medical hyperspectral data set, which includes a training set and a test set;

步骤2、利用超像素分割算法，将上述训练集中的医药高光谱图像进行分割，得到互不重叠的超像素,该互不重叠的超像素构成医药高光谱超像素集合；Step 2, using a superpixel segmentation algorithm to segment the medical hyperspectral images in the training set to obtain non-overlapping superpixels, and the non-overlapping superpixels constitute a medical hyperspectral superpixel set;

本发明首先，获取医药高光谱图像，构建包括训练集和测试集的医药高光谱数据集；其次，利用超像素分割算法，将所述训练集中的医药高光谱图像进行分割，得到互不重叠的超像素；然后，分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角，以及质心像素到每个超像素区域边界距离的特征参数，构造图数据的特征矩阵；接着，以每个超像素为图节点，最近邻超像素为边，构建区域邻接图，并获得图数据的邻接权值矩阵；再次，将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练，得到图卷积神经网络的模型参数；最后，将测试集中的医药高光谱图像重复步骤2至4，得到需进行药物成分分析的区域邻接图，并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵，将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中，得到药物成分分析结果。与现有技术相比，本发明可对医药高光谱图像中药品样本的不同成分进行精确分析，解决了医药种类多样、组成成分复杂、物理特性各异等难题，实现了无损药物成分分析与质量的快速检测。In the present invention, firstly, medical hyperspectral images are acquired, and a medical hyperspectral data set including a training set and a test set is constructed; secondly, a superpixel segmentation algorithm is used to segment the medical hyperspectral images in the training set to obtain non-overlapping medical hyperspectral images. Superpixel; then, count the pixel mean, centroid pixel position, perimeter, area, area azimuth, and the characteristic parameters of the distance from the centroid pixel to the boundary of each superpixel area of each superpixel respectively, and construct the feature matrix of the graph data; Next, take each superpixel as a graph node and the nearest neighbor superpixel as an edge, construct a regional adjacency graph, and obtain the adjacency weight matrix of the graph data; again, combine the feature matrix, the adjacency weight matrix and the training set The medical hyperspectral component labels corresponding to the medical hyperspectral images are input into the graph convolutional neural network for training, and the model parameters of the graph convolutional neural network are obtained; Carry out the region adjacency graph for drug component analysis, and obtain the feature matrix and adjacency weight matrix of the region adjacency graph that needs to be analyzed for drug components, and input the feature matrix and adjacency weight matrix obtained in the test set into the model trained in step 5 In the graph convolutional neural network model initialized by the parameters, the analysis results of the drug components are obtained. Compared with the prior art, the present invention can accurately analyze the different components of the drug samples in the medical hyperspectral image, solve the problems of various types of medicines, complex components, different physical properties, etc., and realize non-destructive drug component analysis and quality. rapid detection.

参见图2至图4，图2为本发明实施例二提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图；图3为本发明实施例中邻接权值矩阵获取过程的流程图;图4为本发明实施例的图卷积神经网络模型的结构框架示意图。Referring to FIGS. 2 to 4 , FIG. 2 is a flowchart of a method for implementing hyperspectral medical component analysis using a graph convolutional neural network according to Embodiment 2 of the present invention; FIG. 3 is a process of obtaining an adjacency weight matrix in an embodiment of the present invention 4 is a schematic diagram of a structural framework of a graph convolutional neural network model according to an embodiment of the present invention.

一种图卷积神经网络高光谱医药成分分析的实现方法，该方法包括以下步骤：A method for realizing hyperspectral medical component analysis of graph convolutional neural network, the method comprises the following steps:

步骤1.1、准备多种不同的药物样品；Step 1.1, prepare a variety of different drug samples;

需要说明的是，该实施例中以头孢丙烯片、土霉素片、马来酸氯苯那敏片、呋塞米片、阿司匹林肠溶片、珀乙红霉素片、裸花紫珠分散片七种药物样品进行实验，但药物的数量和种类并不局限于此。图5即为头孢丙烯片、马来酸氯苯那敏片、裸花紫珠分散片的高光谱医药成分分析数据集部分样本图，具体地，图5中（a）表示裸花紫珠分散片的样本图，（b）表示头孢丙烯片的样本图，（c）表示马来酸氯苯那敏片的样本图。It should be noted that in this example, cefprozil tablets, oxytetracycline tablets, chlorpheniramine maleate tablets, furosemide tablets, aspirin enteric-coated tablets, erythromycin tablets, and naked flowers are dispersed Seven drug samples were used for experiments, but the number and types of drugs were not limited to this. Figure 5 is a partial sample graph of the hyperspectral pharmaceutical composition analysis data set of Cefprozil Tablets, Chlorpheniramine Maleate Tablets, and Naked Flowers Violet Dispersible Tablets. Specifically, (a) in Figure 5 represents Naked Flowers Violet Dispersion The sample diagram of the tablet, (b) represents the sample diagram of the cefprozil tablet, and (c) represents the sample diagram of the chlorpheniramine maleate tablet.

步骤1.2、获取医药高光谱图像，构建医药高光谱数据集

需要说明的是，上述过程中高光谱分选仪优选采用四川双利合谱高光谱分选仪(V10E、N25E - SWIR) ，光谱范围分别为400 - 1000nm，1000 - 2500nm；It should be noted that in the above process, the hyperspectral sorter preferably adopts Sichuan Shuanglihe Spectrum hyperspectral sorter (V10E, N25E-SWIR), and the spectral ranges are 400-1000nm, 1000-2500nm respectively;

步骤1.3、将医药高光谱数据集

随机划分为训练集

和测试集

，

,

，

，

为

中第i个样本的图像，

为

中第i个样本对应的药物成分标签，

为训练集

中第i个样本的图像，

为训练集

中第i个样本对应的药物成分标签，

为测试集

中第i个样本的图像，

为测试集

中第i个样本对应的药物成分标签,d表示医药高光谱数据集

中的样本总数，s表示训练集

中的样本总数，m表示测试集

中的样本总数；Step 1.3, the medical hyperspectral dataset

Randomly divided into training set

and test set

,

for

The image of the ith sample in ,

for

The drug component label corresponding to the i -th sample in

for the training set

The image of the ith sample in ,

for the training set

The drug component label corresponding to the i -th sample in

for the test set

The image of the ith sample in ,

for the test set

The total number of samples in , s is the training set

The total number of samples in , m is the test set

The total number of samples in;

优选地，该步骤具体表现为：采用SLIC算法（Simple Linear IiterativeClustering，简单线性迭代聚类）对所述训练集中的医药高光谱图像进行分割，通过计算像素点之间的空间距离和光谱距离，迭代式更新超像素聚类中心和边界范围，在新的聚类中心和旧的聚类中心之间误差小于预设阈值时停止迭代，从而得到互不重叠的超像素，所述互不重叠的超像素构成医药高光谱超像素集合

，

为第i个超像素，N为互不重叠的超像素个数；Preferably, this step is embodied as follows: using the SLIC algorithm (Simple Linear Iiterative Clustering, simple linear iterative clustering) to segment the medical hyperspectral images in the training set, and by calculating the spatial distance and spectral distance between the pixel points, iteratively The superpixel cluster center and boundary range are updated using the formula, and the iteration is stopped when the error between the new cluster center and the old cluster center is less than a preset threshold, so as to obtain non-overlapping superpixels. Pixel composition medical hyperspectral superpixel collection

,

is the ith superpixel, and N is the number of non-overlapping superpixels;

步骤3，分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角，以及质心像素到每个超像素区域边界距离的特征参数，构造图数据的特征矩阵；Step 3, respectively count the pixel mean value, centroid pixel position, perimeter, area, area azimuth of each superpixel, and characteristic parameters of the distance from the centroid pixel to the boundary of each superpixel area, and construct a feature matrix of the graph data;

具体地，该步骤表现为：将步骤2中得到的每个超像素

，获取每个超像素

的像素均值

、质心像素

位置

、周长

、面积

、区域方位角

以及质心像素

，从而获得特征矩阵X，

，其中，N为超像素个数，M为特征维数，

表示实数集；Specifically, this step is expressed as: each superpixel obtained in step 2

, get each superpixel

pixel mean of

, centroid pixel

Location

,perimeter

,area

, area azimuth

and centroid pixels

, so as to obtain the feature matrix X ,

, where N is the number of superpixels, M is the feature dimension,

represents the set of real numbers;

步骤4、以每个超像素为图节点，最近邻超像素为边，构建区域邻接图，并获得图数据的邻接权值矩阵；具体地，参见图3，该步骤分解为以下过程：Step 4. Taking each superpixel as a graph node and the nearest neighbor superpixel as an edge, construct a regional adjacency graph, and obtain the adjacency weight matrix of the graph data; specifically, referring to Fig. 3, this step is decomposed into the following process:

构成一个个图节点，采用K最近邻算法选取离超像素

最近的K个超像素点构建边，从而构成区域邻接图G，此处，K的取值为8；Step 4.1. According to the medical hyperspectral superpixel set V obtained in step 2, the superpixels in the medical hyperspectral superpixel set are

The nearest K superpixels construct edges to form a region adjacency graph G, where the value of K is 8;

;

步骤4.3、根据步骤3获取超像素

的像素均值

，计算每个超像素间的像素均值距离

，每个超像素间的像素均值距离

由下式计算：Step 4.3, obtain superpixels according to step 3

pixel mean of

, calculate the pixel mean distance between each superpixel

, the pixel mean distance between each superpixel

Calculated by:

（1）

(1)

式中，

表示第i个超像素的像素均值，

表示第j个超像素的像素均值；In the formula,

represents the pixel mean of the ith superpixel,

represents the pixel mean of the jth superpixel;

步骤4.4、根据步骤3获取超像素

的质心像素

位置

，计算每个超像素间质心坐标距离

，每个超像素间质心坐标距离

由下式计算：Step 4.4, obtain superpixels according to step 3

centroid of pixels

Location

, calculate the centroid coordinate distance between each superpixel

, the centroid coordinate distance between each superpixel

Calculated by:

（2）

(2)

式中，

表示第i个超像素质心，

表示第j个超像素质心,

表示第i个超像素质心的横坐标，

表示第i个超像素质心的纵坐标，

表示第j个超像素质心的横坐标,

表示第j个超像素质心的纵坐标；In the formula,

represents the i -th superpixel centroid,

represents the jth superpixel centroid,

represents the abscissa of the i -th superpixel centroid,

represents the ordinate of the i -th superpixel centroid,

represents the abscissa of the j -th superpixel centroid,

represents the ordinate of the j -th superpixel centroid;

步骤4.5、根据步骤4.3获取的超像素间的像素均值距离

和步骤4.4获取的超像素间质心坐标距离

进行计算，得到邻接权值矩阵A，

，邻接权值矩阵A由下式计算：Step 4.5, the pixel mean distance between superpixels obtained according to step 4.3

Perform the calculation to get the adjacency weight matrix A,

, the adjacency weight matrix A is calculated by the following formula:

（3）；

(3);

步骤5、将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练，得到图卷积神经网络的模型参数；图4即为本发明实施例的图卷积神经网络模型的结构框架示意图；Step 5. Input the feature matrix, the adjacency weight matrix and the medical hyperspectral component label corresponding to the medical hyperspectral image in the training set into the graph convolutional neural network for training, and obtain the model parameters of the graph convolutional neural network; 4 is a schematic diagram of a structural framework of a graph convolutional neural network model according to an embodiment of the present invention;

作为本发明的优选实施例，采用K折交叉验证法对步骤1.3中医药高光谱数据集

进行训练集

与测试集

的划分，其中，K取10。As a preferred embodiment of the present invention, the K-fold cross-validation method is used to analyze the hyperspectral data set of traditional Chinese medicine in step 1.3

run the training set

with the test set

, where K is 10.

同时，在进一步的技术方案中，步骤5将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练，得到图卷积神经网络的模型参数的具体实现包括以下步骤：At the same time, in a further technical solution, step 5 inputs the feature matrix, the adjacency weight matrix and the medical hyperspectral component label corresponding to the medical hyperspectral image in the training set into the graph convolutional neural network for training to obtain a graph of The specific implementation of the model parameters of the convolutional neural network includes the following steps:

，需要说明的是，Xavier方法就是一种很有效的神经网络参数初始化方法，其目的主要是使得神经网络每一层输出的方差应该尽量相等；Step 5.1, use the Xavier method to initialize the model parameters of the graph convolutional neural network model

, it should be noted that the Xavier method is a very effective neural network parameter initialization method, and its purpose is to make the variance of the output of each layer of the neural network as equal as possible;

;

(4)

其中，

，W为可学习的权值参数矩阵，

为激活函数，且l=0时，

，X为特征矩阵；in,

, W is a learnable weight parameter matrix,

is the activation function, and when l = 0,

, X is the feature matrix;

(5)

其中，

是训练样本

的真实标签，s为训练样本数量，L为损失函数；其中，L采用交叉熵损失函数的下式计算：in,

is the training sample

The true label of , s is the number of training samples, L is the loss function; among them, L is calculated by the following formula of the cross-entropy loss function:

(6)

式中，

为训练样本

的真实成分，

为训练样本

预测的成分，s为样本数量。In the formula,

for training samples

the real ingredients,

for training samples

Predicted components, s is the sample size.

图4中的图卷积神经网络模型即包括图卷积层、图池化层和输出层。The graph convolutional neural network model in Figure 4 includes graph convolution layer, graph pooling layer and output layer.

相比现有技术，本发明将医药高光谱图像数据处理成图数据，大幅度降低了像素数量，有效减少了数据量；以图卷积神经网络提取药物的特征信息，有效地学习了药物高光谱图像中的视觉特征与药物成分间的空间关系，提升了药物成分分类特征的表示能力，提高了被测药物的成分和属性精度，可实现对药物成分与质量的无损、快速检测分析。Compared with the prior art, the present invention processes the medical hyperspectral image data into graph data, greatly reduces the number of pixels, and effectively reduces the amount of data; the graph convolutional neural network is used to extract the characteristic information of the drug, and the drug height is effectively learned. The spatial relationship between the visual features in the spectral image and the drug components improves the representation ability of the classification features of the drug components, improves the accuracy of the components and attributes of the tested drugs, and can achieve non-destructive and rapid detection and analysis of the drug components and quality.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the scope of the present invention. within the scope of protection.

Claims

1. A method for realizing hyperspectral medical component analysis of a graph convolution neural network is characterized by comprising the following steps:

step 1, acquiring a medical hyperspectral image, and constructing a medical hyperspectral data set, wherein the medical hyperspectral data set comprises a training set and a testing set;

step 2, segmenting the medical hyperspectral images in the training set by utilizing a superpixel segmentation algorithm to obtain mutually non-overlapping superpixels, wherein the mutually non-overlapping superpixels form a medical hyperspectral superpixel set;

step 3, respectively counting the pixel mean value, the centroid pixel position, the perimeter, the area and the region azimuth angle of each super pixel, and the characteristic parameters of the distance from the centroid pixel to the boundary of each super pixel region, and constructing a characteristic matrix of the graph data;

step 4, constructing a region adjacency graph by taking each super pixel as a graph node and the nearest neighbor super pixel as an edge, and obtaining an adjacency weight matrix of graph data;

step 5, inputting the feature matrix, the adjacent weight matrix and the medical hyperspectral component labels corresponding to the medical hyperspectral images in the training set into a atlas neural network for training to obtain model parameters of the atlas neural network;

and 6, repeating the steps 2 to 4 on the medical hyperspectral images in the test set to obtain a region adjacency graph needing to be subjected to medicine component analysis, obtaining a feature matrix and an adjacency weight matrix of the region adjacency graph needing to be subjected to the medicine component analysis, and inputting the feature matrix and the adjacency weight matrix obtained in the test set into a graph convolution neural network model initialized by the model parameters trained in the step 5 to obtain a medicine component analysis result.

2. The method for realizing the hyperspectral medical composition analysis of the convolutional neural network according to claim 1, wherein the step 1 specifically comprises the following steps:

1.1, preparing a plurality of different drug samples;

step 1.2, acquiring a medicine hyperspectral image and constructing a medicine hyperspectral data set

: acquiring a medical hyperspectral image of a medicine sample by adopting a hyperspectral sorter, performing reflectivity correction on the acquired medical hyperspectral image, and taking the corrected image as a sample of a medical hyperspectral data set;

step 1.3, medical hyperspectral data set

Random partitioning into training sets

And test set

，

,

，

，

Is composed of

To middleiThe image of one of the samples is taken,

is composed of

To middleiThe label of the drug component corresponding to each sample,

for training set

To middleiThe image of one of the samples is taken,

for training set

To middleiThe label of the drug component corresponding to each sample,

to test the set

To middleiThe image of one of the samples is taken,

to test the set

To middleiThe medicine component labels corresponding to the samples, d represents a medicine hyperspectral dataset

Total number of samples in (1), s represents the training set

Total number of samples in (1), m represents the test set

Total number of samples in (1).

3. The method for realizing the hyperspectral medical component analysis of the convolutional neural network according to claim 2, wherein a K-fold cross-validation method is adopted to perform hyperspectral medical data collection on the medicine in the step 1.3

Training set

And test set

The division of (2).

4. The method for implementing hyperspectral medical composition analysis of the convolutional neural network according to claim 3, wherein the step 2 is embodied as: the SLIC algorithm is adopted to segment the medical hyperspectral images in the training set, the spatial distance and the spectral distance between pixel points are calculated, the superpixel clustering center and the boundary range are updated in an iterative mode, and the new clustering center is usedStopping iteration when the error between the current clustering center and the old clustering center is smaller than a preset threshold value, thereby obtaining super pixels which are not overlapped with each other, wherein the super pixels which are not overlapped with each other form a medicine hyperspectral super pixel set

，

Is as followsiN is the number of super pixels which are not overlapped with each other.

5. The method for implementing the hyperspectral medical composition analysis of the convolutional neural network according to claim 4, wherein the step 3 is specifically represented as: subjecting each super pixel obtained in step 2

Obtaining each super pixel

Pixel mean of

Centroid pixel

Position of

Circumference length, of

Area of

Azimuth of area

And centroid pixel

Distances from each super pixel region boundary to east, south, west, north, south, north and west 8 directions

Thereby obtaining a feature matrixX，

Wherein N is the number of superpixels, M is the feature dimension,

representing a set of real numbers.

6. The method for realizing the hyperspectral medical component analysis of the convolutional neural network according to claim 5, wherein the concrete realization of the adjacent weight matrix in the step 4 comprises the following steps:

step 4.1, according to the medical hyperspectral superpixel set V obtained in the step 2, superpixels in the medical hyperspectral superpixel set

Forming individual graph nodes, and selecting super-pixel by adopting K nearest neighbor algorithm

Constructing edges by the nearest K super pixel points so as to form a region adjacency graph G;

step 4.2, according to each super-pixel area in the medical hyperspectral super-pixel set V obtained in the step 2, counting adjacent super-pixels of each super-pixel area to obtain an adjacent super-pixel set

；

Step 4.3, obtaining the super pixel according to the step 3

Pixel mean of

Calculating the pixel mean distance between each super pixel

；

Step 4.4, obtaining the super pixel according to the step 3

Centroid pixel of

Position of

Calculating the distance of each superpixel interstitial center coordinate

；

Step 4.5, obtaining the pixel mean distance between the super pixels according to the step 4.3

And the super-pixel interstitial-to-heart coordinate distance obtained in step 4.4

Calculating to obtain an adjacent weight matrix A,

。

7. the method for realizing hyperspectral medical composition analysis of the convolutional neural network according to claim 6, wherein the concrete implementation of the step 5 comprises the following steps:

step 5.1, initializing model parameters of graph convolution neural network model by using Xavier method

；

Step 5.2, calculating a degree matrix D of each graph node according to the region adjacency graph G constructed in the step 4,

；

step 5.3, calculating the characteristic H of each layer of the graph convolution neural network GCN in the graph convolution neural network model according to the following formula:

(4)

wherein,

，Wfor the matrix of weight parameters that can be learned,

is an activation function, andlwhen the value is not less than 0, the reaction time is not less than 0,

，Xis a feature matrix;

step 5.4, in the training phase, the adjustment is carried out through graph convolution and micro-poolingWTo reduce the error on a continuous basis to optimize the output, the loss function is calculated by:

(5)

wherein,

is a training sample

The real label of (a) is,sin order to train the number of samples,Lis a loss function;

step 5.5, according to the loss functionLThe gradient of the whole graph convolution neural network model is adjusted through back propagation

And taking the parameter as the network initialization parameter in the step 5.1, and continuously iterating the step 5.1 to the step 5.5 until the analysis precision of the graph convolution neural network model to the medicine components tends to be stable.

8. The method for implementing the hyperspectral medical composition analysis of the convolutional neural network of claim 6, wherein the pixel mean distance between each superpixel in the step 4.3

Calculated from the following formula:

（1）

in the formula,

is shown asiThe pixel mean of the individual super-pixels,

is shown asjPixel mean of individual superpixels.

9. Implementation of the atlas neural network hyperspectral pharmaceutical composition analysis of claim 8Method, characterized in that in step 4.4 each superpixel interstitial-to-cardiac coordinate distance

Calculated from the following formula:

（2）

in the formula,

is shown asiThe center of mass of each super-pixel,

is shown asjThe center of mass of each super-pixel,

is shown asiThe abscissa of the centroid of an individual super-pixel,

is shown asiThe ordinate of the individual superpixel centroid,

is shown asjThe abscissa of the centroid of an individual super-pixel,

is shown asjThe ordinate of the individual superpixel centroid.

10. The method for implementing the hyperspectral medical composition analysis of the convolutional neural network of claim 9, wherein the adjacency weight matrix a in step 4.5 is calculated by the following formula:

（3）。