WO2023000653A1 - 一种图卷积神经网络高光谱医药成分分析的实现方法 - Google Patents

一种图卷积神经网络高光谱医药成分分析的实现方法 Download PDF

Info

Publication number
WO2023000653A1
WO2023000653A1 PCT/CN2022/076023 CN2022076023W WO2023000653A1 WO 2023000653 A1 WO2023000653 A1 WO 2023000653A1 CN 2022076023 W CN2022076023 W CN 2022076023W WO 2023000653 A1 WO2023000653 A1 WO 2023000653A1
Authority
WO
WIPO (PCT)
Prior art keywords
superpixel
hyperspectral
medical
graph
neural network
Prior art date
Application number
PCT/CN2022/076023
Other languages
English (en)
French (fr)
Inventor
王耀南
尹阿婷
毛建旭
曾凯
张辉
朱青
周显恩
李亚萍
赵禀睿
陈煜嵘
苏学叁
Original Assignee
湖南大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 湖南大学 filed Critical 湖南大学
Publication of WO2023000653A1 publication Critical patent/WO2023000653A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J3/00Spectrometry; Spectrophotometry; Monochromators; Measuring colours
    • G01J3/28Investigating the spectrum
    • G01J3/2823Imaging spectrometer
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions

Definitions

  • the present invention relates to the field of hyperspectral intelligent detection and analysis of high-end medicine, in particular to a method for realizing hyperspectral medical component analysis of graph convolutional neural network. non-destructive analysis.
  • the spectroscopic detection method has been included in the 2015 edition of "Chinese Pharmacopoeia" as a guarantee for testing the quality and quality of medicines. However, it can only detect the quantitative information of the components of the tested sample at the point of light irradiation, and cannot analyze the overall composition of the drug. Therefore, there is an urgent need to study new, general and reliable mass spectroscopic detection and analysis methods for pharmaceutical ingredients.
  • Hyperspectral imaging technology can simultaneously obtain the spectral information and spatial information of the tested drug, and the amount of data obtained is very rich, which can accurately reflect the overall properties of the tested drug, which satisfies the current non-destructive testing and analysis of the overall composition of the drug. need.
  • hyperspectral imaging technology combined with chemometric algorithms has carried out related research in the field of pharmaceuticals, such as the identification of medicinal materials and tablets, the detection of the uniform distribution of active ingredients and excipients in solid tablets, and the monitoring of the composition and distribution of drug-loaded films. , indicating that hyperspectral technology can be used as a high-efficiency non-destructive quality detection method in the pharmaceutical field.
  • graph neural network is a kind of neural network used to process graph domain information. Due to its strong explanatory ability to the structure of biomolecules and the functional relationship between molecules, it has been used in medical fields such as brain science, medical diagnosis, drug discovery and research. Widespread concern. Graph neural network has a good learning ability for the spatial characteristics of topological data structures, but it is difficult to be directly used in the component analysis of medical hyperspectral images. Therefore, there is an urgent need to deeply explore the visual information of medical hyperspectral images for various types of medicines and complex drug composition analysis problems, and combine the spatial characteristics of the drugs to be tested to improve the accuracy of drug composition analysis.
  • the present invention proposes a method for realizing hyperspectral medical component analysis of graph convolutional neural network.
  • the non-destructive drug component analysis and quality analysis can be effectively realized. rapid detection.
  • the present invention provides a method for implementing graph convolutional neural network hyperspectral medical component analysis, comprising the following steps:
  • Step 1 obtain medical hyperspectral image, construct medical hyperspectral data set, described medical hyperspectral data set comprises training set and test set;
  • Step 2 using a superpixel segmentation algorithm to segment the medical hyperspectral images in the training set to obtain non-overlapping superpixels, and the non-overlapping superpixels constitute a medical hyperspectral superpixel set;
  • Step 3 respectively counting the pixel mean value of each superpixel, centroid pixel position, perimeter, area, area azimuth, and feature parameters of the distance from the centroid pixel to each superpixel area boundary, and constructing the feature matrix of the graph data;
  • Step 4 Construct a region adjacency graph with each superpixel as a graph node and the nearest neighbor superpixel as an edge, and obtain the adjacency weight matrix of the graph data;
  • Step 5 input the feature matrix, the adjacency weight matrix and the medical hyperspectral component labels corresponding to the medical hyperspectral images in the training set into the graph convolutional neural network for training to obtain the model parameters of the graph convolutional neural network;
  • Step 6 Repeat steps 2 to 4 for the medical hyperspectral images in the test set to obtain the region adjacency graph that needs to be analyzed for drug components, and obtain the feature matrix and adjacency weight matrix of the region adjacency graph that needs to be analyzed for drug components.
  • the feature matrix and adjacency weight matrix obtained centrally are input into the graph convolutional neural network model initialized by the model parameters trained in step 5, and the drug composition analysis results are obtained.
  • step 1 specifically includes the following process:
  • Step 1.1 Prepare drug samples: seven kinds of cefprozil tablets, oxytetracycline tablets, chlorpheniramine maleate tablets, furosemide tablets, aspirin enteric-coated tablets, perethylerythromycin tablets, and nudehuazizhu dispersible tablets drug samples;
  • Step 1.2 Obtain medical hyperspectral images and construct medical hyperspectral data sets D S : use a hyperspectral sorter to obtain medical hyperspectral images of drug samples, and perform reflectance correction on the collected medical hyperspectral images. The image is used as a sample of the medical hyperspectral data set;
  • the K-fold cross-validation method is used to divide the training set D train and the test set D test on the traditional Chinese medicine hyperspectral data set D S in step 1.3.
  • step 3 is specifically expressed as: taking each superpixel V i obtained in step 2, and obtaining the pixel mean value of each superpixel V i
  • the centroid pixel C i position (C ix , C iy ), perimeter l i , area S i , area azimuth ⁇ i and the centroid pixel C i to the border of each superpixel area take east, south, west, north, southeast, The distance R i in the 8 directions of northeast, southwest and northwest, so as to obtain the characteristic matrix X,
  • N is the number of superpixels
  • M is the feature dimension, represents the set of real numbers.
  • step 4 includes the following steps:
  • Step 4.1 According to the medical hyperspectral superpixel set V obtained in step 2, the superpixels V i in the medical hyperspectral superpixel set form a graph node, and the K nearest neighbor algorithm is used to select K nearest superpixels V i Superpixels construct edges to form a region adjacency graph G;
  • Step 4.2 According to each superpixel area in the medical hyperspectral superpixel set V obtained in step 2, count the adjacent superpixels of each superpixel area to obtain the adjacent superpixel set V';
  • Step 4.3 obtain the pixel mean value of superpixel V i according to step 3 Calculate the pixel mean distance m dist between each superpixel;
  • Step 4.4 obtain superpixels according to step 3
  • Step 4.5 calculate according to the pixel mean distance m dist and centroid coordinate distance f dist between the superpixels obtained in step 4.3 and step 4.4, and obtain the adjacency weight matrix A,
  • step 5 includes the following steps:
  • Step 5.1 using the Xavier method to initialize the model parameter ⁇ gcn of the graph convolutional neural network model
  • Step 5.2 according to the region adjacency graph G constructed in step 4, calculate the degree matrix D of each graph node,
  • Step 5.3 the feature H of each layer of the graph convolutional neural network GCN in the graph convolutional neural network model is calculated by the following formula:
  • W is a learnable weight parameter matrix
  • is an activation function
  • H (0) X
  • Step 5.4 In the training phase, adjust W through graph convolution and differentiable pooling operations to continuously reduce errors, thereby optimizing the output.
  • the loss function is calculated by the following formula:
  • Y i is the real label of the training sample l i
  • s is the number of training samples
  • L is the loss function
  • Step 5.5 Adjust the model parameter ⁇ gcn of the entire graph convolutional neural network model through backpropagation according to the gradient of the loss function L, and use it as the network initialization parameter in step 5.1, and continuously iterate from step 5.1 to step 5.5 until the graph convolutional neural network
  • the analysis accuracy of the network model tends to be stable for drug ingredients.
  • pixel mean distance m dist between each superpixel in step 4.3 is calculated by the following formula:
  • centroid coordinate distance f dist between each superpixel in step 4.4 is calculated by the following formula:
  • C i represents the centroid of the i-th superpixel
  • C j represents the centroid of the j-th superpixel
  • C ix represents the abscissa of the centroid of the i-th superpixel
  • C iy represents the centroid of the i-th superpixel
  • the ordinate, C jx represents the abscissa of the centroid of the jth superpixel
  • C jy represents the ordinate of the centroid of the jth superpixel.
  • the adjacency weight matrix A in step 4.5 is calculated by the following formula:
  • the implementation method of graph convolutional neural network hyperspectral medical component analysis provided by the present invention, firstly, obtain medical hyperspectral images, and construct medical hyperspectral data sets including training sets and test sets; secondly, use superpixel segmentation algorithm, Segment the medical hyperspectral images in the training set to obtain non-overlapping superpixels; then, count the pixel mean, centroid pixel position, perimeter, area, area azimuth, and centroid pixel to The characteristic parameters of the boundary distance of each superpixel region are used to construct the characteristic matrix of the graph data; then, each superpixel is used as a graph node, and the nearest neighbor superpixel is used as an edge to construct a region adjacency graph and obtain the adjacency weight of the graph data matrix; once again, the feature matrix, the adjacent weight matrix and the medical hyperspectral component label corresponding to the medical hyperspectral image in the training set are input into the graph convolutional neural network for training to obtain the model parameters of the graph convolutional neural network ;Finally, steps 2 to 4, obtain the region adjac
  • the present invention processes medical hyperspectral image data into graph data, which greatly reduces the number of pixels and effectively reduces the amount of data;
  • the feature information effectively learns the spatial relationship between the visual features in the medical hyperspectral image and the drug ingredients, improves the representation ability of the classification features of the drug ingredients, improves the accuracy of the ingredients and attributes of the tested drugs, and solves the problem of various types of medicines, Due to the complex composition and different physical properties, it has realized the rapid detection of non-destructive drug composition analysis and quality.
  • FIG. 1 is a flow chart of a method for realizing hyperspectral medical component analysis of a graph convolutional neural network provided by Embodiment 1 of the present invention
  • FIG. 2 is a flow chart of a method for realizing hyperspectral medical component analysis of a graph convolutional neural network provided by Embodiment 2 of the present invention
  • Fig. 3 is the flow chart of the process of obtaining the adjacent weight matrix in the embodiment of the present invention.
  • Fig. 4 is a schematic structural framework diagram of a graph convolutional neural network model according to an embodiment of the present invention.
  • Fig. 5 is a schematic diagram of some samples of the hyperspectral medical component analysis data set according to the embodiment of the present invention.
  • Fig. 1 is a flow chart of a method for implementing a graph convolutional neural network hyperspectral medical component analysis according to Embodiment 1 of the present invention. As shown in Figure 1, a method for realizing the hyperspectral medical composition analysis of a graph convolutional neural network of the present invention is realized through the following steps:
  • Step 1 Acquire medical hyperspectral images and construct a medical hyperspectral data set, which includes a training set and a test set;
  • Step 2 Using the superpixel segmentation algorithm, the medical hyperspectral image in the above training set is segmented to obtain non-overlapping superpixels, which constitute a medical hyperspectral superpixel set;
  • Step 3 respectively counting the pixel mean value of each superpixel, centroid pixel position, perimeter, area, area azimuth, and feature parameters of the distance from the centroid pixel to each superpixel area boundary, and constructing the feature matrix of the graph data;
  • Step 4 Construct a region adjacency graph with each superpixel as a graph node and the nearest neighbor superpixel as an edge, and obtain the adjacency weight matrix of the graph data;
  • Step 5 input the feature matrix, the adjacency weight matrix and the medical hyperspectral component labels corresponding to the medical hyperspectral images in the training set into the graph convolutional neural network for training to obtain the model parameters of the graph convolutional neural network;
  • Step 6 Repeat steps 2 to 4 for the medical hyperspectral images in the test set to obtain the region adjacency graph that needs to be analyzed for drug components, and obtain the feature matrix and adjacency weight matrix of the region adjacency graph that needs to be analyzed for drug components.
  • the feature matrix and adjacency weight matrix obtained centrally are input into the graph convolutional neural network model initialized by the model parameters trained in step 5, and the drug composition analysis results are obtained.
  • the present invention obtains medical hyperspectral images, and constructs a medical hyperspectral data set including a training set and a test set; secondly, uses a superpixel segmentation algorithm to segment the medical hyperspectral images in the training set to obtain non-overlapping Superpixels; then, count the pixel mean value of each superpixel, centroid pixel position, perimeter, area, area azimuth, and feature parameters of the distance from the centroid pixel to the boundary of each superpixel area, and construct the feature matrix of the graph data; Next, with each superpixel as a graph node and the nearest neighbor superpixel as an edge, a region adjacency graph is constructed, and the adjacency weight matrix of the graph data is obtained; again, the feature matrix, the adjacency weight matrix and the training set
  • the medical hyperspectral component labels corresponding to the medical hyperspectral images are input into the graph convolutional neural network for training to obtain the model parameters of the graph convolutional neural network; finally, repeat steps 2 to 4 for the medical hyperspectral images in the test set to obtain the
  • FIG. 2 is a flow chart of a method for realizing the hyperspectral medical component analysis of a graph convolutional neural network provided by the second embodiment of the present invention
  • Fig. 3 is the process of obtaining the adjacency weight matrix in the embodiment of the present invention
  • the flowchart of FIG. 4 is a schematic structural framework diagram of a graph convolutional neural network model according to an embodiment of the present invention.
  • a method for implementing graph convolutional neural network hyperspectral medical component analysis comprising the following steps:
  • Step 1.1 preparing multiple different drug samples
  • FIG. 5 is a partial sample diagram of the hyperspectral medical component analysis data set of Cefprozil Tablets, Chlorpheniramine Maleate Tablets, and Nude Flower Purple Pearl Dispersible Tablets.
  • (b) represents a sample diagram of a cefprozil tablet
  • (c) represents a sample diagram of a chlorpheniramine maleate tablet.
  • Step 1.2 Obtain medical hyperspectral images and construct medical hyperspectral data sets D S : use a hyperspectral sorter to obtain medical hyperspectral images of drug samples, and perform reflectance correction on the collected medical hyperspectral images. The image is used as a sample of the medical hyperspectral data set;
  • the hyperspectral sorting instrument in the above process is preferably Sichuan Shuangli Hespect hyperspectral sorting instrument (V10E, N25E-SWIR), and the spectral ranges are 400-1000nm and 1000-2500nm respectively;
  • Step 2 using a superpixel segmentation algorithm to segment the medical hyperspectral images in the training set to obtain non-overlapping superpixels, and the non-overlapping superpixels constitute a medical hyperspectral superpixel set;
  • this step is embodied as: using the SLIC algorithm (Simple Linear Iterative Clustering, simple linear iterative clustering) to segment the medical hyperspectral images in the training set, and by calculating the spatial distance and spectral distance between pixels, Iteratively update the superpixel cluster center and boundary range, and stop iteration when the error between the new cluster center and the old cluster center is less than the preset threshold, so as to obtain non-overlapping superpixels, the non-overlapping
  • Step 3 counting the pixel mean value of each superpixel, centroid pixel position, perimeter, area, area azimuth, and characteristic parameters of the distance from the centroid pixel to the boundary of each superpixel region, and constructing the feature matrix of the graph data;
  • this step is represented as: take each superpixel V i obtained in step 2, and obtain the pixel mean value of each superpixel V i
  • the centroid pixel C i position (C ix , C iy ), perimeter l i , area S i , area azimuth ⁇ i and the centroid pixel C i to the border of each superpixel area take east, south, west, north, southeast, The distance R i in the 8 directions of northeast, southwest and northwest, so as to obtain the characteristic matrix X,
  • N is the number of superpixels
  • M is the feature dimension, represent the set of real numbers
  • Step 4 Construct a region adjacency graph with each superpixel as a graph node and the nearest neighbor superpixel as an edge, and obtain the adjacency weight matrix of the graph data; specifically, see Figure 3, this step is decomposed into the following process:
  • Step 4.1 According to the medical hyperspectral superpixel set V obtained in step 2, the superpixels V i in the medical hyperspectral superpixel set form a graph node, and the K nearest neighbor algorithm is used to select K nearest superpixels V i Superpixel points construct edges to form a region adjacency graph G, here, the value of K is 8;
  • Step 4.2 According to each superpixel area in the medical hyperspectral superpixel set V obtained in step 2, count the adjacent superpixels of each superpixel area to obtain the adjacent superpixel set V';
  • Step 4.3 obtain the pixel mean value of superpixel V i according to step 3
  • Calculate the pixel mean distance m dist between each superpixel, and the pixel mean distance m dist between each superpixel is calculated by the following formula:
  • Step 4.4 obtain superpixels according to step 3
  • the centroid pixel C i position (C ix , C iy ), calculate the centroid coordinate distance f dist between each superpixel, and the centroid coordinate distance f dist between each superpixel is calculated by the following formula:
  • C i represents the centroid of the i-th superpixel
  • C j represents the centroid of the j-th superpixel
  • C ix represents the abscissa of the centroid of the i-th superpixel
  • C iy represents the centroid of the i-th superpixel
  • the ordinate, C jx represents the abscissa of the jth superpixel centroid, and C jy represents the ordinate of the jth superpixel centroid;
  • Step 4.5 calculate according to the pixel mean distance m dist between superpixels obtained in step 4.3 and the centroid coordinate distance f dist between superpixels obtained in step 4.4, and obtain the adjacency weight matrix A,
  • the adjacency weight matrix A is calculated by the following formula:
  • Step 5 input the feature matrix, the adjacency weight matrix and the medical hyperspectral component labels corresponding to the medical hyperspectral images in the training set into the graph convolutional neural network for training to obtain the model parameters of the graph convolutional neural network;
  • FIG. 4 is a schematic diagram of a structural framework of a graph convolutional neural network model according to an embodiment of the present invention.
  • Step 6 Repeat steps 2 to 4 for the medical hyperspectral images in the test set to obtain the region adjacency graph that needs to be analyzed for drug components, and obtain the feature matrix and adjacency weight matrix of the region adjacency graph that needs to be analyzed for drug components.
  • the feature matrix and adjacency weight matrix obtained centrally are input into the graph convolutional neural network model initialized by the model parameters trained in step 5, and the drug composition analysis results are obtained.
  • the K-fold cross-validation method is used to divide the training set D train and the test set D test on the traditional Chinese medicine hyperspectral data set D S in step 1.3, wherein K is 10.
  • step 5 inputs the feature matrix, the adjacency weight matrix, and the medical hyperspectral component labels corresponding to the medical hyperspectral images in the training set into the graph convolutional neural network for training to obtain a graph
  • the specific implementation of the model parameters of the convolutional neural network includes the following steps:
  • Step 5.1 use the Xavier method to initialize the model parameter ⁇ gcn of the graph convolutional neural network model.
  • the Xavier method is a very effective method for initializing neural network parameters, and its purpose is to make the output of each layer of the neural network The variances should be as equal as possible;
  • Step 5.2 according to the region adjacency graph G constructed in step 4, calculate the degree matrix D of each graph node,
  • Step 5.3 the feature H of each layer of the graph convolutional neural network GCN in the graph convolutional neural network model is calculated by the following formula:
  • W is a learnable weight parameter matrix
  • is an activation function
  • l 0
  • H (0) X
  • X is a feature matrix
  • Step 5.4 In the training phase, adjust W through graph convolution and differentiable pooling operations to continuously reduce errors, thereby optimizing the output.
  • the loss function is calculated by the following formula:
  • Y i is the real label of the training sample l i
  • s is the number of training samples
  • L is the loss function; among them, L is calculated by the following formula of the cross-entropy loss function:
  • Y(I i ) is the real component of the training sample I i
  • Y out (I i ) is the predicted component of the training sample I i
  • s is the number of samples.
  • the graph convolutional neural network model in Figure 4 includes a graph convolution layer, a graph pooling layer, and an output layer.
  • Step 5.5 Adjust the model parameter ⁇ gcn of the entire graph convolutional neural network model through backpropagation according to the gradient of the loss function L, and use it as the network initialization parameter in step 5.1, and continuously iterate from step 5.1 to step 5.5 until the graph convolutional neural network
  • the analysis accuracy of the network model tends to be stable for drug ingredients.
  • the present invention processes the medical hyperspectral image data into graph data, which greatly reduces the number of pixels and effectively reduces the amount of data; extracts the characteristic information of the drug with the graph convolutional neural network, and effectively learns the high
  • the spatial relationship between the visual features in the spectral image and the drug ingredients improves the ability to express the classification features of the drug ingredients, improves the accuracy of the ingredients and attributes of the tested drug, and can realize non-destructive and rapid detection and analysis of the drug ingredients and quality.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种图卷积神经网络高光谱医药成分分析的实现方法,一方面,将医药高光谱图像数据处理成图数据,大幅度降低了像素数量,有效减少了数据量;另一方面,以图卷积神经网络模型提取药物的特征信息,有效地学习了药物高光谱图像中的视觉特征与药物成分间的空间关系,提升了药物成分分类特征的表示能力,提高了被测药物的成分和属性精度,可实现对药物成分与质量的无损、快速检测分析。

Description

一种图卷积神经网络高光谱医药成分分析的实现方法
本申请要求于2021年07月19日提交中国专利局的中国专利申请的优先权,其中国专利申请的申请号为202110811547.X,发明名称为“一种图卷积神经网络高光谱医药成分分析的实现方法”,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及高端医药高光谱智能检测分析领域,特别是涉及一种图卷积神经网络高光谱医药成分分析的实现方法,该方法引入了图卷积神经网络技术,可用于高光谱医药成分和质量的无损分析。
背景技术
医药安全是关系人民群众身体健康和经济发展的大事,已经成为了人们时刻关注的民生与公共安全问题,保障医药质量安全对维护国家安定和社会和谐稳定具有重大意义。现有药物成分质量检测方法,如化学检测方法、分光光度法等,只能适应于抽样检测,且具有破坏性,无法满足医药质量无损检测的要求。近年来,近红外光谱检测技术在药物分析领域中应用十分广泛,其光谱信息是一种鲁棒性很强的类“指纹”特征,可以用来将不同药品成分计量分类。光谱检测法作为检验医药品质、质量的保障,已被2015版《中国药典》收录,但其仅能检测光源照射点被测试样成分的定量信息,无法对药物的整体成分进行分析。因此,亟需研究新型、通用、可靠的医药成分质量光谱检测分析方法。
高光谱成像技术可以同时获取被测药物的光谱信息和空间信息,且获取的数据信息量十分丰富,能准确地反映被检医药的整体性质,很好地满足了当前医药整体成分的无损检测分析需求。目前高光谱成像技术结合化学计量学相关算法,在制药领域开展了药材和片剂的鉴别、固态片剂中有效成分及辅料的均匀性分布检测、载药薄膜的组成及分布情况监测等相关研究,表明了高光谱技术能作为制药领域的高效能无损质量检测手段。但由于医药种类多样、成分复 杂,同时高光谱数据量非常庞大,化学计量学方法难以提取药物的有效特征信息,被测药物的成分和属性预测精度不高。深度学习擅于发掘多维数据中的复杂关系,是目前海量数据处理与分析最好的方法之一。其中图神经网络是一类用于处理图域信息的神经网络,由于对生物分子结构、分子之间的功能关系具有强解释性,目前已在脑科学、医学诊断、药物发现和研究等医药领域受到广泛关注。图神经网络对拓扑数据结构的空间特征具有较好的学习能力,但很难直接用于医药高光谱图像的成分分析中。因此急需针对多种多样的医药种类与复杂的药物成分分析难题,深度探索医药高光谱图像的视觉信息,结合对待测药品的空间特征,提高药物成分分析的精度。
发明内容
有鉴于此,本发明提出一种图卷积神经网络高光谱医药成分分析的实现方法,通过学习高光谱医药图像中药物的光谱信息特征和有效成分空间分布特征,有效实现无损药物成分分析与质量的快速检测。
一方面,本发明提供了一种图卷积神经网络高光谱医药成分分析的实现方法,包括以下步骤:
步骤1、获取医药高光谱图像,构建医药高光谱数据集,所述医药高光谱数据集包括训练集和测试集;
步骤2、利用超像素分割算法,将所述训练集中的医药高光谱图像进行分割,得到互不重叠的超像素,所述互不重叠的超像素构成医药高光谱超像素集合;
步骤3、分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角,以及质心像素到每个超像素区域边界距离的特征参数,构造图数据的特征矩阵;
步骤4、以每个超像素为图节点,最近邻超像素为边,构建区域邻接图,并获得图数据的邻接权值矩阵;
步骤5、将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练,得到图卷积神经网络的模型参数;
步骤6、将测试集中的医药高光谱图像重复步骤2至4,得到需进行药物成分分析的区域邻接图,并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵,将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中,得到药物成分分析结果。
进一步地,所述步骤1具体包括以下过程:
步骤1.1、准备药物样品:头孢丙烯片、土霉素片、马来酸氯苯那敏片、呋塞米片、阿司匹林肠溶片、珀乙红霉素片、裸花紫珠分散片七种药品样本;
步骤1.2、获取医药高光谱图像,构建医药高光谱数据集D S:采用高光谱分选仪获取药物样品的医药高光谱图像,并对采集的医药高光谱图像进行反射率校正,将校正后的图像作为医药高光谱数据集的样本;
步骤1.3、将医药高光谱数据集D S随机划分为训练集D train和测试集D test,D S={(I 1,Y 1),(I 2,Y 2),…,(I i,Y i),…,(I d,Y d)},D train={(I 1',Y 1'),(I 2',Y 2'),…,(I i',Y i'),…,(I s,Y s)},D test={(I 1”,Y 1”),(I 2”,Y 2”),…,(I i”,Y i”),…,(I m,Y m)},I iDS中第i个样本的图像,Y i为D S中第i个样本对应的药物成分标签,I i'为训练集D train中第i个样本的图像,Y i'为训练集D train中第i个样本对应的药物成分标签,I i”为测试集D test中第i个样本的图像,Y i”为测试集D test中第i个样本对应的药物成分标签,d表示医药高光谱数据集D S中的样本总数,s表示训练集D train中的样本总数,m表示测试集D test中的样本总数。
进一步地,采用K折交叉验证法对步骤1.3中医药高光谱数据集D S进行训 练集D train与测试集D test的划分。
进一步地,所述步骤2具体表现为:采用SLIC算法对所述训练集中的医药高光谱图像进行分割,通过计算像素点之间的空间距离和光谱距离,迭代式更新超像素聚类中心和边界范围,在新的聚类中心和旧的聚类中心之间误差小于预设阈值时停止迭代,从而得到互不重叠的超像素,所述互不重叠的超像素构成医药高光谱超像素集合V={V 1,V 2,…,V i,…,V N},V i为第i个超像素,N为互不重叠的超像素个数。
进一步地,所述步骤3具体表现为:将步骤2中得到的每个超像素V i,获取每个超像素V i的像素均值
Figure PCTCN2022076023-appb-000001
质心像素C i位置(C ix,C iy)、周长l i、面积S i、区域方位角θ i以及质心像素C i到每个超像素区域边界取东、南、西、北、东南、东北、西南、西北8个方向的距离R i,从而获得特征矩阵X,
Figure PCTCN2022076023-appb-000002
其中,N为超像素个数,M为特征维数,
Figure PCTCN2022076023-appb-000003
表示实数集。
进一步地,步骤4中邻接权值矩阵的具体实现包括以下步骤:
步骤4.1、根据步骤2得到的医药高光谱超像素集合V,将医药高光谱超像素集合中的超像素V i构成一个个图节点,采用K最近邻算法选取离超像素V i最近的K个超像素点构建边,从而构成区域邻接图G;
步骤4.2、根据步骤2获取的医药高光谱超像素集合V中的每个超像素区域,统计每个超像素区域的相邻超像素,得到相邻超像素集合V';
步骤4.3、根据步骤3获取超像素V i的像素均值
Figure PCTCN2022076023-appb-000004
计算每个超像素间的像素均值距离m dist
步骤4.4、根据步骤3获取超像素
Figure PCTCN2022076023-appb-000005
的质心像素C i位置(C ix,C iy),计算每个 超像素间质心坐标距离f dist
步骤4.5、根据步骤4.3和步骤4.4获取的超像素间的像素均值距离m dist、质心坐标距离f dist进行计算,得到邻接权值矩阵A,
Figure PCTCN2022076023-appb-000006
进一步地,所述步骤5的具体实现包括以下步骤:
步骤5.1、采用Xavier方法初始化图卷积神经网络模型的模型参数α gcn
步骤5.2、根据步骤4构建的区域邻接图G,计算各图节点的度矩阵D,
Figure PCTCN2022076023-appb-000007
步骤5.3、图卷积神经网络模型中每层图卷积神经网络GCN的特征H由下式计算:
H (l+1)=σ(L GH (l)W (l))     (4)
其中,
Figure PCTCN2022076023-appb-000008
W为可学习的权值参数矩阵,σ为激活函数,且l=0时,H (0)=X;
步骤5.4、在训练阶段,通过图卷积、可微池化操作来调整W以持续性的减少误差,从而优化输出,损失函数由下式计算:
Figure PCTCN2022076023-appb-000009
其中,Y i是训练样本l i的真实标签,s为训练样本数量,L为损失函数;
步骤5.5、根据损失函数L的梯度经反向传播调整整个图卷积神经网络模型的模型参数α gcn,以此作为步骤5.1中的网络初始化参数,不断迭代步骤5.1到步骤5.5直到图卷积神经网络模型对药物成分分析精度趋于稳定。
进一步地,步骤4.3中每个超像素间的像素均值距离m dist由下式计算:
Figure PCTCN2022076023-appb-000010
式中,
Figure PCTCN2022076023-appb-000011
表示第i个超像素的像素均值,
Figure PCTCN2022076023-appb-000012
表示第j个超像素的像素均值。
进一步地,步骤4.4中每个超像素间质心坐标距离f dist由下式计算:
Figure PCTCN2022076023-appb-000013
式中,C i表示第i个超像素质心,C j表示第j个超像素质心,C ix表示第i个超像素质心的横坐标,C iy表示第i个超像素质心的纵坐标,C jx表示第j个超像素质心的横坐标,C jy表示第j个超像素质心的纵坐标。
进一步地,步骤4.5中邻接权值矩阵A由下式计算:
Figure PCTCN2022076023-appb-000014
故此,本发明提供的图卷积神经网络高光谱医药成分分析的实现方法,首先,获取医药高光谱图像,构建包括训练集和测试集的医药高光谱数据集;其次,利用超像素分割算法,将所述训练集中的医药高光谱图像进行分割,得到互不重叠的超像素;然后,分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角,以及质心像素到每个超像素区域边界距离的特征参数,构造图数据的特征矩阵;接着,以每个超像素为图节点,最近邻超像素为边,构建区域邻接图,并获得图数据的得到邻接权值矩阵;再次,将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练,得到图卷积神经网络的模型参数;最后,步骤2至4,得到需进行药物成分分析的区域邻接图,并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵,将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中,得到药物成分分析结果。与现有技术相比,本发明一方面,将医药高光谱图像数据处理成图数据,大幅度降低了像素数量,有效减少了数据量;另 一方面,以图卷积神经网络模型提取药物的特征信息,有效地学习了医药高光谱图像中的视觉特征与药物成分间的空间关系,提升了药物成分分类特征的表示能力,提高了被测药物的成分和属性精度,解决了医药种类多样、组成成分复杂、物理特性各异等难题,实现了无损药物成分分析与质量的快速检测。
附图说明
构成本发明的一部分的附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1为本发明实施例一提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图;
图2为本发明实施例二提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图;
图3为本发明实施例中邻接权值矩阵获取过程的流程图;
图4为本发明实施例的图卷积神经网络模型的结构框架示意图;
图5为本发明实施例的高光谱医药成分分析数据集部分样本示意图。
具体实施方式
需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。
图1是根据本发明实施例一提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图。如图1所示,本发明的一种图卷积神经网络高光谱医药成分分析的实现方法通过以下步骤实现:
步骤1、获取医药高光谱图像,构建医药高光谱数据集,该医药高光谱数据集包括训练集和测试集;
步骤2、利用超像素分割算法,将上述训练集中的医药高光谱图像进行分 割,得到互不重叠的超像素,该互不重叠的超像素构成医药高光谱超像素集合;
步骤3、分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角,以及质心像素到每个超像素区域边界距离的特征参数,构造图数据的特征矩阵;
步骤4、以每个超像素为图节点,最近邻超像素为边,构建区域邻接图,并获得图数据的邻接权值矩阵;
步骤5、将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练,得到图卷积神经网络的模型参数;
步骤6、将测试集中的医药高光谱图像重复步骤2至4,得到需进行药物成分分析的区域邻接图,并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵,将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中,得到药物成分分析结果。
本发明首先,获取医药高光谱图像,构建包括训练集和测试集的医药高光谱数据集;其次,利用超像素分割算法,将所述训练集中的医药高光谱图像进行分割,得到互不重叠的超像素;然后,分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角,以及质心像素到每个超像素区域边界距离的特征参数,构造图数据的特征矩阵;接着,以每个超像素为图节点,最近邻超像素为边,构建区域邻接图,并获得图数据的邻接权值矩阵;再次,将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练,得到图卷积神经网络的模型参数;最后,将测试集中的医药高光谱图像重复步骤2至4,得到需进行药物成分分析的区域邻接图,并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵,将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中,得到药物成分分析结果。 与现有技术相比,本发明可对医药高光谱图像中药品样本的不同成分进行精确分析,解决了医药种类多样、组成成分复杂、物理特性各异等难题,实现了无损药物成分分析与质量的快速检测。
参见图2至图4,图2为本发明实施例二提供的一种图卷积神经网络高光谱医药成分分析的实现方法的流程图;图3为本发明实施例中邻接权值矩阵获取过程的流程图;图4为本发明实施例的图卷积神经网络模型的结构框架示意图。
一种图卷积神经网络高光谱医药成分分析的实现方法,该方法包括以下步骤:
步骤1.1、准备多种不同的药物样品;
需要说明的是,该实施例中以头孢丙烯片、土霉素片、马来酸氯苯那敏片、呋塞米片、阿司匹林肠溶片、珀乙红霉素片、裸花紫珠分散片七种药物样品进行实验,但药物的数量和种类并不局限于此。图5即为头孢丙烯片、马来酸氯苯那敏片、裸花紫珠分散片的高光谱医药成分分析数据集部分样本图,具体地,图5中(a)表示裸花紫珠分散片的样本图,(b)表示头孢丙烯片的样本图,(c)表示马来酸氯苯那敏片的样本图。
步骤1.2、获取医药高光谱图像,构建医药高光谱数据集D S:采用高光谱分选仪获取药物样品的医药高光谱图像,并对采集的医药高光谱图像进行反射率校正,将校正后的图像作为医药高光谱数据集的样本;
需要说明的是,上述过程中高光谱分选仪优选采用四川双利合谱高光谱分选仪(V10E、N25E-SWIR),光谱范围分别为400-1000nm,1000-2500nm;
步骤1.3、将医药高光谱数据集D S随机划分为训练集D train和测试集D test,D S={(I 1,Y 1),(I 2,Y 2),…,(I i,Y i),…,(I d,Y d)},D train={(I 1',Y 1'),(I 2',Y 2'),…,(I i',Y i'),…,(I s,Y s)},D test={(I 1”,Y 1”),(I 2”,Y 2”),…,(I i”,Y i”),…,(I m,Y m)},I i为D S中第i个样本的图像,Y i为 D S中第i个样本对应的药物成分标签,I i'为训练集D train中第i个样本的图像,Y i'为训练集D train中第i个样本对应的药物成分标签,I i”为测试集D test中第i个样本的图像,Y i”为测试集D test中第i个样本对应的药物成分标签,d表示医药高光谱数据集D S中的样本总数,s表示训练集D train中的样本总数,m表示测试集D test中的样本总数;
步骤2、利用超像素分割算法,将所述训练集中的医药高光谱图像进行分割,得到互不重叠的超像素,所述互不重叠的超像素构成医药高光谱超像素集合;
优选地,该步骤具体表现为:采用SLIC算法(Simple Linear Iiterative Clustering,简单线性迭代聚类)对所述训练集中的医药高光谱图像进行分割,通过计算像素点之间的空间距离和光谱距离,迭代式更新超像素聚类中心和边界范围,在新的聚类中心和旧的聚类中心之间误差小于预设阈值时停止迭代,从而得到互不重叠的超像素,所述互不重叠的超像素构成医药高光谱超像素集合V={V 1,V 2,…,V i,…,V N},V i为第i个超像素,N为互不重叠的超像素个数;
步骤3,分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角,以及质心像素到每个超像素区域边界距离的特征参数,构造图数据的特征矩阵;
具体地,该步骤表现为:将步骤2中得到的每个超像素V i,获取每个超像素V i的像素均值
Figure PCTCN2022076023-appb-000015
质心像素C i位置(C ix,C iy)、周长l i、面积S i、区域方位角θ i以及质心像素C i到每个超像素区域边界取东、南、西、北、东南、东北、西南、西北8个方向的距离R i,从而获得特征矩阵X,
Figure PCTCN2022076023-appb-000016
其中,N为超像素个数,M为特征维数,
Figure PCTCN2022076023-appb-000017
表示实数集;
步骤4、以每个超像素为图节点,最近邻超像素为边,构建区域邻接图,并获得图数据的邻接权值矩阵;具体地,参见图3,该步骤分解为以下过程:
步骤4.1、根据步骤2得到的医药高光谱超像素集合V,将医药高光谱超像素集合中的超像素V i构成一个个图节点,采用K最近邻算法选取离超像素V i最近的K个超像素点构建边,从而构成区域邻接图G,此处,K的取值为8;
步骤4.2、根据步骤2获取的医药高光谱超像素集合V中的每个超像素区域,统计每个超像素区域的相邻超像素,得到相邻超像素集合V';
步骤4.3、根据步骤3获取超像素V i的像素均值
Figure PCTCN2022076023-appb-000018
计算每个超像素间的像素均值距离m dist,每个超像素间的像素均值距离m dist由下式计算:
Figure PCTCN2022076023-appb-000019
式中,
Figure PCTCN2022076023-appb-000020
表示第i个超像素的像素均值,
Figure PCTCN2022076023-appb-000021
表示第j个超像素的像素均值;
步骤4.4、根据步骤3获取超像素
Figure PCTCN2022076023-appb-000022
的质心像素C i位置(C ix,C iy),计算每个超像素间质心坐标距离f dist,每个超像素间质心坐标距离f dist由下式计算:
Figure PCTCN2022076023-appb-000023
式中,C i表示第i个超像素质心,C j表示第j个超像素质心,C ix表示第i个超像素质心的横坐标,C iy表示第i个超像素质心的纵坐标,C jx表示第j个超像素质心的横坐标,C jy表示第j个超像素质心的纵坐标;
步骤4.5、根据步骤4.3获取的超像素间的像素均值距离m dist和步骤4.4获取的超像素间质心坐标距离f dist进行计算,得到邻接权值矩阵A,
Figure PCTCN2022076023-appb-000024
邻 接权值矩阵A由下式计算:
Figure PCTCN2022076023-appb-000025
步骤5、将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练,得到图卷积神经网络的模型参数;图4即为本发明实施例的图卷积神经网络模型的结构框架示意图;
步骤6、将测试集中的医药高光谱图像重复步骤2至4,得到需进行药物成分分析的区域邻接图,并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵,将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中,得到药物成分分析结果。
作为本发明的优选实施例,采用K折交叉验证法对步骤1.3中医药高光谱数据集D S进行训练集D train与测试集D test的划分,其中,K取10。
同时,在进一步的技术方案中,步骤5将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练,得到图卷积神经网络的模型参数的具体实现包括以下步骤:
步骤5.1、采用Xavier方法初始化图卷积神经网络模型的模型参数α gcn,需要说明的是,Xavier方法就是一种很有效的神经网络参数初始化方法,其目的主要是使得神经网络每一层输出的方差应该尽量相等;
步骤5.2、根据步骤4构建的区域邻接图G,计算各图节点的度矩阵D,
Figure PCTCN2022076023-appb-000026
步骤5.3、图卷积神经网络模型中每层图卷积神经网络GCN的特征H由下式计算:
H (l+1)=σ(L GH (l)W (l))     (4)
其中,
Figure PCTCN2022076023-appb-000027
W为可学习的权值参数矩阵,σ为激活函数,且l=0时,H (0)=X,X为特征矩阵;
步骤5.4、在训练阶段,通过图卷积、可微池化操作来调整W以持续性的减少误差,从而优化输出,损失函数由下式计算:
Figure PCTCN2022076023-appb-000028
其中,Y i是训练样本l i的真实标签,s为训练样本数量,L为损失函数;其中,L采用交叉熵损失函数的下式计算:
Figure PCTCN2022076023-appb-000029
式中,Y(I i)为训练样本I i的真实成分,Y out(I i)为训练样本I i预测的成分,s为样本数量。
图4中的图卷积神经网络模型即包括图卷积层、图池化层和输出层。
步骤5.5、根据损失函数L的梯度经反向传播调整整个图卷积神经网络模型的模型参数α gcn,以此作为步骤5.1中的网络初始化参数,不断迭代步骤5.1到步骤5.5直到图卷积神经网络模型对药物成分分析精度趋于稳定。
相比现有技术,本发明将医药高光谱图像数据处理成图数据,大幅度降低了像素数量,有效减少了数据量;以图卷积神经网络提取药物的特征信息,有效地学习了药物高光谱图像中的视觉特征与药物成分间的空间关系,提升了药物成分分类特征的表示能力,提高了被测药物的成分和属性精度,可实现对药物成分与质量的无损、快速检测分析。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,包括以下步骤:
    步骤1、获取医药高光谱图像,构建医药高光谱数据集,所述医药高光谱数据集包括训练集和测试集;
    步骤2、利用超像素分割算法,将所述训练集中的医药高光谱图像进行分割,得到互不重叠的超像素,所述互不重叠的超像素构成医药高光谱超像素集合;
    步骤3、分别统计每个超像素的像素均值、质心像素位置、周长、面积、区域方位角,以及质心像素到每个超像素区域边界距离的特征参数,构造图数据的特征矩阵;
    步骤4、以每个超像素为图节点,最近邻超像素为边,构建区域邻接图,并获得图数据的邻接权值矩阵;
    步骤5、将所述特征矩阵、所述邻接权值矩阵以及训练集中医药高光谱图像对应的医药高光谱成分标签输入到图卷积神经网络中进行训练,得到图卷积神经网络的模型参数;
    步骤6、将测试集中的医药高光谱图像重复步骤2至4,得到需进行药物成分分析的区域邻接图,并获取需进行药物成分分析的区域邻接图的特征矩阵与邻接权值矩阵,将测试集中获取的特征矩阵与邻接权值矩阵输入到由步骤5训练好的模型参数所初始化的图卷积神经网络模型中,得到药物成分分析结果。
  2. 根据权利要求1所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,所述步骤1具体包括以下过程:
    步骤1.1、准备多种不同的药物样品;
    步骤1.2、获取医药高光谱图像,构建医药高光谱数据集D S:采用高光谱分选仪获取药物样品的医药高光谱图像,并对采集的医药高光谱图像进行反射率校正,将校正后的图像作为医药高光谱数据集的样本;
    步骤1.3、将医药高光谱数据集D S随机划分为训练集D train和测试集D test,D S={(I 1,Y 1),(I 2,Y 2),…,(I i,Y i),…,(I d,Y d)},D train={(I 1',Y 1'),(I 2',Y 2'),…,(I i',Y i'),…,(I s,Y s)},D test={(I 1”,Y 1”),(I 2”,Y 2”),…,(I i”,Y i”),…,(I m,Y m)},I i为D S中第i个样本的图像,Y i为D S中第i个样本对应的药物成分标签,I i'为训练集D train中第i个样本的图像,Y i'为训练集D train中第i个样本对应的药物成分标签,I i”为测试集D test中第i个样本的图像,Y i”为测试集D test中第i个样本对应的药物成分标签,d表示医药高光谱数据集D S中的样本总数,s表示训练集D train中的样本总数,m表示测试集D test中的样本总数。
  3. 根据权利要求2所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,采用K折交叉验证法对步骤1.3中医药高光谱数据集D S进行训练集D train与测试集D test的划分。
  4. 根据权利要求3所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,所述步骤2具体表现为:采用SLIC算法对所述训练集中的医药高光谱图像进行分割,通过计算像素点之间的空间距离和光谱距离,迭代式更新超像素聚类中心和边界范围,在新的聚类中心和旧的聚类中心之间误差小于预设阈值时停止迭代,从而得到互不重叠的超像素,所述互不重叠的超像素 构成医药高光谱超像素集合V={V 1,V 2,…,V i,…,V N},V i为第i个超像素,N为互不重叠的超像素个数。
  5. 根据权利要求4所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,所述步骤3具体表现为:将步骤2中得到的每个超像素V i,获取每个超像素V i的像素均值
    Figure PCTCN2022076023-appb-100001
    质心像素C i位置(C ix,C iy)、周长l i、面积S i、区域方位角θ i以及质心像素C i到每个超像素区域边界取东、南、西、北、东南、东北、西南、西北8个方向的距离R i,从而获得特征矩阵X,
    Figure PCTCN2022076023-appb-100002
    其中,N为超像素个数,M为特征维数,
    Figure PCTCN2022076023-appb-100003
    表示实数集。
  6. 根据权利要求5所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,步骤4中邻接权值矩阵的具体实现包括以下步骤:
    步骤4.1、根据步骤2得到的医药高光谱超像素集合V,将医药高光谱超像素集合中的超像素V i构成一个个图节点,采用K最近邻算法选取离超像素V i最近的K个超像素点构建边,从而构成区域邻接图G;
    步骤4.2、根据步骤2获取的医药高光谱超像素集合V中的每个超像素区域,统计每个超像素区域的相邻超像素,得到相邻超像素集合V';
    步骤4.3、根据步骤3获取超像素V i的像素均值
    Figure PCTCN2022076023-appb-100004
    计算每个超像素间的像素均值距离m dist
    步骤4.4、根据步骤3获取超像素
    Figure PCTCN2022076023-appb-100005
    的质心像素C i位置(C ix,C iy),计算每个超像素间质心坐标距离f dist
    步骤4.5、根据步骤4.3获取的超像素间的像素均值距离m dist和步骤4.4获取的超像素间质心坐标距离f dist进行计算,得到邻接权值矩阵A,
    Figure PCTCN2022076023-appb-100006
  7. 根据权利要求6所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,所述步骤5的具体实现包括以下步骤:
    步骤5.1、采用Xavier方法初始化图卷积神经网络模型的模型参数α gcn
    步骤5.2、根据步骤4构建的区域邻接图G,计算各图节点的度矩阵D,
    Figure PCTCN2022076023-appb-100007
    步骤5.3、图卷积神经网络模型中每层图卷积神经网络GCN的特征H由下式计算:
    H (l+1)=σ(L GH (l)W (l))  (4)
    其中,
    Figure PCTCN2022076023-appb-100008
    W为可学习的权值参数矩阵,σ为激活函数,且l=0时,H (0)=X,X为特征矩阵;
    步骤5.4、在训练阶段,通过图卷积、可微池化操作来调整W以持续性的减少误差,从而优化输出,损失函数由下式计算:
    Figure PCTCN2022076023-appb-100009
    其中,Y i是训练样本l i的真实标签,s为训练样本数量,L为损失函数;
    步骤5.5、根据损失函数L的梯度经反向传播调整整个图卷积神经网络模型的模型参数α gcn,以此作为步骤5.1中的网络初始化参数,不断迭代步骤5.1到步骤5.5直到图卷积神经网络模型对药物成分分析精度趋于稳定。
  8. 根据权利要求6所述的图卷积神经网络高光谱医药成分分析的实现方 法,其特征在于,步骤4.3中每个超像素间的像素均值距离m dist由下式计算:
    Figure PCTCN2022076023-appb-100010
    式中,
    Figure PCTCN2022076023-appb-100011
    表示第i个超像素的像素均值,
    Figure PCTCN2022076023-appb-100012
    表示第j个超像素的像素均值。
  9. 根据权利要求8所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,步骤4.4中每个超像素间质心坐标距离f dist由下式计算:
    Figure PCTCN2022076023-appb-100013
    式中,C i表示第i个超像素质心,C j表示第j个超像素质心,C ix表示第i个超像素质心的横坐标,C iy表示第i个超像素质心的纵坐标,C jx表示第j个超像素质心的横坐标,C jy表示第j个超像素质心的纵坐标。
  10. 根据权利要求9所述的图卷积神经网络高光谱医药成分分析的实现方法,其特征在于,步骤4.5中邻接权值矩阵A由下式计算:
    Figure PCTCN2022076023-appb-100014
PCT/CN2022/076023 2021-07-19 2022-02-11 一种图卷积神经网络高光谱医药成分分析的实现方法 WO2023000653A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110811547.X 2021-07-19
CN202110811547.XA CN113269196B (zh) 2021-07-19 2021-07-19 一种图卷积神经网络高光谱医药成分分析的实现方法

Publications (1)

Publication Number Publication Date
WO2023000653A1 true WO2023000653A1 (zh) 2023-01-26

Family

ID=77236799

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076023 WO2023000653A1 (zh) 2021-07-19 2022-02-11 一种图卷积神经网络高光谱医药成分分析的实现方法

Country Status (2)

Country Link
CN (1) CN113269196B (zh)
WO (1) WO2023000653A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115825316A (zh) * 2023-02-15 2023-03-21 武汉宏韧生物医药股份有限公司 基于超临界色谱法的药物有效成分分析方法及装置
CN116429710A (zh) * 2023-06-15 2023-07-14 武汉大学人民医院(湖北省人民医院) 一种药物组分检测方法、装置、设备及可读存储介质
CN116563711A (zh) * 2023-05-17 2023-08-08 大连民族大学 基于动量更新的二分类编码器网络的高光谱目标检测方法
CN116612333A (zh) * 2023-07-17 2023-08-18 山东大学 一种基于快速全卷积网络的医学高光谱影像分类方法
CN116662593A (zh) * 2023-07-21 2023-08-29 湖南大学 一种基于fpga的全流水线医药高光谱图像神经网络分类方法
CN117333486A (zh) * 2023-11-30 2024-01-02 清远欧派集成家居有限公司 Uv面漆性能检测数据分析方法、装置及存储介质
CN118096536A (zh) * 2024-04-29 2024-05-28 中国科学院长春光学精密机械与物理研究所 基于超图神经网络的遥感高光谱图像超分辨率重构方法
CN118469400A (zh) * 2024-07-09 2024-08-09 中科信息产业(山东)有限公司 基于中药识别数据的中医人才分析系统
CN118459405A (zh) * 2024-04-22 2024-08-09 陕西科弘健康产业有限公司 一种从千层塔中提取石杉碱甲的工艺

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269196B (zh) * 2021-07-19 2021-09-28 湖南大学 一种图卷积神经网络高光谱医药成分分析的实现方法
CN113989525B (zh) * 2021-12-24 2022-03-29 湖南大学 自适应随机块卷积核网络的高光谱中药材鉴别方法
CN115979973B (zh) * 2023-03-20 2023-06-16 湖南大学 一种基于双通道压缩注意力网络的高光谱中药材鉴别方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022353A (zh) * 2016-05-05 2016-10-12 浙江大学 一种基于超像素分割的图像语义标注方法
US20190156154A1 (en) * 2017-11-21 2019-05-23 Nvidia Corporation Training a neural network to predict superpixels using segmentation-aware affinity loss
WO2020165913A1 (en) * 2019-02-12 2020-08-20 Tata Consultancy Services Limited Automated unsupervised localization of context sensitive events in crops and computing extent thereof
CN111695636A (zh) * 2020-06-15 2020-09-22 北京师范大学 一种基于图神经网络的高光谱图像分类方法
CN112381813A (zh) * 2020-11-25 2021-02-19 华南理工大学 一种基于图卷积神经网络的全景图视觉显著性检测方法
CN113269196A (zh) * 2021-07-19 2021-08-17 湖南大学 一种图卷积神经网络高光谱医药成分分析的实现方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111681249B (zh) * 2020-05-14 2023-11-28 中山艾尚智同信息科技有限公司 基于Grabcut的砂石颗粒的改进分割算法研究
CN112446417B (zh) * 2020-10-16 2022-04-12 山东大学 基于多层超像素分割的纺锤形果实图像分割方法及系统
CN113095305B (zh) * 2021-06-08 2021-09-07 湖南大学 一种医药异物高光谱分类检测方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022353A (zh) * 2016-05-05 2016-10-12 浙江大学 一种基于超像素分割的图像语义标注方法
US20190156154A1 (en) * 2017-11-21 2019-05-23 Nvidia Corporation Training a neural network to predict superpixels using segmentation-aware affinity loss
WO2020165913A1 (en) * 2019-02-12 2020-08-20 Tata Consultancy Services Limited Automated unsupervised localization of context sensitive events in crops and computing extent thereof
CN111695636A (zh) * 2020-06-15 2020-09-22 北京师范大学 一种基于图神经网络的高光谱图像分类方法
CN112381813A (zh) * 2020-11-25 2021-02-19 华南理工大学 一种基于图卷积神经网络的全景图视觉显著性检测方法
CN113269196A (zh) * 2021-07-19 2021-08-17 湖南大学 一种图卷积神经网络高光谱医药成分分析的实现方法

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115825316A (zh) * 2023-02-15 2023-03-21 武汉宏韧生物医药股份有限公司 基于超临界色谱法的药物有效成分分析方法及装置
CN116563711A (zh) * 2023-05-17 2023-08-08 大连民族大学 基于动量更新的二分类编码器网络的高光谱目标检测方法
CN116563711B (zh) * 2023-05-17 2024-02-09 大连民族大学 基于动量更新的二分类编码器网络的高光谱目标检测方法
CN116429710B (zh) * 2023-06-15 2023-09-26 武汉大学人民医院(湖北省人民医院) 一种药物组分检测方法、装置、设备及可读存储介质
CN116429710A (zh) * 2023-06-15 2023-07-14 武汉大学人民医院(湖北省人民医院) 一种药物组分检测方法、装置、设备及可读存储介质
CN116612333A (zh) * 2023-07-17 2023-08-18 山东大学 一种基于快速全卷积网络的医学高光谱影像分类方法
CN116612333B (zh) * 2023-07-17 2023-09-29 山东大学 一种基于快速全卷积网络的医学高光谱影像分类方法
CN116662593B (zh) * 2023-07-21 2023-10-27 湖南大学 一种基于fpga的全流水线医药高光谱图像神经网络分类方法
CN116662593A (zh) * 2023-07-21 2023-08-29 湖南大学 一种基于fpga的全流水线医药高光谱图像神经网络分类方法
CN117333486A (zh) * 2023-11-30 2024-01-02 清远欧派集成家居有限公司 Uv面漆性能检测数据分析方法、装置及存储介质
CN117333486B (zh) * 2023-11-30 2024-03-22 清远欧派集成家居有限公司 Uv面漆性能检测数据分析方法、装置及存储介质
CN118459405A (zh) * 2024-04-22 2024-08-09 陕西科弘健康产业有限公司 一种从千层塔中提取石杉碱甲的工艺
CN118096536A (zh) * 2024-04-29 2024-05-28 中国科学院长春光学精密机械与物理研究所 基于超图神经网络的遥感高光谱图像超分辨率重构方法
CN118469400A (zh) * 2024-07-09 2024-08-09 中科信息产业(山东)有限公司 基于中药识别数据的中医人才分析系统

Also Published As

Publication number Publication date
CN113269196B (zh) 2021-09-28
CN113269196A (zh) 2021-08-17

Similar Documents

Publication Publication Date Title
WO2023000653A1 (zh) 一种图卷积神经网络高光谱医药成分分析的实现方法
Mei et al. Automatic fabric defect detection with a multi-scale convolutional denoising autoencoder network model
Yang et al. Rapid detection and counting of wheat ears in the field using YOLOv4 with attention module
Wei et al. A small UAV based multi-temporal image registration for dynamic agricultural terrace monitoring
Hong et al. Prediction of soil organic matter by VIS–NIR spectroscopy using normalized soil moisture index as a proxy of soil moisture
Xiao et al. Wheat fusarium head blight detection using UAV-based spectral and texture features in optimal window size
Huang et al. Diagnosis of the severity of Fusarium head blight of wheat ears on the basis of image and spectral feature fusion
Luo et al. Grape berry detection and size measurement based on edge image processing and geometric morphology
Li et al. A generalized framework of feature learning enhanced convolutional neural network for pathology-image-oriented cancer diagnosis
Yin et al. Maize small leaf spot classification based on improved deep convolutional neural networks with a multi-scale attention mechanism
Gong et al. Patch matching and dense CRF-based co-refinement for building change detection from bi-temporal aerial images
Chen et al. Improving building change detection in VHR remote sensing imagery by combining coarse location and co-segmentation
Wang et al. A novel method of aircraft detection based on high-resolution panchromatic optical remote sensing images
Cai et al. An adaptive image segmentation method with automatic selection of optimal scale for extracting cropland parcels in smallholder farming systems
Zhang et al. A comprehensive evaluation of approaches for built-up area extraction from landsat oli images using massive samples
Liang et al. An adaptive hierarchical detection method for ship targets in high-resolution SAR images
Liu et al. Incorporating deep features into GEOBIA paradigm for remote sensing imagery classification: A patch-based approach
Zhang et al. Urban area extraction by regional and line segment feature fusion and urban morphology analysis
Xue et al. Object-oriented crop classification using time series sentinel images from google earth engine
Wang et al. Supervised classification high-resolution remote-sensing image based on interval type-2 fuzzy membership function
Hua et al. Real-time object detection in remote sensing images based on visual perception and memory reasoning
Jin et al. Extraction of arecanut planting distribution based on the feature space optimization of PlanetScope imagery
Amin et al. FabNet: A features agglomeration-based convolutional neural network for multiscale breast cancer histopathology images classification
Ni et al. TongueCaps: An improved capsule network model for multi-classification of tongue color
Herrera et al. A stereovision matching strategy for images captured with fish-eye lenses in forest environments

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22844827

Country of ref document: EP

Kind code of ref document: A1