CN113989563A - Multi-scale multi-label fusion Chinese medicine tongue picture classification method - Google Patents

Multi-scale multi-label fusion Chinese medicine tongue picture classification method Download PDF

Info

Publication number
CN113989563A
CN113989563A CN202111273511.7A CN202111273511A CN113989563A CN 113989563 A CN113989563 A CN 113989563A CN 202111273511 A CN202111273511 A CN 202111273511A CN 113989563 A CN113989563 A CN 113989563A
Authority
CN
China
Prior art keywords
tongue
classification
label
scale
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111273511.7A
Other languages
Chinese (zh)
Inventor
张明川
赵凌昊
徐文萱
王琳
郑瑞娟
冀治航
宋建强
朱军龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Science and Technology
Original Assignee
Henan University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Science and Technology filed Critical Henan University of Science and Technology
Priority to CN202111273511.7A priority Critical patent/CN113989563A/en
Publication of CN113989563A publication Critical patent/CN113989563A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/90ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to alternative medicines, e.g. homeopathy or oriental medicines

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Alternative & Traditional Medicine (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Image Analysis (AREA)

Abstract

A multi-scale multi-label fused traditional Chinese medicine tongue picture classification method relates to the technical field of computer vision, applies deep learning theoretical knowledge to tongue picture feature classification, fuses high-level semantic information and low-level detail features by utilizing a feature pyramid network to form a feature map with higher resolution, labels processed tongue picture features and extracts the relevance of labels to obtain a classification result. The invention has the beneficial effects that: the method applies the deep learning theoretical knowledge to tongue picture characteristic classification, increases the resolution and diversity of the characteristics by extracting and fusing the multi-scale characteristics in the characteristic pyramid network, and performs multi-label classification, thereby improving the accuracy and robustness of tongue picture classification.

Description

Multi-scale multi-label fusion Chinese medicine tongue picture classification method
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a multi-scale multi-label fusion Chinese medicine tongue image classification method.
Background
The automatic classification of the tongue manifestation characteristics of traditional Chinese medicine is the core content of the objectification of tongue diagnosis, and the accuracy of the classification result determines the reliability of subsequent processing and the acceptance degree of practitioners of traditional Chinese medicine. According to the diagnosis principle of the exterior and interior in traditional Chinese medicine, the change of tongue manifestation characteristics reflects the functional pathological changes of human viscera, which is a manifestation of the abundance or insufficiency of qi and blood on the tongue, so that the automatic classification of tongue manifestation becomes the objective research hotspot of tongue diagnosis.
When analyzing tongue picture characteristics by using a computer, the tongue picture characteristics related to physiological functions and pathological changes of the body are obtained from tongue images. The tongue color has small color difference and certain similarity, so the tongue image classification precision is higher and higher. However, most tongue picture classification tasks in the research are set as multi-class (or binary) classification problems with a single label, while few research using multi-label learning have the same effect because the number of labels is small and deep learning techniques are not used. The classification problem in medicine should be multi-output classification from the practical point of view, and multi-label classification is one of multi-output classification.
In the past, most of classification researches aiming at tongue manifestation are to classify each label independently, and the potential dependency relationship among the labels is ignored, so that the potential dependency relationship among the targets can improve the classification effect of multi-label images to a certain extent. Few studies using multiple tags either do not use deep learning techniques or do not fully mine the dependency between the tags, which affects the accuracy of tongue classification.
Disclosure of Invention
The invention aims to solve the technical problem of providing a multi-scale multi-label fusion Chinese medicine tongue image classification method, and solves the problems of low tongue image classification accuracy and the like in the prior art.
The technical scheme adopted by the invention for solving the technical problems is as follows: a multi-scale multi-label fused traditional Chinese medicine tongue image classification method comprises the following steps:
step 1, acquiring a tongue image under the conditions of standard light and fixed professional photographing equipment, preprocessing the acquired tongue image, and constructing an original tongue image data set;
step 2, constructing a multi-scale feature fusion network, extracting features of the input tongue image in a feature pyramid network structure, fusing corresponding layer features in a pyramid model through superposition operation, constructing final output features, and integrating the fused features;
step 3, marking the tongue image characteristics extracted in the step 2 by using a semi-automatic marking method;
step 4, dividing the characteristics marked in the step 3 into different subclass label sets according to different tongue body areas, and then integrating all the subclass label sets into a multi-label data set; for example, the result of the feature sub-label set of the tongue tip region after analysis is about the aspects of heart and lung, and the cracked tongue is mainly distributed in the tongue root and the tongue middle position, so that the cracked tongue only needs to be divided into two regions for analysis: root of tongue and tongue in the middle;
step 5, training a classification model by adopting a multi-label classification method, automatically mining the correlation among labels in the training process of the model, and applying the correlation to the classification model to enable tongue picture classification to be more comprehensive;
step 6, inputting a tongue image photo to be tested, judging the validity of the tongue image photo, and performing the next step if the requirement is met;
and 7, putting the tongue picture to be tested into the trained model for prediction, and outputting a classification result.
The semi-automatic labeling method used in the step 3 of the invention is to adopt Labelme open source image labeling software of Python, firstly manually perform region division and labeling on the approximate structure of the tongue picture through the software, then manually determine the labeling method and range, automatically label the rest pictures by using the software, and finally manually check.
The specific process of training the classification model by adopting the multi-label classification method in the step 5 comprises the following steps: after the tongue picture enters a convolutional neural network, automatically analyzing various sub-label sets and excavating the correlation among small targets so as to analyze various characteristics of the tongue picture; and then continuously adjusting parameters in the training process, optimizing the whole network model and storing the optimal weight information.
The method for judging the validity of the tongue picture to be tested in the step 6 comprises the following steps: and judging whether the ratio of the tongue body in the whole photographic picture meets the requirement or not.
The ratio of the tongue body in the whole tongue image picture is judged by 80 percent, the requirement is met when the ratio is more than or equal to 80 percent, and the requirement is not met when the ratio is less than 80 percent.
The invention has the beneficial effects that: the method applies the deep learning theoretical knowledge to tongue picture characteristic classification, increases the resolution and diversity of the characteristics by extracting and fusing the multi-scale characteristics in the characteristic pyramid network, and performs multi-label classification, thereby improving the accuracy and robustness of tongue picture classification.
Drawings
FIG. 1 is a schematic diagram of tongue image classification structure according to the present invention;
FIG. 2 is a schematic diagram of a feature extraction network according to an embodiment of the present invention;
FIG. 3 is a schematic illustration of tongue region positioning and characterization according to an embodiment of the present invention;
FIG. 4 is a schematic overall flowchart of the tongue image classification method according to the present invention.
Detailed Description
In the invention, a multi-scale multi-label fused Chinese medicine tongue picture classification method is provided. For tongue picture classification, firstly, the resolution of the acquired tongue picture is improved, the recognition degree of a small target is enhanced, the processed tongue picture is subjected to feature extraction, fusion and labeling, and then relevance analysis is carried out, so that more comprehensive multi-label tongue picture classification is realized.
The training part in the invention consists of 2 parts (as shown in figure 1), which are respectively constructed with a characteristic pyramid network and a depth model, and the two parts jointly complete the whole tongue picture classification process of multi-scale multi-label fusion. The feature pyramid network has two main purposes: constructing a tongue picture data set and extracting characteristics (the main process comprises extracting, fusing and re-extracting the characteristics of the tongue picture); the depth model construction mainly comprises two processes: tongue feature labeling and tongue classification.
1. Stage for constructing characteristic pyramid network
Firstly, a depth network is constructed based on a depth convolution characteristic paradigm, and a pyramid strategy is utilized to fuse multi-scale characteristics to construct a deep abstract representation of an input tongue picture.
The feature pyramid network improves the accuracy of tongue picture high-level features (the low-level features are general information easy to express and contain more position and detail information, the high-level features are complex semantic features and global features which are difficult to explain, the resolution is low, and the perception capability of details is poor), a feature map with a larger resolution is formed, and the situation that a small target is ignored due to too many downsampling of a deep convolution network is avoided.
The whole process of constructing the feature pyramid network is as follows:
(1) a tongue picture data set is constructed. The tongue image is collected under standard light, so that the shot picture has higher definition, the tongue body is relaxed as much as possible, the tongue surface is horizontally displayed, and the tongue body is fully exposed. The acquired tongue image is preprocessed. The method specifically comprises the following steps: and correcting the deflection of the tongue body, removing redundancy of the tongue root part, and constructing a tongue picture data set by extracting the tongue body structure according to the division of the tongue tip and the two sides of the tongue in the tongue body.
(2) And extracting the characteristic information of the tongue picture by using a characteristic pyramid network structure. After the picture is continuously convoluted, the size of the feature map becomes small, semantic information becomes rich, but the resolution of the feature map is reduced, so that the detection of small targets becomes difficult. Because there are many small targets in the tongue image, and a large number of small targets need to be detected and analyzed in the tongue image, a characteristic pyramid network is adopted, a series of characteristic graphs from large to small, from low layer to high layer, from high resolution to low resolution are obtained after a series of convolutions, and the high-layer characteristic graphs can be restored back step by upsampling, so that the size of the characteristic graph is increased under the condition that high-layer semantic information is not lost to a certain extent, and then the small targets are detected, thereby solving the problem that the small targets are difficult to detect, namely equivalently improving the resolution of the high-layer characteristic graph.
(3) And building a basic characteristic fusion network. And constructing a multi-scale feature fusion network by stacking a plurality of convolution layers and sampling layers. In the feature pyramid network, a feature extraction network is shown in fig. 2. The feature layer with large scale (such as p 2) has lower level, high resolution and rich color features, and the size is reduced by downsampling to make the size of the feature layer the same as that of p 3; the feature layer with small scale (such as p 4) is high in level and low in resolution, the size of the feature layer is the same as that of p3 through upsampling, and feature points at the corresponding positions of the feature layer and the feature point are added to generate a feature map with the same size as that of p 3. The tongue image feature maps are fused and overlapped, corresponding layer features in the pyramid model are fused through overlapping operation, final output features are constructed, feature maps with different sizes can be mutually overlapped through sampling, and feature maps with larger resolution are formed.
In order to solve the problem that the identification of small targets in multi-label image classification is easy to lose, a basic feature fusion network uses a feature pyramid network as a basic feature extraction network, and uses a fused p3 layer as a final feature layer.
2. Stage of building depth model
The method comprises the steps of constructing a depth model mainly comprising a tongue picture feature labeling architecture and a multi-label tongue picture feature classification model, obtaining a multi-label data set for a tongue picture feature map after fusion by adopting a semi-automatic labeling method, and training the classification model by utilizing a multi-label classification network to realize the tongue picture multi-label classification problem.
(1) Tongue image labeling architecture
In the fused tongue feature images, medical personnel individually label each label of several groups of sample tongue images, and mainly integrate the unique pathological information represented by each fixed region of the tongue based on professional medical knowledge; each label corresponds to physiological and pathological information of a corresponding area of the tongue, the quantity of various information of each label is ensured to be balanced as much as possible, and the labeling mode is image-level weak supervision labeling; then selecting a simple automatic tongue image labeling tool (adopting Labelme open source image labeling software of Python) and automatically labeling according to the method and range of the previous labeling of medical personnel; the feature information labels represented by the concerned tongue region are combined into one large label (for example, in fig. 3, all labels in the heart and lung regions are combined into one large label), so as to obtain a multi-label data set, and finally, the medical professional performs review. The tongue corresponding region positioning and features are shown in fig. 3.
(2) Multi-label tongue picture characteristic classification model
Unlike the traditional classification problem, multi-label classification is a more complex classification task, and each sample can belong to one or more classes simultaneously. If multi-label learning is adopted to solve the multi-label classification problem, the tongue picture can be classified more accurately and more comprehensively. The traditional tongue picture classification is often the output of single attribute and single label, and a plurality of characteristics can be simultaneously detected and output by the method provided by the invention. For example, a tongue image to be predicted is input, and the output prediction result is: the red tongue has white coating, smooth body fluid and no cracks, and the tongue area shows various characteristics such as vigorous heart fire, deficiency of spleen and stomach, and the like.
A multi-label learning convolutional neural network model method is adopted to train a characteristic information model contained in the labels of each area of the tongue, the residual samples of each label are deduced to obtain a multi-label classification method training classification model, the correlation among the labels is automatically mined, and the multi-label training classification model is effectively applied to the classification model, so that tongue picture classification is more accurate.

Claims (5)

1. A multi-scale multi-label fused traditional Chinese medicine tongue image classification method is characterized by comprising the following steps:
step 1, acquiring a tongue image under the conditions of standard light and fixed professional photographing equipment, preprocessing the acquired tongue image, and constructing an original tongue image data set;
step 2, constructing a multi-scale feature fusion network, extracting features of the input tongue image in a feature pyramid network structure, fusing corresponding layer features in a pyramid model through superposition operation, constructing final output features, and integrating the fused features;
step 3, marking the tongue image characteristics extracted in the step 2 by using a semi-automatic marking method;
step 4, dividing the characteristics marked in the step 3 into different subclass label sets according to different tongue body areas, and then integrating all the subclass label sets into a multi-label data set;
step 5, training a classification model by adopting a multi-label classification method, automatically mining the correlation among labels in the training process of the model, and applying the correlation to the classification model to enable tongue picture classification to be more comprehensive;
step 6, inputting a tongue image photo to be tested, judging the validity of the tongue image photo, and performing the next step if the requirement is met;
and 7, putting the tongue picture to be tested into the trained model for prediction, and outputting a classification result.
2. The method for classifying tongue images in multi-scale and multi-label fusion in traditional Chinese medicine according to claim 1, wherein the semi-automatic labeling method used in step 3 is Labelme open source image labeling software of Python, wherein the general structure of tongue images is manually partitioned and labeled through software, then the labeling method and range are manually determined, the remaining images are automatically labeled through software, and finally, the images are manually checked.
3. The method for classifying tongue images of multi-scale and multi-label fusion traditional Chinese medicine according to claim 1, wherein the specific process of training the classification model by using the multi-label classification method in the step 5 is as follows: after the tongue picture enters a convolutional neural network, automatically analyzing various sub-label sets and excavating the correlation among small targets so as to analyze various characteristics of the tongue picture; and then continuously adjusting parameters in the training process, optimizing the whole network model and storing the optimal weight information.
4. The method for classifying tongue images of multi-scale and multi-label fusion in traditional Chinese medicine according to claim 1, wherein the method for judging the validity of the tongue image photo to be tested in step 6 comprises: and judging whether the ratio of the tongue body in the whole photographic picture meets the requirement or not.
5. The method for classifying the tongue images of the multi-scale and multi-label fusion traditional Chinese medicine as claimed in claim 4, wherein the ratio of the tongue body in the whole tongue image picture is determined by 80%, and if the ratio is greater than or equal to 80%, the requirement is met, and if the ratio is less than 80, the requirement is not met.
CN202111273511.7A 2021-10-29 2021-10-29 Multi-scale multi-label fusion Chinese medicine tongue picture classification method Pending CN113989563A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111273511.7A CN113989563A (en) 2021-10-29 2021-10-29 Multi-scale multi-label fusion Chinese medicine tongue picture classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111273511.7A CN113989563A (en) 2021-10-29 2021-10-29 Multi-scale multi-label fusion Chinese medicine tongue picture classification method

Publications (1)

Publication Number Publication Date
CN113989563A true CN113989563A (en) 2022-01-28

Family

ID=79744527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111273511.7A Pending CN113989563A (en) 2021-10-29 2021-10-29 Multi-scale multi-label fusion Chinese medicine tongue picture classification method

Country Status (1)

Country Link
CN (1) CN113989563A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117392138A (en) * 2023-12-13 2024-01-12 四川大学 Tongue picture image processing method, storage medium and electronic equipment
CN117853345A (en) * 2024-03-07 2024-04-09 吉林大学 Image optimization method and system for traditional Chinese medicine tongue diagnosis and tongue image imaging
CN117853345B (en) * 2024-03-07 2024-05-24 吉林大学 Image optimization method and system for traditional Chinese medicine tongue diagnosis and tongue image imaging

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117392138A (en) * 2023-12-13 2024-01-12 四川大学 Tongue picture image processing method, storage medium and electronic equipment
CN117392138B (en) * 2023-12-13 2024-02-13 四川大学 Tongue picture image processing method, storage medium and electronic equipment
CN117853345A (en) * 2024-03-07 2024-04-09 吉林大学 Image optimization method and system for traditional Chinese medicine tongue diagnosis and tongue image imaging
CN117853345B (en) * 2024-03-07 2024-05-24 吉林大学 Image optimization method and system for traditional Chinese medicine tongue diagnosis and tongue image imaging

Similar Documents

Publication Publication Date Title
CN110321923B (en) Target detection method, system and medium for fusion of different-scale receptive field characteristic layers
US20210407076A1 (en) Multi-sample Whole Slide Image Processing in Digital Pathology via Multi-resolution Registration and Machine Learning
CN107977671B (en) Tongue picture classification method based on multitask convolutional neural network
CN109344701B (en) Kinect-based dynamic gesture recognition method
CN109857889B (en) Image retrieval method, device and equipment and readable storage medium
CN108596102B (en) RGB-D-based indoor scene object segmentation classifier construction method
CN113989662B (en) Remote sensing image fine-grained target identification method based on self-supervision mechanism
CN111062441A (en) Scene classification method and device based on self-supervision mechanism and regional suggestion network
CN114998220B (en) Tongue image detection and positioning method based on improved Tiny-YOLO v4 natural environment
CN110929746A (en) Electronic file title positioning, extracting and classifying method based on deep neural network
CN113963240A (en) Comprehensive detection method for multi-source remote sensing image fusion target
CN114821014A (en) Multi-mode and counterstudy-based multi-task target detection and identification method and device
CN111563550A (en) Sperm morphology detection method and device based on image technology
Ge et al. Coarse-to-fine foraminifera image segmentation through 3D and deep features
CN113989563A (en) Multi-scale multi-label fusion Chinese medicine tongue picture classification method
US20220304617A1 (en) System and method for diagnosing small bowel cleanliness
CN116468690B (en) Subtype analysis system of invasive non-mucous lung adenocarcinoma based on deep learning
CN117333948A (en) End-to-end multi-target broiler behavior identification method integrating space-time attention mechanism
CN114947751A (en) Mobile terminal intelligent tongue diagnosis method based on deep learning
Xiang et al. Application of convolutional neural network algorithm in diagnosis of chronic cough and tongue in children with traditional Chinese medicine
CN113269195A (en) Reading table image character recognition method and device and readable storage medium
CN112598056A (en) Software identification method based on screen monitoring
CN113096079A (en) Image analysis system and construction method thereof
CN115690092B (en) Method and device for identifying and counting amoeba cysts in corneal confocal image
CN113627255B (en) Method, device and equipment for quantitatively analyzing mouse behaviors and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination