WO2020207377A1 - 一种图像识别模型训练及图像识别方法、装置及系统 - Google Patents

一种图像识别模型训练及图像识别方法、装置及系统 Download PDF

Info

Publication number
WO2020207377A1
WO2020207377A1 PCT/CN2020/083489 CN2020083489W WO2020207377A1 WO 2020207377 A1 WO2020207377 A1 WO 2020207377A1 CN 2020083489 W CN2020083489 W CN 2020083489W WO 2020207377 A1 WO2020207377 A1 WO 2020207377A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
label
strong
lesion
information
Prior art date
Application number
PCT/CN2020/083489
Other languages
English (en)
French (fr)
Inventor
郑瀚
孙钟前
尚鸿
付星辉
杨巍
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2020207377A1 publication Critical patent/WO2020207377A1/zh
Priority to US17/321,219 priority Critical patent/US11967414B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30028Colon; Small intestine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30092Stomach; Gastric
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Definitions

  • This application relates to the field of computer technology, and in particular to an image recognition model training and image recognition method, device and system.
  • the diagnosis of gastrointestinal diseases is usually based on diagnostic tools such as endoscopy. After obtaining internal images of the body, relevant medical personnel can judge whether there is a disease and the type of disease through human eye observation. The recognition efficiency is low. In some current recognition methods, by acquiring a large number of endoscopic images, the relevant medical staff will mark each image with the lesion category, and use the marked images as samples for model training, so that other medical images can be performed based on the trained model. Lesion recognition, determine whether a lesion has occurred, and automatically give the diagnosis result.
  • the embodiments of the present application provide an image recognition model training and image recognition method, device, and system to improve the accuracy of lesion prediction.
  • the embodiment of the application provides an image recognition model training method, including:
  • the training image sample set includes at least strong label training image samples; wherein, the strong label training image samples represent image samples with strong label information, and the strong label information includes at least a lesion category and a lesion location Labeling information;
  • the strong supervision objective function is a loss function between the identified lesion category and the lesion category in the strong label information.
  • the embodiment of the present application also provides an image recognition method, including:
  • the image feature information of the image to be recognized is used as an input parameter to obtain the recognition result of the lesion category of the image to be recognized; wherein, the image recognition model uses at least strong label training image samples
  • the training image sample set is trained to determine the recognition result of the lesion category; the strong label training image sample represents the image sample with strong label information, and the strong label information includes at least the label information of the lesion category and the location of the lesion.
  • the embodiment of the application provides an image recognition model training device, including:
  • the acquiring module is used to acquire a training image sample set, wherein the training image sample set includes at least strong label training image samples; wherein, the strong label training image samples represent image samples with strong label information, and the strong label information includes at least Labeling information of lesion type and location;
  • An extraction module for extracting image feature information of image samples in the training image sample set
  • the training module is used to mark the image characteristic information belonging to each preset disease category based on the image characteristic information of the image sample and the corresponding strong label information, and train the image recognition model according to the marking result until the image recognition model
  • the strong supervision objective function converges to obtain a trained image recognition model.
  • the strong supervision objective function is a loss function between the identified lesion category and the lesion category in the strong label information.
  • An embodiment of the present application also provides an image recognition device, including:
  • the acquisition module is used to acquire the image to be recognized
  • An extraction module for extracting image feature information of the image to be recognized
  • the recognition module is used to obtain the recognition result of the lesion category of the image to be recognized based on a preset image recognition model and the image feature information of the image to be recognized as input parameters; wherein, the image recognition model adopts at least The training image sample set of the strong label training image samples is trained to determine the result of lesion category recognition; the strong label training image samples represent image samples with strong label information, and the strong label information includes at least the label information of the lesion category and the location of the lesion.
  • the embodiment of the present application also provides an image recognition system, which at least includes: an image acquisition device, an image processing device, and an output device, wherein:
  • Image acquisition equipment for acquiring images to be identified
  • a processing device for extracting the image feature information of the image to be recognized, and based on a preset image recognition model, using the image feature information of the image to be recognized as an input parameter, to obtain the recognition result of the lesion category of the image to be recognized
  • the image recognition model is trained using a training image sample set that includes at least strong label training image samples to determine the result of lesion category recognition; strong label training image samples represent image samples with strong label information, and the strong label
  • the information includes at least the type of lesion and the label information of the location of the lesion;
  • the display device is used to output the recognition result of the lesion category of the image to be recognized.
  • An embodiment of the present application also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor implements any of the above-mentioned image recognition models when the program is executed. Training method or image recognition method steps.
  • the embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of any of the above-mentioned image recognition model training methods or image recognition methods are realized.
  • the embodiment of the application is not only based on the label information of the lesion category, but also can use the location of the lesion to more accurately locate the image feature information of a certain lesion category, so that the image feature information belonging to the lesion category in the strong label can be more accurately distinguished from the image feature information that does not belong to the lesion.
  • the image feature information of the category reduces the noise of training samples, improves the reliability of training, and makes the prediction of the image recognition model more accurate.
  • Figure 1 is a schematic diagram of the application architecture of image recognition model training and image recognition methods in an embodiment of the application
  • FIG. 2 is a flowchart of an image recognition model training method in an embodiment of the application
  • 3 is a schematic diagram of strong label training image samples and weak label training image samples in an embodiment of the application
  • FIG. 4 is a schematic diagram of a feature map and a label map of a strong label training image sample in an embodiment of the application
  • Fig. 5 is a schematic block diagram of an image recognition model training method in an embodiment of the application.
  • FIG. 6 is a schematic diagram of the implementation logic of the supervision separation layer in the training of the image recognition model in the embodiment of the application;
  • FIG. 7 is a flowchart of an image recognition method in an embodiment of the application.
  • Fig. 8 is a schematic block diagram of an image recognition method in an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of an image recognition system in an embodiment of the application.
  • FIG. 10 is a schematic diagram of the structure of an image recognition model training device in an embodiment of the application.
  • FIG. 11 is a schematic diagram of the structure of an image recognition device in an embodiment of the application.
  • Weak label information indicates the label information that only includes information required for a single task. In the embodiment of the present application, it indicates that only the label information of the lesion type is included.
  • Strong label information indicates labeling information that includes other related information in addition to the information required by the task.
  • the labeling information includes at least the lesion type and the location of the lesion.
  • the lesion category can indicate the classification of various digestive tract lesion properties, such as benign, malignant, etc.
  • the lesion location indicates the location of the lesion area that leads to a certain lesion category.
  • Deeply Supervised Object Detector (DSOD) algorithm It is a detection algorithm that does not require pre-training.
  • Intersection-over-Union Represents the ratio between the intersection of the two regions and the union of the two regions. It can also be understood as the overlap rate between the candidate box generated by the detection result and the original marked box , The ratio of their intersection and union, can be used to evaluate the accuracy of detection.
  • endoscopes are usually used as diagnostic tools to collect images of the stomach, esophagus and other parts.
  • endoscopes such as gastroscopes enter the patient’s esophagus, stomach, and stomach from the patient’s mouth.
  • colonoscopy the patient’s anus enters the patient’s colon and rectum for examination.
  • images can be archived to facilitate subsequent analysis by relevant medical personnel.
  • relevant medical personnel can only judge whether it exists through human observation The recognition efficiency and accuracy of lesions and the types of existing lesions are relatively low.
  • AI Artificial Intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology.
  • Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • Computer Vision is a science that studies how to make machines "see”. Furthermore, it refers to the use of cameras and computers instead of human eyes to identify, track, and measure targets. And further graphics processing, so that the computer processing becomes more suitable for human eyes to observe or send to the instrument to detect images.
  • Computer vision studies related theories and technologies trying to establish an artificial intelligence system that can obtain information from images or multi-dimensional data.
  • Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping Construction and other technologies also include common facial recognition, fingerprint recognition and other biometric recognition technologies.
  • Machine Learning is a multi-disciplinary interdisciplinary, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. Specializing in the study of how computers simulate or realize human learning behaviors in order to acquire new knowledge or skills, and reorganize the existing knowledge structure to continuously improve its own performance.
  • Machine learning is the core of artificial intelligence, the fundamental way to make computers intelligent, and its applications cover all fields of artificial intelligence.
  • Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning techniques.
  • artificial intelligence technology has been researched and applied in many fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, autonomous driving, drones , Robotics, intelligent medical care, intelligent customer service, etc., I believe that with the development of technology, artificial intelligence technology will be applied in more fields and play more and more important values.
  • the relevant medical staff will mark each image with the type of disease, and use the marked images as samples for model training, which can be based on
  • the trained model can identify lesions in other medical images, determine whether a lesion has occurred, and automatically give a diagnosis result.
  • the label of the training image sample is usually consistent with the target task, only a single label at the same level as the task. For example, if the target task is to determine the nature of gastric lesions, the label is the lesion category of each image, which will lead to Model accuracy is low.
  • an image recognition model training method uses strong label training image samples with more label information.
  • the strong label information includes at least the label information of the lesion category and the location of the lesion, and the image in the training image sample set is extracted Image feature information of the sample.
  • the image feature information of the image sample and the corresponding strong label information the image feature information belonging to each preset lesion category is marked.
  • the image recognition model is trained until the strong supervision objective function of the image recognition model converges, and the trained image recognition model is obtained. Furthermore, based on the trained image recognition model, lesion recognition can be performed on the image to be recognized.
  • the location of the lesion can further assist in the identification of the lesion category. Therefore, it can achieve better results under the same amount of data, and provide a new training method for gastrointestinal endoscopic medical diagnosis methods, thereby making the image recognition model more accurate and improving the accuracy of lesion recognition and prediction.
  • the strong label training image samples and the weak label training image samples can also be combined at the same time to train the image recognition model. Compared with only the weak label training image samples for training, the image can also be improved to a certain extent. Identify the predictive accuracy of the model.
  • FIG. 1 is a schematic diagram of the application architecture of image recognition model training and image recognition methods in an embodiment of the application, including a server 100 and a terminal device 200.
  • the terminal device 200 may be a medical device.
  • the user can view the image lesion recognition result based on the terminal device 200.
  • the terminal device 200 and the server 100 may be connected via the Internet to realize mutual communication.
  • the aforementioned Internet uses standard communication technologies and/or protocols.
  • the Internet is usually the Internet, but it can also be any network, including but not limited to Local Area Network (LAN), Metropolitan Area Network (MAN), Wide Area Network (WAN), mobile, wired or wireless Any combination of network, private network, or virtual private network.
  • technologies and/or formats including HyperText Mark-up Language (HTML), Extensible Markup Language (XML), etc. are used to represent data exchanged over the network.
  • HTML HyperText Mark-up Language
  • XML Extensible Markup Language
  • SSL Secure Socket Layer
  • TLS Transport Layer Security
  • VPN Virtual Private Network
  • IPsec Internet Protocol Security
  • the server 100 can provide various network services for the terminal device 200.
  • the server 100 may be a server, a server cluster composed of several servers, or a cloud computing center.
  • the server 100 may include a processor 110 (Center Processing Unit, CPU), a memory 120, an input device 130, an output device 140, etc.
  • the input device 130 may include a keyboard, a mouse, a touch screen, etc.
  • the output device 140 may include a display Equipment, such as Liquid Crystal Display (LCD), Cathode Ray Tube (CRT), etc.
  • LCD Liquid Crystal Display
  • CRT Cathode Ray Tube
  • the memory 120 may include a read only memory (ROM) and a random access memory (RAM), and provides the processor 110 with program instructions and data stored in the memory 120.
  • the memory 120 may be used to store the image recognition model training method or the program of the image recognition method in the embodiment of the present application.
  • the processor 110 calls the program instructions stored in the memory 120, and the processor 110 is configured to execute any image recognition model training method or steps of the image recognition method in the embodiments of the present application according to the obtained program instructions.
  • the image recognition model training method or the image recognition method may be executed by the server 100.
  • the terminal device 200 may send the collected images of body parts such as the digestive tract to the server 100, and the server 100 may perform lesion recognition on the image, and may return the lesion recognition result to the terminal device 200.
  • the application architecture shown in FIG. 1 is described by taking the application on the server 100 side as an example.
  • the image recognition method in the embodiments of the present application may also be executed by the terminal device 200.
  • the terminal device 200 may obtain a trained image recognition model from the server 100 side, thereby performing lesion recognition on the image based on the image recognition model, and obtaining a lesion recognition result. This is not limited in the embodiments of this application.
  • the application architecture diagrams in the embodiments of the present application are intended to more clearly illustrate the technical solutions in the embodiments of the present application, and do not constitute a limitation to the technical solutions provided in the embodiments of the present application.
  • the technical solutions provided by the embodiments of the present application are not limited to digestive tract disease diagnosis business applications, and are equally applicable to similar problems in other application architectures and business applications.
  • FIG. 1 Each embodiment of the present application is schematically illustrated by taking the application architecture diagram shown in FIG. 1 as an example.
  • FIG. 2 is a flowchart of an image recognition model training method in an embodiment of this application.
  • the method may be executed by a computing device, such as the server 100 or the terminal device 200.
  • the method includes the following steps.
  • Step 200 Obtain a training image sample set.
  • the training image sample set includes at least strong label training image samples.
  • strong label training image samples represent image samples with strong label information.
  • the strong label information includes at least the label information of the lesion type and the location of the lesion.
  • Step 210 Extract image feature information of image samples in the training image sample set.
  • Step 220 Based on the image feature information of the image sample and the corresponding strong label information, mark the image feature information belonging to each preset lesion category, and train the image recognition model according to the labeling result until the strong supervision objective function of the image recognition model converges, Obtain a trained image recognition model.
  • the strong supervision objective function is the loss function between the identified lesion category and the lesion category in the strong label information.
  • a large number of endoscopic images of the digestive tract can be obtained in advance, and the relevant medical staff can simultaneously mark the lesion category and the location of the lesion, so as to obtain a large number of labeled training image samples with strong labels.
  • strong label training image samples and the method of the embodiments of the present application are used to improve the accuracy of lesion recognition.
  • the training image sample set may also include weak-label training image samples.
  • the weak-label training image samples represent image samples with weak-label information.
  • the weak label information includes label information of the lesion category. For example, when the relevant medical personnel are labeling, only the type of the lesion is marked, but the location of the lesion is not marked, and the sample at this time is the weak-label training image sample.
  • the image recognition model can be jointly trained by combining these two training image samples.
  • Fig. 3 is a schematic diagram of strong label training image samples and weak label training image samples in an embodiment of the application.
  • the left and right images are two images for the same type of disease.
  • the area in the box indicates the area where a lesion belonging to a certain type of lesion occurs.
  • On the right there is no positioning box, only the information of the lesion type label is included. That is, the left image is a strong label training image sample, and the right image is a weak label training image sample.
  • this application aims to improve the accuracy of the lesion prediction of the image recognition model by using more label information besides the lesion category. Therefore, strong label information is not limited to including the lesion category and the location of the lesion. It may also include lesion type and other label information, which is not limited in the embodiment of the present application.
  • step 210 may include the following steps.
  • the neural network structure is used for feature extraction.
  • the neural network is, for example, DSOD.
  • other deep neural networks with the same characterization ability can also be used, which is not limited in the embodiments of the present application.
  • the image feature information is the P*P*C dimension
  • P is the set value
  • P*P represents the P*P image blocks that divide the image sample horizontally and vertically
  • C is the preset number of lesion categories.
  • P is an arbitrary natural number set.
  • the dimension of the image feature information output after feature extraction is P*P*C.
  • the image is equally divided into 25 image blocks of 5*5, and the preset number of lesion categories is 10, the finally extracted image feature information is data of 5*5*10 dimensions.
  • Each data point can correspond to an image block, and the value of each data point represents the probability of whether the corresponding image block belongs to a certain disease category.
  • the image feature information may be processed by an activation function to map the image feature information data to a certain value range.
  • the sigmoid function is used to map to (0, 1).
  • the image feature information processed by the activation function can be used to train the image recognition model.
  • the image recognition model when the image recognition model is trained in step 220, the following implementation manners are provided according to the labeling conditions of the samples in the training image sample set.
  • Step 220 The first implementation of training the image recognition model: the training image sample set includes only strong label training image samples, then based on the image feature information of the image samples and the corresponding strong label information, the image features belonging to each preset disease category are labeled According to the marking results, the image recognition model is trained until the strong supervision objective function of the image recognition model converges, and the trained image recognition model is obtained.
  • this method may include the following steps.
  • the input of the image recognition model is the strong label training image sample, that is, the image sample with strong label information
  • the output is the identified lesion category
  • the objective function is Strong supervision objective function
  • the strong supervision objective function is continuously optimized to minimize and converge, that is, it is determined that the image recognition model training is completed.
  • the image recognition model can be trained based on only strong label training image samples, and the label information can be more abundant. According to the location of the lesion, more accurate image feature information belonging to a certain lesion category can be identified, so that the training information It is more reliable and reduces noise, so that the image recognition model obtained by training is more reliable and the accuracy is improved.
  • the method of determining the strong supervision objective function in step S1 above may include the following steps.
  • the label belongs to the image feature information of the lesion category in the corresponding strong label.
  • the label belongs to the image feature information of the lesion category in the corresponding strong label.
  • each strong label training image sample determines the overlap rate of each image block and the lesion position in the image feature information of the strong label training image sample according to the location of the lesion in the strong label information corresponding to the strong label training image sample. If the rate is not less than the threshold, the corresponding image block is marked as 1, otherwise it is marked as 0, and it is obtained whether the strong label training image sample belongs to the label information of the lesion category in the corresponding strong label information.
  • the IOU value can represent the overlap rate. If it is not less than a certain threshold, it means that the small image block The lesions belonging to the lesion category are more likely to be marked as 1, otherwise they are marked as 0, so that each image block of the strong label training image sample belongs to the label information of the lesion category in the strong label information.
  • each image block determines the overlap ratio between each image block and the location of the lesion.
  • other calculation methods can also be used, which are not limited in the embodiment of this application.
  • the proportion of each image block to the location of the lesion is calculated separately, that is, it occupies the positioning frame.
  • the ratio of is not less than a certain ratio, it is marked as 1, and it is considered that it is more likely to belong to the disease category, otherwise it is marked as 0.
  • the image feature information obtained after feature extraction of the training image sample is referred to as a feature map
  • the strong label training image sample corresponding to the label information of each lesion category is referred to as a label map.
  • the label image corresponds to a P*P*C dimension data.
  • Figure 4 is a schematic diagram of the feature map and label map of the strong label training image sample in the embodiment of this application.
  • Figure 4 (A) is the input strong label training image sample.
  • the positioning frame in the image sample The lesion category corresponding to the area is A, the image sample is divided into 4 image blocks, the numbers of the divided image blocks are 1, 2, 3, and 4 respectively.
  • Figure 4 (B) is the corresponding feature map, the feature map Each point corresponds to an image block. Calculate the overlap ratio between each image block and the positioning frame.
  • the IOU of the 1 and 2 image blocks and the positioning frame exceeds the threshold, the IOU of the 3 and 4 image blocks and the positioning frame is less than the threshold, then 1 The and 2 image blocks are marked as 1, and the 3 and 4 image blocks are marked as 0, that is, the label map as shown in (C) in Fig. 4 is obtained, indicating that the image sample belongs to the label map of the lesion category A.
  • the image feature information that does not belong to the lesion category in the corresponding strong label information that is, whether the image feature information of the obtained image sample belongs to the label of other preset lesion categories other than the lesion category in the strong label information information.
  • the label information is 0.
  • the strong label information of a strong label training image sample has a lesion category of A, the strong label training image sample belongs to lesion category B And the label information of C is 0.
  • S1.2 Determine the strong supervision objective function according to the label information and image feature information of whether each strong label training image sample belongs to each lesion category.
  • the loss function of the label information and the image feature information is used as the strong supervision objective function.
  • the strong supervision objective function is:
  • X strong represents a strong label training image sample
  • a represents any variable.
  • Step 220 The second implementation of training the image recognition model: the training image sample set includes strong label training image samples and weak label training image samples, then according to the image feature information of the image samples and the corresponding strong label information or weak label information, The image feature information belonging to each preset lesion category is marked, and the image recognition model is trained according to the marking result until the total objective function of the image recognition model converges, and the trained image recognition model is obtained.
  • the total objective function is the total loss function of the strong supervision objective function and the weak supervision objective function
  • the weak supervision objective function is the loss function between the identified lesion category and the lesion category in the weak label information.
  • the second implementation manner can be applied to situations where the training image samples that may be obtained have different labeling levels. For example, there may be training image samples that only label the lesion category, or there may be training images that both label the lesion category and the location of the lesion. Samples, without distinguishing between the two training image samples, joint training can also enrich the number of training image samples to a certain extent.
  • the embodiment of the present application provides a joint training method based on training image samples of different labeling levels, which may include the following steps.
  • the method of determining the strong supervision objective function in some embodiments is the same as that of the above-mentioned first embodiment, and will not be repeated here.
  • the training image sample is a weak-label training image sample
  • the input of the image recognition model is a weak-label training image sample, that is, an image sample with weak-label information
  • the output is the identified lesion category
  • the objective function is Is the weakly supervised objective function.
  • the embodiment of the present application provides a method for determining the weakly supervised objective function, including:
  • each weak-label training image sample can determine the probability of belonging to the lesion category for each preset lesion category, which is called a category feature map.
  • Each category feature map represents the probability that P*P image blocks in the image sample are the lesion category.
  • lesion category A there are two types of preset lesion categories, namely, lesion category A and lesion category B.
  • a certain weak-label training image sample is divided into 4 image blocks. If the weak label information is the lesion category A, the probability that the 4 image blocks of the weak label training image sample belong to the lesion category A respectively is determined, assuming 0.5, 0.8, 0.2, 0, and 3 respectively. Since the weak label is the lesion category A, the probability that the 4 image blocks of the weak label training image sample belong to the lesion category B is 0. Then, for each lesion category, the maximum probability is selected as the weak label The probability that the entire training image sample belongs to each lesion category.
  • the probability that the weak-label training image sample belongs to the lesion category A is 0.8, and the probability that it belongs to the lesion category B is 0.
  • the probability that the weak-label training image sample belongs to the lesion category A is 0.8, and the probability that it belongs to the lesion category B is 0.
  • the maximum value of the probability that each image block belongs to each preset lesion category and the loss function of the lesion category in the weak label information are calculated, and the loss function is used as the weak supervision objective function.
  • the weakly supervised objective function is:
  • X weak represents the weak label training image sample
  • the overall objective function is:
  • is a preset weight, which is used to weigh the proportion of the loss function of the strong label training image sample and the weak label training image sample in the total loss function.
  • the total objective function converges, that is, the strong supervision objective function and the weak supervision objective function are required to converge, and the training process is completed when both converge.
  • strong label training image samples and weak label training image samples may be combined to jointly train the image recognition model.
  • the existence of weak-label training image samples can be allowed to a certain extent, and all the label information in the training image samples can be fully utilized to improve the accuracy of the image recognition model.
  • the image recognition model obtained by training can not only be used to identify the lesion category, but also can be used To identify the location of the lesion.
  • Fig. 5 is a schematic block diagram of an image recognition model training method in an embodiment of the application.
  • the overall process of the image recognition model training method can be divided into two parts.
  • the first part feature extraction part.
  • the feature extraction part in Figure 5 is the basic structure of the DSOD model.
  • the image samples in the training image sample set undergo operations such as convolution, dense block, convolution, pooling, and dense block respectively.
  • the image feature information of the set dimension is output in the last layer, the dimension is P*P*C, which represents the feature map of the image sample.
  • the input feature map of the supervision separation layer is obtained.
  • the input feature map of the supervision separation layer is:
  • X strong represents a strong label training image sample
  • X weak represents a weak label training image sample
  • the image feature information output by the last layer of feature extraction is
  • Part 2 Supervise the separation layer.
  • the supervision separation layer mainly includes a strong supervision branch and a weak supervision branch.
  • the training image samples based on different label information are trained through different branches to determine whether the label information of the training image samples is strong label information. If it is strong label information, the strong label training image samples are entered into the strong supervision branch for training. If it is not the strong label information, the weak label training image samples are entered into the weak supervision branch for training.
  • lesion category A there are two preset lesion categories, namely lesion category A and lesion category B.
  • the strong supervision branch is mainly based on the lesion category and the location of the lesion in the strong label information, and the lesion category is predicted through training and the strong supervision objective function is determined.
  • the weak supervision branch is mainly based on the lesion category in the weak label information, and the lesion category is predicted through training and the weak supervision is determined.
  • Objective function In this way, continuous optimization of the strong-supervised objective function and the weak-supervised objective function achieves convergence, completes the training process, and achieves the purpose of training an image recognition model by using strong label training image samples and weak label training image samples simultaneously.
  • FIG. 6 is a schematic diagram of the implementation logic of the supervision separation layer in the training of the image recognition model in the embodiment of the application.
  • the image feature information of the image samples in the training image sample set passes through the separator and enters the strong supervision branch and the weak supervision branch respectively.
  • Strong supervision branch Supervise the input feature map of the separation layer, estimate each image block in the input feature map according to the location of the lesion in the strong label information, and obtain the label of whether each image block belongs to each preset lesion category. Obtain the corresponding label image. That is, there is a corresponding label map for a lesion category, and the loss function of the label map and the input feature map is used as the strong supervision objective function.
  • Weak supervision branch the input feature map of the supervision separation layer, because the weak label training image samples only have the label information of the lesion category, and there is no lesion location. Therefore, the overall estimation can only be made for the input feature map, judge whether the input feature map as a whole belongs to each preset lesion category, and obtain a corresponding total estimated probability for each preset lesion category to obtain the input feature map s Mark. That is, there is a corresponding probability for a lesion category, and the total estimated probability and the loss function of the lesion category in the weak label information are used as the weak supervision objective function.
  • the weak label training image sample only the label information of the lesion category is contained in the weak label training image sample, so during training, it can only be known whether all the image feature information in the overall input feature map belongs to a certain lesion category. In practice, only the image feature information of a few small image blocks may match the image feature information of the corresponding lesion category. In this way, the weak-label training image samples will introduce some noise image feature information during training.
  • the image recognition model is obtained by training based on strong label training image samples, or the combination of strong label training image samples and weak label training image samples, that is, at least strong label training image samples are required for training.
  • the location of the lesion is marked in the strong-label training image sample, not only the information of the lesion category can be used during training, but also the location information judged to be the lesion category, and the location of the lesion can be determined more accurately.
  • the image feature information represented by the occurrence of the lesion is extracted, and the noise is reduced.
  • the image recognition model training can be more accurate and reliable.
  • FIG. 7 is a flowchart of an image recognition method in an embodiment of the application.
  • the method may be executed by a computing device, such as the server 100 or the terminal device 200.
  • the method may include the following steps.
  • Step 700 Obtain an image to be recognized.
  • the image recognition model can be used to identify the disease category of gastrointestinal diseases, and the acquired image to be recognized is the collected digestive tract image.
  • Step 710 Extract image feature information of the image to be recognized.
  • Step 720 Based on the preset image recognition model, the image feature information of the image to be recognized is used as an input parameter to obtain the recognition result of the lesion category of the image to be recognized.
  • the image recognition model is trained by using a training image sample set that includes at least strong label training image samples to determine the result of lesion category recognition.
  • the strong label training image sample indicates the image sample with strong label information, and the strong label information includes at least the label information of the lesion category and the location of the lesion.
  • the image recognition model may be used to determine the relationship between the image block feature information of the lesion location determined by the strong label training image sample and the strong label information and the category of the lesion to determine the image to be recognized Whether each image block in the image feature information belongs to the lesion category, and then according to whether each image block belongs to the lesion category, it is determined whether the image to be recognized belongs to the lesion category, as the lesion of the image to be recognized Category recognition result.
  • the image recognition model in addition to using the relationship between the image block feature information of the lesion location determined from the strong label training image sample and the strong label information and the lesion category, the image recognition model can also be used to identify the weak
  • the relationship between the overall image feature information and the lesion category determined by the label training image sample and the weak label information is to determine whether the image feature information of the image to be recognized belongs to the lesion category.
  • the weak label information only includes label information of the lesion type.
  • the image recognition model here is a model obtained based on the image recognition model training method in the foregoing embodiment.
  • the image recognition model may be applied to, for example, an endoscopic assisted diagnosis system for identifying the type of lesion. Since the image recognition model is mainly trained based on strong label training image samples, it is more reliable and accurate. Therefore, it is more accurate to predict the lesion category based on the image recognition model obtained by training.
  • FIG. 8 is a schematic block diagram of the image recognition method in the embodiment of the application.
  • the first part feature extraction part.
  • the image to be recognized is subjected to operations such as convolution, dense block, convolution, pooling, dense block, etc., and the image feature information of the set dimension is output in the last layer, and the dimension is P *P*C represents the feature map of the image to be recognized.
  • the feature map may be passed through the sigmoid function to obtain the input feature map of the supervised separation layer.
  • Part 2 Supervise the separation layer.
  • the image feature information obtained through the feature extraction part is input into the image recognition model.
  • the image recognition model judges the image feature information and determines which type of lesion it belongs to. For example, there are two preset lesion types, namely lesion type A and For lesion category B, it is judged whether it belongs to lesion category A and whether it belongs to lesion category B respectively, and the final lesion category recognition result is obtained.
  • FIG. 9 is a schematic structural diagram of an image recognition system in an embodiment of this application.
  • the image recognition system at least includes an image acquisition device 90, a processing device 91 and a display device 92.
  • the image acquisition device 90, the processing device 91, and the display device 92 are related medical devices, which can be integrated in the same medical device, or can be divided into multiple devices, which are connected and communicated with each other to form a medical system for use
  • the image acquisition device 90 may be an endoscope
  • the processing device 91 and the display device 92 may be computer devices that communicate with the endoscope.
  • the image acquisition device 90 is used to acquire the image to be recognized.
  • the processing device 91 is configured to extract image feature information of the image to be recognized, and based on a preset image recognition model, use the image feature information of the image to be recognized as an input parameter to obtain a recognition result of the lesion category of the image to be recognized.
  • the image recognition model is trained by using a training image sample set that includes at least strong label training image samples to determine the result of lesion category recognition.
  • the strong label training image sample indicates the image sample with strong label information, and the strong label information includes at least the label information of the lesion category and the location of the lesion.
  • the display device 92 is used to output the recognition result of the lesion category of the image to be recognized.
  • FIG. 10 is an image recognition model training device in an embodiment of this application, which includes the following modules.
  • the obtaining module 1000 is used to obtain a training image sample set.
  • the training image sample set includes at least strong label training image samples.
  • the strong label training image sample represents an image sample with strong label information, and the strong label information includes at least the label information of the lesion category and the location of the lesion.
  • the extraction module 1010 is used to extract image feature information of image samples in the training image sample set.
  • the training module 1020 is used for marking the image characteristic information belonging to each preset disease category based on the image characteristic information of the image sample and the corresponding strong label information, and training the image recognition model according to the marking result until the image recognition model
  • the strongly supervised objective function converges, and the trained image recognition model is obtained.
  • the strong supervision objective function is a loss function between the identified lesion category and the lesion category in the strong label information.
  • the training image sample set further includes weak label training image samples.
  • Weak label training image samples represent image samples with weak label information.
  • the weak label information includes label information of the lesion category.
  • the training module 1020 is further configured to mark the image feature information belonging to each preset disease category according to the image feature information of the image sample and the corresponding strong label information or weak label information, and train the image recognition model according to the labeling result, Until the total objective function of the image recognition model converges, a trained image recognition model is obtained.
  • the total objective function is a total loss function of a strong supervision objective function and a weak supervision objective function
  • the weak supervision objective function is a loss function between the identified lesion category and the lesion category in the weak label information.
  • the extraction module 1010 when extracting image feature information of image samples in the training image sample set, is specifically configured to:
  • the image feature information is P*P*C dimension
  • P is a set value
  • P*P represents P*P image blocks that divide the image sample horizontally and vertically
  • C is the preset number of lesion categories .
  • the training module 1020 is specifically used to:
  • the strong supervision objective function is optimized until the strong supervision function converges, and the training is determined to be completed.
  • the training module 1020 is specifically used to:
  • the overall objective function is optimized until the overall objective function converges, and the training is determined to be completed.
  • the training module 1020 specifically uses in:
  • each strong label training image sample determines the overlap rate of each image block and the lesion position in the image feature information of the strong label training image sample according to the location of the lesion in the strong label information corresponding to the strong label training image sample. If the rate is not less than the threshold, mark the corresponding image block as 1, otherwise mark it as 0, and obtain the label information of whether the strong label training image sample belongs to the lesion category in the corresponding strong label information;
  • the label information is 0;
  • the strong supervision objective function is determined according to the label information and image feature information of whether each strong label training image sample belongs to each lesion category.
  • the training module 1020 specifically uses in:
  • the probability that each image block in the image feature information of the weak-label training image sample belongs to each preset disease category is determined according to the lesion category in the weak-label information corresponding to the weak-label training image sample;
  • the weak-supervised objective function is determined according to the maximum value of the probability that each image block of each weak-label training image sample belongs to each preset lesion category and the corresponding lesion category in the weak-label information.
  • the image recognition device in the embodiment of the present application specifically includes:
  • the obtaining module 1100 is used to obtain the image to be recognized
  • the extraction module 1110 is used to extract the image feature information of the image to be recognized
  • the recognition module 1120 is configured to obtain the recognition result of the lesion category of the image to be recognized based on a preset image recognition model and using the image feature information of the image to be recognized as input parameters; wherein, the image recognition model adopts at least The training image sample set including strong label training image samples is trained to determine the recognition result of the lesion category; the strong label training image sample represents the image sample with strong label information, and the strong label information includes at least the label information of the lesion category and the location of the lesion .
  • the embodiment of the present application also provides an electronic device of another exemplary implementation manner.
  • the electronic device in the embodiments of the present application may include a memory, a processor, and a computer program stored in the memory and running on the processor. Wherein, when the processor executes the program, the steps of the image recognition model training method or the image recognition method in the foregoing embodiments can be implemented.
  • the processor in the electronic device is the processor 110 in the server 100
  • the memory in the electronic device is the memory in the server 100. 120.
  • a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the image recognition model training method in any of the foregoing method embodiments Or image recognition methods.
  • each implementation manner can be implemented by means of software plus a general hardware platform, and of course, it can also be implemented by hardware.
  • the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, An optical disc, etc., includes a number of instructions to make a control device (which may be a personal computer, a server, or a network device, etc.) execute the methods described in each embodiment or some parts of the embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Pathology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及计算机技术领域,尤其涉及一种图像识别模型训练及图像识别方法、装置及系统。该识别方法获取待识别图像;提取所述待识别图像的图像特征信息;基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息。根据病变位置可以更准确地定位出某病变类别的图像特征信息,减少噪声,提高可靠性和准确性。

Description

一种图像识别模型训练及图像识别方法、装置及系统
本申请要求于2019年04月10日提交中国专利局、申请号为201910284918.6、名称为“一种图像识别模型训练及图像识别方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种图像识别模型训练及图像识别方法、装置及系统。
背景
针对各种医疗影像诊断分析中,例如,消化道疾病的诊断,通常是基于内镜等诊断工具,获得机体的内部图像后,相关医疗人员通过人眼观察判断是否存在病变以及存在病变的类别,识别效率较低。目前的一些识别方法,通过获取大量内镜影像,由相关医疗人员对每张图像进行病变类别的标注,将标注后的图像作为样本进行模型训练,从而可以基于训练的模型,对其它医疗图像进行病变识别,判断是否发生病变,自动给出诊断结果。
技术内容
本申请实施例提供一种图像识别模型训练及图像识别方法、装置及系统,以提高病变预测准确性。
本申请实施例提供了一种图像识别模型训练方法,包括:
获取训练图像样本集,其中,所述训练图像样本集中至少包括强标签训练图像样本;其中,强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息;
提取所述训练图像样本集中图像样本的图像特征信息;
基于所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型,其中,所述强监督目标函数为识别出的病变类别与强标签信息中病变类别之间的损失函数。
本申请实施例还提供了一种图像识别方法,包括:
获取待识别图像;
提取所述待识别图像的图像特征信息;
基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息。
本申请实施例提供了一种图像识别模型训练装置,包括:
获取模块,用于获取训练图像样本集,其中,所述训练图像样本集中至少包括强标签训练图像样本;其中,强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息;
提取模块,用于提取所述训练图像样本集中图像样本的图像特征信息;
训练模块,用于基于所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型,所述强监督目标函数为识别出的病变类别与强标签信息中病变类别之间的损失函数。
本申请实施例还提供了一种图像识别装置,包括:
获取模块,用于获取待识别图像;
提取模块,用于提取所述待识别图像的图像特征信息;
识别模块,用于基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息。
本申请实施例还提供了一种图像识别系统,至少包括:图像采集设备、图像处理设备和输出设备,其中,
图像采集设备,用于获取待识别图像;
处理设备,用于提取所述待识别图像的图像特征信息,并基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息;
显示设备,用于输出所述待识别图像的病变类别识别结果。
本申请实施例还提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述任一种图像识别模型训练方法或图像识别方法的步骤。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任一种图像识别模型训练方法或图像识别方法的步骤。
本申请实施例不仅基于病变类别标注信息,还可以利用病变位置更加准确地定位出某病变类别的图像特征信息,从而可以更准确地区分属于强标签中病变类别的图像特征信息和不属于该病变类别的图像特征信息,减少训练的样本噪声,提高训练可靠性,使得图像识别模型预测更加准确。
附图简要说明
图1为本申请实施例中图像识别模型训练及图像识别方法的应用架构示意图;
图2为本申请实施例中图像识别模型训练方法流程图;
图3为本申请实施例中强标签训练图像样本和弱标签训练图像样本示意图;
图4为本申请实施例中强标签训练图像样本的特征图和标签图的示意图;
图5为本申请实施例中图像识别模型训练方法原理框图;
图6为本申请实施例中图像识别模型训练中监督分离层的实现逻辑示意图;
图7为本申请实施例中图像识别方法流程图;
图8为本申请实施例中图像识别方法原理框图;
图9为本申请实施例中图像识别系统的结构示意图;
图10为本申请实施例中图像识别模型训练装置结构示意图;
图11为本申请实施例中图像识别装置结构示意图。
实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,并不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为便于对本申请实施例的理解,下面先对几个概念进行简单介绍:
弱标签信息:表示仅包括单一的任务所需信息的标注信息,本申请实施例中表示仅包括病变类别的标注信息。
强标签信息:表示除包括任务所需信息外还包括其它相关信息的标注信息,本申请实施例中表示至少包括病变类别和病变位置的标注信息。其中,病变类别可以表示各种消化道病变性质的分类,例如分为良性、恶性等,病变位置表示导致被判定为某一病变类别的病变区域的位置。
深度目标检测(Deeply Supervised Object Detector,DSOD)算法:是一种不需要预先训练的检测算法。
交并比(Intersection-over-Union,IOU):表示两块区域相交的部分与 两个区域的并集之间的比值,也可以理解为检测结果产生的候选框与原标记框的交叠率,即它们的交集与并集的比值,可以用于评价检测的准确率。
目前,消化道疾病的发生越来越频繁,发病率也居高不下,即使暂时治愈了也有很大的复发可能,然而,若能尽早发现病变并进行预防,则可以大大提高完全治愈率。针对消化道疾病的诊断分析,通常是采用内镜作为诊断工具,采集胃部、食道等部位的影像,例如,常见的内镜如胃镜,由患者的口腔进入患者的食道、胃部、十二指肠等,又如肠镜则由患者的肛门进入患者的结直肠进行检查,在检查过程中,可以进行影像存档,方便相关医疗人员后续分析,但是相关医疗人员仅通过人眼观察判断是否存在病变以及存在病变的类别,识别效率和准确性都比较低。
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。
计算机视觉技术(Computer Vision,CV)计算机视觉是一门研究如何使机器“看”的科学,更进一步的说,就是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别、图像语义理解、图像检索、OCR、视频处理、视频语义理解、视频内容/行为识别、三维物体重建、3D技术、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。
机器学习(Machine Learning,ML)是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。
随着人工智能技术研究和进步,人工智能技术在多个领域展开研究和应用,例如常见的智能家居、智能穿戴设备、虚拟助理、智能音箱、智能营销、无人驾驶、自动驾驶、无人机、机器人、智能医疗、智能客服等,相信随着技术的发展,人工智能技术将在更多的领域得到应用,并发挥越来越重要的价值。
本申请实施例提供的方案涉及人工智能的计算机视觉、机器学习等技术,具体通过如下实施例进行说明。
目前,一种利用人工智能协助诊断消化道疾病的方式,通过获取大量内镜影像,由相关医疗人员对每张图像进行病变类别的标注,将标注后的图像作为样本进行模型训练,从而可以基于训练的模型,对其它医疗图像进行病变识别,判断是否发生病变,自动给出诊断结果。其中,训练图像样本的标注通常是与目标任务一致,仅是与任务同等级的单一标注,例如,目标任务是判断胃部病变性质类别,则标注的就是每张图像的病变类别,从而会导致模型准确性较低。
因此,本申请实施例中提供了一种图像识别模型的训练方法,利用标注信息更多的强标签训练图像样本,强标签信息至少包括病变类别和病变位置的标注信息,提取训练图像样本集中图像样本的图像特征信息。根据图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息。根据标记结果,训练图像识别模型,直至图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型。进而可以基于训练完成的图像识别模型,对待识别图像进行病变识别。这样,由于标注信息更丰富,病变位置可以进一步辅助对病变类别的识别。因此,可以达到在同等数据量下更佳的效果,为消化道内镜医疗诊断方法提供一种新的训练方法,从而使得图像识别模型更加准确,提高病变识别预测准确性。
并且,本申请实施例中,还可以同时联合强标签训练图像样本和弱标签训练图像样本,来训练图像识别模型,相比于仅采用弱标签训练图像样本进行训练,也可以一定程度上提高图像识别模型的预测准确性。
图1为本申请实施例中图像识别模型训练及图像识别方法的应用架构示意图,包括服务器100、终端设备200。
终端设备200可以是医疗设备。例如,用户可以基于终端设备200查看图像病变识别结果。
终端设备200与服务器100之间可以通过互联网相连,实现相互之间的通信。一些实施例中,上述的互联网使用标准通信技术和/或协议。互联网通常为因特网、但也可以是任何网络,包括但不限于局域网(Local Area Network,LAN)、城域网(Metropolitan Area Network,MAN)、广域网(Wide Area Network,WAN)、移动、有线或者无线网络、专用网络或者虚拟专用网络的任何组合。在一些实施例中,使用包括超文本标记语言(Hyper Text Mark-up Language,HTML)、可扩展标记语言(Extensible Markup Language,XML)等的技术和/或格式来代表通过网络交换的数据。此外还可以使用诸如安全套接字层(Secure Socket Layer,SSL)、传输层安全(Transport Layer Security,TLS)、虚拟专用网络(Virtual Private Network,VPN)、网际协议安全(Internet Protocol Security,IPsec)等常规加密技术来加密所有或者一些链路。在另一些实施例中,还可以使用定制和/或专用数据通信技术取代或者补充上述数据通信技术。
服务器100可以为终端设备200提供各种网络服务。服务器100可以是一台服务器、若干台服务器组成的服务器集群或云计算中心。
一些实施例中,服务器100可以包括处理器110(Center Processing Unit,CPU)、存储器120、输入设备130和输出设备140等,输入设备130可以包括键盘、鼠标、触摸屏等,输出设备140可以包括显示设备,如液晶显示器(Liquid Crystal Display,LCD)、阴极射线管(Cathode Ray Tube,CRT)等。
存储器120可以包括只读存储器(ROM)和随机存取存储器(RAM),并向处理器110提供存储器120中存储的程序指令和数据。在本申请实施例中,存储器120可以用于存储本申请实施例中图像识别模型训练方法或图像识别方法的程序。
处理器110通过调用存储器120存储的程序指令,处理器110用于按照获得的程序指令执行本申请实施例中任一种图像识别模型训练方法或图像识别方法的步骤。
一些实施例中,图像识别模型训练方法或图像识别方法可以由服务器100执行。例如,针对图像识别方法,终端设备200可以将采集到的消化道等机体部位的图像,发送给服务器100,由服务器100对图像进行病变识别,并可以将病变识别结果返回给终端设备200。如图1所示的应用架构,是以应用于服务器100侧为例进行说明的。一些实施例中,本申请实施例中图像识别方法也可以由终端设备200执行。例如终端设备200可以从服务器100侧获得训练好的图像识别模型,从而基于该图像识别模型,对图像进行病变识别,获得病变识别结果。对此本申请实 施例中并不进行限制。
本申请实施例中的应用架构图是为了更加清楚地说明本申请实施例中的技术方案,并不构成对本申请实施例提供的技术方案的限制。当然,本申请实施例提供的技术方案也并不仅限于消化道疾病诊断业务应用,对于其它的应用架构和业务应用中的类似的问题,同样适用。
本申请各个实施例以应用于图1所示的应用架构图为例进行示意性说明。
基于上述实施例,图2为本申请实施例中图像识别模型训练方法流程图。该方法可以由一计算设备执行,例如服务器100或终端设备200。该方法包括以下步骤。
步骤200:获取训练图像样本集。
其中,训练图像样本集中至少包括强标签训练图像样本。其中,强标签训练图像样本表示有强标签信息的图像样本。强标签信息至少包括病变类别和病变位置的标注信息。
步骤210:提取训练图像样本集中图像样本的图像特征信息。
步骤220:基于图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型。
其中,强监督目标函数为识别出的病变类别与强标签信息中病变类别之间的损失函数。
本申请实施例在训练图像识别模型时,可以预先获取大量消化道内镜的图像,由相关医疗人员同时标注病变类别和发生病变的病变位置,从而可以获得大量已标注的强标签训练图像样本,从而利用强标签训练图像样本和本申请实施例的方法来提高病变识别的准确性。
一些实施例中,训练图像样本集中还可以包括弱标签训练图像样本。其中,弱标签训练图像样本表示有弱标签信息的图像样本。弱标签信息包括病变类别的标注信息。例如,可能相关医疗人员在标注时,只标注了病变类别,未标注病变位置,这时的样本即是弱标签训练图像样本。
这样,若训练图像样本集中包括两种标注等级的样本,即强标签训练图像样本和弱标签训练图像样本,则可以结合这两种训练图像样本联合训练图像识别模型。
图3为本申请实施例中强标签训练图像样本和弱标签训练图像样本示意图。如图3中左图和右图为针对同种病变的两张图。其中左图中标注有一个方框,也可以称为定位框,表示病变位置。方框内区域即表示发生属于某病变类别的病变的区域。而右图中没有定位框,仅包括病变 类别标注信息。即,左图为强标签训练图像样本,右图为弱标签训练图像样本。
另外需要说明的是,本申请旨在利用除病变类别之外的其它更多标注信息,来提高图像识别模型的病变预测的准确性,因此,强标签信息并不仅限于包括病变类别和病变位置,还可以是包括病变类别和其它标注信息,本申请实施例中并不进行限制。
一些实施例中,步骤210可以包括以下步骤。
1)将所述训练图像样本集中图像样本输入到神经网络。
考虑到病变类别的识别本身是一个比较复杂的问题,因此采用神经网络结构进行特征提取。神经网络例如为DSOD,当然还可以采用其它具有相同表征能力的深度神经网络,本申请实施例中并不进行限制。
2)获得基于神经网络对图像样本进行特征提取后输出的设定维度的图像特征信息。
其中,图像特征信息为P*P*C维度,P为设定值,P*P表示将图像样本横纵向等分的P*P个图像块,C为预设病变类别数目。例如,P为设定的任意自然数。
这样,经过特征提取后输出的图像特征信息维度为P*P*C。例如,将图像等分为5*5的25个图像块,预设病变类别数目为10,则最后提取的图像特征信息为5*5*10维度的数据。每个数据点可以对应一个图像块,每个数据点的取值代表了对应图像块是否属于某一病变类别的概率。
一些实施例中,为便于计算,提高训练效率,还可以将图像特征信息经过激活函数处理,将图像特征信息数据映射到一定取值范围内。例如,采用sigmoid函数,映射到(0,1)之间。进而可以采用经过激活函数处理后的图像特征信息,训练图像识别模型。
本申请实施例中,在步骤220训练图像识别模型时,根据训练图像样本集中样本的标注情况,相应地提供了以下几种实施方式。
步骤220训练图像识别模型的第一种实施方式:训练图像样本集中仅包括强标签训练图像样本,则基于图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型。
例如,该方式可以包括以下步骤。
S1、根据图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定强监督目标函数。
本申请实施例中,在基于强标签训练图像样本进行训练时,图像识别模型的输入为强标签训练图像样本,即有强标签信息的图像样本,输出为识别出的病变类别,目标函数即是强监督目标函数。
S2、优化强监督目标函数,直至强监督函数收敛时,确定训练完成。
即在训练过程中不断优化强监督目标函数,使其最小化并收敛,即确定图像识别模型训练完成。
也就是说,本申请实施例中可以仅基于强标签训练图像样本,训练图像识别模型,标记信息更加丰富,可以根据病变位置,识别出属于某病变类别的更准确的图像特征信息,使得训练信息更加可靠,减少噪声,从而使得训练得到的图像识别模型更加可靠,提高了准确性。
一些实施例中,上面步骤S1中确定强监督目标函数的方式可以包括以下步骤。
S1.1、1)标记属于对应强标签中病变类别的图像特征信息,一些实施例中:
分别针对每个强标签训练图像样本,根据强标签训练图像样本对应的强标签信息中病变位置,分别确定强标签训练图像样本的图像特征信息中每个图像块与病变位置的重叠率,若重叠率不小于阈值,则将对应图像块标记为1,否则标记为0,获得强标签训练图像样本是否属于对应的强标签信息中病变类别的标记信息。
其中,确定每个图像块与病变位置的重叠率,分别计算图像特征信息中每个图像块与病变位置的IOU值,IOU值即可以表示重叠率,若不小于一定阈值,说明该小图像块属于该病变类别的病变可能性较大,标记为1,否则标记为0,从而得到该强标签训练图像样本的各个图像块属于该强标签信息中病变类别的标记信息。
另外,确定每个图像块与病变位置的重叠率,也可以采用其它计算方式,本申请实施例中并不进行限制,例如分别计算每个图像块对于病变位置的占比,即占了定位框的比例大小,当不小于一定比例时,标记为1,认为属于该病变类别的可能性较大,否则标记为0。
需要说明的是,本申请实施例中将训练图像样本经过特征提取后得到的图像特征信息对应的称为特征图,将强标签训练图像样本是否属于各病变类别的标记信息对应的称为标签图,标签图对应的也是一个P*P*C维度的数据。
例如,参阅图4所示,为本申请实施例中强标签训练图像样本的特征图和标签图的示意图,图4中(A)图为输入的强标签训练图像样本,该图像样本中定位框区域对应的病变类别为A,将该图像样本划分为4个图像块,划分的图像块序号分别为1、2、3、4,图4中(B)图为对应的特征图,特征图中各点分别对应一个图像块,计算各图像块与定位 框的重叠率,例如,1和2图像块与定位框的IOU超过了阈值,3和4图像块与定位框的IOU小于阈值,则1和2图像块标记为1,3和4图像块标记为0,即得到如图4中(C)图所示的标签图,表示该图像样本属于病变类别A的标签图。
2)一些实施例中,还可以确定出不属于对应强标签信息中病变类别的图像特征信息,即获得图像样本的图像特征信息是否属于除强标签信息中病变类别的其它预设病变类别的标记信息。一些实施例中,可以获得强标签训练图像样本是否属于除强标签信息中病变类别的其它预设病变类别的标记信息为0。
也就是说,对于不属于强标签信息的其它病变类别,说明该图像样本中不存在属于其它病变类别的病变区域,则该图像样本对于其它病变类别的标记信息为0,即得到对应的标签图中各个图像块对应的标记为0。
例如,针对同种病变,预设的病变类别有三种,分别为A、B和C,某强标签训练图像样本的强标签信息中病变类别为A,则该强标签训练图像样本属于病变类别B和C的标记信息为0。
S1.2、分别根据每个强标签训练图像样本是否属于各病变类别的标记信息和图像特征信息,确定强监督目标函数。
一些实施例中,将标记信息与图像特征信息的损失函数作为强监督目标函数。
例如,强监督目标函数为:
Figure PCTCN2020083489-appb-000001
其中,
Figure PCTCN2020083489-appb-000002
表示强标签信息,
Figure PCTCN2020083489-appb-000003
X strong表示强标签训练图像样本,
Figure PCTCN2020083489-appb-000004
Figure PCTCN2020083489-appb-000005
表示经过特征提取获得的图像特征信息。
其中,
Figure PCTCN2020083489-appb-000006
其中,a表示任意变量。
步骤220训练图像识别模型的第二种实施方式:训练图像样本集中包括强标签训练图像样本和弱标签训练图像样本,则根据图像样本的图像特征信息,以及对应的强标签信息或弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至图像识别模型的总目标函数收敛,获得训练完成的图像识别模型。其中,总目标函数为强监督目标函数和弱监督目标函数的总损失函数,弱监督目标函数为识别出的病变类别与弱标签信息中病变类别之间的损失函数。
第二种实施方式可以应用于可能获得的训练图像样本标注等级不同的情况,例如,可能有只标注了病变类别的训练图像样本,也可能有既标注了病变类别又标注了病变位置的训练图像样本,可以不用区分这两 种训练图像样本,联合进行训练,一定程度上也丰富了训练图像样本数量。这时,本申请实施例中提供了一种基于不同标注等级训练图像样本联合训练的方式,可以包括以下步骤。
S1、若为强标签训练图像样本,根据图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定强监督目标函数。
一些实施例中确定强监督目标函数的方式和上述第一种实施方式相同,这里就不再进行赘述了。
S2、若为弱标签训练图像样本,根据图像样本的图像特征信息和对应的弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定弱监督目标函数。
本申请实施例中,若训练图像样本为弱标签训练图像样本时,图像识别模型的输入为弱标签训练图像样本,即有弱标签信息的图像样本,输出为识别出的病变类别,目标函数即是弱监督目标函数。
一些实施例中,本申请实施例中给出了一种确定弱监督目标函数的方式,包括:
S2.1、分别针对每个弱标签训练图像样本,根据弱标签训练图像样本对应的弱标签信息中病变类别,分别确定弱标签训练图像样本的图像特征信息中每个图像块属于各预设病变类别的概率。
这样,每个弱标签训练图像样本都可以分别针对每个预设病变类别,确定其属于该病变类别的概率,称为类别特征图。每个类别特征图表示图像样本中P*P个图像块为该病变类别的概率。
S2.2、确定弱标签训练图像样本的每个图像块属于各预设病变类别的概率中的最大值。
例如,预设病变类别有2种,分别为病变类别A和病变类别B。某个弱标签训练图像样本,被划分为4个图像块。其弱标签信息为病变类别A,则确定该弱标签训练图像样本的4个图像块分别属于病变类别A的概率,假设分别为0.5,0.8,0.2,0,3。由于弱标签为病变类别A,则该弱标签训练图像样本的4个图像块分别属于病变类别B的概率均为0,然后,针对每种病变类别,选择其中概率的最大值,作为该弱标签训练图像样本整体属于各病变类别的概率。即确定该弱标签训练图像样本的4个图像块分别属于病变类别A的概率中最大值为0.8,分别属于病变类别B的概率中最大值为0。即认为该弱标签训练图像样本属于病变类别A的概率为0.8,属于病变类别B的概率为0。这样,可以获得每个弱标签训练图像样本分别针对每种病变类别的概率。
S2.3、分别根据每个弱标签训练图像样本的每个图像块属于各预设病变类别的概率中的最大值,以及对应的弱标签信息中病变类别,确定 弱监督目标函数。
一些实施例中,计算每个图像块属于各预设病变类别的概率中的最大值与弱标签信息中病变类别的损失函数,将该损失函数作为弱监督目标函数。
例如,弱监督目标函数为:
Figure PCTCN2020083489-appb-000007
其中,
Figure PCTCN2020083489-appb-000008
表示弱标签信息,
Figure PCTCN2020083489-appb-000009
X weak表示弱标签训练图像样本,
Figure PCTCN2020083489-appb-000010
Figure PCTCN2020083489-appb-000011
表示经过特征提取获得的图像特征信息。
S3、根据强监督目标函数和弱监督目标函数,确定总目标函数。
例如,总目标函数为:
L total=λL strong+(1-λ)L weak
其中,λ为预设权值,用于权衡强标签训练图像样本和弱标签训练图像样本的损失函数在总损失函数中的占比。
S4、优化总目标函数,直至总目标函数收敛时,确定训练完成。
这时总目标函数收敛,即需要强监督目标函数和弱监督目标函数均收敛,当两者均收敛时完成训练过程。
本申请实施例中,可以结合强标签训练图像样本和弱标签训练图像样本,来联合训练图像识别模型。这样,可以在一定程度上允许弱标签训练图像样本存在,又可以充分利用训练图像样本中的所有标注信息,提高图像识别模型准确性。
一些实施例中,由于本申请实施例中主要是基于至少有病变类别和病变位置的强标签训练图像样本进行训练的,因此,训练得到的图像识别模型不仅可以用于识别病变类别,还可以用于识别病变位置。
基于上述实施例,下面采用一个具体应用场景进行说明,以训练图像样本集中同时包括强标签训练图像样本和弱标签训练图像样本为例进行说明。图5为本申请实施例中图像识别模型训练方法原理框图。
如图5所示,图像识别模型训练方法的整体流程可以分为两部分。
第一部分:特征提取部分。
以基于DSOD模型进行特征提取为例,如图5所示,图5中特征提取部分为DSOD模型基本结构。训练图像样本集中图像样本分别经过卷积、密块(dense block)、卷积、池化、dense block等操作。在最后一层输出设定维度的图像特征信息,维度为P*P*C,表示图像样本的特征图。并且将特征图通过sigmoid函数后,得到监督分离层的输入特征图。
例如,监督分离层的输入特征图为:
Figure PCTCN2020083489-appb-000012
其中,X strong表示强标签训练图像样本,X weak表示弱标签训练图像样本,
Figure PCTCN2020083489-appb-000013
Figure PCTCN2020083489-appb-000014
Figure PCTCN2020083489-appb-000015
表示强标签信息,
Figure PCTCN2020083489-appb-000016
表示 弱标签信息,经过特征提取最后一层输出的图像特征信息为
Figure PCTCN2020083489-appb-000017
第二部分:监督分离层。
本申请实施例中,监督分离层主要包括强监督分支和弱监督分支。将基于不同标签信息的训练图像样本分别经过不同分支进行训练,判断训练图像样本的标签信息是否为强标签信息。若是强标签信息,则将强标签训练图像样本进入强监督分支进行训练。若不是强标签信息,则将弱标签训练图像样本进入弱监督分支进行训练。
如图5所示,预设病变类别有两种,分别为病变类别A和病变类别B。强监督分支主要是基于强标签信息中病变类别和病变位置,通过训练预测病变类别并确定强监督目标函数,弱监督分支主要是基于弱标签信息中病变类别,通过训练预测病变类别并确定弱监督目标函数。这样,不断优化强监督目标函数和弱监督目标函数,均达到收敛,完成训练过程,达到同步利用强标签训练图像样本和弱标签训练图像样本训练图像识别模型的目的。
一些实施例中,图6为本申请实施例中图像识别模型训练中监督分离层的实现逻辑示意图。如图6所示,训练图像样本集中图像样本的图像特征信息,根据标注等级不同,经过分离器,分别进入强监督分支和弱监督分支。
强监督分支:监督分离层的输入特征图,根据强标签信息中病变位置,对输入特征图中每个图像块分别进行估计,分别获得每个图像块是否属于每个预设病变类别的标记,获得对应的标签图。即针对一个病变类别就有对应的一个标签图,标签图与输入特征图的损失函数作为强监督目标函数。
可知,由于强标签训练图像样本中有病变位置的标注信息。因此,可以分别针对每个图像块进行估计判别,分别判断每个图像块是否属于某个病变类别,能确定出更准确的表示病变的图像特征信息,从而可以实现基于病变位置,提高病变类别识别的准确性。
弱监督分支:监督分离层的输入特征图,由于弱标签训练图像样本中只有病变类别的标注信息,没有病变位置。因此,只能针对输入特征图进行整体估计,分别判断该输入特征图整体是否属于每个预设病变类别,分别针对每个预设病变类别获得一个相应的总估计的概率,得到该输入特征图的标签。即针对一个病变类别就有对应的一个概率,总估计的概率与弱标签信息中病变类别的损失函数作为弱监督目标函数。
可知,弱标签训练图像样本中仅有病变类别的标注信息,因此训练时,只能获知整体输入特征图中所有图像特征信息是否属于某个病变类别。而实际中,可能只有其中几个小图像块的图像特征信息才符合相应病变类别的图像特征信息。这样,弱标签训练图像样本在训练时,会引 入一些噪声图像特征信息。
因此,本申请实施例中,基于强标签训练图像样本,或者强标签训练图像样本和弱标签训练图像样本联合,训练得到图像识别模型,即至少需要使用强标签训练图像样本进行训练。这样,由于强标签训练图像样本中标注有病变位置,使得在训练时不仅仅可以利用本身的病变类别标注信息,还可以利用判断为该病变类别的位置信息,根据病变位置,可以更加准确地确定出发生病变所代表的图像特征信息,减少噪声,相比于仅使用弱标签训练图像样本进行训练,可以使得图像识别模型训练更加准确和可靠。
基于上述实施例中图像识别模型训练方法,本申请实施例中还相应提供了一种图像识别方法。图7为本申请实施例中图像识别方法流程图。该方法可以由一计算设备执行,例如服务器100或终端设备200。该方法可以包括以下步骤。
步骤700:获取待识别图像。
例如,若训练的图像识别模型是针对消化道疾病,则图像识别模型可以用于消化道疾病的病变类别识别,获取的待识别图像即是采集到的消化道图像。
步骤710:提取待识别图像的图像特征信息。
步骤720:基于预设的图像识别模型,以待识别图像的图像特征信息为输入参数,获得待识别图像的病变类别识别结果。其中,图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果。强标签训练图像样本表示有强标签信息的图像样本,强标签信息至少包括病变类别和病变位置的标注信息。
一些例子中,步骤720处,可以利用所述图像识别模型从所述强标签训练图像样本及所述强标签信息确定的病变位置的图像块特征信息与病变类别的关系,判断所述待识别图像的图像特征信息中每个图像块是否属于所述病变类别,然后根据所述每个图像块是否属于所述病变类别,确定所述待识别图像是否属于所述病变类别,作为待识别图像的病变类别识别结果。
一些例子中,步骤720处,除了利用从所述强标签训练图像样本及所述强标签信息确定的病变位置的图像块特征信息与病变类别的关系外,还可以利用所述图像识别模型从弱标签训练图像样本及弱标签信息确定的整体图像特征信息与病变类别的关系,判断所述待识别图像的图像特征信息是否属于所述病变类别。其中,所述弱标签信息仅包括病变类别的标注信息。确定待识别图像的病变类别识别结果时,可以根据所述每个图像块是否属于所述病变类别,以及所述待识别图像的图像特征信息是否属于所述病变类别,确定所述待识别图像是否属于所述病变类 别。
即这里的图像识别模型为基于上述实施例中图像识别模型训练方法获得的模型。本申请实施例中,可以将图像识别模型例如应用到内镜辅助诊断系统中,用于识别病变类别。由于主要是基于强标签训练图像样本训练得到图像识别模型,更加可靠准确,因此,基于训练得到的图像识别模型,进行病变类别预测,也更加准确。
基于上述实施例,下面采用一个具体应用场景进行说明,参阅图8所示,为本申请实施例中图像识别方法原理框图。
本申请实施例中图像识别方法原理和图像识别模型训练方法原理是类似的,同样也可以分为两部分:
第一部分:特征提取部分。
如图8所示,以采用DSOD为例,待识别图像分别经过卷积、dense block、卷积、池化、dense block等操作,在最后一层输出设定维度的图像特征信息,维度为P*P*C,表示待识别图像的特征图。
一些实施例中,可以将特征图通过sigmoid函数后,得到监督分离层的输入特征图。
第二部分:监督分离层。
将经过特征提取部分获得的图像特征信息,输入图像识别模型中,图像识别模型对图像特征信息进行判断,判断是属于哪种病变类别,例如预设病变类别有两种,分别为病变类别A和病变类别B,则分别判断是否属于病变类别A和是否属于病变类别B,得到最终的病变类别识别结果。
基于上述实施例,图9为本申请实施例中一种图像识别系统的结构示意图。
该图像识别系统至少包括图像采集设备90、处理设备91和显示设备92。本申请实施例中,图像采集设备90、处理设备91和显示设备92为相关的医疗器械,可以集成在同一医疗器械中,也可以分为多个设备,相互连接通信,组成一个医疗系统来使用等,例如针对消化道疾病诊断,图像采集设备90可以为内镜,处理设备91和显示设备92可以为与内镜相通信的计算机设备等。
一些实施例中,图像采集设备90,用于获取待识别图像。
处理设备91,用于提取待识别图像的图像特征信息,并基于预设的图像识别模型,以待识别图像的图像特征信息为输入参数,获得待识别图像的病变类别识别结果。其中,图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果。强标签训练图像样本表示有强标签信息的图像样本,强标签信息至少包括病变类别和病变位置的标注信息。
显示设备92,用于输出待识别图像的病变类别识别结果。
基于上述实施例,图10为本申请实施例中图像识别模型训练装置,包括以下模块。
获取模块1000,用于获取训练图像样本集。其中,所述训练图像样本集中至少包括强标签训练图像样本。其中,强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息。
提取模块1010,用于提取所述训练图像样本集中图像样本的图像特征信息。
训练模块1020,用于基于所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型。所述强监督目标函数为识别出的病变类别与强标签信息中病变类别之间的损失函数。
一些实施例中,所述训练图像样本集中还包括弱标签训练图像样本。弱标签训练图像样本表示有弱标签信息的图像样本。所述弱标签信息包括病变类别的标注信息。训练模块1020进一步用于:根据所述图像样本的图像特征信息,以及对应的强标签信息或弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的总目标函数收敛,获得训练完成的图像识别模型。其中,所述总目标函数为强监督目标函数和弱监督目标函数的总损失函数,所述弱监督目标函数为识别出的病变类别与弱标签信息中病变类别之间的损失函数。
一些实施例中,提取所述训练图像样本集中图像样本的图像特征信息时,提取模块1010具体用于:
将所述训练图像样本集中图像样本输入到神经网络;获得基于所述神经网络对图像样本进行特征提取后输出的设定维度的图像特征信息。
一些实施例中,所述图像特征信息为P*P*C维度,P为设定值,P*P表示将图像样本横纵向等分的P*P个图像块,C为预设病变类别数目。
一些实施例中,基于所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的强监督目标函数收敛时,训练模块1020具体用于:
根据所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息;
并根据标记结果,确定强监督目标函数;
优化所述强监督目标函数,直至所述强监督函数收敛时,确定训练 完成。
一些实施例中,根据所述图像样本的图像特征信息,以及对应的强标签信息或弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的总目标函数收敛时,训练模块1020具体用于:
若为强标签训练图像样本,根据所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定强监督目标函数;
若为弱标签训练图像样本,根据所述图像样本的图像特征信息和对应的弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定弱监督目标函数;
根据所述强监督目标函数和所述弱监督目标函数,确定总目标函数;
优化所述总目标函数,直至所述总目标函数收敛时,确定训练完成。
一些实施例中,根据所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定强监督目标函数时,训练模块1020具体用于:
分别针对每个强标签训练图像样本,根据强标签训练图像样本对应的强标签信息中病变位置,分别确定强标签训练图像样本的图像特征信息中每个图像块与病变位置的重叠率,若重叠率不小于阈值,则将对应图像块标记为1,否则标记为0,获得强标签训练图像样本是否属于对应的强标签信息中病变类别的标记信息;
并获得强标签训练图像样本是否属于除强标签信息中病变类别的其它预设病变类别的标记信息为0;
分别根据每个强标签训练图像样本是否属于各病变类别的标记信息和图像特征信息,确定强监督目标函数。
一些实施例中,根据所述图像样本的图像特征信息和对应的弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定弱监督目标函数时,训练模块1020具体用于:
分别针对每个弱标签训练图像样本,根据弱标签训练图像样本对应的弱标签信息中病变类别,分别确定弱标签训练图像样本的图像特征信息中每个图像块属于各预设病变类别的概率;
确定弱标签训练图像样本的每个图像块属于各预设病变类别的概率中的最大值;
分别根据每个弱标签训练图像样本的每个图像块属于各预设病变类别的概率中的最大值,以及对应的弱标签信息中病变类别,确定弱监督目标函数。
基于上述实施例,参阅图11所示,本申请实施例中图像识别装置, 具体包括:
获取模块1100,用于获取待识别图像;
提取模块1110,用于提取所述待识别图像的图像特征信息;
识别模块1120,用于基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息。
基于上述实施例,本申请实施例中还提供了另一示例性实施方式的电子设备。在一些实施方式中,本申请实施例中电子设备可以包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序。其中,处理器执行程序时可以实现上述实施例中图像识别模型训练方法或图像识别方法的步骤。
例如,以电子设备为本申请图1中的服务器100为例进行说明,则该电子设备中的处理器即为服务器100中的处理器110,该电子设备中的存储器即为服务器100中的存储器120。
基于上述实施例,本申请实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任意方法实施例中的图像识别模型训练方法或图像识别方法。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对相关技术做出贡献的部分可以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台控制设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种图像识别方法,由计算设备执行,包括:
    获取待识别图像;
    提取所述待识别图像的图像特征信息;
    基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息。
  2. 如权利要求1所述的方法,其中,获得所述待识别图像的病变类别识别结果包括:
    利用所述图像识别模型从所述强标签训练图像样本及所述强标签信息确定的病变位置的图像块特征信息与病变类别的关系,判断所述待识别图像的图像特征信息中每个图像块是否属于所述病变类别;
    根据所述每个图像块是否属于所述病变类别,确定所述待识别图像是否属于所述病变类别。
  3. 如权利要求2所述的方法,其中,获得所述待识别图像的病变类别识别结果进一步包括:
    利用所述图像识别模型从弱标签训练图像样本及弱标签信息确定的整体图像特征信息与病变类别的关系,判断所述待识别图像的图像特征信息是否属于所述病变类别;其中,所述弱标签信息仅包括病变类别的标注信息;
    根据所述每个图像块是否属于所述病变类别,以及所述待识别图像的图像特征信息是否属于所述病变类别,确定所述待识别图像是否属于所述病变类别。
  4. 一种图像识别模型训练方法,由计算设备执行,包括:
    获取训练图像样本集,其中,所述训练图像样本集中至少包括强标签训练图像样本;其中,强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息;
    提取所述训练图像样本集中图像样本的图像特征信息;
    基于所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型,其中,所述强监督目标函数为识别出的病变类别与强标签信息中病变类别之间的损失函数。
  5. 如权利要求4所述的方法,其中,所述训练图像样本集中还包括弱标签训练图像样本;弱标签训练图像样本表示有弱标签信息的图像样 本,所述弱标签信息包括病变类别的标注信息;则进一步包括:
    根据所述图像样本的图像特征信息,以及对应的强标签信息或弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的总目标函数收敛,获得训练完成的图像识别模型;其中,所述总目标函数为强监督目标函数和弱监督目标函数的总损失函数,所述弱监督目标函数为识别出的病变类别与弱标签信息中病变类别之间的损失函数。
  6. 如权利要求5所述的方法,其中,提取所述训练图像样本集中图像样本的图像特征信息,具体包括:
    将所述训练图像样本集中图像样本输入到神经网络;
    获得基于所述神经网络对图像样本进行特征提取后输出的设定维度的图像特征信息。
  7. 如权利要求6所述的方法,其中,所述图像特征信息为P*P*C维度,P为设定值,P*P表示将图像样本横纵向等分的P*P个图像块,C为预设病变类别数目。
  8. 如权利要求7所述的方法,其中,基于所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的强监督目标函数收敛,具体包括:
    根据所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息;
    并根据标记结果,确定强监督目标函数;
    优化所述强监督目标函数,直至所述强监督函数收敛时,确定训练完成。
  9. 如权利要求7所述的方法,其中,根据所述图像样本的图像特征信息,以及对应的强标签信息或弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的总目标函数收敛,具体包括:
    若为强标签训练图像样本,根据所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定强监督目标函数;
    若为弱标签训练图像样本,根据所述图像样本的图像特征信息和对应的弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定弱监督目标函数;
    根据所述强监督目标函数和所述弱监督目标函数,确定总目标函数;
    优化所述总目标函数,直至所述总目标函数收敛时,确定训练完成。
  10. 如权利要求8或9所述的方法,其特征在于,根据所述图像样 本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定强监督目标函数,具体包括:
    分别针对每个强标签训练图像样本,根据强标签训练图像样本对应的强标签信息中病变位置,分别确定强标签训练图像样本的图像特征信息中每个图像块与病变位置的重叠率,若重叠率不小于阈值,则将对应图像块标记为1,否则标记为0,获得强标签训练图像样本是否属于对应的强标签信息中病变类别的标记信息;
    并获得强标签训练图像样本是否属于除强标签信息中病变类别的其它预设病变类别的标记信息为0;
    分别根据每个强标签训练图像样本是否属于各病变类别的标记信息和图像特征信息,确定强监督目标函数。
  11. 如权利要求9所述的方法,其特征在于,根据所述图像样本的图像特征信息和对应的弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定弱监督目标函数,具体包括:
    分别针对每个弱标签训练图像样本,根据弱标签训练图像样本对应的弱标签信息中病变类别,分别确定弱标签训练图像样本的图像特征信息中每个图像块属于各预设病变类别的概率;
    确定弱标签训练图像样本的每个图像块属于各预设病变类别的概率中的最大值;
    分别根据每个弱标签训练图像样本的每个图像块属于各预设病变类别的概率中的最大值,以及对应的弱标签信息中病变类别,确定弱监督目标函数。
  12. 一种图像识别模型训练装置,包括:
    获取模块,用于获取训练图像样本集,其中,所述训练图像样本集中至少包括强标签训练图像样本;其中,强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息;
    提取模块,用于提取所述训练图像样本集中图像样本的图像特征信息;
    训练模块,用于基于所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的强监督目标函数收敛,获得训练完成的图像识别模型,所述强监督目标函数为识别出的病变类别与强标签信息中病变类别之间的损失函数。
  13. 如权利要求12所述的装置,其中,所述训练模块用于:
    所述训练图像样本集中还包括弱标签训练图像样本时,根据所述图像样本的图像特征信息,以及对应的强标签信息或弱标签信息,标记属 于各预设病变类别的图像特征信息,并根据标记结果,训练图像识别模型,直至所述图像识别模型的总目标函数收敛,获得训练完成的图像识别模型;其中,所述总目标函数为强监督目标函数和弱监督目标函数的总损失函数,所述弱监督目标函数为识别出的病变类别与弱标签信息中病变类别之间的损失函数;所述弱标签训练图像样本表示有弱标签信息的图像样本,所述弱标签信息包括病变类别的标注信息。
  14. 如权利要求13所述的装置,其中,所述训练模块用于:
    若为强标签训练图像样本,根据所述图像样本的图像特征信息和对应的强标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定强监督目标函数;
    若为弱标签训练图像样本,根据所述图像样本的图像特征信息和对应的弱标签信息,标记属于各预设病变类别的图像特征信息,并根据标记结果,确定弱监督目标函数;
    根据所述强监督目标函数和所述弱监督目标函数,确定总目标函数;
    优化所述总目标函数,直至所述总目标函数收敛时,确定训练完成。
  15. 一种图像识别装置,包括:
    获取模块,用于获取待识别图像;
    提取模块,用于提取所述待识别图像的图像特征信息;
    识别模块,用于基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息。
  16. 如权利要求15所述的装置,其中,所述识别模块用于:
    利用所述图像识别模型从所述强标签训练图像样本及所述强标签信息确定的病变位置的图像块特征信息与病变类别的关系,判断所述待识别图像的图像特征信息中每个图像块是否属于所述病变类别;
    根据所述每个图像块是否属于所述病变类别,确定所述待识别图像是否属于所述病变类别。
  17. 如权利要求16所述的装置,其中,所述识别模块用于进一步用于:
    利用所述图像识别模型从弱标签训练图像样本及弱标签信息确定的整体图像特征信息与病变类别的关系,判断所述待识别图像的图像特征信息是否属于所述病变类别;其中,所述弱标签信息仅包括病变类别的标注信息;
    根据所述每个图像块是否属于所述病变类别,以及所述待识别图像 的图像特征信息是否属于所述病变类别,确定所述待识别图像是否属于所述病变类别。
  18. 一种图像识别系统,包括:图像采集设备、图像处理设备和输出设备,具体地:
    图像采集设备,用于获取待识别图像;
    处理设备,用于提取所述待识别图像的图像特征信息,并基于预设的图像识别模型,以所述待识别图像的图像特征信息为输入参数,获得所述待识别图像的病变类别识别结果;其中,所述图像识别模型为采用至少包括强标签训练图像样本的训练图像样本集进行训练,以确定病变类别识别结果;强标签训练图像样本表示有强标签信息的图像样本,所述强标签信息至少包括病变类别和病变位置的标注信息;
    显示设备,用于输出所述待识别图像的病变类别识别结果。
  19. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1-11中任一项所述方法。
  20. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-11中任一项所述方法。
PCT/CN2020/083489 2019-04-10 2020-04-07 一种图像识别模型训练及图像识别方法、装置及系统 WO2020207377A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/321,219 US11967414B2 (en) 2019-04-10 2021-05-14 Image recognition model training method and apparatus, and image recognition method, apparatus, and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910284918.6 2019-04-10
CN201910284918.6A CN110009623B (zh) 2019-04-10 2019-04-10 一种图像识别模型训练及图像识别方法、装置及系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/321,219 Continuation US11967414B2 (en) 2019-04-10 2021-05-14 Image recognition model training method and apparatus, and image recognition method, apparatus, and system

Publications (1)

Publication Number Publication Date
WO2020207377A1 true WO2020207377A1 (zh) 2020-10-15

Family

ID=67170805

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/083489 WO2020207377A1 (zh) 2019-04-10 2020-04-07 一种图像识别模型训练及图像识别方法、装置及系统

Country Status (3)

Country Link
US (1) US11967414B2 (zh)
CN (2) CN110009623B (zh)
WO (1) WO2020207377A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633276A (zh) * 2020-12-25 2021-04-09 北京百度网讯科技有限公司 训练方法、识别方法、装置、设备、介质
CN114841970A (zh) * 2022-05-09 2022-08-02 北京字节跳动网络技术有限公司 检查图像的识别方法、装置、可读介质和电子设备

Families Citing this family (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110009623B (zh) * 2019-04-10 2021-05-11 腾讯医疗健康(深圳)有限公司 一种图像识别模型训练及图像识别方法、装置及系统
CN110533086B (zh) * 2019-08-13 2021-01-26 天津大学 图像数据半自动标注方法
CN110660055B (zh) * 2019-09-25 2022-11-29 北京青燕祥云科技有限公司 疾病数据预测方法、装置、可读存储介质及电子设备
CN110738263B (zh) 2019-10-17 2020-12-29 腾讯科技(深圳)有限公司 一种图像识别模型训练的方法、图像识别的方法及装置
CN111161848B (zh) * 2019-10-31 2023-08-29 杭州深睿博联科技有限公司 Ct图像的病灶标注方法及装置、存储介质
CN110909780B (zh) * 2019-11-14 2020-11-03 腾讯科技(深圳)有限公司 一种图像识别模型训练和图像识别方法、装置及系统
CN110909803B (zh) * 2019-11-26 2023-04-18 腾讯科技(深圳)有限公司 图像识别模型训练方法、装置和计算机可读存储介质
CN112926612A (zh) * 2019-12-06 2021-06-08 中移(成都)信息通信科技有限公司 病理图像分类模型训练方法、病理图像分类方法及装置
WO2021133954A1 (en) * 2019-12-23 2021-07-01 DeepHealth, Inc. Systems and methods for analyzing two-dimensional and three-dimensional image data
CN111091562B (zh) * 2019-12-23 2020-12-01 山东大学齐鲁医院 一种消化道病灶大小测量方法及系统
CN111126509B (zh) * 2019-12-31 2024-03-15 深圳开立生物医疗科技股份有限公司 一种图像处理系统模型构建方法和装置
CN111275080B (zh) * 2020-01-14 2021-01-08 腾讯科技(深圳)有限公司 基于人工智能的图像分类模型训练方法、分类方法及装置
CN111539443B (zh) * 2020-01-22 2024-02-09 北京小米松果电子有限公司 一种图像识别模型训练方法及装置、存储介质
CN111353392B (zh) * 2020-02-18 2022-09-30 腾讯科技(深圳)有限公司 换脸检测方法、装置、设备及存储介质
CN111353542B (zh) * 2020-03-03 2023-09-19 腾讯科技(深圳)有限公司 图像分类模型的训练方法、装置、计算机设备和存储介质
CN111358431B (zh) * 2020-03-06 2023-03-24 重庆金山医疗技术研究院有限公司 一种食道压力云图的标识识别方法及设备
CN111414946B (zh) * 2020-03-12 2022-09-23 腾讯科技(深圳)有限公司 基于人工智能的医疗影像的噪声数据识别方法和相关装置
CN111428072A (zh) * 2020-03-31 2020-07-17 南方科技大学 眼科多模态影像的检索方法、装置、服务器及存储介质
CN111626102B (zh) * 2020-04-13 2022-04-26 上海交通大学 基于视频弱标记的双模态迭代去噪异常检测方法及终端
TWI734449B (zh) * 2020-04-21 2021-07-21 財團法人工業技術研究院 用於影像辨識的特徵標註方法及其裝置
US11599742B2 (en) * 2020-04-22 2023-03-07 Dell Products L.P. Dynamic image recognition and training using data center resources and data
CN111524124A (zh) * 2020-04-27 2020-08-11 中国人民解放军陆军特色医学中心 炎症性肠病消化内镜影像人工智能辅助系统
CN111739007B (zh) * 2020-06-22 2024-01-26 中南民族大学 内窥镜图像识别方法、设备、存储介质及装置
CN111738197B (zh) * 2020-06-30 2023-09-05 中国联合网络通信集团有限公司 一种训练图像信息处理的方法和装置
CN111783889B (zh) * 2020-07-03 2022-03-01 北京字节跳动网络技术有限公司 图像识别方法、装置、电子设备和计算机可读介质
CN113516614A (zh) * 2020-07-06 2021-10-19 阿里巴巴集团控股有限公司 脊柱影像的处理方法、模型训练方法、装置及存储介质
CN111932547B (zh) * 2020-09-24 2021-06-11 平安科技(深圳)有限公司 图像中目标物的分割方法、装置、电子设备及存储介质
CN112419251A (zh) * 2020-11-13 2021-02-26 浙江核睿医疗科技有限公司 上消化道内镜图像生成方法、装置、电子设备和存储介质
CN112434178A (zh) * 2020-11-23 2021-03-02 北京达佳互联信息技术有限公司 图像分类方法、装置、电子设备和存储介质
CN112712515A (zh) * 2021-01-06 2021-04-27 重庆金山医疗器械有限公司 一种内镜图像处理方法、装置、电子设备及存储介质
CN112735554A (zh) * 2021-01-06 2021-04-30 重庆金山医疗器械有限公司 一种内镜报告生成装置、方法、电子设备及可读存储介质
CN113034113A (zh) * 2021-04-01 2021-06-25 苏州惟信易量智能科技有限公司 一种基于可穿戴设备的流程控制系统及方法
CN113034114A (zh) * 2021-04-01 2021-06-25 苏州惟信易量智能科技有限公司 一种基于可穿戴设备的流程控制系统及方法
CN112990362B (zh) * 2021-04-20 2021-08-20 长沙树根互联技术有限公司 矿车驾驶等级识别模型训练方法、装置和终端设备
CN113159193B (zh) * 2021-04-26 2024-05-21 京东科技信息技术有限公司 模型训练方法、图像识别方法、存储介质及程序产品
CN113534849A (zh) * 2021-09-16 2021-10-22 中国商用飞机有限责任公司 集成机器视觉的飞行组合导引系统、方法和介质
CN113792807B (zh) * 2021-09-16 2023-06-27 平安科技(深圳)有限公司 皮肤病分类模型训练方法、系统、介质和电子设备
CN113962951B (zh) * 2021-10-15 2022-05-17 杭州研极微电子有限公司 检测分割模型的训练方法及装置、目标检测方法及装置
CN113947701B (zh) * 2021-10-18 2024-02-23 北京百度网讯科技有限公司 训练方法、对象识别方法、装置、电子设备以及存储介质
CN113688248B (zh) 2021-10-26 2022-02-22 之江实验室 一种小样本弱标注条件下的医疗事件识别方法及系统
CN114972725B (zh) * 2021-12-30 2023-05-23 华为技术有限公司 模型训练方法、可读介质和电子设备
CN114310954B (zh) * 2021-12-31 2024-04-16 北京理工大学 一种护理机器人自适应升降控制方法和系统
CN114445406B (zh) * 2022-04-07 2022-08-09 武汉大学 肠镜图像分析方法、装置和医学图像处理设备
CN114565611B (zh) * 2022-04-28 2022-07-19 武汉大学 医学信息获取方法及相关设备
CN115578394B (zh) * 2022-12-09 2023-04-07 湖南省中医药研究院 一种基于非对称网络的肺炎图像处理方法
CN116051486A (zh) * 2022-12-29 2023-05-02 抖音视界有限公司 内窥镜图像识别模型的训练方法、图像识别方法及装置
CN116596927B (zh) * 2023-07-17 2023-09-26 浙江核睿医疗科技有限公司 一种内镜视频处理方法、系统及装置
CN116740475B (zh) * 2023-08-15 2023-10-17 苏州凌影云诺医疗科技有限公司 一种基于状态分类的消化道图像识别方法和系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140233826A1 (en) * 2011-09-27 2014-08-21 Board Of Regents Of The University Of Texas System Systems and methods for automated screening and prognosis of cancer from whole-slide biopsy images
CN107368859A (zh) * 2017-07-18 2017-11-21 北京华信佳音医疗科技发展有限责任公司 病变识别模型的训练方法、验证方法和病变图像识别装置
CN107680684A (zh) * 2017-10-12 2018-02-09 百度在线网络技术(北京)有限公司 用于获取信息的方法及装置
CN109447149A (zh) * 2018-10-25 2019-03-08 腾讯科技(深圳)有限公司 一种检测模型的训练方法、装置及终端设备
CN109584218A (zh) * 2018-11-15 2019-04-05 首都医科大学附属北京友谊医院 一种胃癌图像识别模型的构建方法及其应用
CN110009623A (zh) * 2019-04-10 2019-07-12 腾讯科技(深圳)有限公司 一种图像识别模型训练及图像识别方法、装置及系统
CN110013264A (zh) * 2019-04-29 2019-07-16 北京青燕祥云科技有限公司 X光图像识别方法、装置、电子设备及存储介质

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100472556C (zh) * 2005-10-09 2009-03-25 欧姆龙株式会社 特定被摄体检测装置及方法
US8150113B2 (en) * 2008-01-23 2012-04-03 Carestream Health, Inc. Method for lung lesion location identification
WO2010071896A2 (en) * 2008-12-19 2010-06-24 Piedmont Healthcare, Inc. System and method for lesion-specific coronary artery calcium quantification
US9117259B2 (en) * 2010-09-22 2015-08-25 Siemens Aktiengesellschaft Method and system for liver lesion detection
KR20120072961A (ko) * 2010-12-24 2012-07-04 삼성전자주식회사 의료 영상을 이용한 영상진단을 보조하는 방법 및 장치, 이를 수행하는 영상진단 시스템
US9324140B2 (en) * 2013-08-29 2016-04-26 General Electric Company Methods and systems for evaluating bone lesions
US10585940B2 (en) * 2014-05-12 2020-03-10 Koninklijke Philips N.V. Method and system for computer-aided patient stratification based on case difficulty
US20190033315A1 (en) * 2014-12-19 2019-01-31 Uti Limited Partnership Metabolomics for diagnosing pancreatic cancer
US11094058B2 (en) * 2015-08-14 2021-08-17 Elucid Bioimaging Inc. Systems and method for computer-aided phenotyping (CAP) using radiologic images
US9760807B2 (en) * 2016-01-08 2017-09-12 Siemens Healthcare Gmbh Deep image-to-image network learning for medical image analysis
CN106157294B (zh) * 2016-04-28 2019-02-19 中国人民解放军第一七五医院 一种用于消化道肿瘤内镜图像识别的方法及应用
US10499882B2 (en) * 2016-07-01 2019-12-10 yoR Labs, Inc. Methods and systems for ultrasound imaging
US9965863B2 (en) * 2016-08-26 2018-05-08 Elekta, Inc. System and methods for image segmentation using convolutional neural network
KR101879207B1 (ko) * 2016-11-22 2018-07-17 주식회사 루닛 약한 지도 학습 방식의 객체 인식 방법 및 장치
CN106845374B (zh) * 2017-01-06 2020-03-27 清华大学 基于深度学习的行人检测方法及检测装置
US10575774B2 (en) * 2017-02-27 2020-03-03 Case Western Reserve University Predicting immunotherapy response in non-small cell lung cancer with serial radiomics
EP3629898A4 (en) * 2017-05-30 2021-01-20 Arterys Inc. AUTOMATED LESION DETECTION, SEGMENTATION AND LONGITUDINAL IDENTIFICATION
EP3432263B1 (en) * 2017-07-17 2020-09-16 Siemens Healthcare GmbH Semantic segmentation for cancer detection in digital breast tomosynthesis
US10650286B2 (en) * 2017-09-07 2020-05-12 International Business Machines Corporation Classifying medical images using deep convolution neural network (CNN) architecture
US10496884B1 (en) * 2017-09-19 2019-12-03 Deepradiology Inc. Transformation of textbook information
WO2019103912A2 (en) * 2017-11-22 2019-05-31 Arterys Inc. Content based image retrieval for lesion analysis
CN108230322B (zh) * 2018-01-28 2021-11-09 浙江大学 一种基于弱样本标记的眼底特征检测装置
US11132792B2 (en) * 2018-02-22 2021-09-28 Siemens Healthcare Gmbh Cross domain medical image segmentation
US10878569B2 (en) * 2018-03-28 2020-12-29 International Business Machines Corporation Systems and methods for automatic detection of an indication of abnormality in an anatomical image
CN108734195A (zh) * 2018-04-13 2018-11-02 王延峰 基于协同学习的弱监督检测模型训练方法及系统
CN109034208B (zh) * 2018-07-03 2020-10-23 怀光智能科技(武汉)有限公司 一种高低分辨率组合的宫颈细胞切片图像分类系统
CN108961274B (zh) * 2018-07-05 2021-03-02 四川大学 一种mri图像中自动头颈肿瘤分割方法
US20200051257A1 (en) * 2018-08-08 2020-02-13 Siemens Medical Solutions Usa, Inc. Scan alignment based on patient-based surface in medical diagnostic ultrasound imaging
CN108985268B (zh) * 2018-08-16 2021-10-29 厦门大学 基于深度迁移学习的归纳式雷达高分辨距离像识别方法
CN109214386B (zh) * 2018-09-14 2020-11-24 京东数字科技控股有限公司 用于生成图像识别模型的方法和装置
CN109359569B (zh) * 2018-09-30 2022-05-13 桂林优利特医疗电子有限公司 一种基于cnn的红细胞图像亚分类方法
CN109359684B (zh) * 2018-10-17 2021-10-29 苏州大学 基于弱监督定位和子类别相似性度量的细粒度车型识别方法
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140233826A1 (en) * 2011-09-27 2014-08-21 Board Of Regents Of The University Of Texas System Systems and methods for automated screening and prognosis of cancer from whole-slide biopsy images
CN107368859A (zh) * 2017-07-18 2017-11-21 北京华信佳音医疗科技发展有限责任公司 病变识别模型的训练方法、验证方法和病变图像识别装置
CN107680684A (zh) * 2017-10-12 2018-02-09 百度在线网络技术(北京)有限公司 用于获取信息的方法及装置
CN109447149A (zh) * 2018-10-25 2019-03-08 腾讯科技(深圳)有限公司 一种检测模型的训练方法、装置及终端设备
CN109584218A (zh) * 2018-11-15 2019-04-05 首都医科大学附属北京友谊医院 一种胃癌图像识别模型的构建方法及其应用
CN110009623A (zh) * 2019-04-10 2019-07-12 腾讯科技(深圳)有限公司 一种图像识别模型训练及图像识别方法、装置及系统
CN110013264A (zh) * 2019-04-29 2019-07-16 北京青燕祥云科技有限公司 X光图像识别方法、装置、电子设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633276A (zh) * 2020-12-25 2021-04-09 北京百度网讯科技有限公司 训练方法、识别方法、装置、设备、介质
CN114841970A (zh) * 2022-05-09 2022-08-02 北京字节跳动网络技术有限公司 检查图像的识别方法、装置、可读介质和电子设备

Also Published As

Publication number Publication date
CN110473192B (zh) 2021-05-14
CN110009623B (zh) 2021-05-11
CN110009623A (zh) 2019-07-12
CN110473192A (zh) 2019-11-19
US11967414B2 (en) 2024-04-23
US20210272681A1 (en) 2021-09-02

Similar Documents

Publication Publication Date Title
WO2020207377A1 (zh) 一种图像识别模型训练及图像识别方法、装置及系统
Min et al. Overview of deep learning in gastrointestinal endoscopy
WO2020151536A1 (zh) 一种脑部图像分割方法、装置、网络设备和存储介质
CN110909780B (zh) 一种图像识别模型训练和图像识别方法、装置及系统
US20220051405A1 (en) Image processing method and apparatus, server, medical image processing device and storage medium
US11967069B2 (en) Pathological section image processing method and apparatus, system, and storage medium
Yap et al. Analysis towards classification of infection and ischaemia of diabetic foot ulcers
CN110689025B (zh) 图像识别方法、装置、系统及内窥镜图像识别方法、装置
Xue et al. Modality alignment contrastive learning for severity assessment of COVID-19 from lung ultrasound and clinical information
US20220180520A1 (en) Target positioning method, apparatus and system
EP3482346A1 (en) System and method for automatic detection, localization, and semantic segmentation of anatomical objects
CN113011485A (zh) 多模态多病种长尾分布眼科疾病分类模型训练方法和装置
CN111798425B (zh) 基于深度学习的胃肠道间质瘤中核分裂象智能检测方法
CN115082747B (zh) 基于组块对抗的零样本胃溃疡分类系统
WO2019098415A1 (ko) 자궁경부암에 대한 피검체의 발병 여부를 판정하는 방법 및 이를 이용한 장치
CN112966792B (zh) 血管图像分类处理方法、装置、设备及存储介质
Lei et al. Automated detection of retinopathy of prematurity by deep attention network
CN117237351B (zh) 一种超声图像分析方法以及相关装置
Wang et al. Explainable multitask Shapley explanation networks for real-time polyp diagnosis in videos
CN115937129B (zh) 基于多模态磁共振影像的左右半脑关系的处理方法及装置
CN116029968A (zh) 猴痘感染皮肤图像检测方法和装置、电子设备及存储介质
CN113706449B (zh) 基于病理图像的细胞分析方法、装置、设备及存储介质
Lin et al. A meta-fusion RCNN network for endoscopic visual bladder lesions intelligent detection
Zhang et al. Unsupervised Domain Adaptation Based Automatic COVID-19 CT Segmentation
Hayat et al. Chest X-Ray Image Analysis to Augment the Decision Making in Diagnosing Pneumonia using Convolutional Neural Networks Algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20788299

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20788299

Country of ref document: EP

Kind code of ref document: A1