WO2021147429A1 - 内窥镜图像展示方法、装置、计算机设备及存储介质 - Google Patents

内窥镜图像展示方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2021147429A1
WO2021147429A1 PCT/CN2020/124483 CN2020124483W WO2021147429A1 WO 2021147429 A1 WO2021147429 A1 WO 2021147429A1 CN 2020124483 W CN2020124483 W CN 2020124483W WO 2021147429 A1 WO2021147429 A1 WO 2021147429A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
endoscopic
target area
matching
network
Prior art date
Application number
PCT/CN2020/124483
Other languages
English (en)
French (fr)
Other versions
WO2021147429A9 (zh
Inventor
邱俊文
孙钟前
付星辉
尚鸿
郑瀚
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021147429A1 publication Critical patent/WO2021147429A1/zh
Publication of WO2021147429A9 publication Critical patent/WO2021147429A9/zh
Priority to US17/674,126 priority Critical patent/US20220172828A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • G06T7/0014Biomedical image inspection using an image reference approach
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/945User interactive design; Environments; Toolboxes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the embodiments of the present application relate to the field of machine learning technology, and in particular to an endoscopic image display method, device, computer equipment, and storage medium.
  • endoscopes such as gastroscopy or colonoscopy, etc.
  • endoscopes have gradually become an important means to assist doctors in diagnosing gastrointestinal diseases.
  • the operator of the endoscope equipment (such as a doctor or nurse) operates the lens of the endoscope to penetrate the patient’s digestive tract, and the lens of the endoscope captures images of the digestive tract in the patient’s body in real time, and images the digestive tract.
  • the doctor displays on the display interface of the external monitor, the doctor makes a preliminary diagnosis of the patient's digestive tract diseases through digestive tract images.
  • assisted diagnosis through endoscopy requires higher experience for doctors.
  • many doctors lack the ability to accurately diagnose gastrointestinal diseases with the help of endoscopes, resulting in low accuracy of assisted diagnosis with the help of endoscopes.
  • the embodiments of the present application provide an endoscopic image display method, device, computer equipment, and storage medium, which can improve the accuracy of assisted diagnosis with the aid of an endoscope.
  • the technical solutions are as follows:
  • an endoscopic image display method is provided, the method is executed by a computer device, and the method includes:
  • the target area image into a coding network to obtain the semantic features of the target area image output by the coding network;
  • the coding network is the network part used to extract image features in the image classification network;
  • the image classification network Is a machine learning network trained through the first training image and the image category of the first training image;
  • the endoscopic image display interface displays the endoscopic image and the matching result.
  • an endoscopic image display method is provided, the method is executed by a computer device, and the method includes:
  • a second endoscope image is displayed in the endoscope image display interface, and the second endoscope image is the endoscope The image collected by the speculum in the NBI mode;
  • a matching result is displayed corresponding to the second endoscopic image, and the matching result is used to indicate a target image that matches the target region image in the second endoscopic image sample.
  • an endoscopic image display device is provided.
  • the method is used in computer equipment, and the device includes:
  • Endoscope image acquisition module for acquiring the endoscopic image collected by the endoscope
  • An area image positioning module configured to locate a target area image from the endoscopic image, where the target area image is a partial image of the endoscope image that includes the target area;
  • the semantic feature extraction module is used to input the target region image into a coding network to obtain the semantic features of the target region image output by the coding network;
  • the coding network is a network for extracting image features in an image classification network Part;
  • the image classification network is a machine learning network trained by the first training image and the image category of the first training image;
  • the matching module is used to match the semantic features of the target area image with the semantic features of each image sample to obtain a matching result; the matching result is used to indicate which of the various image samples matches the target area image Target image sample;
  • the display module is used for displaying the endoscopic image and the matching result on the endoscopic image display interface.
  • the matching module includes:
  • the matching score obtaining sub-module is used to input the semantic feature of the target area image and the semantic feature of each image sample into a matching network, and obtain the output of the matching network, the target area image and each image respectively
  • the matching score between samples is obtained by training semantic feature pairs marked with matching tags, the semantic feature pairs containing the semantic features of two images, and the matching tags are used to indicate the corresponding semantics Whether the feature pairs match;
  • An image sample determination submodule configured to determine the target image sample based on the matching scores between the target area image and the respective image samples
  • the matching result obtaining sub-module obtains the matching result based on the target image sample.
  • the image sample determination sub-module is configured to:
  • the top n image samples are used as the target image samples, where n ⁇ 1, and n is an integer;
  • an image sample with a corresponding matching score higher than a matching score threshold is used as the target image sample;
  • the respective image samples are arranged in the top n image samples according to the corresponding matching score from high to low, and the image samples with the corresponding matching score higher than the matching score threshold are used as the target Image sample.
  • the matching result includes at least one of the following:
  • the target image sample
  • the device further includes:
  • the second display module is configured to correspond to the endoscope image display area mark in the endoscope image display interface, and the area mark is used to indicate the target area in the endoscope image.
  • the regional image positioning module includes:
  • the area coordinate acquisition sub-module is used to input the endoscopic image into a target area positioning network to obtain the area coordinates output by the target area positioning network;
  • the target area positioning network is machine learning obtained through training of the second training image Network;
  • the second training image is marked with a target area;
  • the first area image acquisition sub-module is configured to acquire an image corresponding to the area coordinates in the endoscopic image as the target area image.
  • the regional image positioning module includes:
  • the user operation receiving sub-module is used to receive the frame selection operation performed by the user in the endoscopic image
  • the second region image acquisition sub-module is configured to acquire the image of the region corresponding to the frame selection operation in the endoscopic image as the target region image.
  • the region image positioning module is configured to perform the step of locating a target region image from the endoscopic image in response to the image mode of the endoscopic image being the NBI mode of endoscopic narrowband imaging.
  • the device further includes:
  • the image mode information acquisition module is used to input the endoscopic image into an image mode classification network to obtain image mode information output by the image mode classification network;
  • the image mode classification network is a machine trained through a third training image In the learning network, the third training image is annotated with an image mode; the image mode information indicates whether the image mode of the endoscopic image is the NBI mode.
  • the device further includes:
  • the working state acquisition module is used to acquire the working state of the endoscope when the endoscope image is collected
  • the image mode of the endoscopic image is the NBI mode.
  • the device further includes:
  • An image quality information acquisition module for acquiring image quality information of the endoscopic image, where the image quality information includes at least one of blur, exposure and hue abnormality, and effective resolution;
  • the regional image positioning module is configured to perform the step of locating a target region image from the endoscopic image in response to the image quality information meeting the image quality threshold.
  • an endoscopic image display device is provided, the device is used in computer equipment, and the device includes:
  • the first display module is configured to display a first endoscopic image in the endoscopic image display interface, where the first endoscopic image is an image collected by the endoscope in the white light mode;
  • the second display module is configured to switch to the NBI mode of endoscopic narrow-band imaging in response to the shooting mode of the endoscope, and display a second endoscope image on the endoscope image display interface, and the second endoscope
  • the endoscope image is an image collected by the endoscope in the NBI mode
  • the third display module is configured to display a matching result corresponding to the second endoscopic image in the endoscopic image display interface, and the matching result is used to indicate the matching result with the target in the second endoscopic image The target image sample that matches the region image.
  • a computer device in a fifth aspect, includes a processor and a memory.
  • the memory stores at least one instruction, at least one program, a code set, or an instruction set, the at least one instruction, the at least one program ,
  • the code set or instruction set is loaded and executed by the processor to realize the above-mentioned endoscopic image display method.
  • a computer-readable storage medium stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code
  • the set or instruction set is loaded and executed by the processor to realize the above-mentioned endoscopic image display method.
  • the endoscopic image display interface displays the endoscopic image and the matching result.
  • Fig. 1 is a frame diagram of image recognition and image display according to an exemplary embodiment
  • Fig. 2 shows a flowchart of an endoscopic image display method shown in an exemplary embodiment of the present application
  • Fig. 3 shows a flowchart of an endoscopic image display method shown in an exemplary embodiment of the present application
  • Fig. 4 shows a schematic diagram of an effective pixel area shown in an exemplary embodiment of the present application
  • Fig. 5 shows a schematic structural diagram of a coding network provided by an exemplary embodiment of the present application
  • Fig. 6 shows a schematic diagram of training a matching network provided by an exemplary embodiment of the present application
  • Fig. 7 shows an interface diagram of an endoscope image display interface shown in an exemplary embodiment of the present application
  • Fig. 8 shows a schematic diagram of an endoscope image recognition process shown in an exemplary embodiment of the present application
  • Fig. 9 shows a flowchart of an endoscopic image display method according to an exemplary embodiment of the present application.
  • Fig. 10 shows a schematic diagram of an endoscopic image retrieval system according to an exemplary embodiment of the present application
  • Fig. 11 shows a flowchart of an endoscopic image display method according to an exemplary embodiment of the present application
  • Fig. 12 is a structural block diagram of an endoscopic image display device according to an exemplary embodiment
  • Fig. 13 is a structural block diagram of an endoscopic image display device according to an exemplary embodiment
  • Fig. 14 is a structural block diagram showing a computer device according to an exemplary embodiment.
  • the embodiment of the application proposes an efficient and high-accuracy endoscopic assisted diagnosis solution, which can help by positioning the image of the lesion area in the endoscopic image and matching it with the existing image sample Users and their doctors quickly identify possible digestive tract diseases (such as early gastric cancer).
  • endoscopic assisted diagnosis solution proposes an efficient and high-accuracy endoscopic assisted diagnosis solution, which can help by positioning the image of the lesion area in the endoscopic image and matching it with the existing image sample Users and their doctors quickly identify possible digestive tract diseases (such as early gastric cancer).
  • an endoscope refers to a commonly used medical device composed of a bendable part, a light source and a lens.
  • the endoscope can enter the human body through the natural orifice of the human body or through a small incision made by surgery.
  • the endoscope can be introduced into the organ for pre-examination to directly observe the changes in the relevant part.
  • Endoscopes include gastroscope, colonoscope and so on.
  • a gastroscope is an endoscope that uses a slender, soft tube to extend from the pharynx into the stomach, so that through the lens of the tube head, images of the digestive tract in the patient's body can be captured in real time.
  • the doctor can directly observe the lesions of the esophagus, stomach and duodenum through the screen of the external monitor.
  • gastroscopy doctors can directly observe the true conditions of the inspected parts, and can further confirm the diagnosis through biopsy and cytology of suspicious lesions. It is the first choice for upper gastrointestinal lesions.
  • the focus area of the digestive tract disease is the area where the digestive tract organs produce the disease.
  • the gastric mucosal area corresponding to the protruding tissue is the focal area of gastric polyposis; for another example, the duodenum of the stomach will digest under the invasion of high gastric acid and pepsin. If local inflammatory defects are formed in the mucosa itself, the area corresponding to the digested duodenal mucosa epithelium is the focus area of duodenal ulcer.
  • NBI Narrow Band Imaging
  • NBI also known as narrowband imaging endoscopy
  • narrowband imaging endoscopy is an emerging endoscopy technology, which uses filters to filter out the broadband spectrum of the red, blue and green light waves emitted by the endoscope light source, leaving only the narrowband spectrum for diagnosis and digestion Tao’s various diseases.
  • the image after turning on the NBI mode can not only accurately observe the morphology of the mucosal epithelium of the digestive tract, such as the structure of the epithelial gland, but also the morphology of the epithelial vascular network.
  • This new technology can better help endoscopists distinguish gastrointestinal epithelium, changes in vascular morphology in gastrointestinal inflammation, and irregular changes in the early stage of gastrointestinal tumors, thereby improving the accuracy of endoscopic diagnosis.
  • the loss function or cost function is a function that maps the value of a random event or its related random variable to a non-negative real number to express the "risk” or "loss" of the random event.
  • the loss function is usually used as a learning criterion to be associated with optimization problems, that is, to solve and evaluate the model by minimizing the loss function.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology.
  • Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • the solutions provided by the embodiments of the present application mainly relate to technologies such as machine learning/deep learning in artificial intelligence.
  • Machine learning is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. Specializing in the study of how computers simulate or realize human learning behaviors in order to acquire new knowledge or skills, and reorganize the existing knowledge structure to continuously improve its own performance.
  • Machine learning is the core of artificial intelligence, the fundamental way to make computers intelligent, and its applications cover all fields of artificial intelligence.
  • Machine learning and deep learning usually include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and style teaching learning.
  • the AI box is deployed in a set of hardware equipment and services in the hospital.
  • the AI engine service and video capture card are integrated in the AI box.
  • the AI box can obtain the real-time video stream of the endoscope and input it to the AI engine service, which is positioned by AI Real-time endoscopic image of the lesion, and real-time analysis of whether it is early cancer and its probability.
  • Fig. 1 is a frame diagram of image recognition and image display according to an exemplary embodiment.
  • the image recognition device 110 performs real-time lesion recognition based on the endoscopic image input by the endoscopic device 120.
  • the image recognition device 110 may be an AI box, and the image recognition device may include video.
  • the video acquisition module is used to acquire the image captured by the endoscope in real time and input the image into the image recognition module.
  • the video acquisition module can be implemented as shown in Figure 1
  • the video capture card shown in the figure; the image recognition module is used to perform recognition processing on the image screen input to the image recognition device to obtain the recognition result.
  • the image recognition module can be implemented as the AI engine and CPU server shown in Figure 1, in one possibility
  • the AI engine and the CPU server can be integrated in the AI box, or the AI engine and the CPU server can also interact with the AI box in a wired or wireless manner;
  • the image output module is used to obtain the image recognition module
  • the recognition result is input to the image display device 130 for display, where the image display device 130 may be a built-in display module of the image recognition device 110 or an image display device 120 external to the image recognition device 110.
  • the image display device 130 displays the endoscopic image and the recognition result of the image recognition device 110 on the image display interface.
  • the aforementioned image recognition device 110 may be a computer device with machine learning capabilities.
  • the computer device may be a stationary computer device such as a personal computer, a server, and a stationary medical device (such as the aforementioned AI box), or the computer device may also It is a mobile computer device such as a tablet computer, an e-book reader, or a portable medical device.
  • the image recognition device 110 and the image display device 130 described above may be the same device, or the image recognition device 110 and the image display device 130 may also be different devices.
  • the image recognition device 110 and the image display device 130 may be the same type of device, for example, the image recognition device 110 and the image display device 130 may both be personal computers;
  • the image recognition device 110 and the image display device 130 may also be different types of devices.
  • the image recognition device 110 may be an AI box
  • the image display device 120 may be a fixed medical device or a fixed medical device, such as an image display.
  • the device may be a doctor report workstation of the Picture Archiving & Communication System (PACS) shown in FIG. 1.
  • PACS Picture Archiving & Communication System
  • the image display device is an external device of an image recognition device (for example, an AI box) as an example to illustrate the application.
  • an image recognition device for example, an AI box
  • FIG. 2 shows a flowchart of an endoscopic image display method according to an exemplary embodiment of the present application.
  • the endoscopic image display method may be used in a computer device, for example, the computer device may be The image recognition device shown in Figure 1 above.
  • the endoscopic image display method may include the following steps:
  • Step 210 Obtain an endoscopic image collected by an endoscope.
  • the endoscopic image collected by the endoscope can be a white light image or an NBI image.
  • the NBI image refers to the use of a filter to filter out the red, blue and green emitted by the endoscope light source during the endoscopic image collection process.
  • the image collected after the broadband spectrum in the light wave. Whether the endoscope collects a white light image or an NBI image can be adjusted by the medical worker by changing the working mode of the endoscope.
  • the working mode of the endoscope is white light mode
  • the image collected by the endoscope is White light image.
  • the working mode of the endoscope is the NBI image.
  • Step 220 Locate a target area image from the endoscopic image, where the target area image is a partial image that includes the target area in the endoscopic image.
  • the computer device can use the image within a certain range of the suspected lesion area as the target area image in the endoscopic image, and locate the target area image in the endoscopic image The position in the mirror image.
  • Step 230 Input the target region image into the coding network to obtain the semantic features of the target region image output by the coding network; the coding network is the part of the network used to extract image features in the image classification network; the image classification network uses the first training image And a machine learning network trained on the image category of the first training image.
  • the coding network may be a convolutional network (such as a full convolutional network), and the coding network is used to extract image features, so as to obtain the semantic features of the target region image input to the coding network.
  • a convolutional network such as a full convolutional network
  • the coding network may be a part of the image classification network, and the image classification network may be composed of the coding network and the classification network.
  • the first training can be input to the model training device.
  • the image and the image category of the first training image are used to train the image classification network.
  • a loss function can be calculated based on the output result of the image classification network and the image category of the first training image, and the image can be classified according to the loss function
  • the parameters in the network are adjusted so that the output result of the image classification network obtained by training is as close to the image category of the first training image as possible, and the loss function is used to normalize the output result of the image classification network and the first training image.
  • the relationship between image categories are adjusted so that the output result of the image classification network obtained by training is as close to the image category of the first training image as possible, and the loss function is used to normalize the output result of the image classification network and the first training image.
  • the first training image may be a marked area image in the endoscopic image
  • the image category of the first training image may be the image type corresponding to the area image, that is, whether it is the target area, for example, whether It is the lesion area.
  • Step 240 Match the semantic feature of the target area image with the semantic feature of each image sample to obtain a matching result; the matching result is used to indicate a target image sample matching the target area image in each image sample.
  • the semantic feature of each image sample refers to the semantic feature corresponding to each image sample obtained in advance through the same coding model as the target region image.
  • each image sample has a corresponding lesion attribute
  • the matching result of the semantic feature of the target area image and the semantic feature of each image sample can indicate the lesion attribute that the target area image may correspond to.
  • step 250 the endoscopic image and the matching result are displayed on the endoscopic image display interface.
  • the endoscope image display interface may be the display screen of an external image display device connected to a computer device.
  • the endoscope display interface can display images collected by the endoscope in real time, or the endoscope processed by the computer device.
  • the image collected by the speculum for example, an image marked with the location of the target image area.
  • the matching result may also be displayed on the display interface of the endoscope.
  • the matching result may include a lesion attribute that may correspond to the target image area obtained through a matching network and/or an image of an image sample corresponding to the matching result.
  • the endoscopic image display method acquires the endoscopic image collected by the endoscope; locates the target area image from the endoscopic image, and inputs the target area image into the coding network. Obtain the semantic features of the target area image output by the encoding network; match the semantic features of the target area image with the semantic features of each image sample to obtain the matching result; the endoscopic image display interface displays the endoscopic image and the matching result .
  • the endoscopic image display interface displays the endoscopic image and the matching result .
  • FIG. 3 shows an exemplary embodiment of the present application.
  • Step 310 Obtain an endoscopic image collected by an endoscope.
  • the above process can be expressed as that the AI box obtains the real-time video stream of the endoscope and communicates with the AI The box is connected, or the AI engine server integrated in the AI box inputs the real-time video stream picture of the endoscope, and the corresponding AI engine server obtains the endoscope image collected by the endoscope.
  • Step 320 Obtain an image mode of the endoscopic image collected by the endoscope.
  • the image mode of the endoscopic image can be switched by the medical staff through manual adjustment, for example, when the medical staff finds a suspected lesion by observing the endoscopic image in the white light mode, they can switch the white light mode to NBI mode, NBI mode Compared with the white light mode, the picture of, can more clearly show the direction of the blood vessel, the opening of the duct and other information.
  • NBI mode the endoscopic image can be further observed for more details of the suspected lesion, so that the medical staff can conduct the investigation of the suspected lesion. judge.
  • the process of acquiring the image mode of the endoscopic image collected by the endoscope can be implemented as:
  • the image mode of the endoscopic image is the NBI mode.
  • the computer equipment can acquire the working status of the endoscope when the image is collected based on the user operation of the mode selection by the medical staff.
  • the endoscope collected by the endoscope in this mode The image mode of the image is the white light mode.
  • the image mode of the endoscopic image collected by the endoscope is the NBI mode.
  • the computer device may input the endoscopic image into the image pattern classification network to obtain image pattern information output by the image pattern classification network; wherein the image pattern classification network is a machine learning network obtained through training of the third training image;
  • the image mode information indicates whether the image mode of the endoscopic image is the NBI mode.
  • the third training image is annotated with an image mode.
  • the above-mentioned image pattern classification network may be a dense convolutional network (DenseNet) to classify and recognize endoscopic images, and the image pattern classification network may be obtained through machine learning model training in advance.
  • DenseNet dense convolutional network
  • the model training device can input endoscopic image samples and corresponding image patterns into the model training device to construct an image pattern classification network, based on the output result of the image pattern classification network and the corresponding image pattern Perform a loss function calculation, and adjust the parameters in the image pattern classification network according to the loss function, so that the output result of the image pattern classification network obtained by training indicates as close as possible to the image pattern corresponding to the endoscopic image sample.
  • the computer device can input the endoscopic image into the image mode classification network, and the image mode classification network can output the image mode corresponding to the endoscopic image.
  • the image pattern classification network can scale the input endoscopic image so that the size of the scaled endoscopic image meets the requirements of the image pattern classification network, for example, the endoscope required by the image pattern classification network
  • the size of the image is 224*224, so before the image mode is judged, the size of the input endoscopic image is first scaled to 224*224.
  • this application provides an image pattern classification network, the model structure of which is shown in Table 1:
  • the image pattern classification network prefers lower-level feature combinations, such as blood vessel colors, etc., when setting the combination of depth and width of the dense convolutional network structure, a wider and shallower pattern can be used, and finally
  • the network structure used can be the above-mentioned DenseNet-40 (dense convolutional network structure-40), and then network parameter tuning, for example, the growth-rate (growth rate) is set to 48, the feature is set through the conversion layer compression ratio Is 0.5.
  • the computer device locates the target area image from the endoscopic image, it further includes:
  • Acquiring image quality information of the endoscopic image where the image quality information includes at least one of blur, exposure and tone abnormality, and effective resolution;
  • the step of locating the target area image from the endoscopic image is performed.
  • the acquired endoscopic images may contain blurred images caused by blurred shooting or undigested food residues in the digestive tract.
  • the existence of these blurred images will cause subsequent analysis and judgment Serious errors, so it is necessary to filter the low-quality images in the images collected by the endoscope.
  • the low-quality images can include but are not limited to the following three situations: blurred images, abnormal tones, overexposed and underexposed images, and low resolution Rate the picture.
  • the computer equipment can recognize by calculating the effective pixel area in the picture, where the effective pixel area refers to the area after cropping the black borders on the top, bottom, left, and right of the picture.
  • the effective pixel area refers to the area after cropping the black borders on the top, bottom, left, and right of the picture.
  • Figure 4 shows this application.
  • An exemplary embodiment shows a schematic diagram of the effective pixel area.
  • the area 410 outside the upper, lower, left, and right black borders in the interface of FIG. 4 is the effective pixel area.
  • the black border trimming process can be realized by computer equipment by calculating the gray value distribution of each row or column of the picture. When the gray value distribution of a certain row or column of the picture is lower than the preset threshold ratio When a certain threshold is reached, it is determined that the row or column should be cut off.
  • the row or column is determined It should be cut off.
  • the computer device judges the effective pixel area after cutting off the black border, and if the effective pixel area is less than a certain threshold, the picture is judged to be a low-resolution picture.
  • the threshold in the above description can be set based on actual application requirements.
  • the embodiment of the present application provides an exemplary detection algorithm for blurred pictures, which eliminates the moiré generated when the endoscope is sampled by performing Gaussian filtering on the endoscopic image, where the moiré refers to
  • the high-frequency interference fringe appearing on the photosensitive element is a kind of high-frequency irregular fringe that makes the picture appear colored;
  • the endoscopic image after Gaussian filtering is defined as R, and the endoscopic image after Gaussian filtering is re
  • the endoscopic image after median filtering is defined as P.
  • the median filtering may be a 3*3 median filtering.
  • the pixel image edge detection operator can be the Sobel operator (Sobel operator); calculate the similarity of G_P and G_R, Determine whether the endoscopic image is a blurred image based on the similarity calculation result. The higher the similarity between G_P and G_R, the more blurred the endoscopic image, and the lower the similarity between G_P and G_R, the clearer the endoscopic image.
  • this application provides an exemplary detection algorithm for abnormal hue and overexposed and underexposed images. Because there are many abnormal phenomena in color abnormality and overexposure and underexposure, it is necessary to build a standard library file with qualified color tones and normal shooting.
  • detecting the endoscopic image first, the image is divided into n image frames on average, and then Randomly select m image blocks, m and n are both positive integers, and m ⁇ n.
  • the endoscopic image is determined to be a matched image, that is, the tone is normal and does not exist Overexposed and underexposed pictures. Otherwise, if the similarity between a certain threshold image block and the standard image in the m image blocks does not reach the similarity threshold, then the endoscopic image is judged to be a matching failure image, that is, the color tone is abnormal and / Or there are overexposed or underexposed pictures.
  • the endoscopic image can be divided into 7*7 image blocks, and 9 image blocks can be randomly taken out for calculation of H, S, and V. With H and S as features, the 9 image blocks can be separately performed on the standard image.
  • the similarity threshold and a certain threshold for judging the matching of image blocks can be set and adjusted based on actual application requirements, which is not limited in the embodiment of the present application.
  • the acquisition and judgment of image quality information can be performed before acquiring the working state of the endoscope when the endoscope image is collected, to ensure that the endoscope images input to the image mode classification network are all high-quality pictures, which is convenient for image mode
  • the judgment of the classification network improves the recognition accuracy of the image pattern classification network. That is, in response to the image quality information meeting the image quality threshold, the step of acquiring the operating state of the endoscope when the endoscope image is acquired is executed.
  • Step 330 In response to the image mode of the endoscopic image being the NBI mode, the endoscopic image is input to the target area positioning network to obtain the area coordinates output by the target area positioning network; the target area positioning network is obtained through training of the second training image Machine learning network; the second training image is marked with the target area.
  • the image mode of the endoscopic image acquired by the endoscope acquired through step 320 indicates that the image is in the NBI mode
  • the endoscopic image is input into the target area positioning network, and accordingly, if the endoscopic image acquired through step 320 is The image mode of the endoscopic image collected by the microscope indicates that the image is a white light mode, and the step of inputting the endoscopic image into the target area positioning network will not be performed.
  • the target area positioning network is used to locate the target area in the input NBI mode endoscopic image.
  • the target area may be a suspected lesion area.
  • the computer device locates the network in the endoscope image through the target area
  • the location coordinates of the suspected lesion area can be obtained and output.
  • the target area positioning network may be an end-to-end real-time target detection and recognition network, and the target positioning network may be obtained through training of a machine learning model in advance.
  • the model training device can input the second training image marked with the target area into the model training device to construct a target area positioning network, based on the output result of the target area positioning network and the corresponding target area
  • the coordinates are calculated for the loss function, and the parameters in the target location network are adjusted according to the loss function, so that the output result of the target area location network obtained by training indicates as close as possible to the target area corresponding to the second training image with the target area coordinate.
  • the YOLO v2 algorithm can be used to locate and detect the target area.
  • YOLO v2 uses a single neural network to convert the target detection problem into a regression problem of extracting bounding boxes and category probabilities in the image.
  • YOLO v2 adopts the multi-degree training method and borrows the Faster RCNN anchor box (Faster RCNN anchor box) idea, which can improve the progress of model detection and generalization ability while ensuring the detection speed.
  • the size of the anchor box can be clustered based on free training data.
  • the (image database) data is good for the initialization parameters of the target area positioning network, and then the field data is used to adjust the initialization parameters of the target area positioning network, so that the resulting target area positioning network can perform well in the field.
  • the Imagenet data is an open source data set related to image classification and target detection in the computer vision field.
  • the image data covers thousands of categories in various fields, and the amount of data is more than one million.
  • the embodiments of this application can be obtained through Imagenet data training
  • the initialization parameters of the model can better converge the model to obtain the global optimal solution.
  • specific training is carried out for a specific field, so as to improve the judgment accuracy of the model in a specific field. For example, using endoscopic images in the medical field to retrain the initialized model to obtain a model with higher accuracy in the medical field.
  • the computer device can input the NBI mode endoscope image into the target area network, and the target area network can output the coordinates of the target area in the endoscope image.
  • Step 340 Acquire an image corresponding to the region coordinates in the endoscopic image as a target region image.
  • the coordinates of the target area may be the coordinates of several end points of a polygon (such as a rectangle) that can frame the target area.
  • the computer device can connect the coordinates of each end point in turn. To obtain the target area range, and obtain the image in the range as the target area image.
  • Step 350 Input the target region image into the coding network to obtain the semantic features of the target region image output by the coding network; the coding network is the part of the network used to extract image features in the image classification network; the image classification network is through the first training image and The machine learning network obtained by training the image category of the first training image.
  • the coding network is used to obtain the semantic features of the target area image input to the coding network without the need to classify the target area image.
  • the coding network can perform dimensionality reduction processing on the target region image input to the coding network, and use the reduced dimensionality image data of the target region as the input of the database and the matching network for subsequent sample analysis Match and match.
  • FIG. 5 shows a schematic structural diagram of a coding network provided by an exemplary embodiment of the present application.
  • the coding network may include a fully convolutional network 510 and a global pooling layer 520, in which the full volume
  • the product network 510 is used to parse the target region image input into the coding network into high-dimensional semantic features.
  • the high-dimensional semantic features can be expressed as a feature map of size H*W*K, where H corresponds to the length of the feature map, and W Corresponding to the width of the feature map, K corresponds to the number of feature maps.
  • the feature map will be subsequently imported into the full set pooling layer 520 for subsequent processing to reduce the dimension of high-dimensional semantic features to obtain a 1*K-dimensional semantic feature vector, which facilitates subsequent matching of semantic features.
  • the database may initially filter the image samples stored in the database based on the semantic features acquired by the encoding network to obtain information containing semantic features similar to that of the target area image. Samples with similar semantic features can reduce meaningless matching in the subsequent matching process and reduce the working pressure of the matching network.
  • the database can obtain the image type corresponding to the above-mentioned target area image, and based on the image type, filter out the semantic features corresponding to the image samples with the same image type, and then only need to correspond the semantic features of the target area image with the filtered image samples It is sufficient to match the semantic features of the target area image, and it is not necessary to match the semantic features of the image of the target area with the semantic features corresponding to all the image samples in the database.
  • the above-mentioned image type may indicate the type of organ in the image and so on.
  • the database is used to store the K-dimensional semantic features corresponding to the original sample.
  • the database also stores relevant information that can be traced back to the original picture, and in order to realize the input-based target area image
  • the samples in the database are preliminarily filtered.
  • the database has a special plan for the storage of K-dimensional semantic features.
  • the K-dimensional semantic features of the original samples stored in the database are obtained through a coding network that extracts semantic features from the image of the target area. .
  • Step 360 Input the semantic feature of the target area image and the semantic feature of each image sample into the matching network to obtain the matching score between the target area image and each image sample outputted by the matching network; the matching network is marked with a matching label
  • the semantic feature pair is obtained by training, the semantic feature pair contains the semantic features of the two images, and the matching label is used to indicate whether the corresponding semantic feature pair matches.
  • the matching network can be composed of a two-branch input similarity measurement network (Siamese netwoke) to evaluate the matching relationship between the two input samples input to the matching network, and this matching relationship can be between the two
  • the degree of similarity can also be the spatial distance between the two.
  • the database will also input the semantic feature of the corresponding type of image sample into the matching network, and then In the matching network, the semantic features of the image of the target area are sequentially matched with the semantic features of each image sample selected in the database.
  • the matching network can output the matching result based on the degree of matching between the semantic feature of the target area and the semantic feature of the image sample.
  • the matching result may be to score the matching relationship between the two, where the matching relationship score may be in various forms, such as Euclidean distance, cosine similarity, etc., which is not limited in this application.
  • the matching network can be trained by inputting semantic feature pairs labeled with matching tags into the model training device.
  • Each semantic feature pair can contain two semantic features in pairs.
  • the label is used to indicate whether the corresponding semantic feature pair matches, that is to say, several pairs of semantic feature pairs are input to the matching network, and each semantic feature pair corresponds to a matching label.
  • the model training device is based on the output result of the matching network Perform a loss function calculation with the matching label, and adjust the parameters in the matching network based on the calculation result of the loss function, so that the output result of the matching network obtained by training can indicate as close to the matching label as possible.
  • FIG. 6 shows a schematic diagram of training a matching network provided by an exemplary embodiment of the present application, as shown in FIG. 6.
  • the matching network simultaneously inputs the K-dimensional semantic feature 610 of the endoscopic image 1 and the K-dimensional semantic feature 620 of the endoscopic image 2, through multiple fully connected layers and the non-linear change of the activation function to M dimensions, and calculates the difference between the two
  • the relationship score of is recorded as D. If the score of the relationship between the semantic feature of the target area image and the semantic feature of the image sample in the matching network is defined as Euclidean distance, then the loss function can be defined as the following form:
  • represents a smoothing parameter to suppress the relationship score.
  • Step 370 Determine the target image sample based on the matching scores between the target area image and each image sample.
  • the computer device sorts the image samples according to the corresponding matching score from high to low, the top n image samples are used as the target image samples, where n ⁇ 1, and n is an integer;
  • the computer device uses, among each image sample, an image sample with a corresponding matching score higher than a matching score threshold as the target image sample;
  • the computer device sorts each image sample according to the corresponding matching score from high to low and arranges it in the top n image samples, and the image sample with the corresponding matching score higher than the matching score threshold is used as the target image sample .
  • the matching score between the target area image and each image sample is used to indicate the degree of similarity between the target area image and each image sample.
  • the target area image because the matching score is higher, the target area image The higher the degree of similarity with the image sample, therefore, after the computer equipment sorts each image sample in the order of the corresponding matching scores from high to low, the higher the ranking, the higher the degree of similarity between the image sample and the target area image The higher is, therefore, the top n image samples can be selected as the target image samples.
  • the same matching score may correspond to multiple image samples. Taking only the top n may not be able to filter out all the image samples corresponding to the top matching scores. Therefore, the computer device can choose to set a matching score. Value threshold, and all image samples higher than the matching score threshold are used as target image samples.
  • the computer device may first sort each image sample according to the corresponding matching score from high to low, and then filter the top n image samples based on the matching score threshold after sorting, so as to obtain the top ranking , And the image samples that are higher than the matching score threshold.
  • Step 380 Obtain a matching result based on the target image sample.
  • the matching result includes at least one of the following:
  • the computer device can obtain the target image sample based on the relevant information stored in the database that can be traced back to the image sample.
  • step 390 the endoscopic image and the matching result are displayed on the endoscopic image display interface.
  • the area mark is used to indicate the target area in the endoscopic image.
  • FIG. 7 shows an interface diagram of an endoscope image display interface shown in an exemplary embodiment of the present application.
  • an endoscope image is displayed in an area 710, and the endoscope image
  • the area mark 711 is displayed on the upper part, and the matching result is displayed in the area 720.
  • the matching result may include the target image sample 721, the image category 722 of the target image sample, and the matching degree 723 between the target image sample and the target area image.
  • the area marker 711 in the endoscopic image may be correspondingly displayed with the relevant information of the target image sample with the highest matching degree in the matching result, for example, the image category corresponding to the target image sample and the difference between the target image sample and the endoscopic image The degree of matching, etc.
  • the endoscopic image display method acquires the endoscopic image collected by the endoscope; locates the target area image from the endoscopic image, and inputs the target area image into the coding network. Obtain the semantic features of the target area image output by the encoding network; match the semantic features of the target area image with the semantic features of each image sample to obtain the matching result; the endoscopic image display interface displays the endoscopic image and the matching result .
  • the endoscopic image display interface displays the endoscopic image and the matching result .
  • FIG. 8 shows a schematic diagram of an endoscope image recognition process shown in an exemplary embodiment of the present application, as shown in FIG.
  • the user enters the endoscope into the stomach of the human body through the natural orifice of the human body or a small incision made by surgery, performs endoscopic image collection, and inputs the collected image of the underscope into the computer device; because the stomach may be There are food residues and other factors that affect the image quality of the endoscopic image. Therefore, a computer device is required to perform low-quality image filtering 810 on the endoscopic image to filter out high-quality endoscopic images for further processing.
  • Computer equipment preprocesses the filtered high-quality endoscopic images, adjusts the image size to a size that meets the requirements of the image pattern classification network, and then starts image type recognition 820; the image type recognition process can rely on The image pattern classification network filters out the endoscopic images in NBI mode from the endoscopic images input to the image pattern classification network, and locates the suspected lesions on the endoscopic images in the NBI mode 830; in the process of locating the suspected lesions Relying on the target area positioning network, locate the suspected lesion in the endoscopic image input into the target area positioning network, and obtain the area coordinates of the lesion area corresponding to the suspected lesion; perform similar data retrieval in the database for the suspected lesion area And analysis 840, the process can rely on the encoding network and matching network to encode the input endoscopic image, obtain the semantic features of the endoscopic image, and compare it with the semantic features of the sample image stored in the data after preliminary screening. The matching is performed to obtain a sample image that matches the image of the suspected lesion area, so as to
  • FIG. 9 shows an example of the present application
  • the embodiment shows a flowchart of an endoscopic image display method.
  • the endoscopic image display method can be used in a computer device, such as the image recognition device shown in FIG. 1 described above. As shown in FIG. 9, the endoscopic image display method may include the following steps:
  • Step 910 Obtain an endoscopic image collected by an endoscope.
  • the endoscope image display interface can display the endoscope image in real time, and in the endoscope display interface, the user can display the interface for the endoscope image Perform user operations in the endoscopic image.
  • the user operation may include, but is not limited to, a zoom-in operation, a zoom-out operation, and a frame selection operation.
  • the computer device can obtain the endoscopic image collected by the endoscope, and can also obtain the user operation generated by the user through the interface interaction.
  • Step 920 Receive a frame selection operation performed by the user in the endoscopic image.
  • the frame selection operation performed by the user in the endoscopic image can be that the user selects a part of the region in the user's endoscopic image through an external device such as a mouse, or it can also be that the user directly interacts with the endoscope display interface to Select some areas in the endoscope interface.
  • Step 930 Acquire an image of an area corresponding to the frame selection operation in the endoscopic image as a target area image.
  • a frame selection frame may be displayed in the active area of the frame selection operation to indicate that the area is a frame selection area, and the image located in the frame selection area is acquired as The target area image.
  • the user can perform a frame selection operation, or, in the NBI mode, the image corresponding to the frame selection area can be acquired as the target area image.
  • the image quality information of the image of the target area can be obtained.
  • the image quality information includes at least one of blur, exposure and tone abnormality, and effective resolution. ;
  • the computer device executes the step of inputting the target area image into the coding network, so that the target area image processed by the coding network is the target area image with higher image quality, reducing the subsequent recognition of low-quality images
  • the impact of the matching process reduces unnecessary workload.
  • Step 940 Input the target area image into the coding network to obtain the semantic features of the target area image output by the coding network; the coding network is the part of the network used to extract image features in the image classification network; the image classification network uses the first training image and The machine learning network obtained by training the image category of the first training image.
  • Step 950 Match the semantic feature of the target area image with the semantic feature of each image sample to obtain a matching result; the matching result is used to indicate the target image sample matching the target area image in each image sample.
  • FIG. 10 shows a schematic diagram of an endoscopic image retrieval system according to an exemplary embodiment of the present application.
  • the endoscopic image system may include an encoding network 1010, a database 1020, and a matching network. 1030.
  • the endoscopic image is first passed through the encoding network 1010 to obtain the K-dimensional semantic feature corresponding to the endoscopic image, and the K-dimensional semantic feature is used as the database 1020
  • the K-dimensional semantic feature is a semantic feature after dimensionality reduction
  • the database 1020 performs a preliminary screening of the image samples in the database based on the K-dimensional semantic feature to obtain a semantic feature close to the endoscopic image
  • the matching network 1030 matches the semantic features of the endoscopic image input by the encoding network 1010 and the semantic features of the image samples input by the database 1020 , And score the matching relationship between the semantic feature of the endoscopic image and the semantic feature of each image sample to obtain the score result
  • the computer equipment can sort the score results according to the score value, and according to certain determination rules, from the database 1020 Obtain the target image sample from the image sample corresponding to the semantic feature of the
  • step 960 the endoscopic image and the matching result are displayed on the endoscopic image display interface.
  • the endoscopic image displayed on the endoscopic image display interface is the corresponding endoscopic image when the user performs the frame selection operation, and the matching result refers to the matching result corresponding to the image in the frame selection area of the user frame selection operation.
  • the endoscopic image display method acquires the endoscopic image collected by the endoscope; locates the target area image from the endoscopic image, and inputs the target area image into the coding network. Obtain the semantic features of the target area image output by the encoding network; match the semantic features of the target area image with the semantic features of each image sample to obtain the matching result; the endoscopic image display interface displays the endoscopic image and the matching result .
  • the endoscopic image display interface displays the endoscopic image and the matching result .
  • FIG. 11 shows a flowchart of an endoscopic image display method shown in an exemplary embodiment of the present application.
  • the endoscopic image display method can be used in a computer device, such as the one shown in FIG. 1 above.
  • the endoscopic image display method may include the following steps:
  • Step 1111 Display a first endoscopic image on the endoscopic image display interface, where the first endoscopic image is an image collected by the endoscope in the white light mode.
  • the user will first use the endoscope to collect images of the organ in the white light mode to obtain a global image of the organ.
  • the shooting mode of the endoscope can be switched to the NBI mode.
  • the NBI mode the user can observe the blood vessel flow, and the blood appears as black in the collected images in the NBI mode.
  • the digestion can also be accurately observed
  • the epithelial morphology of the mucosa of the tract is convenient for users to observe and judge the lesion area.
  • Step 1120 in response to the switching of the shooting mode of the endoscope to the NBI mode, a second endoscope image is displayed on the endoscope image display interface, and the second endoscope image is an image collected by the endoscope in the NBI mode .
  • Step 1130 In the endoscopic image display interface, a matching result is displayed corresponding to the second endoscopic image, and the matching result is used to indicate a target image sample that matches the target region image in the second endoscopic image.
  • the above-mentioned matching result of the second endoscopic image is the target image sample matching the second endoscopic image and other related information obtained after the image recognition is performed on the second endoscopic image.
  • the process of recognizing the second endoscopic image reference may be made to related content in the endoscopic image display method shown in any of the above-mentioned embodiments of FIG. 2, FIG. 3, or FIG. 8.
  • the endoscopic image display method displays the image collected by the endoscope in the white light mode on the endoscope image display interface; in response to the endoscope's shooting mode being switched to the NBI mode, The image acquired by the endoscope in the NBI mode is displayed on the endoscope image display interface; on the endoscope image display interface, the matching result corresponding to the second endoscope image is displayed.
  • Fig. 12 is a structural block diagram of an endoscopic image display device according to an exemplary embodiment.
  • the endoscopic image display device can be used in a computer device.
  • the computer device can be the image recognition device shown in FIG. 1 to execute the method shown in any one of the embodiments in FIG. 2, FIG. 3, or FIG. 9. All or part of the steps.
  • the endoscopic image display device may include:
  • the endoscopic image acquisition module 1210 is used to acquire the endoscopic image collected by the endoscope
  • the area image positioning module 1220 is used to locate a target area image from the endoscopic image, where the target area image is a partial image of the endoscope image that includes the target area;
  • the semantic feature extraction module 1230 is used to input the target area image into the coding network to obtain the semantic feature of the target area image output by the coding network;
  • the coding network is the network part used to extract image features in the image classification network;
  • the image classification network It is a machine learning network trained through the first training image and the image category of the first training image;
  • the matching module 1240 is used to match the semantic feature of the target area image with the semantic feature of each image sample to obtain a matching result; the matching result is used to indicate the target image sample in each image sample that matches the target area image;
  • the first display module 1250 is used for displaying the endoscopic image and the matching result on the endoscopic image display interface.
  • the matching module 1240 includes:
  • the matching score acquisition sub-module is used to input the semantic features of the target area image and the semantic features of each image sample into the matching network, and obtain the matching scores between the target area image and each image sample output by the matching network;
  • the matching The network is obtained by training the semantic feature pairs marked with matching tags.
  • the semantic feature pairs contain the semantic features of the two images, and the matching tags are used to indicate whether the corresponding semantic feature pairs match;
  • the image sample determination sub-module is used to determine the target image sample based on the matching scores between the target area image and each image sample;
  • the matching result obtaining sub-module is used to obtain the matching result based on the target image sample.
  • the image sample determination sub-module 1220 is configured to:
  • n image samples are used as target image samples, n ⁇ 1, n is an integer
  • an image sample with a corresponding matching score higher than a matching score threshold is used as the target image sample;
  • each image sample is sorted in the order of the corresponding matching score from high to low, and then arranged in the top n image samples, and the image sample with the corresponding matching score higher than the matching score threshold is used as the target image sample, n ⁇ 1, n is an integer.
  • the matching result includes at least one of the following:
  • the device further includes:
  • the second display module is used in the endoscope image display interface to correspond to the endoscope image display area mark, and the area mark is used to indicate the target area in the endoscope image.
  • the regional image positioning module 1220 includes:
  • the area coordinate acquisition sub-module is used to input the endoscopic image into the target area positioning network to obtain the area coordinates output by the target area positioning network;
  • the target area positioning network is a machine learning network trained through the second training image;
  • the second The training image is marked with the target area;
  • the first area image acquisition sub-module is used to acquire the image corresponding to the area coordinates in the endoscopic image as the target area image.
  • the regional image positioning module 1220 includes:
  • the user operation receiving sub-module is used to receive the frame selection operation performed by the user in the endoscopic image
  • the second area image acquisition sub-module is used to acquire the image of the area corresponding to the frame selection operation in the endoscopic image as the target area image.
  • the area image positioning module 1220 is configured to perform the step of locating the target area image from the endoscopic image in response to the image mode of the endoscopic image being the NBI mode.
  • the device further includes:
  • the image mode information acquisition module is used to input the endoscopic image into the image mode classification network to obtain the image mode information output by the image mode classification network; the image mode classification network is a machine learning network trained through the third training image. Three training images are marked with image modes; the image mode information indicates whether the image mode of the endoscopic image is the NBI mode.
  • the device further includes:
  • the working state acquisition module is used to acquire the working state of the endoscope when the endoscope is collected
  • the image mode of the endoscopic image is the NBI mode.
  • the device further includes:
  • the image quality information acquisition module is used to acquire the image quality information of the endoscopic image, the image quality information includes at least one of blur, exposure and tone abnormality, and effective resolution;
  • the area image positioning module 1220 is configured to perform the step of locating the target area image from the endoscopic image in response to the image quality information meeting the image quality threshold.
  • the endoscopic image display device provided by the present application is applied to computer equipment, by acquiring the endoscopic image collected by the endoscope; locating the target area image from the endoscopic image, and positioning the target
  • the region image is input to the coding network to obtain the semantic features of the target region image output by the coding network; the semantic features of the target region image are matched with the semantic features of each image sample to obtain the matching result; the endoscopic image display interface displays the endoscope Mirror image, and matching results.
  • Fig. 13 is a structural block diagram of an endoscopic image display device according to an exemplary embodiment.
  • the endoscopic image display device may be used in a computer device.
  • the computer device may be the image display device shown in FIG. 1 to perform all or part of the steps of the method shown in FIG. 11.
  • the endoscopic image display device may include:
  • the first display module 1310 is configured to display a first endoscopic image in the endoscopic image display interface, where the first endoscopic image is an image collected by the endoscope in the white light mode;
  • the second display module 1320 is configured to respond to the switching of the shooting mode of the endoscope to the NBI mode, and display the second endoscope image on the endoscope image display interface, and the second endoscope image is the endoscope in the NBI Images collected in mode;
  • the third display module 1330 is used in the endoscopic image display interface to display the matching result corresponding to the second endoscopic image, and the matching result is used to indicate the target matching the target area image in the second endoscopic image Image sample.
  • the endoscopic image display device provided by the present application is applied to computer equipment, and displays the images collected by the endoscope in the white light mode through the endoscope image display interface; in response to the shooting of the endoscope The mode is switched to the NBI mode, and the image collected by the endoscope in the NBI mode is displayed on the endoscope image display interface; on the endoscope image display interface, the matching result corresponding to the second endoscope image is displayed.
  • Fig. 14 is a structural block diagram showing a computer device 1400 according to an exemplary embodiment.
  • the computer device can be implemented as the image recognition device or the image display device in the above solution of this application.
  • the computer equipment 1400 includes a central processing unit (CPU) 1401, a system memory 1404 including a random access memory (Random Access Memory, RAM) 1402 and a read-only memory (Read-Only Memory, ROM) 1403, and A system bus 1405 connecting the system memory 1404 and the central processing unit 1401.
  • the computer device 1400 also includes a basic input/output system (Input/Output system, I/O system) 1406 that helps transfer information between various devices in the computer, and is used to store an operating system 1413, application programs 1414, and other programs.
  • Module 1415 is a mass storage device 1409.
  • the basic input/output system 1406 includes a display 1408 for displaying information and an input device 1407 such as a mouse and a keyboard for the user to input information.
  • the display 1408 and the input device 1407 are both connected to the central processing unit 1401 through the input and output controller 1410 connected to the system bus 1405.
  • the basic input/output system 1406 may also include an input and output controller 1410 for receiving and processing input from multiple other devices such as a keyboard, a mouse, or an electronic stylus.
  • the input and output controller 1410 also provides output to a display screen, a printer, or other types of output devices.
  • the mass storage device 1409 is connected to the central processing unit 1401 through a mass storage controller (not shown) connected to the system bus 1405.
  • the mass storage device 1409 and its associated computer-readable medium provide non-volatile storage for the computer device 1400. That is, the mass storage device 1409 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROM (Compact Disc Read-Only Memory) drive.
  • the computer-readable media may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, Erasable Programmable Read Only Memory (EPROM), Electronically-Erasable Programmable Read-Only Memory (EEPROM), flash memory or Other solid-state storage technologies, such as CD-ROM, Digital Versatile Disc (DVD) or other optical storage, tape cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices.
  • EPROM Erasable Programmable Read Only Memory
  • EEPROM Electronically-Erasable Programmable Read-Only Memory
  • flash memory or Other solid-state storage technologies, such as CD-ROM, Digital Versatile Disc (DVD) or other optical storage, tape cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices.
  • CD-ROM Compact Disc
  • DVD Digital Versatile Disc
  • tape cartridges magnetic
  • the computer device 1400 may also be connected to a remote computer on the network through a network such as the Internet to run. That is, the computer device 1400 can be connected to the network 1412 through the network interface unit 1411 connected to the system bus 1405, or in other words, the network interface unit 1411 can also be used to connect to other types of networks or remote computer systems (not shown) ).
  • the memory also includes one or more programs, the one or more programs are stored in the memory, and the central processing unit 1401 executes the one or more programs to implement the programs shown in FIG. 2, FIG. 3, FIG. 9 or FIG. 11. All or part of the steps of the method shown.
  • the functions described in the embodiments of the present application may be implemented by hardware, software, firmware, or any combination thereof.
  • these functions can be stored in a computer-readable medium or transmitted as one or more instructions or codes on the computer-readable medium.
  • the computer-readable medium includes a computer storage medium and a communication medium, where the communication medium includes any medium that facilitates the transfer of a computer program from one place to another.
  • the storage medium may be any available medium that can be accessed by a general-purpose or special-purpose computer.
  • the embodiment of the present application also provides a computer-readable storage medium for storing at least one instruction, at least one program, code set or instruction set, the at least one instruction, the at least one program, the code set or The instruction set is loaded and executed by the processor to realize the above-mentioned endoscopic image display method.
  • the computer-readable storage medium may be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Data Mining & Analysis (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Pathology (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Endoscopes (AREA)
  • Image Analysis (AREA)

Abstract

本申请是关于一种内窥镜图像展示方法、装置、计算机设备及存储介质。涉及机器学习技术领域,该方法包括:获取由内窥镜采集得到的内窥镜图像;从内窥镜图像中定位出目标区域图像;将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。通过上述方法基于人工智能技术,使得在内窥镜的使用过程中,通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅助诊断的准确性。

Description

内窥镜图像展示方法、装置、计算机设备及存储介质
本申请要求于2020年01月20日提交的申请号为202010067143.X、发明名称为“内窥镜图像展示方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及机器学习技术领域,特别涉及一种内窥镜图像展示方法、装置、计算机设备及存储介质。
背景技术
随着医学技术的不断发展,内窥镜(比如胃镜或者肠镜等)已经逐渐成为辅助医生诊断消化道疾病的重要手段。
在相关技术中,内窥镜设备的操作人员(比如医生或护士)操作内窥镜的镜头探入患者的消化道,内窥镜的镜头实时拍摄患者体内的消化道影像,并将消化道影像显示在外接显示器的显示界面上,医生通过消化道影像来对患者的消化道疾病进行初步诊断。
然而,通过内窥镜辅助诊断对于医生的经验要求较高,目前很多医生缺乏借助于内窥镜准确的进行消化道疾病诊断的能力,导致借助于内窥镜进行辅助诊断的准确性较低。
发明内容
本申请实施例提供了一种内窥镜图像展示方法、装置、计算机设备及存储介质,可以提高借助于内窥镜进行辅助诊断的准确性,技术方案如下:
第一方面,提供了一种内窥镜图像展示方法,所述方法由计算机设备执行,所述方法包括:
获取由内窥镜采集到的内窥镜图像;
从所述内窥镜图像中定位出目标区域图像,所述目标区域图像是所述内窥镜图像中包含目标区域的部分图像;
将所述目标区域图像输入编码网络,获得所述编码网络输出的,所述目标区域图像的语义特征;所述编码网络是图像分类网络中用于提取图像特征的网络部分;所述图像分类网络是通过第一训练图像以及所述第一训练图像的图像类别训练得到的机器学习网络;
将所述目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;所述匹配结果用于指示所述各个图像样本中与所述目标区域图像相匹配的目标图像样本;
在内窥镜图像显示界面显示所述内窥镜图像,以及所述匹配结果。
第二方面,提供了一种内窥镜图像展示方法,所述方法由计算机设备执行,所述方法包括:
在内窥镜图像显示界面中显示第一内窥镜图像,所述第一内窥镜图像是内窥镜在白光模式下采集的图像;
响应于所述内窥镜的拍摄模式切换为内镜窄带成像术NBI模式,在所述内窥镜图像显示界面中显示第二内窥镜图像,所述第二内窥镜图像是所述内窥镜在所述NBI模式下采集的图 像;
在所述内窥镜图像显示界面中,对应所述第二内窥镜图像显示匹配结果,所述匹配结果用于指示与所述第二内窥镜图像中的目标区域图像相匹配的目标图像样本。
第三方面,提供了一种内窥镜图像展示装置,所述方法用于计算机设备中,所述装置包括:
内窥镜图像获取模块,用于获取由内窥镜采集到的内窥镜图像;
区域图像定位模块,用于从所述内窥镜图像中定位出目标区域图像,所述目标区域图像是所述内窥镜图像中包含目标区域的部分图像;
语义特征提取模块,用于将所述目标区域图像输入编码网络,获得所述编码网络输出的,所述目标区域图像的语义特征;所述编码网络是图像分类网络中用于提取图像特征的网络部分;所述图像分类网络是通过第一训练图像以及所述第一训练图像的图像类别训练得到的机器学习网络;
匹配模块,用于将所述目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;所述匹配结果用于指示所述各个图像样本中与所述目标区域图像相匹配的目标图像样本;
显示模块,用于在内窥镜图像显示界面显示所述内窥镜图像,以及所述匹配结果。
可选的,所述匹配模块,包括:
匹配分值获取子模块,用于将所述目标区域图像的语义特征与所述各个图像样本的语义特征输入匹配网络,获得所述匹配网络输出的,所述目标区域图像分别与所述各个图像样本之间的匹配分值;所述匹配网络是通过标注有匹配标签的语义特征对进行训练得到的,所述语义特征对包含两个图像的语义特征,所述匹配标签用于指示对应的语义特征对是否匹配;
图像样本确定子模块,用于基于所述目标区域图像分别与所述各个图像样本之间的匹配分值确定所述目标图像样本;
匹配结果获取子模块,基于所述目标图像样本获取所述匹配结果。
可选的,所述图像样本确定子模块,用于,
将所述各个图像样本按照对应的匹配分值从高到低的顺序排序后,排列在前n位的图像样本作为所述目标图像样本,n≥1,n为整数;
或者,将所述各个图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为所述目标图像样本;
或者,将所述各个图像样本按照对应的匹配分值从高到低的顺序排序后排列在前n位的图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为所述目标图像样本。
可选的,所述匹配结果包括以下内容中的至少一项:
所述目标图像样本;
所述目标图像样本对应的图像类别;
以及,所述目标图像样本与所述目标区域图像的匹配度。
可选的,所述装置还包括:
第二显示模块,用于在所述内窥镜图像显示界面中,对应所述内窥镜图像显示区域标记,所述区域标记用于指示所述内窥镜图像中的所述目标区域。
可选的,所述区域图像定位模块,包括:
区域坐标获取子模块,用于将所述内窥镜图像输入目标区域定位网络,获得所述目标区 域定位网络输出的区域坐标;所述目标区域定位网络是通过第二训练图像训练得到的机器学习网络;所述第二训练图像标注有目标区域;
第一区域图像获取子模块,用于将所述内窥镜图像中对应所述区域坐标的图像获取为所述目标区域图像。
可选的,所述区域图像定位模块,包括:
用户操作接收子模块,用于接收用户在所述内窥镜图像中执行的框选操作;
第二区域图像获取子模块,用于将所述内窥镜图像中对应所述框选操作的区域的图像获取为所述目标区域图像。
可选的,所述区域图像定位模块,用于响应于所述内窥镜图像的图像模式为内镜窄带成像术NBI模式,执行从所述内窥镜图像中定位出目标区域图像的步骤。
可选的,所述装置还包括:
图像模式信息获取模块,用于将所述内窥镜图像输入图像模式分类网络,获得所述图像模式分类网络输出的图像模式信息;所述图像模式分类网络是通过第三训练图像训练得到的机器学习网络,所述第三训练图像标注有图像模式;所述图像模式信息指示所述内窥镜图像的图像模式是否为所述NBI模式。
可选的,所述装置还包括:
工作状态获取模块,用于获取所述内窥镜采集所述内窥镜图像时的工作状态;
响应于所述工作状态是NBI状态,确定所述内窥镜图像的图像模式为所述NBI模式。
可选的,所述装置还包括:
图像质量信息获取模块,用于获取所述内窥镜图像的图像质量信息,所述图像质量信息包括模糊度、曝光及色调异常度、以及有效分辨率中的至少一项;
区域图像定位模块,用于响应于所述图像质量信息满足图像质量阈值,执行从所述内窥镜图像中定位出目标区域图像的步骤。
第四方面,提供了一种内窥镜图像展示装置,所述装置用于计算机设备中,所述装置包括:
第一显示模块,用于在内窥镜图像显示界面中显示第一内窥镜图像,所述第一内窥镜图像是内窥镜在白光模式下采集的图像;
第二显示模块,用于响应于所述内窥镜的拍摄模式切换为内镜窄带成像术NBI模式,在所述内窥镜图像显示界面中显示第二内窥镜图像,所述第二内窥镜图像是所述内窥镜在所述NBI模式下采集的图像;
第三显示模块,用于在所述内窥镜图像显示界面中,对应所述第二内窥镜图像显示匹配结果,所述匹配结果用于指示与所述第二内窥镜图像中的目标区域图像相匹配的目标图像样本。
第五方面,提供了一种计算机设备,所述计算机设备包含处理器和存储器,存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述的内窥镜图像展示方法。
第六方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述的内窥镜图像展示方法。
本申请提供的技术方案可以包括以下有益效果:
通过获取由内窥镜采集到的内窥镜图像;从内窥镜图像中定位出目标区域图像,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。使得在内窥镜的使用过程中,通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅助诊断的准确性。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1是根据一示例性实施例示出的一种图像识别以及图像显示框架图;
图2示出了本申请一示例性实施例示出的一种内窥镜图像展示方法的流程图;
图3示出了本申请一示例性实施例示出的一种内窥镜图像展示方法的流程图;
图4示出了本申请一示例性实施例示出的有效像素面积的示意图;
图5示出了本申请一示例性实施例提供的编码网络的结构示意图;
图6示出了本申请一示例性实施例提供的匹配网络的训练示意图;
图7示出了本申请一示例性实施例示出的内窥镜图像显示界面的界面图;
图8示出了本申请一示例性实施例示出的一种内窥镜图像识别过程的示意图;
图9示出了本申请一示例性实施例示出的一种内窥镜图像展示方法的流程图;
图10示出了本申请一示例性实施例示出的内窥镜图像检索系统的示意图;
图11示出了本申请一示例性实施例示出的一种内窥镜图像展示方法的流程图;
图12是根据一示例性实施例示的一种内窥镜图像展示装置的结构方框图;
图13是根据一示例性实施例示的一种内窥镜图像展示装置的结构方框图;
图14是根据一示例性实施例示出的计算机设备的结构框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
本申请实施例提出了一种高效并且高准确率的内窥镜辅助诊断方案,该方案能够通过对内窥镜图像中的病灶区域的图像进行定位并与已有的图像样本进行匹配,来帮助用户及其医生快速识别出可能的消化道疾病(比如早期胃癌)。为了便于理解,下面对本申请实施例涉及的几个名词进行解释。
1)内窥镜
在本申请中,内窥镜是指一种由可弯曲部分、光源及镜头组成的,常用的医疗器械。内窥镜可以经人体的天然孔道,或者是经手术做的小切口进入人体内,在使用时,将内窥镜导入预检查的器官,即可直接窥视有关部位的变化。
常用的内窥镜包括胃镜、肠镜等。
其中,胃镜是一种借助一条纤细、柔软的管子从咽部伸入胃中,使得通过管子头部的镜头可以实时拍摄患者体内的消化道影像的内窥镜。借助于胃镜,医生可以通过外接显示器的画面直接观察食道、胃和十二指肠的病变。通过胃镜检查,医生能直接观察到被检查部位的真实情况,更可以通过对可疑病变部位进行活检及细胞学检查,以进一步明确诊断,是上消化道病变的首选检查方法。
2)病灶区域
通常指代机体上发生病变的部分区域,而消化道疾病的病灶区域即为消化道器官部产生病变的区域。比如,胃粘膜表面长出突起状组织,那么突起状组织对应的胃粘膜区域便是胃息肉病的病灶区域;再比如,胃部的十二指肠在高胃酸、胃蛋白酶的侵袭下,消化自身粘膜,形成局部炎性缺损,则被消化的十二指肠粘膜上皮对应的区域就是十二指肠溃疡的病灶区域。
3)内镜窄带成像术(Narrow Band Imaging,NBI)
NBI,又称窄带成像内镜,是一种新兴的内镜技术,它是利用滤光器过滤掉内镜光源所发出的红蓝绿光波中的宽带光谱,仅留下窄带光谱用于诊断消化道的各种疾病。开启NBI模式后的影像不仅能够精确地观察消化道黏膜上皮形态,比如上皮腺凹结构,还可以观察上皮血管网的形态。这种新技术能够更好地帮助内镜医生区分胃肠道上皮,胃肠道炎症中血管形态的改变,以及胃肠道早期肿瘤腺凹不规则改变,从而提高内镜诊断的准确率。
4)损失函数
损失函数(loss function)或代价函数(cost function)是将随机事件或其有关随机变量的取值映射为非负实数以表示该随机事件的“风险”或“损失”的函数。在应用中,损失函数通常作为学习准则与优化问题相联系,即通过最小化损失函数求解和评估模型。
5)人工智能(Artificial Intelligence,AI)
人工智能是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。本申请实施例提供的方案主要涉及人工智能中的机器学习/深度学习等技术。
6)机器学习(Machine Learning,ML)
机器学习是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。
7)AI盒子
AI盒子部署在医院的一套硬件设备及服务,AI引擎服务及视频采集卡均集成在AI盒子 中,AI盒子可以获取内镜实时视频流的画面,并向AI引擎服务做输入,由AI定位实时内镜画面中的病灶,并即时分析是否为早癌和概率。
本申请实施例的方案包括识别阶段和显示阶段。图1是根据一示例性实施例示出的一种图像识别以及图像显示框架图。如图1所示,在图像识别阶段,图像识别设备110基于内窥镜设备120输入的内窥镜图像,实时进行病灶识别,该图像识别设备110可以是AI盒子,该图像识别设备可以包括视频采集模块,图像识别模块以及图像输出模块,其中视频采集模块用以实时获取内窥镜采集到得图像画面,并将该图像输入到该图像识别模块中,该视频采集模块可以实现为图1所示的视频采集卡;图像识别模块用以对输入该图像识别设备的图像画面进行识别处理,获得识别结果,该图像识别模块可以实现为图1所示的AI引擎和CPU服务器,在一种可能的情况下,AI引擎和CPU服务器可以集成在AI盒子中,或者也可以AI引擎和CPU服务器也可以通过有线或者无线的方式与AI盒子进行信息交互;图像输出模块用以将图像识别模块获得的识别结果输到图像显示设备130进行显示,其中,图像显示设备130可以是该图像识别设备110的内置显示模块,也可以是该图像识别设备110外接的图像显示设备120。在图像显示阶段,图像显示设备130在图像显示界面中显示该内窥镜图像以及图像识别设备110的识别结果。
其中,上述图像识别设备110可以是具有机器学习能力的计算机设备,该计算机设备可以是个人电脑、服务器以及固定式医疗设备(比如上述AI盒子)等固定式计算机设备,或者,该计算机设备也可以是平板电脑、电子书阅读器或者便携式医疗设备等移动式计算机设备。
可选的,上述图像识别设备110和图像显示设备130可以是同一个设备,或者,图像识别设备110和图像显示设备130也可以是不同的设备。并且,当图像识别设备110和图像显示设备130是不同的设备时,图像识别设备110和图像显示设备130可以是同一类型的设备,比如图像识别设备110和图像显示设备130可以都是个人电脑;或者,图像识别设备110和图像显示设备130也可以是不同类型的设备,比如图像识别设备110可以是AI盒子,而图像显示设备120可以是固定式医疗设备或者固定式医疗设备等,比如图像显示设备可以是图1所示的影像信息系统(Picture Archiving&Communication System,PACS)医生报告工作站。本申请实施例对于图像识别设备110和图像显示设备130的具体类型不做限定。
本申请实施例以图像显示设备是图像识别设备(例如AI盒子)的外接设备为例对本申请进行说明。
请参考图2,其示出了本申请一示例性实施例示出的一种内窥镜图像展示方法的流程图,该内窥镜图像展示方法可以用于计算机设备,比如,该计算机设备可以是上述图1所示的图像识别设备。如图2所示,该内窥镜图像展示方法可以包括以下步骤:
步骤210,获取由内窥镜采集得到的内窥镜图像。
其中,内窥镜采集得到的内窥镜图像可以是白光图像,也可以是NBI图像,NBI图像是指在内窥镜图像采集过程中利用滤光器过滤掉内镜光源所发出的红蓝绿光波中的宽带光谱后所采集到的图像。内窥镜采集得到是白光图像还是NBI图像可以由医疗工作者通过对内窥镜的工作模式的改变进行调节,当内窥镜的工作模式为白光模式时,内窥镜所采集到的图像为白光图像,当内窥镜的工作模式为NBI模式时,内窥镜所采集到的图像为NBI图像。
步骤220,从内窥镜图像中定位出目标区域图像,该目标区域图像是内窥镜图像中包含目标区域的部分图像。
当内窥镜图像中显示可能存在病灶疑似区域时,计算机设备可以在内窥镜图像中将该病灶疑似区域所处的一定范围内的图像作为目标区域图像,并定位该目标区域图像在内窥镜图像中的位置。
步骤230,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;该编码网络是图像分类网络中用于提取图像特征的网络部分;图像分类网络是通过第一训练图像以及第一训练图像的图像类别训练得到的机器学习网络。
其中,编码网络可以是卷积网络(比如全卷积网络),该编码网络用以提取图像特征,从而获得输入该编码网络的目标区域图像的语义特征。
在本申请实施例中,编码网络可以是图像分类网络的一部分,而图像分类网络可以由编码网络和分类网络构成,在图像分类网络的训练过程中,可以通过向模型训练设备中输入第一训练图像以及第一训练图像的图像类别,来对图像分类网络进行训练,例如,可以基于该图像分类网络的输出结果与第一训练图像的图像类别进行损失函数计算,并依据该损失函数对图像分类网络中的参数进行调整,以使得训练获得的图像分类网络的输出结果尽可能的指示接近于第一训练图像的图像类别,其中损失函数用以规范图像分类网络的输出结果与第一训练图像的图像类别之间的关系。
在本申请实施例中,第一训练图像可以是内窥镜图像中标记的区域图像,第一训练图像的图像类别可以是该区域图像所对应的图像类型,即是否是目标区域,例如,是否为病灶区域。
步骤240,将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;该匹配结果用于指示各个图像样本中与目标区域图像相匹配的目标图像样本。
其中,各个图像样本的语义特征是指将各个图像样本通过与目标区域图像相同的编码模型预先获得的各个图像样本对应的语义特征。
可选的,各个图像样本都有与之对应的病灶属性,目标区域图像的语义特征与各个图像样本的语义特征的匹配结果即可指示该目标区域图像可能对应的病灶属性。
步骤250,在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。
其中,内窥镜图像显示界面可以是与计算机设备连接的外置图像显示设备的显示屏幕,该内窥镜显示界面中可以实时显示内窥镜采集到的图像,或者经计算机设备处理过后的内窥镜采集到的图像,比如,标注有目标图像区域位置的图像。在本申请实施例中,还可以在内窥镜显示界面中显示匹配结果。
可选的,该匹配结果可以包括通过匹配网络获取的该目标图像区域可能对应的病灶属性和/或与该匹配结果对应的图像样本的图像。
综上所述,本申请提供的内窥镜图像展示方法,通过获取由内窥镜采集到的内窥镜图像;从内窥镜图像中定位出目标区域图像,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。使得在内窥镜的使用过程中,通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅助诊断的准确性。
基于图2所示的内窥镜图像展示方法,当从内窥镜图像中定位出目标区域图像的步骤由计算机设备执行时,请参考图3,其示出了本申请一示例性实施例示出的一种内窥镜图像展 示方法的流程图,该内窥镜图像展示方法可以用于计算机设备,比如上述图1所示的图像识别设备中。如图3所示,该内窥镜图像展示方法可以包括以下步骤:
步骤310,获取由内窥镜采集到的内窥镜图像。
例如,以计算机设备包括AI盒子和AI引擎为例,当通过AI盒子中的视频采集卡进行视频采集时,上述过程可以表现为,AI盒子获取内窥镜实时视频流的画面,并向与AI盒子连接,或者集成在AI盒子中的AI引擎服务器输入内窥镜实时视频流画面,相应的AI引擎服务器获取由内窥镜采集到的内窥镜图像。
步骤320,获取内窥镜采集的内窥镜图像时的图像模式。
由于内窥镜图像的图像模式可以由医务人员通过人工调节进行切换,比如,当医务人员通过观察到白光模式的内窥镜图像发现疑似病灶时,即可将白光模式切换为NBI模式,NBI模式的图片相较于白光模式的图片可以更清晰的显示出血管走向,腺管开口等信息,在NBI模式下对内窥镜图像进一步观察更多疑似病灶的细节,以便于医务人员对疑似病灶进行判断。
因此,可选的,获取内窥镜采集的内窥镜图像时的图像模式的过程可以实现为:
获取内窥镜采集内窥镜图像时的工作状态;
响应于工作状态是NBI状态,确定内窥镜图像的图像模式为NBI模式。
比如,计算机设备可以基于医护人员对模式选择的用户操作获取内窥镜采集图像时的工作状态,当该用户操作指示内窥镜处于白光模式时,在该模式下内窥镜采集的内窥镜图像的图像模式即为白光模式,当该用户操作指示内窥镜处于NBI模式时,在该模式下,内窥镜采集的内窥镜图像的图像模式即为NBI模式。
或者,可选的,计算机设备可以将内窥镜图像输入图像模式分类网络,获得图像模式分类网络输出的图像模式信息;其中,图像模式分类网络是通过第三训练图像训练得到的机器学习网络;图像模式信息指示内窥镜图像的图像模式是否为NBI模式。该第三训练图像标注有图像模式。
可选的,上述图像模式分类网络可以是密集卷积网络(DenseNet),用以对内窥镜图像进行分类识别,该图像模式分类网络可以预先通过机器学习模型训练获得。
比如,在训练过程中,模型训练设备可以将内窥镜图像样本以及对应的图像模式输入到模型训练设备中,以构建图像模式分类网络,基于该图像模式分类网络的输出结果与对应的图像模式进行损失函数计算,并依据损失函数对图像模式分类网络中的参数进行调整,以使得训练获得的图像模式分类网络的输出结果尽可能的指示接近于内窥镜图像样本对应的图像模式。
在该图像模式分类网络的使用过程中,计算机设备可以向图像模式分类网络中输入内窥镜图像,该图像模式分类网络即可输出该内窥镜图像所对应的图像模式。
可选的,图像模式分类网络可以对输入的内窥镜图像进行缩放,以使得缩放后的内窥镜图像的大小满足图像模式分类网络的要求,比如,图像模式分类网络所要求的内窥镜图像的大小为224*224,则在进行图像模式判断前,先将输入的内窥镜图像的大小缩放为224*224。
基于本申请实施例中对图像模式分类网络的工作需求,本申请提供一种图像模式分类网络,其模型结构如下表1:
表1
Figure PCTCN2020124483-appb-000001
Figure PCTCN2020124483-appb-000002
由于在本申请中,图像模式分类网络更偏好于较低级的特征组合,如血管颜色等,因此在设置密集卷积网络结构深度和宽度的组合时,可以采用更宽更浅的模式,最终所使用的网络结构可以是上述的DenseNet-40(密集卷积网络结构-40),再进行网络参数调优,比如,将growth-rate(增长率)设置为48,特征经过转换层压缩比设置为0.5。
可选的,计算机设备从内窥镜图像中定位出目标区域图像之前,还包括:
获取内窥镜图像的图像质量信息,图像质量信息包括模糊度、曝光及色调异常度、以及有效分辨率中的至少一项;
响应于图像质量信息满足图像质量阈值,执行从内窥镜图像中定位出目标区域图像的步骤。
在内窥镜进行图像采集的过程中,采集到的内窥镜图像中可能存在由于拍摄模糊或者由消化道未消化的食物残渣造成的模糊图像,这些模糊图像的存在会对后续的分析判断造成严重的误差,因此需要对内窥镜采集到的图像中的低质量图像进行筛选,其中低质量图片可以包括但不限于以下三种情况:模糊图片、色调异常以及过曝欠曝图片、低分辨率图片。
对于低分辨率图片,计算机设备可以通过计算图片中的有效像素面积来进行识别,其中,有效像素面积是指剪裁图片上下左右的黑边之后的面积,请参考图4,其示出了本申请一示例性实施例示出的有效像素面积的示意图,如图4所示,图4界面中上下左右黑边之外的区域410即为有效像素面积。对黑边的剪裁过程可以由计算机设备通过统计图片每行或者每列像素值的灰度值分布来实现,当图片中某行或者某列像素值的灰度值分布低于预设阈值的比例达到一定阈值时,则确定该行或者该列应该剪除,比如,当图片中某行或者某列像素值的灰度值分布低于预设阈值的比例达到50%时,确定该行或该列应该剪除。计算机设备对剪除黑边后的有效像素面积进行判断,若有效像素面积小于一定阈值,则该图片被判断为低分辨率图片。其中,上述说明中的阈值均可基于实际应用所需进行设定。
对于模糊图片,本申请实施例提供一种示例性的对模糊图片的检测算法,通过对内窥镜图片进行高斯滤波,以消除内窥镜采样时所产生的摩尔纹,其中,摩尔纹是指感光元件出现的高频干扰的条纹,是一种会使图片出现彩色的高频率不规则的条纹;将经过高斯滤波后的 内窥镜图像定义为R,将高斯滤波后的内窥镜图像再经过一次中值滤波,将中值滤波后的内窥镜图像定义为P,比如,该中值滤波可以是3*3的中值滤波。分别计算图像P与图像R的梯度,利用像素图像边缘检测算子得到G_P和G_R图像,该像素图像边缘检测算子可以是索贝尔算子(Sobel算子);计算G_P和G_R的相似度,基于相似度计算结果对该内窥镜图像是否为模糊图片进行判断,G_P和G_R的相似度越高,内窥镜图像越模糊,G_P和G_R的相似度越低,内窥镜图像越清晰。
对于色调异常以及过曝欠曝图片,本申请提供一种示例性的色调异常以及过曝欠曝图片检测算法。由于色调异常以及过曝欠曝存在多种异常现象,因此需要建设色调合格、拍摄正常的标准库文件,在对内窥镜图像进行检测时,首先,将图像平均分成n个图像框,并从中随机选取m个图像块,m,n均为正整数,且m<n。在HSV(Hue、Saturation、Value)空间下分别计算m个图像块的H、S、V。其中,H是指色调、色相,S是指饱和度。色彩纯净度、V是指明度/亮度,HSV空间多用于图像处理中。其次,以H和S为特征,将m个图像块的H、S与标准库文件中的标准图像的H、S进行匹配,对每个图像块计算相应的相似度,在一种可能的情况下,对于一个图像块而言,若标准图像为一个,则获取图像块与该标准图像的相似度值为该图像块与标准库文件中的标准图像的相似度;若标准图像为多个,则获取图像块与多个标准图像的相似度的平均值作为该图像块与标准文件中的标准图像的相似度。设定一个相似度阈值,若m个图像块中存在一定阈值的图像块与标准图像的相似度达到该相似度阈值,那么则判定该内窥镜图像为匹配成功图像,即色调正常且不存在过曝欠曝的图片,否则,若m个图像块中存在一定阈值的图像块与标准图像的相似度未达到相似度阈值,那么判定该内窥镜图像为匹配失败图像,即色调不正常和/或存在过曝或欠曝的图片。比如,可以将内窥镜图像划分为7*7的图像块,并从中随机出去9个图像块进行H、S、V计算,以H、S为特征,将9个图像块分别于标准图像进行相似度计算,若9个图像块中大于或等于5个匹配成功,则认为该内窥镜图像为色调正常且不存在过曝欠曝现象,若9个图像块中匹配成功的图像块小于5个,则认为该内窥镜图像为色调不正常和/或存在过曝或欠曝的图片。其中,相似度阈值以及对图像块的匹配进行判断的一定阈值均可以基于实际应用所需进行设置和调整,本申请实施例对此不进行限定。
可选的,图像质量信息的获取和判断可以在获取内窥镜采集内窥镜图像时的工作状态之前进行,以保证输入图像模式分类网络的内窥镜图像均为高质量图片,便于图像模式分类网络的判断,提高图像模式分类网络的识别精度。也就是说,响应于图像质量信息满足图像质量阈值,执行获取内窥镜采集内窥镜图像时的工作状态的步骤。
步骤330,响应于内窥镜图像的图像模式为NBI模式,将内窥镜图像输入目标区域定位网络,获得目标区域定位网络输出的区域坐标;目标区域定位网络是通过第二训练图像训练得到的机器学习网络;第二训练图像标注有目标区域。
当经由步骤320获取的内窥镜采集的内窥镜图像的图像模式指示该图像为NBI模式时,则将该内窥镜图像输入目标区域定位网络,相应的,若经由步骤320获取的内窥镜采集的内窥镜图像的图像模式指示该图像为白光模式,则将不执行将该内窥镜图像输入目标区域定位网络的步骤。
其中,该目标区域定位网络用以在输入的NBI模式的内窥镜图像中定位目标区域,比如,该目标区域可以是疑似病灶区域,当计算机设备通过该目标区域定位网络在内窥镜图像中确认存在疑似病灶区域时,可以获取并输出该疑似病灶区域的定位坐标。
目标区域定位网络可以是一种端到端的实时的目标检测与识别网络,该目标定位网络可以预先通过机器学习模型进行训练获得。
比如,在训练过程中,模型训练设备可以将标注有目标区域的第二训练图像输入到模型训练设备中,以构建目标区域定位网络,基于该目标区域定位网络的输出结果与对应的目标区域的坐标进行损失函数计算,并依据损失函数对目标定位网络中的参数进行调整,以使得训练获得的目标区域定位网络的输出结果尽可能的指示接近于具有目标区域的第二训练图像对应的目标区域坐标。
在一种可能的情况下,可以采用YOLO v2算法对目标区域进行定位检测,YOLO v2利用单个神经网络,将目标检测问题转化为提取图像中的边框(bounding boxes)和类别概率的回归问题。YOLO v2采用多出度训练方法,以及借用快速RCNN锚框(Faster RCNN anchor box)思想,可以在保证检测速度的同时,提升模型检测的进度以及泛化能力。在将YOLO v2算法应用到本申请的病灶定位场景中时,anchor box(锚框)的尺寸设置可以基于自由的训练数据聚类得到,在对该目标区域定位网络进行训练时,可以先使用Imagenet(图像数据库)数据好该目标区域定位网络的初始化参数,再利用本领域数据对该目标区域定位网络的初始化参数进行调整,以使得所得的目标区域定位网络能够对本领域表现良好。其中Imagenet数据是计算机视觉领域有关图像分类和目标检测的开源数据集,其中的图像数据涵盖各个领域的成千上万中类别,数据量在百万以上,本申请实施例可以通过Imagenet数据训练所得的模型的初始化参数,能够更好的使模型收敛以得到全局最优解,在该基础上,针对特定领域进行特定训练,以使提高该模型在特定领域的判断精度。比如,采用医疗领域的内窥镜图像对初始化的模型进行再训练,得到在医疗领域具有更高精度的模型。
在目标区域网络的使用过程中,计算机设备可以向目标区域网络中输入的NBI模式的内窥镜图像,该目标区域网络即可输出该内窥镜图像中的目标区域的坐标。
步骤340,将内窥镜图像中对应区域坐标的图像获取为目标区域图像。
可选的,该目标区域的坐标可以是能够框住目标区域的多边形(比如矩形)的几个端点的坐标,在获取到对应区域坐标后,计算机设备即可通过对各个端点的坐标依次进行连接来获取目标区域范围,并将该范围内的图像获取为目标区域图像。
步骤350,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;编码网络是图像分类网络中用于提取图像特征的网络部分;图像分类网络是通过第一训练图像以及第一训练图像的图像类别训练得到的机器学习网络。
基于图2所示实施例中对编码网络的说明,编码网络用于获得输入该编码网络的目标区域图像的语义特征,而无需对该目标区域图像进行分类。在该编码网络工作过程中,编码网络可以对输入该编码网络的目标区域图像进行降维处理,并将该降维后的目标区域图像数据作为数据库和匹配网络的输入,用以进行后续样本的匹配比对。
请参考图5,其示出了本申请一示例性实施例提供的编码网络的结构示意图,如图5所示,该编码网络可以包括全卷积网络510以及全局池化层520,其中全卷积网络510用以将输入该编码网络的目标区域图像解析为高维语义特征,该高维语义特征可以表现为H*W*K大小的特征图,其中,H对应于特征图的长,W对应于特征图的宽,K对应于特征图的个数。该特征图后续将会被导入全集池化层520进行后续处理,以将高维语义特征进行降维,获得1*K维的语义特征向量,便于后续对语义特征的匹配。
可选的,在编码网络获取目标区域图像的语义特征的同时,数据库可以基于编码网络获 取的语义特征对数据库中存储的图像样本进行初步过滤,以获取包含与该目标区域图像的语义特征相近的语义特征相近的样本,从而减少后续匹配过程中的无意义匹配,减少匹配网络的工作压力。
例如,数据库可以获取上述目标区域图像对应的图像类型,并基于图像类型,筛选出具有相同图像类型的图像样本对应的语义特征,后续只需要将目标区域图像的语义特征与筛选出的图像样本对应的语义特征进行匹配即可,不需要将目标区域图像的语义特征与数据库中的所有图像样本对应的语义特征进行匹配。例如,上述图像类型可以指示图像中的器官类型等。
其中,数据库用以存储原始样本对应的K维语义特征,同时,为了能回溯到原始图片,该数据库中还保存有能追溯到原始图片的相关信息,并且,为了能实现基于输入的目标区域图像对数据库中的样本进行初步过滤,数据库对K维语义特征的存储进行了特殊规划,其中数据库中存储的原始样本的K维语义特征是通过对目标区域图像进行语义特征提取的编码网络所获取的。
步骤360,将目标区域图像的语义特征与各个图像样本的语义特征输入匹配网络,获得匹配网络输出的,目标区域图像分别与各个图像样本之间的匹配分值;匹配网络是通过标注有匹配标签的语义特征对进行训练得到的,语义特征对包含两个图像的语义特征,匹配标签用于指示对应的语义特征对是否匹配。
可选的,匹配网络可以由一个双分支输入的相似度量网络(Siamese netwoke)构成,用以评估输入该匹配网络的两个输入样本之间的匹配关系,这种匹配关系可以是两者之间的相似程度,也可以是两者之间的空间距离等。在匹配网络工作过程中,当一个需要进行检索的目标区域图像的语义特征输入到该匹配网络时,数据库中也会向该匹配网络中输入与之具有对应类型的图像样本的语义特征,并在匹配网络中对目标区域图像的语义特征依次与数据库中筛选出来的各个图像样本的语义特征进行匹配,匹配网络可以基于目标区域的语义特征与图像样本的语义特征的匹配程度输出匹配结果,其中该匹配结果可以是对两者之间的匹配关系进行计分,其中,匹配关系得分可以为多种形式,比如欧式距离,余弦(Cosine)相似度等,本申请对此不做限制。
在匹配网络的训练过程中,可以通过向模型训练设备中输入标注有匹配标签的语义特征对来对匹配网络进行训练,其中,每个语义特征对可以包含成对的两个语义特征,该匹配标签用以指示对应的语义特征对是否匹配,也就是说输入该匹配网络的是若干对语义特征对,以及各个语义特征对分别对应的匹配标签,然后,模型训练设备基于该匹配网络的输出结果与匹配标签进行损失函数计算,并基于该损失函数的计算结果对匹配网络中的参数进行调整,以使得训练获得的匹配网络的输出结果能够尽可能的指示接近于匹配标签。
在一种可能的情况下,请参考图6,其示出了本申请一示例性实施例提供的匹配网络的训练示意图,如图6所示。匹配网络同时输入内窥镜图像1的K维语义特征610和内窥镜图像2的K维语义特征620,通过多个全连接层与激活函数的非线性变化至M维,并计算两者间的关系得分,记为D。若匹配网络中对目标区域图像的语义特征与图像样本的语义特征之间的关系得分定义为欧式距离,那么损失函数可以定义为以下形式:
Figure PCTCN2020124483-appb-000003
其中,τ表示平滑参数,用以抑制关系得分。当两个样本正相关时,Y=0,否则Y=1。
步骤370,基于目标区域图像分别与各个图像样本之间的匹配分值确定目标图像样本。
可选的,计算机设备将各个图像样本按照对应的匹配分值从高到低的顺序排序后,排列在前n位的图像样本作为目标图像样本,n≥1,n为整数;
或者,计算机设备将各个图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为目标图像样本;
或者,计算机设备将各个图像样本按照对应的匹配分值从高到低的顺序排序后排列在前n位的图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为目标图像样本。
其中,目标区域图像分别与各个图像样本之间的匹配分值用以指示目标区域图像与各个图像样本之间的相似程度,对于目标图像样本的获取,由于匹配分值越高,则目标区域图像与图像样本的相似程度越高,因此,计算机设备在对各个图像样本按照对应的匹配分值从高到低的顺序排序后,排序越靠前,则证明该图像样本与目标区域图像的相似程度越高,因此,可以选取排在前n位的图像样本作为目标图像样本。
或者,同一匹配分值可能对应有多个图像样本,只取前n位可能无法将排位靠前的匹配分值对应的全部图像样本全部筛选出来,由此,计算机设备可以选择设置一个匹配分值阈值,将高于该匹配分值阈值的图像样本全部作为目标图像样本。
或者,计算机设备也可以先对各个图像样本按照对应的匹配分值从高到低的顺序排序,再对排序后前n位图像样本基于匹配分值阈值进行筛选,从而获取到既排位靠前,又高于匹配分值阈值的图像样本。
步骤380,基于目标图像样本获取匹配结果。
可选的,匹配结果包括以下内容中的至少一项:
目标图像样本;
目标图像样本对应的图像类别;
以及,目标图像样本与目标区域图像的匹配度。
其中,计算机设备可以基于数据库中保存的能追溯到图像样本的相关信息获取目标图像样本。
步骤390,在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。
可选的,在内窥镜图像显示界面中,对应内窥镜图像显示区域标记,区域标记用于指示内窥镜图像中的目标区域。
请参考图7,其示出了本申请一示例性实施例示出的内窥镜图像显示界面的界面图,如图7所示,区域710中显示有内窥镜图像,在该内窥镜图像上显示有区域标记711,区域720中显示有匹配结果,该匹配结果可以包括目标图像样本721,目标图像样本的图像类别722,以及目标图像样本与目标区域图像的匹配度723。
可选的,在内窥镜图像中的区域标记711处,可以对应显示有匹配结果中匹配度最高的目标图像样本的相关信息,比如,目标图像样本对应的图像类别以及与内窥镜图像的匹配程度等。
综上所述,本申请提供的内窥镜图像展示方法,通过获取由内窥镜采集到的内窥镜图像;从内窥镜图像中定位出目标区域图像,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。使得在内窥镜的使用过程中,通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅 助诊断的准确性。
示例性的,当内窥镜的采集对象是胃部时,请参考图8,其示出了本申请一示例性实施例示出的一种内窥镜图像识别过程的示意图,如图8所示,用户将内窥镜经由人体的天然孔道或者经手术做的小切口进入到人体胃部,进行内窥镜图像采集,并将采集到的被窥镜图像输入到计算机设备中;由于胃部可能存在食物残渣等对内窥镜图像的画面质量造成影响的因素,因此需要图计算机设备将内窥镜图像进行低质图像过滤810,以筛选出高质量的内窥镜图像,以便于下一步处理;计算机设备对筛选出的高质量的内窥镜图像进行预处理,将图像大小调整为满足图像模式分类网络的要求的大小后,开始进行图像类型识别820;在图像类型识别过程中可以依靠于图像模式分类网络,从输入该图像模式分类网络的内窥镜图像中筛选出NBI模式的内窥镜图像,并对NBI模式的内窥镜图像进行疑似病灶定位830;在疑似病灶定位过程中可以依靠于目标区域定位网络,对输入该目标区域定位网络的内窥镜图像中的疑似病灶进行定位,获取到疑似病灶所对应的病灶区域的区域坐标;针对该疑似病灶区域进行库内相似数据检索与分析840,该过程可以依靠于编码网络和匹配网络,对输入的内窥镜图像进行编码,获得内窥镜图像的语义特征,并与存储在数据中的经过初步筛选的样本图像的语义特征进行匹配,以获取疑似病灶区域的图像相匹配的样本图像,从而获取到该疑似病灶的相关信息,比如,病灶类型,与样本图像的匹配程度等等。
基于图2或图3所示的内窥镜图像展示方法,当从内窥镜图像中定位出目标区域图像的步骤由用户操作执行时,请参考图9,其示出了本申请一示例性实施例示出的一种内窥镜图像展示方法的流程图,该内窥镜图像展示方法可以用于计算机设备中,比如上述图1所示的图像识别设备中。如图9所示,该内窥镜图像展示方法可以包括以下步骤:
步骤910,获取由内窥镜采集到的内窥镜图像。
可选的,在内窥镜采集内窥镜图像时,内窥镜图像显示界面中可以实时显示内窥镜图像,且在该内窥镜显示界面中,用户可以对该内窥镜图像显示界面中的内窥镜图像进行用户操作。
可选的,该用户操作可以包括但不限于放大操作、缩小操作、框选操作。
计算机设备可以获取由内窥镜采集的内窥镜图像,也可以获取到用户通过界面交互产生的用户操作。
步骤920,接收用户在内窥镜图像中执行的框选操作。
用户在内窥镜图像中执行的框选操作可以是用户通过鼠标等外接设备对用户内窥镜图像中的部分区域进行选取,或者,也可以是用户直接与内窥镜显示界面进行交互,以对内窥镜界面中的部分区域进行选取。
步骤930,将内窥镜图像中对应框选操作的区域的图像获取为目标区域图像。
响应于用户在内窥镜界面中进行框选操作,在该框选操作的作用区域可以显示一个框选框,用以指示该区域为框选区域,并将位于该框选区域的图像获取为目标区图像。
可选的,在NBI模式下,用户可以进行框选操作,或者,在NBI模式下,对应与框选区域的图像可以被获取为目标区域图像。
可选的,用户进行框选操作获取到目标区域图像后,可以获取该目标区域图像的图像质量信息,该图像质量信息包括模糊度、曝光及色调异常度、以及有效分辨率中的至少一项;
响应于图像质量信息满足图像质量阈值,计算机设备执行将目标区域图像输入编码网络的步骤,以使得编码网络所处理的目标区域图像是图像质量较高的目标区域图像,减少低质 量图像对后续识别匹配过程的影响,减少不必要的工作量。
对目标区域图像的图像质量信息的获取和判断过程可以参考图3实施例中获取内窥镜图像的图像质量信息的相关说明,此处不再赘述。
步骤940,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;编码网络是图像分类网络中用于提取图像特征的网络部分;图像分类网络是通过第一训练图像以及第一训练图像的图像类别训练得到的机器学习网络。
步骤950,将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;匹配结果用于指示各个图像样本中与目标区域图像相匹配的目标图像样本。
请参考图10,其示出了本申请一示例性实施例示出的内窥镜图像检索系统的示意图,如图10所示,该内窥镜图像系统可以包括编码网络1010,数据库1020以及匹配网络1030。当有目标区域图像输入到该内窥镜图像检索系统时,首先将该内窥镜图像通过编码网络1010,得到内窥镜图像对应的K维语义特征,并将该K维语义特征作为数据库1020和匹配网络1030的输入,该K维语义特征是降维后的语义特征;数据库1020基于该K维语义特征对数据库中的图像样本进行初步筛选,以获得与该内窥镜图像的语义特征接近的图像样本的语义特征,并将上述图像样本的语义特征输入到匹配网络1030中;匹配网络1030对编码网络1010输入的内窥镜图像的语义特征以及数据库1020输入的图像样本的语义特征进行匹配,并对内窥镜图像的语义特征与各个图像样本的语义特征的匹配关系进行评分,获得评分结果;计算机设备可以对上述评分结果按照分值高低进行排序,按照一定的确定规则,从数据库1020输入的图像样本的语义特征对应的图像样本中获取目标图像样本。
步骤960,在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。
内窥镜图像显示界面中显示的内窥镜图像是用户进行框选操作时所对应的内窥镜图像,匹配结果是指与用户框选操作的框选区域中的图像所对应的匹配结果。
综上所述,本申请提供的内窥镜图像展示方法,通过获取由内窥镜采集到的内窥镜图像;从内窥镜图像中定位出目标区域图像,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。使得在内窥镜的使用过程中,通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅助诊断的准确性。
请参考图11,其示出了本申请一示例性实施例示出的一种内窥镜图像展示方法的流程图,该内窥镜图像展示方法可以用于计算机设备,比如上述图1所示的图像显示设备中。如图11所示,该内窥镜图像展示方法可以包括以下步骤:
步骤1111,在内窥镜图像显示界面中显示第一内窥镜图像,该第一内窥镜图像是内窥镜在白光模式下采集的图像。
在内窥镜的使用过程中,用户首先会在白光模式下使用内窥镜对器官进行图像采集,以获取器官的全局图像,当用户发现在内窥镜的采集图像中存在疑似病灶区域时,可以将内窥镜的拍摄模式切换为NBI模式,其中,在NBI模式下,用户可以观测到血管流向,且血液在NBI模式的采集图像中表现为黑色,在NBI模式下,还可以精确观察消化道粘膜的上皮形态,便于用户对病灶区域的观察和判断。
步骤1120,响应于内窥镜的拍摄模式切换为NBI模式,在内窥镜图像显示界面中显示第 二内窥镜图像,该第二内窥镜图像是内窥镜在NBI模式下采集的图像。
步骤1130,在内窥镜图像显示界面中,对应第二内窥镜图像显示匹配结果,匹配结果用于指示与第二内窥镜图像中的目标区域图像相匹配的目标图像样本。
上述对第二内窥镜图像的匹配结果是由图像识别是被对第二内窥镜图像进行识别匹配后获得的与第二内窥镜图像相匹配的目标图像样本以及其他相关信息,计算机设备对第二内窥镜图像进行识别的过程可以参考上述图2、图3或图8任一实施例所示的内窥镜图像展示方法中的相关内容。
综上所述,本申请提供的内窥镜图像展示方法,通过在内窥镜图像显示界面中显示内窥镜在白光模式下采集的图像;响应于内窥镜的拍摄模式切换为NBI模式,在内窥镜图像显示界面中显示内窥镜在NBI模式下采集的图像;在内窥镜图像显示界面中,对应第二内窥镜图像显示匹配结果。通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅助诊断的准确性。
图12是根据一示例性实施例示的一种内窥镜图像展示装置的结构方框图。该内窥镜图像展示装置可以用于计算机设备中,比如,该计算机设备可以是上述图1所示的图像识别设备中,以执行图2、图3或图9任一实施例所示的方法的全部或部分步骤。如图12所示,该内窥镜图像展示装置可以包括:
内窥镜图像获取模块1210,用于获取由内窥镜采集到的内窥镜图像;
区域图像定位模块1220,用于从内窥镜图像中定位出目标区域图像,该目标区域图像是内窥镜图像中包含目标区域的部分图像;
语义特征提取模块1230,用于将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;该编码网络是图像分类网络中用于提取图像特征的网络部分;该图像分类网络是通过第一训练图像以及第一训练图像的图像类别训练得到的机器学习网络;
匹配模块1240,用于将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;该匹配结果用于指示各个图像样本中与目标区域图像相匹配的目标图像样本;
第一显示模块1250,用于在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。
可选的,匹配模块1240,包括:
匹配分值获取子模块,用于将目标区域图像的语义特征与各个图像样本的语义特征输入匹配网络,获得匹配网络输出的,目标区域图像分别与各个图像样本之间的匹配分值;该匹配网络是通过标注有匹配标签的语义特征对进行训练得到的,语义特征对包含两个图像的语义特征,该匹配标签用于指示对应的语义特征对是否匹配;
图像样本确定子模块,用于基于目标区域图像分别与各个图像样本之间的匹配分值确定目标图像样本;
匹配结果获取子模块,用于基于目标图像样本获取匹配结果。
可选的,该图像样本确定子模块1220,用于,
将各个图像样本按照对应的匹配分值从高到低的顺序排序后,排列在前n位的图像样本作为目标图像样本,n≥1,n为整数;
或者,将各个图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为目标图像样本;
或者,将各个图像样本按照对应的匹配分值从高到低的顺序排序后排列在前n位的图像 样本中,对应的匹配分值高于匹配分值阈值的图像样本作为目标图像样本,n≥1,n为整数。
可选的,匹配结果包括以下内容中的至少一项:
目标图像样本;
目标图像样本对应的图像类别;
以及,目标图像样本与目标区域图像的匹配度。
可选的,该装置还包括:
第二显示模块,用于在内窥镜图像显示界面中,对应内窥镜图像显示区域标记,该区域标记用于指示内窥镜图像中的目标区域。
可选的,该区域图像定位模块1220,包括:
区域坐标获取子模块,用于将内窥镜图像输入目标区域定位网络,获得目标区域定位网络输出的区域坐标;目标区域定位网络是通过第二训练图像训练得到的机器学习网络;所述第二训练图像标注有目标区域;
第一区域图像获取子模块,用于将内窥镜图像中对应区域坐标的图像获取为目标区域图像。
可选的,该区域图像定位模块1220,包括:
用户操作接收子模块,用于接收用户在内窥镜图像中执行的框选操作;
第二区域图像获取子模块,用于将内窥镜图像中对应框选操作的区域的图像获取为目标区域图像。
可选的,该区域图像定位模块1220,用于响应于内窥镜图像的图像模式为NBI模式,执行从内窥镜图像中定位出目标区域图像的步骤。
可选的,该装置还包括:
图像模式信息获取模块,用于将内窥镜图像输入图像模式分类网络,获得图像模式分类网络输出的图像模式信息;该图像模式分类网络是通过第三训练图像训练得到的机器学习网络,该第三训练图像标注有图像模式;该图像模式信息指示内窥镜图像的图像模式是否为NBI模式。
可选的,该装置还包括:
工作状态获取模块,用于获取内窥镜采集内窥镜图像时的工作状态;
响应于工作状态是NBI状态,确定内窥镜图像的图像模式为NBI模式。
可选的,该装置还包括:
图像质量信息获取模块,用于获取内窥镜图像的图像质量信息,图像质量信息包括模糊度、曝光及色调异常度、以及有效分辨率中的至少一项;
该区域图像定位模块1220,用于响应于图像质量信息满足图像质量阈值,执行从内窥镜图像中定位出目标区域图像的步骤。
综上所述,本申请提供的内窥镜图像展示装置,应用于计算机设备中,通过获取由内窥镜采集到的内窥镜图像;从内窥镜图像中定位出目标区域图像,将目标区域图像输入编码网络,获得编码网络输出的,目标区域图像的语义特征;将目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;在内窥镜图像显示界面显示内窥镜图像,以及匹配结果。使得在内窥镜的使用过程中,通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅助诊断的准确性。
图13是根据一示例性实施例示的一种内窥镜图像展示装置的结构方框图。该内窥镜图像展示装置可以用于计算机设备中,比如,该计算机设备可以是上述图1所示的图像显示设备,以执行图11所示的方法的全部或部分步骤。如图13所示,该内窥镜图像展示装置可以包括:
第一显示模块1310,用于在内窥镜图像显示界面中显示第一内窥镜图像,该第一内窥镜图像是内窥镜在白光模式下采集的图像;
第二显示模块1320,用于响应于内窥镜的拍摄模式切换为NBI模式,在内窥镜图像显示界面中显示第二内窥镜图像,该第二内窥镜图像是内窥镜在NBI模式下采集的图像;
第三显示模块1330,用于在内窥镜图像显示界面中,对应第二内窥镜图像显示匹配结果,该匹配结果用于指示与第二内窥镜图像中的目标区域图像相匹配的目标图像样本。
综上所述,本申请提供的内窥镜图像展示装置,应用于计算机设备中,通过在内窥镜图像显示界面中显示内窥镜在白光模式下采集的图像;响应于内窥镜的拍摄模式切换为NBI模式,在内窥镜图像显示界面中显示内窥镜在NBI模式下采集的图像;在内窥镜图像显示界面中,对应第二内窥镜图像显示匹配结果。通过对内窥镜图像中的病灶的图像进行定位和匹配,提高了借助于内窥镜进行辅助诊断的准确性。
图14是根据一示例性实施例示出的计算机设备1400的结构框图。该计算机设备可以实现为本申请上述方案中图像识别设备或图像显示设备。所述计算机设备1400包括中央处理单元(Central Processing Unit,CPU)1401、包括随机存取存储器(Random Access Memory,RAM)1402和只读存储器(Read-Only Memory,ROM)1403的系统存储器1404,以及连接系统存储器1404和中央处理单元1401的系统总线1405。所述计算机设备1400还包括帮助计算机内的各个器件之间传输信息的基本输入/输出系统(Input/Output系统,I/O系统)1406,和用于存储操作系统1413、应用程序1414和其他程序模块1415的大容量存储设备1409。
所述基本输入/输出系统1406包括有用于显示信息的显示器1408和用于用户输入信息的诸如鼠标、键盘之类的输入设备1407。其中所述显示器1408和输入设备1407都通过连接到系统总线1405的输入输出控制器1410连接到中央处理单元1401。所述基本输入/输出系统1406还可以包括输入输出控制器1410以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器1410还提供输出到显示屏、打印机或其他类型的输出设备。
所述大容量存储设备1409通过连接到系统总线1405的大容量存储控制器(未示出)连接到中央处理单元1401。所述大容量存储设备1409及其相关联的计算机可读介质为计算机设备1400提供非易失性存储。也就是说,所述大容量存储设备1409可以包括诸如硬盘或者只读光盘(Compact Disc Read-Only Memory,CD-ROM)驱动器之类的计算机可读介质(未示出)。
不失一般性,所述计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、可擦除可编程只读寄存器(Erasable Programmable Read Only Memory,EPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、闪存或其他固态存储其技术,CD-ROM、数字多功能光盘(Digital Versatile Disc,DVD)或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知所 述计算机存储介质不局限于上述几种。上述的系统存储器1404和大容量存储设备1409可以统称为存储器。
基于本申请的各种实施例,所述计算机设备1400还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即计算机设备1400可以通过连接在所述系统总线1405上的网络接口单元1411连接到网络1412,或者说,也可以使用网络接口单元1411来连接到其他类型的网络或远程计算机系统(未示出)。
所述存储器还包括一个或者一个以上的程序,所述一个或者一个以上程序存储于存储器中,中央处理器1401通过执行该一个或一个以上程序来实现图2、图3、图9或图11所示的方法的全部或者部分步骤。
本领域技术人员可以理解,在上述一个或多个示例中,本申请实施例所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。
本申请实施例还提供了一种计算机可读存储介质,用于存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述内窥镜图像展示方法。例如,该计算机可读存储介质可以是ROM、RAM、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (25)

  1. 一种内窥镜图像展示方法,其特征在于,所述方法由计算机设备执行,所述方法包括:
    获取由内窥镜采集到的内窥镜图像;
    从所述内窥镜图像中定位出目标区域图像,所述目标区域图像是所述内窥镜图像中包含目标区域的部分图像;
    将所述目标区域图像输入编码网络,获得所述编码网络输出的,所述目标区域图像的语义特征;所述编码网络是图像分类网络中用于提取图像特征的网络部分;所述图像分类网络是通过第一训练图像以及所述第一训练图像的图像类别训练得到的机器学习网络;
    将所述目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;所述匹配结果用于指示所述各个图像样本中与所述目标区域图像相匹配的目标图像样本;
    在内窥镜图像显示界面显示所述内窥镜图像,以及所述匹配结果。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得用于指示匹配图像样本的匹配结果,包括:
    将所述目标区域图像的语义特征与所述各个图像样本的语义特征输入匹配网络,获得所述匹配网络输出的,所述目标区域图像分别与所述各个图像样本之间的匹配分值;所述匹配网络是通过标注有匹配标签的语义特征对进行训练得到的,所述语义特征对包含两个图像的语义特征,所述匹配标签用于指示对应的语义特征对是否匹配;
    基于所述目标区域图像分别与所述各个图像样本之间的匹配分值确定所述目标图像样本;
    基于所述目标图像样本获取所述匹配结果。
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述目标区域图像分别与所述各个图像样本之间的匹配分值确定所述目标图像样本,包括:
    将所述各个图像样本按照对应的匹配分值从高到低的顺序排序后,排列在前n位的图像样本作为所述目标图像样本,n≥1,n为整数;
    或者,将所述各个图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为所述目标图像样本;
    或者,将所述各个图像样本按照对应的匹配分值从高到低的顺序排序后,排列在前n位的图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为所述目标图像样本。
  4. 根据权利要求1所述的方法,其特征在于,所述匹配结果包括以下内容中的至少一项:
    所述目标图像样本;
    所述目标图像样本对应的图像类别;
    以及,所述目标图像样本与所述目标区域图像的匹配度。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    在所述内窥镜图像显示界面中,对应所述内窥镜图像显示区域标记,所述区域标记用于指示所述内窥镜图像中的所述目标区域。
  6. 根据权利要求1所述的方法,其特征在于,所述从所述内窥镜图像中定位出目标区域图像,包括:
    将所述内窥镜图像输入目标区域定位网络,获得所述目标区域定位网络输出的区域坐标;所述目标区域定位网络是通过第二训练图像训练得到的机器学习网络;所述第二训练图像标注有目标区域;
    将所述内窥镜图像中对应所述区域坐标的图像获取为所述目标区域图像。
  7. 根据权利要求1所述的方法,其特征在于,所述从所述内窥镜图像中定位出目标区域图像,包括:
    接收用户在所述内窥镜图像中执行的框选操作;
    将所述内窥镜图像中对应所述框选操作的区域的图像获取为所述目标区域图像。
  8. 根据权利要求1所述的方法,其特征在于,所述从所述内窥镜图像中定位出目标区域图像,包括:
    响应于所述内窥镜图像的图像模式为内镜窄带成像术NBI模式,执行从所述内窥镜图像中定位出目标区域图像的步骤。
  9. 根据权利要求8所述的方法,其特征在于,所述从所述内窥镜图像中定位出目标区域图像之前,还包括:
    将所述内窥镜图像输入图像模式分类网络,获得所述图像模式分类网络输出的图像模式信息;所述图像模式分类网络是通过第三训练图像训练得到的机器学习网络,所述第三训练图像标注有图像模式;所述图像模式信息指示所述内窥镜图像的图像模式是否为所述NBI模式。
  10. 根据权利要求8所述的方法,其特征在于,所述从所述内窥镜图像中定位出目标区域图像之前,还包括:
    获取所述内窥镜采集所述内窥镜图像时的工作状态;
    响应于所述工作状态是NBI状态,确定所述内窥镜图像的图像模式为所述NBI模式。
  11. 根据权利要求1所述的方法,其特征在于,所述从所述内窥镜图像中定位出目标区域图像之前,还包括:
    获取所述内窥镜图像的图像质量信息,所述图像质量信息包括模糊度、曝光及色调异常度、以及有效分辨率中的至少一项;
    所述从所述内窥镜图像中定位出目标区域图像,包括:
    响应于所述图像质量信息满足图像质量阈值,执行从所述内窥镜图像中定位出目标区域图像的步骤。
  12. 一种内窥镜图像展示方法,其特征在于,所述方法由计算机设备执行,所述方法包 括:
    在内窥镜图像显示界面中显示第一内窥镜图像,所述第一内窥镜图像是内窥镜在白光模式下采集的图像;
    响应于所述内窥镜的拍摄模式切换为内镜窄带成像术NBI模式,在所述内窥镜图像显示界面中显示第二内窥镜图像,所述第二内窥镜图像是所述内窥镜在所述NBI模式下采集的图像;
    在所述内窥镜图像显示界面中,对应所述第二内窥镜图像显示匹配结果,所述匹配结果用于指示与所述第二内窥镜图像中的目标区域图像相匹配的目标图像样本。
  13. 一种内窥镜图像展示装置,其特征在于,所述装置用于计算机设备中,所述装置包括:
    内窥镜图像获取模块,用于获取由内窥镜采集到的内窥镜图像;
    区域图像定位模块,用于从所述内窥镜图像中定位出目标区域图像,所述目标区域图像是所述内窥镜图像中包含目标区域的部分图像;
    语义特征提取模块,用于将所述目标区域图像输入编码网络,获得所述编码网络输出的,所述目标区域图像的语义特征;所述编码网络是图像分类网络中用于提取图像特征的网络部分;所述图像分类网络是通过第一训练图像以及所述第一训练图像的图像类别训练得到的机器学习网络;
    匹配模块,用于将所述目标区域图像的语义特征与各个图像样本的语义特征进行匹配,获得匹配结果;所述匹配结果用于指示所述各个图像样本中与所述目标区域图像相匹配的目标图像样本;
    显示模块,用于在内窥镜图像显示界面显示所述内窥镜图像,以及所述匹配结果。
  14. 根据权利要求13所述的装置,其特征在于,所述匹配模块,包括:
    匹配分值获取子模块,用于将所述目标区域图像的语义特征与所述各个图像样本的语义特征输入匹配网络,获得所述匹配网络输出的,所述目标区域图像分别与所述各个图像样本之间的匹配分值;所述匹配网络是通过标注有匹配标签的语义特征对进行训练得到的,所述语义特征对包含两个图像的语义特征,所述匹配标签用于指示对应的语义特征对是否匹配;
    图像样本确定子模块,用于基于所述目标区域图像分别与所述各个图像样本之间的匹配分值确定所述目标图像样本;
    匹配结果获取子模块,基于所述目标图像样本获取所述匹配结果。
  15. 根据权利要求14所述的装置,其特征在于,所述图像样本确定子模块,用于,
    将所述各个图像样本按照对应的匹配分值从高到低的顺序排序后,排列在前n位的图像样本作为所述目标图像样本,n≥1,n为整数;
    或者,将所述各个图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为所述目标图像样本;
    或者,将所述各个图像样本按照对应的匹配分值从高到低的顺序排序后排列在前n位的图像样本中,对应的匹配分值高于匹配分值阈值的图像样本作为所述目标图像样本。
  16. 根据权利要求13所述的装置,其特征在于,所述匹配结果包括以下内容中的至少一项:
    所述目标图像样本;
    所述目标图像样本对应的图像类别;
    以及,所述目标图像样本与所述目标区域图像的匹配度。
  17. 根据权利要求13所述的装置,其特征在于,所述装置还包括:
    第二显示模块,用于在所述内窥镜图像显示界面中,对应所述内窥镜图像显示区域标记,所述区域标记用于指示所述内窥镜图像中的所述目标区域。
  18. 根据权利要求13所述的装置,其特征在于,所述区域图像定位模块,包括:
    区域坐标获取子模块,用于将所述内窥镜图像输入目标区域定位网络,获得所述目标区域定位网络输出的区域坐标;所述目标区域定位网络是通过第二训练图像训练得到的机器学习网络;所述第二训练图像标注有目标区域;
    第一区域图像获取子模块,用于将所述内窥镜图像中对应所述区域坐标的图像获取为所述目标区域图像。
  19. 根据权利要求13所述的装置,其特征在于,所述区域图像定位模块,包括:
    用户操作接收子模块,用于接收用户在所述内窥镜图像中执行的框选操作;
    第二区域图像获取子模块,用于将所述内窥镜图像中对应所述框选操作的区域的图像获取为所述目标区域图像。
  20. 根据权利要求13所述的装置,其特征在于,所述区域图像定位模块,用于响应于所述内窥镜图像的图像模式为内镜窄带成像术NBI模式,执行从所述内窥镜图像中定位出目标区域图像的步骤。
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括:
    图像模式信息获取模块,用于将所述内窥镜图像输入图像模式分类网络,获得所述图像模式分类网络输出的图像模式信息;所述图像模式分类网络是通过第三训练图像训练得到的机器学习网络,所述第三训练图像标注有图像模式;所述图像模式信息指示所述内窥镜图像的图像模式是否为所述NBI模式。
  22. 根据权利要求20所述的装置,其特征在于,所述装置还包括:
    工作状态获取模块,用于获取所述内窥镜采集所述内窥镜图像时的工作状态;
    响应于所述工作状态是NBI状态,确定所述内窥镜图像的图像模式为所述NBI模式。
  23. 根据权利要求13所述的装置,其特征在于,所述装置还包括:
    图像质量信息获取模块,用于获取所述内窥镜图像的图像质量信息,所述图像质量信息 包括模糊度、曝光及色调异常度、以及有效分辨率中的至少一项;
    区域图像定位模块,用于响应于所述图像质量信息满足图像质量阈值,执行从所述内窥镜图像中定位出目标区域图像的步骤。
  24. 一种计算机设备,其特征在于,所述计算机设备包含处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如权利要求1至12任一所述的内窥镜图像展示方法。
  25. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至12任一所述的内窥镜图像展示方法。
PCT/CN2020/124483 2020-01-20 2020-10-28 内窥镜图像展示方法、装置、计算机设备及存储介质 WO2021147429A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/674,126 US20220172828A1 (en) 2020-01-20 2022-02-17 Endoscopic image display method, apparatus, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010067143.XA CN111275041B (zh) 2020-01-20 2020-01-20 内窥镜图像展示方法、装置、计算机设备及存储介质
CN202010067143.X 2020-01-20

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/674,126 Continuation US20220172828A1 (en) 2020-01-20 2022-02-17 Endoscopic image display method, apparatus, computer device, and storage medium

Publications (2)

Publication Number Publication Date
WO2021147429A1 true WO2021147429A1 (zh) 2021-07-29
WO2021147429A9 WO2021147429A9 (zh) 2021-12-09

Family

ID=71003339

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124483 WO2021147429A1 (zh) 2020-01-20 2020-10-28 内窥镜图像展示方法、装置、计算机设备及存储介质

Country Status (3)

Country Link
US (1) US20220172828A1 (zh)
CN (1) CN111275041B (zh)
WO (1) WO2021147429A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275041B (zh) * 2020-01-20 2022-12-13 腾讯科技(深圳)有限公司 内窥镜图像展示方法、装置、计算机设备及存储介质
CN111985457A (zh) * 2020-09-11 2020-11-24 北京百度网讯科技有限公司 交通设施损坏识别方法、装置、设备和存储介质
CN114092426A (zh) * 2021-11-12 2022-02-25 数坤(北京)网络科技股份有限公司 一种图像关联方法、装置、电子设备和存储介质
CN113989125B (zh) * 2021-12-27 2022-04-12 武汉楚精灵医疗科技有限公司 内镜图像的拼接方法、装置、计算机设备及存储介质
CN114549482A (zh) * 2022-02-25 2022-05-27 数坤(北京)网络科技股份有限公司 一种图像关联方法、装置、电子设备和存储介质
CN116229522A (zh) * 2023-05-10 2023-06-06 广东电网有限责任公司湛江供电局 一种变电站作业人员安全防护装备检测方法和系统
CN116913455B (zh) * 2023-09-15 2023-12-15 紫东信息科技(苏州)有限公司 一种胃镜检查报告生成装置、设备及计算机可读存储介质
CN117528131B (zh) * 2024-01-05 2024-04-05 青岛美迪康数字工程有限公司 一种医学影像的ai一体化显示系统及方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103377376A (zh) * 2012-04-13 2013-10-30 阿里巴巴集团控股有限公司 图像分类的方法和系统、图像检索的方法和系统
CN110136199A (zh) * 2018-11-13 2019-08-16 北京初速度科技有限公司 一种基于摄像头的车辆定位、建图的方法和装置
US20200005072A1 (en) * 2018-06-28 2020-01-02 General Electric Company Systems and methods of 3d scene segmentation and matching for robotic operations
CN111275041A (zh) * 2020-01-20 2020-06-12 腾讯科技(深圳)有限公司 内窥镜图像展示方法、装置、计算机设备及存储介质

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934799B (zh) * 2017-02-24 2019-09-03 安翰科技(武汉)股份有限公司 胶囊内窥镜图像辅助阅片系统及方法
EP3603481B1 (en) * 2017-03-30 2023-05-03 FUJIFILM Corporation Medical image processing device, endoscope system, and method for operating medical image processing device
CN107832335B (zh) * 2017-10-10 2019-12-17 西安电子科技大学 一种基于上下文深度语义信息的图像检索方法
CN107679250B (zh) * 2017-11-01 2020-12-01 浙江工业大学 一种基于深度自编码卷积神经网络的多任务分层图像检索方法
CN108596884B (zh) * 2018-04-15 2021-05-18 桂林电子科技大学 一种胸部ct图像中的食管癌分割方法
CN108852268A (zh) * 2018-04-23 2018-11-23 浙江大学 一种消化内镜图像异常特征实时标记系统及方法
CN109410185B (zh) * 2018-10-10 2019-10-25 腾讯科技(深圳)有限公司 一种图像分割方法、装置和存储介质
CN109978002A (zh) * 2019-02-25 2019-07-05 华中科技大学 基于深度学习的内窥镜图像胃肠道出血检测方法和系统
CN109903314A (zh) * 2019-03-13 2019-06-18 腾讯科技(深圳)有限公司 一种图像区域定位的方法、模型训练的方法及相关装置
CN110414607A (zh) * 2019-07-31 2019-11-05 中山大学 胶囊内窥镜图像的分类方法、装置、设备及介质
CN110689025B (zh) * 2019-09-16 2023-10-27 腾讯医疗健康(深圳)有限公司 图像识别方法、装置、系统及内窥镜图像识别方法、装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103377376A (zh) * 2012-04-13 2013-10-30 阿里巴巴集团控股有限公司 图像分类的方法和系统、图像检索的方法和系统
US20200005072A1 (en) * 2018-06-28 2020-01-02 General Electric Company Systems and methods of 3d scene segmentation and matching for robotic operations
CN110136199A (zh) * 2018-11-13 2019-08-16 北京初速度科技有限公司 一种基于摄像头的车辆定位、建图的方法和装置
CN111275041A (zh) * 2020-01-20 2020-06-12 腾讯科技(深圳)有限公司 内窥镜图像展示方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
WO2021147429A9 (zh) 2021-12-09
CN111275041A (zh) 2020-06-12
CN111275041B (zh) 2022-12-13
US20220172828A1 (en) 2022-06-02

Similar Documents

Publication Publication Date Title
WO2021147429A1 (zh) 内窥镜图像展示方法、装置、计算机设备及存储介质
JP6657480B2 (ja) 画像診断支援装置、画像診断支援装置の作動方法および画像診断支援プログラム
CN110490856B (zh) 医疗内窥镜图像的处理方法、系统、机器设备和介质
CN111474701B (zh) 一种病理显微图像实时采集分析系统、方法、装置及介质
US9445713B2 (en) Apparatuses and methods for mobile imaging and analysis
RU2765619C1 (ru) Компьютерная классификация биологической ткани
CN110600122B (zh) 一种消化道影像的处理方法、装置、以及医疗系统
CN113379693B (zh) 基于视频摘要技术的胶囊内镜关键病灶图像检测方法
EP1849402B1 (en) Medical image processing device, lumen image processing device, lumen image processing method, and programs for them
WO2014155778A1 (ja) 画像処理装置、内視鏡装置、プログラム及び画像処理方法
CN113573654A (zh) 用于检测并测定病灶尺寸的ai系统
CN110738655B (zh) 影像报告生成方法、装置、终端及存储介质
US9342881B1 (en) System and method for automatic detection of in vivo polyps in video sequences
CN110189303B (zh) 一种基于深度学习和图像增强的nbi图像处理方法及其应用
CN113543694B (zh) 医用图像处理装置、处理器装置、内窥镜系统、医用图像处理方法、及记录介质
WO2007119296A1 (ja) 内視鏡挿入方向検出装置、及び内視鏡挿入方向検出方法
WO2019130924A1 (ja) 画像処理装置、内視鏡システム、画像処理方法、及びプログラム
CN113129287A (zh) 一种针对上消化道内镜影像的病灶自动留图方法
CN113498323A (zh) 医用图像处理装置、处理器装置、内窥镜系统、医用图像处理方法、及程序
CN114372951A (zh) 基于图像分割卷积神经网络的鼻咽癌定位分割方法和系统
US20240005494A1 (en) Methods and systems for image quality assessment
CN116745861A (zh) 通过实时影像获得的病变判断系统的控制方法、装置及程序
CN113139937A (zh) 一种基于深度学习的消化道内窥镜视频图像识别方法
CN112788300A (zh) 一种新型关节腔内窥镜及其控制方法
CN113744266B (zh) 一种病灶检测框的显示方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915876

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.11.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20915876

Country of ref document: EP

Kind code of ref document: A1