WO2020133636A1 - Method and system for intelligent envelope detection and warning in prostate surgery - Google Patents

Method and system for intelligent envelope detection and warning in prostate surgery Download PDF

Info

Publication number
WO2020133636A1
WO2020133636A1 PCT/CN2019/074084 CN2019074084W WO2020133636A1 WO 2020133636 A1 WO2020133636 A1 WO 2020133636A1 CN 2019074084 W CN2019074084 W CN 2019074084W WO 2020133636 A1 WO2020133636 A1 WO 2020133636A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
detection
early warning
outer envelope
data
Prior art date
Application number
PCT/CN2019/074084
Other languages
French (fr)
Chinese (zh)
Inventor
郭成城
王行环
毋世晓
赵亚楠
郝玉洁
Original Assignee
武汉唐济科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 武汉唐济科技有限公司 filed Critical 武汉唐济科技有限公司
Publication of WO2020133636A1 publication Critical patent/WO2020133636A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing

Definitions

  • the invention relates to the technical field of target detection of artificial intelligence, in particular to a method and system for intelligent detection and early warning of an outer envelope in prostate surgery.
  • target detection is a very popular key technology, and more researches include face detection and pedestrian detection.
  • the traditional target detection generally uses a sliding window framework, which mainly includes three steps: one is to use the sliding window to select the candidate region; the second is to extract the visual features related to the candidate region; the third is to use the classifier for recognition.
  • the more classic algorithm is a multi-scale deformable component model. This algorithm can be regarded as an extension of the "gradient histogram + support vector machine” method. The disadvantage is that it is more complicated and the calculation speed is slow, which cannot support applications with high real-time requirements.
  • SPP-net Spatial Pyramid Pooling Network
  • Fast Region-based Convolutional Neural Networks Fast, R-CNN fast region-based convolutional neural network
  • region-based volume Neural network Faster Region-based Convolutional Neural Networks, Faster R-CNN
  • region-based full convolutional network region-based full convolutional network
  • YOLO Real-Time Object Detection
  • SSD single shot multibox detector
  • Literatures [1][2][3][4] adopted artificial neural network, probabilistic neural network, multi-layer neural network, support vector machine and other technologies to solve medical image processing problems.
  • Reference [5] uses a suitable filter as a preprocessing to remove noise.
  • Reference [6] made an intelligent model by using principal components analysis (Primary Components Analysis, PCA) and segmentation.
  • Reference [7] uses gradient vector flow to extract the edges of tumors in the image, and uses a combination of principal component analysis and artificial neural network (Primary Components Analysis-neural network, PCA-ANN) method to detect the region of interest.
  • PCA-ANN Principal Components Analysis-neural network
  • Reference [8] uses discrete wavelet changes to obtain the features of medical images, and uses PCA to reduce the features.
  • Literature [9] uses discrete wavelet transform to extract features and PCA to reduce features.
  • none of the above studies have considered the real-time nature of the algorithms. Therefore, these algorithms are not suitable for minimally invasive plasma bipolar electrosurgical procedures that require high real-time performance.
  • the real-time target detection algorithms based on deep learning are YOLO and SSD, but they still have the problems of real-time guarantee and target positioning inaccuracy in the detection of the outer envelope in the prostate surgery video. Therefore, it is necessary to research and design a new method to meet the requirements of faster and more accurate detection and judgment of the outer envelope in prostate surgery.
  • the present invention proposes a method and system for intelligent detection and early warning of the outer envelope in prostate surgery, focusing on solving two problems : First, the problem of real-time guarantee of the outer envelope detection based on the video images of the operation site; second, on the premise of ensuring that no leaks are detected, try to improve the accuracy of the positioning of the outer envelope, and provide better warning instructions for the surgeon And help.
  • the method for intelligent detection and early warning of the outer envelope in prostate surgery proposed by the present invention is special in that the method includes the following steps:
  • Data collection collect the outer envelope image data in the prostate surgery video
  • Second image pre-processing using deep bilateral learning to enhance the outer envelope image after the first image pre-processing
  • Neural network training perform feature extraction and network training on the outer envelope image after the second image preprocessing to generate a trained detection model
  • Detection and early warning real-time dynamic images of the prostate surgery site video are collected, and the dynamic images are recognized as image data after the first image preprocessing and the second image preprocessing are input to the detection model, when the detection model detects outsourcing When the membrane features a target, an alarm message is output.
  • a data amplification step is also included.
  • the training samples are all from prostate surgery video recordings. Due to various reasons, it is inevitable that the captured image features are not obvious and the features are redundant. In addition, after all, the video data is limited. It is necessary to consider the different habits and operating methods of different doctors in the application, which will inevitably lead to the possibility that the outer envelope image will show different angles and various shapes. Therefore, the present invention is designed to use the "amplifier" to enhance the number of images.
  • the step 4) is based on the YOLOv2 platform and the MobileNet deep learning model. Because the detection and warning system needs to run on the embedded device integrated with the surgical host, the biggest advantage of using the combination of mobilenet + YOLOv2 is that the real-time performance can be well guaranteed, and the balance between speed and accuracy is met, which meets the prostate. The practical application requirements of surgical assistant early warning.
  • step 3 the specific steps of step 3) include:
  • the low-resolution stream is divided into a local path and a global path.
  • the local path uses a full convolution layer to learn the local features of the image data
  • the global path uses the convolution layer and the fully connected layer to learn the global features of the image, and then the two paths
  • the output of is fused into a common set of fusion features
  • the specific steps of the data amplification step are: importing the module, instantiating the pipeline object, specifying the directory containing the image to be processed; defining data enhancement operations, including perspective, angle deviation, shearing, elastic deformation, brightness, Contrast, color, rotation, and cropping are added to the pipeline; call the pipeline's sample function to specify the total amount of samples after enhancement.
  • step 4 the specific steps of step 4) include: 4.1) pre-training; 4.2) feature extraction; 4.3) boundary box prediction; 4.4) classification.
  • the invention also provides an intelligent detection and early warning system based on the outer envelope in prostate surgery, which is special in that it includes an image acquisition module, an image processing module, and an image detection and early warning module; the image acquisition module is used for acquisition and storage Image information and model; the image processing module is used to perform the first image preprocessing and the second image preprocessing on the collected image data; the image detection and early warning module is used to perform network training on the processed image to generate After the training of the detection model, the image to be detected is input into the detection model to obtain the detection and early warning results.
  • the image acquisition module includes a digital video interface for interfacing with an endoscope, an image data memory for storing real-time image data during surgery, and a model for storing processed images and models after deep learning. Image model storage.
  • the image processing module includes a data amplification component, an image feature extraction component and an image enhancement component.
  • the image detection and early warning module includes an image depth training component and an image detection and early warning component.
  • the working process of the present invention is as follows: first, a certain number of prostate outer capsule images are extracted from the surgical video recording; second, if the extracted capsule images are too few, the data enhancement method can be used to enhance the number of pictures; again , Use PCA to extract image features, perform the first step of image preprocessing; then, use deep bilateral learning to perform a second preprocessing on some pictures with insignificant features, and then use mobilene+YOLOv2 to train the pictures; finally, monitor The real-time surgical video image on the device is used to detect the outer envelope image target.
  • the endoscope tracks the mechanical operation site through the probe to obtain a visual image of the operation interval. Due to the different postures of the patients and the different habitual operation methods of the doctors, the outer envelope image will show different angles and various shapes.
  • the data expansion method can greatly enrich the original data set, avoiding the phenomenon of overfitting during deep learning, so as to achieve better detection results.
  • FIG. 1 is a structural block diagram of a system for an intelligent detection and early warning method of an outer envelope in prostate surgery of the present invention.
  • FIG. 2 is a working flowchart of a method for intelligent detection and early warning of an outer envelope in prostate surgery of the present invention.
  • FIG. 3 is a detection effect diagram of an intelligent detection and early warning method of an outer envelope in prostate surgery of the present invention.
  • the invention is mainly for real-time early warning and recognition of the outer envelope image in the video image of minimally invasive prostate surgery.
  • the early warning system mainly includes an image acquisition module, an image processing module, and an image detection and early warning module.
  • Image acquisition module used to collect and store image information and models, which contains an adapter interface connected to the digital video interface DVI of the endoscopic imaging device, an image data storage, and an image model storage; the adapter interface is responsible for the endoscope
  • the 1920 ⁇ 1200p/60Hz CVT-RB video stream output from the digital video interface is converted to 1920 ⁇ 1080p/60Hz RGB24 video stream, and input into the management machine running the early warning analysis system;
  • the image data storage is responsible for buffering real-time video data of surgical images, The buffer space can be set for the image quality of 1080p (or 720p);
  • the image model memory is responsible for storing the pre-processed image and the model after deep learning training.
  • the image processing module is used to perform the first image preprocessing and the second image preprocessing on the collected image data. It contains a data amplification component, an image feature extraction component, and an image enhancement component.
  • the data amplification component implements operations such as rotation, stretching, elastic deformation, and cropping on the marked outer envelope image.
  • the image feature extraction component realizes the feature acquisition of the outer envelope picture based on principal component analysis, and extracts a total of 300 feature values, including the following functions:
  • Eigenvalue decomposition is a very good method for extracting matrix features, but it is only for square matrix. In the real world, most of the matrices we see are not square matrices, but the use of singular value decomposition can describe the important characteristics of such ordinary matrices. Any m ⁇ n matrix can be singular value decomposition, split into three matrix multiplication form. Singular value decomposition can represent a more complex matrix by multiplying several smaller and simpler sub-matrices. These small matrices describe the important characteristics of the original matrix.
  • the axis with the largest variance is the first singular vector
  • the axis with the second largest variance is the second singular vector. Therefore, the most important key features of the grayscale image can be obtained based on the singular value decomposition.
  • the image enhancement component implements image enhancement on some dark pictures, and determines the final training data set, including the following functions:
  • the fusion feature as the third bilateral network that has been developed. Since image enhancement usually depends not only on local image features, but also on global image features, such as histogram, average intensity, and even scene category. Therefore, our low-resolution stream is further divided into local paths and global paths. Then, our architecture merges these two paths to produce the final coefficients that represent the affine transformation.
  • Low resolution stream input The image size is adjusted to 256 ⁇ 256, which is first processed by a series of convolutional layers to extract low-level features and reduce spatial resolution. Then the final low-level features are processed by two asymmetric paths, one path is fully convoluted, specializing in learning local features of image data while retaining spatial information; the second path uses a convolutional layer and a fully connected layer to learn the global feature. Finally, the outputs of the two paths are fused into a common set of features, and the point-wise linear layer outputs the final array A from the fused stream, which is called a double-sided grid of affine coefficients.
  • the image detection and early warning module is used to perform network training on the processed images to generate a trained detection model, and then input the image to be detected into the detection model to obtain detection and early warning results, which includes an image depth training component and an image detection and early warning component.
  • the image depth training component consists of the following functions:
  • Pre-train with the final data set First train the network from the beginning with a 224 ⁇ 224 input, about 160 sequences (run all training data 160 times in a loop); then, adjust the input to 448 ⁇ 448 and train 10 more sequences.
  • Mobilenet is a lightweight deep network model proposed mainly for mobile applications. Depthwise Separable Convolution is mainly used to decompose the standard convolution kernel to reduce the amount of calculation. The purpose of using this network is to deploy deep networks on embedded devices.
  • YOLOv2 (the second version of YOLO) is used for classification.
  • the deep training network based on mobilenet+YOLOv2 can meet the fast detection in real time, the detection accuracy is not high. Therefore, we expanded the data before detection, and extracted features using principal component analysis, and used deep bilateral learning to enhance some dark and insignificant images. Finally, in terms of speed and accuracy Balanced.
  • the image detection and early warning component realizes real-time detection, recognition and early warning of the prostate surgery video image with the weight of training.
  • a neural network computing stick is used.
  • Movidius Neural Computing Stick (NCS–Neural Computing), its biggest feature is that it can provide more than 100 billion floating-point operations per second at a power of 1 watt.
  • the steps include, first, prepare the Mobilenet+Yolo deep neural network model and test data set that have been trained using the caffe deep learning platform.
  • the test data set of the video detection task is real-time video.
  • the Caffe model is compiled into a graph file dedicated to the neural computing stick by using the compilation tool mvNCCompile provided by the NCS provided by the neural computing stick; again, the python interface provided by the NCS SDK is called on the neural computing stick to run the compiled nerve Network model. Introduce mvnc module to call neural computing stick for inference work. When the detected classification score reaches more than 94%, the system immediately sends an early warning signal.
  • “Amplifier” has many classes for image processing functions, including operations: perspective, angle deviation, shear, elastic deformation, brightness, contrast, color, rotation, cropping, etc. It uses a “pipeline”-based processing method, and different operations are added to the pipeline in sequence to form the final operation pipeline. The operation is mainly divided into three steps:
  • 1Import related modules instantiate pipeline objects, and specify the directory containing the pictures to be processed
  • the expanded data set can be based on the limited original image data to avoid overfitting during deep learning training, so as to achieve better detection results.
  • the first image pre-processing gray-scale processing and singular value decomposition of the outer envelope data to extract the main component feature values of the image.
  • the present invention uses the "dimension reduction" method of principal component analysis to process pictures, and extract the main key features of the pictures.
  • the advantage of this is that on the one hand, it reduces the model training time; on the other hand, it improves the location accuracy of detection and recognition.
  • the steps are: 1) Load the image. 2) Obtain the gray value of the image. 3) Perform singular value decomposition on the grayscale image.
  • the principal component analysis problem is a basic transformation, that is, transformation from one matrix to another matrix, so that the transformed data has the largest variance.
  • the size of the variance describes the amount of information of a variable.
  • the direction with large variance is the direction of the signal, and the direction with small variance is the direction of noise.
  • principal component analysis is to sequentially find a set of mutually orthogonal coordinate axes in the original space: the first axis is the coordinate that maximizes the variance; the second axis is orthogonal to the first axis The coordinate in the plane that maximizes the variance; the third axis is the coordinate with the largest variance in the plane orthogonal to the 1st and 2nd axes.
  • n-dimensional space if n such coordinate axes can be found, take the first r to approximate this space, so that an n-dimensional space is compressed into r-dimensional space, and the r coordinate axes should be selected to make the space as much as possible.
  • the loss of data during compression is minimal.
  • each row of the matrix represents a sample
  • each column represents a set of features, expressed in matrix language as,
  • P is a transformation matrix that transforms an n-dimensional space into another n-dimensional space, and performs some spatial rotation, stretching and other changes.
  • A is the original image matrix
  • P is a transformation matrix that transforms an n-dimensional space into another n-dimensional space, and performs some spatial rotation, stretching and other changes.
  • the original sample with n features can be transformed into a sample with only r (r ⁇ n) features.
  • the r features are the original Refinement and compression of n features. If we compress the original image, then after an r ⁇ r transformation matrix, we will get the dimensionality-reduced transformation matrix The r ⁇ r transformation matrix is the selected feature vector after sorting. Expressed in mathematical language
  • the singular vectors obtained by singular value decomposition are also arranged from singular values from large to small. From the perspective of principal component analysis, the axis with the largest variance is the first singular vector, and the axis with the second largest variance is the second Singular vector. Singular value decomposition formula
  • A is an m ⁇ n matrix
  • matrix decomposition will get U, E, V T (transpose of V) three matrices, where U is an m ⁇ r square matrix, known as left singularity Vectors, the vectors in the square matrix are orthogonal; E is an r ⁇ r diagonal matrix, except for the elements of the diagonal are all 0, the value on the diagonal is called the singular value; V T (V Transpose) is an r ⁇ n matrix, called right singular vector, and the vectors in the square matrix are also orthogonal.
  • V r ⁇ n U m ⁇ r E r ⁇ r .
  • Second image pre-processing using deep bilateral learning to enhance the outer envelope image after the first image pre-processing.
  • S i is the stride convolution layer, Is the index of the convolution layer; x′, y′ are the horizontal and vertical coordinates of the pixel before convolution, x, y is the horizontal and vertical coordinates of the pixel after convolution; c and c′ are the index of the convolution layer channel; w Convolution kernel weight matrix; b is offset.
  • the activation function ⁇ adopts ReLU convolution, it is filled with 0. Since the scale of the image will be reduced after the convolution, the pixels that are initialized to 0 at the periphery of the original picture can be added to maintain the scale of the convolved image to a certain extent.
  • This formula means to perform n s layer operation on the low-resolution copy of the image.
  • Each convolution layer includes the convolution operation of the convolution check image and input the result into the activation function, thus obtaining the feature map of the low-resolution image.
  • n s is the maximum value of the above convolutional layer index i). It has two functions: one is to drive the learning of low-resolution input and the learning of the affine coefficients in the final grid. The larger the n s , the coarser the grid; the second is to control the prediction As a result of the complexity, the deeper network layers can obtain more complex and abstract features.
  • n s 4 is set, and the size of the convolution kernel is 3 ⁇ 3.
  • the low-resolution stream is divided into a local path and a global path.
  • the local path uses the full convolution layer to learn the local features of the image data
  • the global path uses the convolution layer and the fully connected layer to learn the global features of the image. Then, the two The output of the path is fused into a common set of fused features.
  • Set stride 1 here, which means that the resolution of this part will not change anymore, and at the same time, the number of channels will not change. Therefore, adding the convolution used in step 4.1), the total is n S + n L layers.
  • A can be seen as a 16 ⁇ 16 ⁇ 8 double-sided grid, each grid has a 3 ⁇ 4 affine color transformation matrix.
  • This conversion makes the previous feature extraction and operation operate in the bilateral domain, which corresponds to the convolution in the x and y dimensions, learning the features of the z and c dimensions blending with each other. Therefore, the previous feature extraction operation is also more expressive than using 3D convolution in a double-sided grid, because the latter can only relate to the z dimension.
  • it is more effective than the general two-sided grid, because it only needs to focus on the c-dimensional discretization. In short, that is, by using 2D convolution and using the last layer as a double-sided grid, it can be used to determine the optimal way to convert 2D to 3D.
  • a c [i, j, k] represents the bilateral grid coefficients obtained based on the low-resolution image, and i, j, k represents its three dimensions. Represents the coefficient based on high-resolution space obtained after up-sampling of A c [i,j,k].
  • ⁇ ( ⁇ ) max(1-
  • ,0) ⁇ ( ⁇ ) represents linear interpolation
  • s x and s y represent the height and width ratios of the grid and the full resolution original image
  • each pixel Is assigned a coefficient this coefficient is the coefficient of the affine transformation above
  • its corresponding depth in the grid is determined by the image gray value g[x,y], which is A c [x,y,g[x ,y]], that is, use the guide map to interpolate the grid.
  • the depth of each pixel is the corresponding guide map pixel minus the depth of the corresponding grid.
  • the slicing is done using the OpenGL library. Through this operation, the edges of the output graph follow the edges of the input graph, achieving the edge-preserving effect.
  • the guide map is obtained by adding three channels to the original image, and the calculation formula is as follows
  • ⁇ c is a piecewise linear conversion module, including threshold t c,i and gradient a c,i , which is obtained by 16 ReLU activation units.
  • the calculation formula is as follows:
  • the parameters M, a, t, b, b′ are all obtained through learning.
  • Neural network training Perform feature extraction and network training on the outer envelope image after the second image preprocessing to generate a trained detection model. Specific steps include:
  • YOLOv2 divides pre-training into two steps: first train the network from the beginning with a 224 ⁇ 224 input, about 160 sequences (loop all training data 160 times); then, adjust the input to 448 ⁇ 448, and train 10 more sequence.
  • the training structure adopted by the present invention uses MobileNet for feature extraction.
  • the core idea of MobileNet is to decompose the standard convolutional layer into two convolutional layers: sub-channel convolution and single-pixel convolution.
  • Sub-channel convolution uses M convolution kernels to generate M feature maps, and single pixel convolution linearly combines the feature maps.
  • the calculation of the MobileNet convolutional layer can be divided into two steps:
  • Sub-channel convolution For each input channel, a convolution kernel of D K ⁇ D K ⁇ 1 is used for convolution, and a total of M convolution kernels are used to obtain M D F ⁇ D F ⁇ 1 feature maps. These The feature maps come from different channels of input and are independent of each other.
  • the calculation amount using the MobileNet convolution method can save about 8 to 9 times, which can effectively reduce the parameter amount of the Yolo algorithm, reduce the calculation amount, and further ensure the real-time performance of the early warning function.
  • YOLOv2's “anchor box” is obtained by clustering. Count the training samples, and take the top shapes as the “anchor box”. Since the data comes from training samples, if each grid is predicted according to this, it will basically cover the most likely situation, and the recall rate will be relatively high. YOLOv2 uses “anchor boxes” to predict "boundary boxes”.
  • YOLOv2 performs target angle detection by dividing grids. Each grid is responsible for detecting a part of the picture. Each grid includes 5 “anchor boxes”. YOLOv2 predicts four coordinate values (t x , t y , t w , t h ) for each “anchor box”, based on the offset (c x , c y ) in the upper left corner of the image and the width p of the previously obtained bounding box w and high p h , the equation is as follows,
  • YOLOv2 predicts the score of an object through logistic regression for each "boundary box". If the predicted "boundary box” mostly coincides with the real border value and is better than all other predicted, then this value is 1. If the overlapping part does not reach a threshold (the default threshold set in YOLOv2 is 0.5), then the predicted "boundary box" will be ignored, which means that there will be no loss.
  • the vector size of the YOLOv2 neural network output is 13 ⁇ 13 ⁇ 30, of which 13 ⁇ 13 is to divide the picture into 13 rows and 13 columns, a total of 169 cells, and 30 represents 30 data per cell.
  • Detection and early warning real-time dynamic images of the prostate surgery site video are collected, and the dynamic images are recognized as image data after the first image preprocessing and the second image preprocessing are input to the detection model, when the detection model detects outsourcing When the membrane features a target, an alarm message is output.
  • the management machine reads the endoscope equipment through a dedicated video adapter card to output real-time video;
  • the real-time video is analyzed by the detection and early warning module, and the detection result is output in the form of video.
  • the buzzer sounds to alert the doctor;
  • the minimum operating system requirements for the management machine are Windows 7 or ubuntu 16.04, CPU i5 quad-core, and 8G memory. Equipped with an image processing unit (GPU) or multiple Movidius (neural computing sticks) that support deep learning algorithms can further accelerate the video processing speed.
  • GPU image processing unit
  • Movidius neural computing sticks

Abstract

A method and system for intelligent envelope detection and warning in prostate surgery. The method comprises the following steps: (1) acquiring envelope image data in a prostate surgery video; (2) performing gray scale processing and singular value decomposition on the envelope data, and extracting a principal component eigenvalue of an image; (3) performing, by means of deep bilateral learning, image enhancement on the envelope image having undergone image preprocessing in step 1; (4) training a neural network; and (5) performing detection, and issuing a warning. A latest artificial-intelligence image recognition technique is used for prostate envelope target detection, thereby providing an intelligent warning in prostate surgery. Compared with an existing static medical image recognition technique, the method performs recognition and warning analysis on a dynamic image of a surgery video. Image preprocessing measures such as data augmentation, principal component analysis, and image enhancement are used, thereby striking a balance between speed and accuracy, and meeting actual application requirements of auxiliary warnings for prostate surgery.

Description

前列腺手术中外包膜智能检测和预警方法及系统Method and system for intelligent detection and early warning of outer envelope in prostate surgery 技术领域Technical field
本发明涉及人工智能的目标检测技术领域,具体地指一种前列腺手术中外包膜智能检测和预警方法及系统。The invention relates to the technical field of target detection of artificial intelligence, in particular to a method and system for intelligent detection and early warning of an outer envelope in prostate surgery.
背景技术Background technique
在传统的图像处理领域中,目标检测是一项很热门的关键技术,研究得比较多的包括人脸检测和行人检测等。传统的目标检测一般使用滑动窗口的框架,主要包括三个步骤:一是利用滑动窗口去挑选候选区域;二是提取候选区域相关的视觉特征;三是利用分类器进行识别。比较经典的算法是多尺度形变部件模型,该算法可以看作是“梯度直方图+支持向量机”方法的扩展,缺点是比较复杂,运算速度慢,不能支持实时性要求高的应用。In the traditional image processing field, target detection is a very popular key technology, and more researches include face detection and pedestrian detection. The traditional target detection generally uses a sliding window framework, which mainly includes three steps: one is to use the sliding window to select the candidate region; the second is to extract the visual features related to the candidate region; the third is to use the classifier for recognition. The more classic algorithm is a multi-scale deformable component model. This algorithm can be regarded as an extension of the "gradient histogram + support vector machine" method. The disadvantage is that it is more complicated and the calculation speed is slow, which cannot support applications with high real-time requirements.
基于深度学习的目标检测发展起来后,实时性效果有了很大改善。2013年基于区域的卷积神经网络(Region-based Convolutional Neural Networks,R-CNN)诞生,检测平均精度(Mean Average Precision,mAP)被提升至48%。2014年在修改了网络结构后平均精度又被提升至66%,这是一个真正可以在工业级应用的解决方案。之后又出现了空间金字塔池化网络(Spatial Pyramid Pooling net,SPP-net)、快速的基于区域的卷积神经网络(Fast Region-based Convolutional Neural NetworksFast,R-CNN)、更快的基于区域的卷积神经网络(Faster Region-based Convolutional Neural Networks,Faster R-CNN)、基于区域的全卷积网络(Region based Fully Convolutional Network,R-FCN)、统一的实时的目标检测(You Only Look Once:Unified,Real-Time Object Detection,YOLO)、单镜头多盒检测器(single shot multibox detector,SSD)等更快速度更高精度的解决方案。基于深度学习的目标检测算法可分为两类,一是基于区域提名的算法,包括R-CNN,、SPP-net,Fast R-CNN、Faster R-CNN,、R-FCN;二是基于端到端的算法,如YOLO和SSD,不过,这两种算法存在着训练时间长,定位不够准确的问题。After the development of target detection based on deep learning, the real-time effect has been greatly improved. In 2013, Region-based Convolutional Neural Networks (R-CNN) was born, and the average detection accuracy (Mean Average Precision, mAP) was increased to 48%. After modifying the network structure in 2014, the average accuracy was increased to 66% again, which is a solution that can be applied in industrial grade. Later, Spatial Pyramid Pooling Network (SPP-net), fast region-based convolutional neural network (Fast Region-based Convolutional Neural Networks Fast, R-CNN), and faster region-based volume Neural network (Faster Region-based Convolutional Neural Networks, Faster R-CNN), region-based full convolutional network (Region based Fully Convolutional Network, R-FCN), unified real-time target detection (YouOnlyLookOnce: Unified , Real-Time Object Detection (YOLO), single-shot multi-box detector (Single shot multibox detector, SSD) and other faster and more accurate solutions. Target detection algorithms based on deep learning can be divided into two categories. One is based on regional nomination algorithms, including R-CNN, SPP-net, Fast R-CNN, Faster R-CNN, and R-FCN; End-to-end algorithms, such as YOLO and SSD, however, these two algorithms have the problem of long training time and insufficient positioning accuracy.
文献[1][2][3][4]分别采用了人工神经网络、概率神经网络、多层神经网络、支持向量机等技术来解决医学图像处理问题。文献[5]采用了合适的滤波器做预处理来移除噪声。文献[6]采用主成分分析(Primary Components Analysis,PCA)和分割的方式做了一个智能模型。文献[7]采用梯度矢量流抽取图像中肿瘤的边缘,采用主成分分析和人工神经网络结合(Primary Components Analysis-neural network,PCA-ANN)的方法来检测感兴趣的区域。文献[8]采用离散小波变化来获得医学图像的特征,并且利用PCA来减少特征。文献[9]利用离散小波变换的方法来抽取特征并且利用PCA来减少特征。但是,上述研究中都没有考虑算法的实时性,因此,这些算法不适合应用于对实时性要求很高的微创等离子双极电切手术之中。目前,基于深度学习的实时性比较好的目标检测算法是YOLO和SSD,但在针对前列腺手术视频中外包膜的检测方面,它们仍存在实时性保证问题和目标定位不够准确的问题。所以,有必要研究和设计一种新的方法来满足前列腺手术中更快速、更精准地检测和判断外包膜的要求。Literatures [1][2][3][4] adopted artificial neural network, probabilistic neural network, multi-layer neural network, support vector machine and other technologies to solve medical image processing problems. Reference [5] uses a suitable filter as a preprocessing to remove noise. Reference [6] made an intelligent model by using principal components analysis (Primary Components Analysis, PCA) and segmentation. Reference [7] uses gradient vector flow to extract the edges of tumors in the image, and uses a combination of principal component analysis and artificial neural network (Primary Components Analysis-neural network, PCA-ANN) method to detect the region of interest. Reference [8] uses discrete wavelet changes to obtain the features of medical images, and uses PCA to reduce the features. Literature [9] uses discrete wavelet transform to extract features and PCA to reduce features. However, none of the above studies have considered the real-time nature of the algorithms. Therefore, these algorithms are not suitable for minimally invasive plasma bipolar electrosurgical procedures that require high real-time performance. At present, the real-time target detection algorithms based on deep learning are YOLO and SSD, but they still have the problems of real-time guarantee and target positioning inaccuracy in the detection of the outer envelope in the prostate surgery video. Therefore, it is necessary to research and design a new method to meet the requirements of faster and more accurate detection and judgment of the outer envelope in prostate surgery.
发明内容Summary of the invention
针对微创等离子双极电切手术预警分析应用的具体需求和医疗图像处理技术的发展现状,本发明提出了一种前列腺手术中外包膜智能检测和预警方法及系统,重点解决了两个的问题:一是基于手术现场视频影像对外包膜检测的实时性保证问题;二是在保证不出现漏检的前提下,尽量提高外包膜位置定位的精确性,为手术医师提供更好的预警指示和帮助。In view of the specific needs of the early warning analysis application of minimally invasive plasma bipolar electrosurgical surgery and the development status of medical image processing technology, the present invention proposes a method and system for intelligent detection and early warning of the outer envelope in prostate surgery, focusing on solving two problems : First, the problem of real-time guarantee of the outer envelope detection based on the video images of the operation site; second, on the premise of ensuring that no leaks are detected, try to improve the accuracy of the positioning of the outer envelope, and provide better warning instructions for the surgeon And help.
本发明提出的前列腺手术中外包膜智能检测和预警方法,其特殊之处在于,所述方法包括如下步骤:The method for intelligent detection and early warning of the outer envelope in prostate surgery proposed by the present invention is special in that the method includes the following steps:
1)数据采集:采集前列腺手术录像中的外包膜图像数据;1) Data collection: collect the outer envelope image data in the prostate surgery video;
2)第一次图像预处理:对所述外包膜数据进行灰度处理和奇异值分解,提取具有主成分特征值的外薄膜图像;2) The first image pre-processing: grayscale processing and singular value decomposition are performed on the outer envelope data to extract the outer film image with the main component characteristic value;
3)第二次图像预处理:采用深度双边学习的方法对第一步图像预处理后的外包膜图像进行图片增强;3) Second image pre-processing: using deep bilateral learning to enhance the outer envelope image after the first image pre-processing;
4)神经网络训练:对第二次图像预处理后的外包膜图像进行特征提取和网 络训练,产生训练后的检测模型;4) Neural network training: perform feature extraction and network training on the outer envelope image after the second image preprocessing to generate a trained detection model;
5)检测和预警:实时采集前列腺手术现场视频的动态图像,将动态图像识别为图像数据经过第一次图像预处理和第二次图像预处理后输入至检测模型,当检测模型检测到外包膜特征目标时,输出报警信息。5) Detection and early warning: real-time dynamic images of the prostate surgery site video are collected, and the dynamic images are recognized as image data after the first image preprocessing and the second image preprocessing are input to the detection model, when the detection model detects outsourcing When the membrane features a target, an alarm message is output.
优选地,步骤2)之前还包括数据扩增步骤。训练样本全部来自于前列腺手术视频录像,由于种种原因,不可避免地存在截取的图片特征不明显、特征冗余等情况。另外,毕竟视频资料有限,要考虑应用中不同医师的习惯、操作手法的不同,势必导致外包膜图像会呈现出不同角度、各种各样形状的可能。因此,本发明设计了用“扩增器”进行图像数量增强。Preferably, before step 2), a data amplification step is also included. The training samples are all from prostate surgery video recordings. Due to various reasons, it is inevitable that the captured image features are not obvious and the features are redundant. In addition, after all, the video data is limited. It is necessary to consider the different habits and operating methods of different doctors in the application, which will inevitably lead to the possibility that the outer envelope image will show different angles and various shapes. Therefore, the present invention is designed to use the "amplifier" to enhance the number of images.
优选地,所述步骤4)基于YOLOv2平台及MobileNet深度学习模型实现。由于检测预警系统需要运行在手术主机一体化的嵌入式设备上,所以,采用mobilenet+YOLOv2的组合方式,其最大优点是实时性可以得到很好保证,在速度和精度上达到平衡,满足了前列腺手术辅助预警的实际应用要求。Preferably, the step 4) is based on the YOLOv2 platform and the MobileNet deep learning model. Because the detection and warning system needs to run on the embedded device integrated with the surgical host, the biggest advantage of using the combination of mobilenet + YOLOv2 is that the real-time performance can be well guaranteed, and the balance between speed and accuracy is met, which meets the prostate. The practical application requirements of surgical assistant early warning.
优选地,所述步骤3)的具体步骤包括:Preferably, the specific steps of step 3) include:
3.1)将高分辨率的输入图像转换为低分辨率流;3.1) Convert high-resolution input images to low-resolution streams;
3.2)将低分辨率流分为局部路径和全局路径,局部路径使用全卷积层学习图像数据的局部特征,全局路径使用卷积层和全连接层学习图像的全局特征,然后将两条路径的输出融合到一组共同的融合特征中;3.2) The low-resolution stream is divided into a local path and a global path. The local path uses a full convolution layer to learn the local features of the image data, and the global path uses the convolution layer and the fully connected layer to learn the global features of the image, and then the two paths The output of is fused into a common set of fusion features;
3.3)将所述融合特征作为第三维展开的双边网络,输出放射系数的双边网格;3.3) Use the fusion feature as a third-dimensional unfolded bilateral network and output a bilateral grid of emissivity coefficients;
3.4)通过一个单通道的引导图对放射系数的双边网格进行上采样;3.4) Upsampling the two-sided grid of emissivity through a single-channel guide map;
3.5)对融合特征做仿射变换后以全分辨率输出。3.5) Affine transform the fused features and output at full resolution.
优选地,所述数据扩增步骤的具体步骤为:导入模块,实例化管道对象,指定包含要处理图片所在的目录;定义数据增强操作,包括透视、角度偏差、剪切、弹性形变、亮度、对比度、颜色、旋转、裁剪,添加到管道中;调用管道的样本函数,指定增强后的样本总量。Preferably, the specific steps of the data amplification step are: importing the module, instantiating the pipeline object, specifying the directory containing the image to be processed; defining data enhancement operations, including perspective, angle deviation, shearing, elastic deformation, brightness, Contrast, color, rotation, and cropping are added to the pipeline; call the pipeline's sample function to specify the total amount of samples after enhancement.
优选地,所述步骤4)的具体步骤包括:4.1)预训练;4.2)特征提取;4.3)边界箱预测;4.4)分类。Preferably, the specific steps of step 4) include: 4.1) pre-training; 4.2) feature extraction; 4.3) boundary box prediction; 4.4) classification.
本发明还提出一种基于上述的前列腺手术中外包膜智能检测和预警系统,其特殊之处在于,包括图像采集模块、图像处理模块、图像检测预警模块;所述图像采集模块用于采集和存储图像信息和模型;所述图像处理模块用于对采集的图像数据进行第一次图像预处理、第二次图像预处理;所述图像检测预警模块用于对处理后的图像进行网络训练,产生训练后的检测模型,再将待检测图像输入检测模型得到检测和预警结果。The invention also provides an intelligent detection and early warning system based on the outer envelope in prostate surgery, which is special in that it includes an image acquisition module, an image processing module, and an image detection and early warning module; the image acquisition module is used for acquisition and storage Image information and model; the image processing module is used to perform the first image preprocessing and the second image preprocessing on the collected image data; the image detection and early warning module is used to perform network training on the processed image to generate After the training of the detection model, the image to be detected is input into the detection model to obtain the detection and early warning results.
进一步地,所述图像采集模块包括用于与内窥镜对接的数字视频接口、用于存储手术中实时图像数据的图像数据存储器和用于存储经过处理后的图像和经过深度学习后的模型的图像模型存储器。Further, the image acquisition module includes a digital video interface for interfacing with an endoscope, an image data memory for storing real-time image data during surgery, and a model for storing processed images and models after deep learning. Image model storage.
更进一步地,所述图像处理模块包括数据扩增组件、图像特征提取组件和图像增强组件。Furthermore, the image processing module includes a data amplification component, an image feature extraction component and an image enhancement component.
更进一步地,所述图像检测预警模块包括图像深度训练组件和图像检测预警组件。Furthermore, the image detection and early warning module includes an image depth training component and an image detection and early warning component.
本发明的工作过程为:首先,从手术视频录像中抽取一定数量的前列腺外包膜图片;其次,如果抽取的包膜图像过少,可使用数据增强的方法将图片进行数量上的增强;再次,使用PCA抽取图像特征,进行第一步图像预处理;然后,用深度双边学习的方法对部分特征不明显的图片进行第二部预处理,随后,用mobilene+YOLOv2训练图片;最后,对监视器上的实时手术视频影像进行外包膜图像目标检测。The working process of the present invention is as follows: first, a certain number of prostate outer capsule images are extracted from the surgical video recording; second, if the extracted capsule images are too few, the data enhancement method can be used to enhance the number of pictures; again , Use PCA to extract image features, perform the first step of image preprocessing; then, use deep bilateral learning to perform a second preprocessing on some pictures with insignificant features, and then use mobilene+YOLOv2 to train the pictures; finally, monitor The real-time surgical video image on the device is used to detect the outer envelope image target.
本发明的有益效果在于:The beneficial effects of the present invention are:
1)手术过程中,内窥镜通过探头跟踪机械操作部位,获取操作区间可视图像。由于病人体位的不同,以及医师的习惯操作手法不同,势必导致外包膜图像会呈现出不同角度、各种各样形状。通过数据扩充手段可以极大地丰富原始数据集,避免在进行深度学习时出现过拟合的现象,从而达到更好的检测效果。1) During the operation, the endoscope tracks the mechanical operation site through the probe to obtain a visual image of the operation interval. Due to the different postures of the patients and the different habitual operation methods of the doctors, the outer envelope image will show different angles and various shapes. The data expansion method can greatly enrich the original data set, avoiding the phenomenon of overfitting during deep learning, so as to achieve better detection results.
2)如果一幅图像中目标的特征值太多,反而会导致定位不精确问题。另外,外包膜的纹理、颜色等特征与一些息肉组织比较相近,需要仔细观查才能区别。使用主成分分析方法对图片进行预处理,可以有效地精选出关键的图像特征,一方面减少了深度学习训练时间,另一方面优化了已有的检测模型,可以得到更准 确的外包膜定位效果。2) If there are too many feature values of the target in an image, it will lead to the problem of inaccurate positioning. In addition, the texture, color and other characteristics of the outer envelope are similar to some polyp tissues, and they need to be carefully observed to distinguish them. Preprocessing the pictures using the principal component analysis method can effectively select key image features, on the one hand reduce the deep learning training time, on the other hand optimize the existing detection model, you can get a more accurate outer envelope Positioning effect.
3)内窥镜影像的对焦要由手术医师来手动操作,加上光源与被摄体之间距离也总在不断地变化,所以,不可避免地存在一些影像不太清晰的情况。使用图像增强技术配合上述的主成分分析方法对图片进行预处理,可以使得部分灰暗的图片的特征部分更加明显,从而在训练的时候更好的提取特征。同时,在检测过程中加入图像增强可以有效地提高识别准确率。3) The focus of the endoscopic image must be manually operated by the surgeon, and the distance between the light source and the subject is constantly changing. Therefore, it is inevitable that some images are not clear. Using image enhancement technology in conjunction with the above-mentioned principal component analysis method to preprocess the picture can make the feature parts of the partially dark picture more obvious, so as to better extract features during training. At the same time, adding image enhancement to the detection process can effectively improve the recognition accuracy.
4)由于检测预警系统需要运行在手术主机一体化的嵌入式设备上,所以,采用mobilenet+YOLOv2的组合方式,其最大优点是实时性可以得到很好保证,但缺点是检测精度不高。为此,我们采用借助于数据扩充、主成分分析、图像增强等图像预处理措施,可以在速度和精度上达到平衡,满足了前列腺手术辅助预警的实际应用要求。4) Since the detection and early warning system needs to run on the embedded device integrated with the surgical host, the biggest advantage of using the combination of mobilenet+YOLOv2 is that the real-time performance can be well guaranteed, but the disadvantage is that the detection accuracy is not high. To this end, we adopt image preprocessing measures such as data expansion, principal component analysis, image enhancement, etc., which can achieve a balance between speed and accuracy, and meet the practical application requirements of auxiliary warning for prostate surgery.
附图说明BRIEF DESCRIPTION
图1为本发明前列腺手术中外包膜智能检测和预警方法的系统的结构框图。FIG. 1 is a structural block diagram of a system for an intelligent detection and early warning method of an outer envelope in prostate surgery of the present invention.
图2为本发明前列腺手术中外包膜智能检测和预警方法的工作流程图。FIG. 2 is a working flowchart of a method for intelligent detection and early warning of an outer envelope in prostate surgery of the present invention.
图3为本发明前列腺手术中外包膜智能检测和预警方法的检测效果图。3 is a detection effect diagram of an intelligent detection and early warning method of an outer envelope in prostate surgery of the present invention.
具体实施方式detailed description
下面结合附图及实施例对本发明作进一步的详细描述,但该实施例不应理解为对本发明的限制。The present invention will be further described in detail below with reference to the drawings and embodiments, but this embodiment should not be construed as limiting the present invention.
本发明主要是针对微创前列腺手术视频影像中的外包膜图像进行实时预警识别。如图1所示,预警系统主要包括图像采集模块、图像处理模块、图像检测预警模块。The invention is mainly for real-time early warning and recognition of the outer envelope image in the video image of minimally invasive prostate surgery. As shown in Figure 1, the early warning system mainly includes an image acquisition module, an image processing module, and an image detection and early warning module.
图像采集模块,用于采集和存储图像信息和模型,其中含有一个连接内窥镜图像设备数字视频接口DVI的转接接口、一个图像数据存储器、一个图像模型存储器;转接接口负责将内窥镜数字视频接口输出的1920×1200p/60Hz CVT-RB视频流转换为1920×1080p/60Hz RGB24视频流,并且输入到运行预警 分析系统的管理机内;图像数据存储器负责缓存手术影像的实时视频数据,缓存空间可针对1080p(或者720p)的影像质量来设置;图像模型存储器负责存储经过预处理后的图像和经过深度学习训练后的模型。Image acquisition module, used to collect and store image information and models, which contains an adapter interface connected to the digital video interface DVI of the endoscopic imaging device, an image data storage, and an image model storage; the adapter interface is responsible for the endoscope The 1920×1200p/60Hz CVT-RB video stream output from the digital video interface is converted to 1920×1080p/60Hz RGB24 video stream, and input into the management machine running the early warning analysis system; the image data storage is responsible for buffering real-time video data of surgical images, The buffer space can be set for the image quality of 1080p (or 720p); the image model memory is responsible for storing the pre-processed image and the model after deep learning training.
图像处理模块,用于对采集的图像数据进行第一次图像预处理、第二次图像预处理。其中含有一个数据扩增组件、一个图像特征提取组件、一个图像增强组件。The image processing module is used to perform the first image preprocessing and the second image preprocessing on the collected image data. It contains a data amplification component, an image feature extraction component, and an image enhancement component.
数据扩增组件实现对标记了的外包膜图像实施旋转、拉伸、弹性形变、裁剪等操作。The data amplification component implements operations such as rotation, stretching, elastic deformation, and cropping on the marked outer envelope image.
图像特征提取组件实现基于主成分分析的外包膜图片特征获取,共提取300个特征值,包括以下功能:The image feature extraction component realizes the feature acquisition of the outer envelope picture based on principal component analysis, and extracts a total of 300 feature values, including the following functions:
1)对采集的外包膜图像进行灰度处理。对采集的外包膜图像进行灰度处理;彩色图像中的每个像素的颜色有R、G、B三个分量决定,而每个分量有255中值可取,这样一个像素点可以有1600多万(255×255×255)的颜色的变化范围。而灰度图像是R、G、B三个分量相同的一种特殊的彩色图像,其一个像素点的变化范围为255种,所以,在数字图像处理种一般先将各种格式的图像转变成灰度图像以使后续的图像的计算量变得少一些。1) Perform gray-scale processing on the collected outer envelope image. Perform gray-scale processing on the collected envelope image; the color of each pixel in the color image is determined by the three components of R, G, and B, and each component has a median value of 255, such a pixel can have more than 1600 pixels Thousands (255 × 255 × 255) range of colors. The grayscale image is a special color image with the same three components of R, G, and B. The change range of one pixel is 255 kinds. Therefore, in digital image processing, the images of various formats are generally converted into Grayscale images to make subsequent images less computational.
2)对灰度图进行奇异值分解。特征值分解是一个提取矩阵特征很不错的方法,但是它只是对方阵而言的。在现实的世界中,我们看到的大部分矩阵都不是方阵,但是,使用奇异值分解可以描述这样普通的矩阵的重要特征。任何一个m×n矩阵都能进行奇异值分解,拆分为3个矩阵相乘的形式。奇异值分解可以将一个比较复杂的矩阵用更小、更简单的几个子矩阵的相乘来表示,这些小矩阵描述了原矩阵的重要特性。由于奇异值分解得出的奇异向量是由大到小排列的,从主成分分析观点看,方差最大的坐标轴就是第一个奇异向量,方差次大的坐标轴就是第二个奇异向量。因此,可以基于奇异值分解得到灰度图最重要的关键特征。2) Perform singular value decomposition on the grayscale image. Eigenvalue decomposition is a very good method for extracting matrix features, but it is only for square matrix. In the real world, most of the matrices we see are not square matrices, but the use of singular value decomposition can describe the important characteristics of such ordinary matrices. Any m×n matrix can be singular value decomposition, split into three matrix multiplication form. Singular value decomposition can represent a more complex matrix by multiplying several smaller and simpler sub-matrices. These small matrices describe the important characteristics of the original matrix. Since the singular vectors derived from singular value decomposition are arranged from large to small, from the perspective of principal component analysis, the axis with the largest variance is the first singular vector, and the axis with the second largest variance is the second singular vector. Therefore, the most important key features of the grayscale image can be obtained based on the singular value decomposition.
3)外包膜图片的重新生成和保存。在300×300的图像中,我们提取了300个特征。3) Regenerate and save the outer envelope picture. In the 300×300 image, we extracted 300 features.
图像增强组件实现对部分比较灰暗的图片进行图像增强,确定最终的训练数 据集,包括以下功能:The image enhancement component implements image enhancement on some dark pictures, and determines the final training data set, including the following functions:
1)低分辨率图像的特征提取。通过将高分辨率的输入图像转换成低分辨率,并在低分辨率上进行大多数的学习和训练过程,可以节省大量计算成本,实现模型的快速评估。在低分辨率流中对输入图像I的低分辨率拷贝
Figure PCTCN2019074084-appb-000001
进行了大部分推断,最终以类似于双边网格的表示来预测局部仿射变换。
1) Feature extraction of low-resolution images. By converting high-resolution input images to low-resolution, and performing most of the learning and training processes at low-resolution, a large amount of computational cost can be saved and the model can be quickly evaluated. Low-resolution copy of input image I in a low-resolution stream
Figure PCTCN2019074084-appb-000001
Most of the inferences were made, and finally a local affine transformation was predicted with a representation similar to a bilateral grid.
2)融合特征作为第三维已经展开的双边网络。由于图像增强通常不仅取决于局部图像特征,还取决于全局图像特征,如直方图,平均强度,甚至场景类别。因此,我们的低分辨率流进一步分为局部路径和全局路径。然后,我们的架构融合了这两条路径,以产生代表仿射变换的最终系数。低分辨率流的输入
Figure PCTCN2019074084-appb-000002
将图像尺寸调整成256×256,它首先由一系列卷积层处理,以提取低级特征并降低空间分辨率。然后将最后的低级特征由两条不对称路径处理,一条路径是全卷积的,专门学习图像数据的局部特征,同时保留空间信息;第二条路径使用卷积层和全连接层来学习全局特征。最后将两个路径的输出融合到一组共同的特征中,逐点线性层从融合的流中输出最终阵列A,将其称为仿射系数的双边网格。
2) The fusion feature as the third bilateral network that has been developed. Since image enhancement usually depends not only on local image features, but also on global image features, such as histogram, average intensity, and even scene category. Therefore, our low-resolution stream is further divided into local paths and global paths. Then, our architecture merges these two paths to produce the final coefficients that represent the affine transformation. Low resolution stream input
Figure PCTCN2019074084-appb-000002
The image size is adjusted to 256×256, which is first processed by a series of convolutional layers to extract low-level features and reduce spatial resolution. Then the final low-level features are processed by two asymmetric paths, one path is fully convoluted, specializing in learning local features of image data while retaining spatial information; the second path uses a convolutional layer and a fully connected layer to learn the global feature. Finally, the outputs of the two paths are fused into a common set of features, and the point-wise linear layer outputs the final array A from the fused stream, which is called a double-sided grid of affine coefficients.
3)使用可训练的切片进行上采样。引入一个基于双边网格分片操作的层,可以将上一步的信息转换到高分辨率空间。该层将单通道引导图g和特征图A(视为双边网格)作为输入,并在A上进行数据查找,分片算子进行上采样操作,即通过在由g定义的位置三线性插值A的系数,输出结果是一个新的特征图
Figure PCTCN2019074084-appb-000003
其空间分辨率与g相同。分片使用OpenGL(开放式图形库)完成,通过这个操作使得输出图的边缘遵循输入图的边缘,达到保边的效果。
3) Use trainable slices for upsampling. Introducing a layer based on bilateral grid sharding operation can transform the information of the previous step into a high-resolution space. This layer takes the single-channel guide map g and the feature map A (considered as a double-sided grid) as inputs, and performs data search on A, and the sharding operator performs upsampling operations, that is, by trilinear interpolation at the position defined by g A coefficient, the output is a new feature map
Figure PCTCN2019074084-appb-000003
Its spatial resolution is the same as g. Sharding is done using OpenGL (Open Graphics Library), and by this operation, the edges of the output graph follow the edges of the input graph to achieve the edge-preserving effect.
4)实现全分辨率的最终输出。对于输入图像I,提取其特征
Figure PCTCN2019074084-appb-000004
其用途一是获得引导图,二是为上述得到的全分辨局部仿射模型做回归。引导图的获取是对原始图像进行三个通道操作后相加得到,最终的输出可以看作是对输入特征做仿射变换后的结果。
4) Realize the final output of full resolution. For input image I, extract its features
Figure PCTCN2019074084-appb-000004
Its purpose is to obtain the guide map, and the second is to do regression for the full-resolution local affine model obtained above. The guide map is obtained by adding three channels to the original image, and the final output can be regarded as the result of affine transformation of the input features.
图像检测预警模块用于对处理后的图像进行网络训练,产生训练后的检测模型,再将待检测图像输入检测模型得到检测和预警结果,其中含有一个图像深度训练组件和一个图像检测预警组件。The image detection and early warning module is used to perform network training on the processed images to generate a trained detection model, and then input the image to be detected into the detection model to obtain detection and early warning results, which includes an image depth training component and an image detection and early warning component.
图像深度训练组件由以下功能组成:The image depth training component consists of the following functions:
1)用最终确定的数据集进行预训练。先用224×224的输入从头开始训练网络,大概160个序列(将所有训练数据循环跑160次);然后,再将输入调整到448×448,再训练10个序列。1) Pre-train with the final data set. First train the network from the beginning with a 224×224 input, about 160 sequences (run all training data 160 times in a loop); then, adjust the input to 448×448 and train 10 more sequences.
2)对预处理后的外包膜图片使用mobilenet进行特征特征提取,生成特征图。mobilenet主要是为了适用于移动端而提出的一种轻量级深度网络模型。主要使用了深度可分离卷积(Depthwise Separable Convolution)将标准卷积核进行分解计算,减少了计算量。采用这个网络的目的是为了将深度网络部署于嵌入式设备上。2) Use mobilenet to perform feature feature extraction on the pre-processed outer envelope image to generate a feature map. Mobilenet is a lightweight deep network model proposed mainly for mobile applications. Depthwise Separable Convolution is mainly used to decompose the standard convolution kernel to reduce the amount of calculation. The purpose of using this network is to deploy deep networks on embedded devices.
3)特征提取后使用YOLOv2(YOLO的第2版本)进行分类。基于mobilenet+YOLOv2的深度训练网络虽然可以满足实时的快速的检测,然而检测精度不高。因此我们在检测前将数据进行了扩充,并用主成分分析的方法进行了特征提取,并且用深度双边学习的方法对部分灰暗的,特征不明显的图像进行了增强,最终,在速度和精度上达到了平衡。3) After feature extraction, YOLOv2 (the second version of YOLO) is used for classification. Although the deep training network based on mobilenet+YOLOv2 can meet the fast detection in real time, the detection accuracy is not high. Therefore, we expanded the data before detection, and extracted features using principal component analysis, and used deep bilateral learning to enhance some dark and insignificant images. Finally, in terms of speed and accuracy Balanced.
图像检测预警组件实现用训练的权重对前列腺手术视频影像进行实时检测识别和预警。为了加速检测,使用了神经网络计算棒。Movidius神经计算棒(NCS–Neural Computing Stick),其最大的特性是可以在1瓦的功率下提供超过每秒1000亿次浮点运算的性能。其步骤包括,首先,准备好已经利用caffe深度学习平台训练好的Mobilenet+Yolo的深度神经网络模型和测试数据集,其中视频检测任务的测试数据集是实时视频。其次,通过使用神经计算棒提供的NCS SDK所提供的编译工具mvNCCompile将Caffe模型编译成神经计算棒专用的graph文件;再次,在神经计算棒上调用NCS SDK提供的python API接口运行编译好的神经网络模型。通过导入mvnc模块来调用神经计算棒进行推理工作。当检测的分类分数达到94%以上时,系统随即发出预警信号。The image detection and early warning component realizes real-time detection, recognition and early warning of the prostate surgery video image with the weight of training. In order to speed up the detection, a neural network computing stick is used. Movidius Neural Computing Stick (NCS–Neural Computing), its biggest feature is that it can provide more than 100 billion floating-point operations per second at a power of 1 watt. The steps include, first, prepare the Mobilenet+Yolo deep neural network model and test data set that have been trained using the caffe deep learning platform. The test data set of the video detection task is real-time video. Secondly, the Caffe model is compiled into a graph file dedicated to the neural computing stick by using the compilation tool mvNCCompile provided by the NCS provided by the neural computing stick; again, the python interface provided by the NCS SDK is called on the neural computing stick to run the compiled nerve Network model. Introduce mvnc module to call neural computing stick for inference work. When the detected classification score reaches more than 94%, the system immediately sends an early warning signal.
本发明提出的前列腺手术中外包膜智能检测和预警方法,The method for intelligent detection and early warning of the outer envelope in prostate surgery proposed by the present invention,
包括如下步骤:It includes the following steps:
1)数据采集:采集前列腺手术录像中的外包膜图像数据;外包膜图像数据自于前列腺手术视频录像并对其中具有外包膜特征的图像进行标记。1) Data collection: Collect the outer envelope image data in the prostate surgery video; the outer envelope image data comes from the prostate surgery video recording and mark the images with the outer envelope feature.
2)数据扩增:训练样本全部来自于前列腺手术视频录像。由于种种原因, 不可避免地存在截取的图片特征不明显、特征冗余等情况。另外,毕竟视频资料有限,要考虑应用中不同医师的习惯、操作手法的不同,势必导致外包膜图像会呈现出不同角度、各种各样形状的可能。使用“扩增器”进行图像的扩充。“扩增器”使用于图像增强的软件包,可用于生成机器学习用的图像数据。数据扩增通常是一个多阶段过程,“扩增器”采用基于管道的处理方法,依次添加各种操作从而形成最终的操作管道。图像送到管道中,管道中的操作依次作用到图片上从而形成新的图片并保存下来。“扩增器”管道中定义的操作是按照一定的概率随机地对图片进行相应的处理。2) Data augmentation: all training samples are from video recordings of prostate surgery. For various reasons, it is inevitable that the captured image features are not obvious, and the features are redundant. In addition, after all, the video data is limited. It is necessary to consider the different habits and operating methods of different doctors in the application, which will inevitably lead to the possibility that the outer envelope image will show different angles and various shapes. Use the "amplifier" to expand the image. "Amplifier" is a software package for image enhancement, which can be used to generate image data for machine learning. Data amplification is usually a multi-stage process. The "amplifier" uses a pipeline-based processing method, adding various operations in sequence to form the final operation pipeline. The image is sent to the pipeline, and the operations in the pipeline sequentially act on the picture to form a new picture and save it. The operation defined in the "amplifier" pipeline is to randomly process the pictures according to a certain probability.
“扩增器”有很多用于图像处理功能的类,包含的操作有:透视、角度偏差、剪切、弹性形变、亮度、对比度、颜色、旋转、裁剪等。它采用基于“管道”的处理方法,不同操作依次添加到管道中形成最终的操作管道。操作主要分三步:"Amplifier" has many classes for image processing functions, including operations: perspective, angle deviation, shear, elastic deformation, brightness, contrast, color, rotation, cropping, etc. It uses a "pipeline"-based processing method, and different operations are added to the pipeline in sequence to form the final operation pipeline. The operation is mainly divided into three steps:
①导入相关模块,实例化管道对象,指定包含要处理图片所在的目录;①Import related modules, instantiate pipeline objects, and specify the directory containing the pictures to be processed;
②定义数据增强操作,如透视、角度偏差、剪切、弹性形变、亮度、对比度、颜色、旋转、裁剪等,添加到管道中;②Define data enhancement operations, such as perspective, angle deviation, shearing, elastic deformation, brightness, contrast, color, rotation, cropping, etc., to add to the pipeline;
③调用管道的样本函数,同时,指定增强后的样本总量,无论初始样本有多少,都可以生成指定数量的样本。③Call the sample function of the pipeline, and at the same time, specify the total amount of samples after enhancement, no matter how many initial samples, can generate a specified number of samples.
扩充后的数据集可以在有限的原始影像数据基础上,避免在进行深度学习训练时出现过拟合的现象,从而达到更好的检测效果。The expanded data set can be based on the limited original image data to avoid overfitting during deep learning training, so as to achieve better detection results.
3)第一次图像预处理:对外包膜数据进行灰度处理和奇异值分解,提取图像的主成分特征值。3) The first image pre-processing: gray-scale processing and singular value decomposition of the outer envelope data to extract the main component feature values of the image.
如果一幅图像中目标的特征值太多,反而会导致定位不精确问题。另外,外包膜的纹理、颜色等特征与一些息肉组织比较相近,需要仔细观查才能区别。为此,本发明使用了主成分分析的“降维”方法来处理图片,对图片进行主要的关键特征提取。这样做的好处是一方面减少了模型训练时间;另一方面提高了检测识别的位置精确度。其步骤为:1)加载图像。2)得到图像的灰度值。3)对灰度图像进行奇异值分解操作。If there are too many feature values of an object in an image, it will cause inaccurate positioning. In addition, the texture, color and other characteristics of the outer envelope are similar to some polyp tissues, and they need to be carefully observed to distinguish them. To this end, the present invention uses the "dimension reduction" method of principal component analysis to process pictures, and extract the main key features of the pictures. The advantage of this is that on the one hand, it reduces the model training time; on the other hand, it improves the location accuracy of detection and recognition. The steps are: 1) Load the image. 2) Obtain the gray value of the image. 3) Perform singular value decomposition on the grayscale image.
主成分分析问题是一个基的变换,即从一个矩阵变换到另一个矩阵,使得变换后的数据有最大的方差。方差的大小描述了一个变量的信息量。用于机器学习 的数据,方差大才有意义。方差大的方向是信号的方向,方差小的方向是噪声的方向。主成分分析简单地说,就是在原始的空间中顺序地找一组相互正交的坐标轴:第一个轴是使得方差最大的坐标;第二个轴是在与第一个轴正交的平面中使得方差最大的坐标;第三个轴是在与第1、2个轴正交的平面中方差最大的坐标。假设在n维空间中,如果可以找到n个这样的坐标轴,取前r个去近似这个空间,这样就将一个n维的空间压缩成r维空间,选择的r个坐标轴应尽量使得空间压缩过程中数据的损失最小。The principal component analysis problem is a basic transformation, that is, transformation from one matrix to another matrix, so that the transformed data has the largest variance. The size of the variance describes the amount of information of a variable. For the data used for machine learning, it makes sense only if the variance is large. The direction with large variance is the direction of the signal, and the direction with small variance is the direction of noise. To put it simply, principal component analysis is to sequentially find a set of mutually orthogonal coordinate axes in the original space: the first axis is the coordinate that maximizes the variance; the second axis is orthogonal to the first axis The coordinate in the plane that maximizes the variance; the third axis is the coordinate with the largest variance in the plane orthogonal to the 1st and 2nd axes. Suppose that in n-dimensional space, if n such coordinate axes can be found, take the first r to approximate this space, so that an n-dimensional space is compressed into r-dimensional space, and the r coordinate axes should be selected to make the space as much as possible. The loss of data during compression is minimal.
给定一幅m×n大小图像,将它表示成一个向量矩阵,向量中元素为像素点灰度,按行、列存储,定义为A m×n。假设矩阵每一行表示一个样本,每一列表示一组特征,用矩阵的语言来表示为, Given an m×n size image, it is represented as a vector matrix, and the elements in the vector are pixel grayscale, stored in rows and columns, defined as A m×n . Suppose each row of the matrix represents a sample, and each column represents a set of features, expressed in matrix language as,
Figure PCTCN2019074084-appb-000005
Figure PCTCN2019074084-appb-000005
将一个m×n的矩阵A进行坐标轴的变化,P就是将一个n维空间变换到另一个n维空间的变换矩阵,并进行一些空间上的旋转、拉伸等变化。
Figure PCTCN2019074084-appb-000006
指的是变换后的矩阵。也即:A是原始图像矩阵,主成分分析的目的就是使得原始图像矩阵A经过一个变换矩阵P最终得到变换后的矩阵
Figure PCTCN2019074084-appb-000007
To change the coordinate axis of an m×n matrix A, P is a transformation matrix that transforms an n-dimensional space into another n-dimensional space, and performs some spatial rotation, stretching and other changes.
Figure PCTCN2019074084-appb-000006
Refers to the transformed matrix. That is: A is the original image matrix, and the purpose of principal component analysis is to make the original image matrix A undergo a transformation matrix P to finally obtain the transformed matrix
Figure PCTCN2019074084-appb-000007
将一个m×n的矩阵A变换成一个m×r的矩阵,就可以使得原本有n个特征的样本转变为只有r(r<n)个特征的样本了,这r个特征是对原来的n个特征的提炼和压缩。如果我们将原始图像进行压缩,那么,经过一个r×r的转换矩阵后,会得到降维后的变换矩阵
Figure PCTCN2019074084-appb-000008
这个r×r的转换矩阵就是排序后选择的特征向量。用数学语言表示就是
By transforming an m×n matrix A into an m×r matrix, the original sample with n features can be transformed into a sample with only r (r<n) features. The r features are the original Refinement and compression of n features. If we compress the original image, then after an r×r transformation matrix, we will get the dimensionality-reduced transformation matrix
Figure PCTCN2019074084-appb-000008
The r×r transformation matrix is the selected feature vector after sorting. Expressed in mathematical language
Figure PCTCN2019074084-appb-000009
Figure PCTCN2019074084-appb-000009
奇异值分解得出的奇异向量也是按奇异值由大到小排列的,从主成分分析的观点来看,方差最大的坐标轴就是第一个奇异向量,方差次大的坐标轴就是第二个奇异向量。奇异值分解的算式The singular vectors obtained by singular value decomposition are also arranged from singular values from large to small. From the perspective of principal component analysis, the axis with the largest variance is the first singular vector, and the axis with the second largest variance is the second Singular vector. Singular value decomposition formula
A m×n≈U m×rE r×rV r×n T,           (3) A m×n ≈U m×r E r×r V r×n T , (3)
其中,A是一个m×n的矩阵,那么通过矩阵分解将会得到U,E,V T(V的转置)三个矩阵,其中U是一个m×r的方阵,被称为左奇异向量,方阵里面的向量是正交的;E是一个r×r的对角矩阵,除了对角线的元素其他都是0,对角线上的值 称为奇异值;V T(V的转置)是一个r×n的矩阵,被称为右奇异向量,方阵里面的向量也都是正交的。 Among them, A is an m×n matrix, then matrix decomposition will get U, E, V T (transpose of V) three matrices, where U is an m×r square matrix, known as left singularity Vectors, the vectors in the square matrix are orthogonal; E is an r×r diagonal matrix, except for the elements of the diagonal are all 0, the value on the diagonal is called the singular value; V T (V Transpose) is an r×n matrix, called right singular vector, and the vectors in the square matrix are also orthogonal.
如果在奇异值分解算式的两边同时乘以一个正交矩阵V,公式(3)变为If both sides of the singular value decomposition formula are multiplied by an orthogonal matrix V at the same time, formula (3) becomes
A m×nV r×n≈U m×rE r×rV r× n TV r×n=U m×rE r×r。       (4) A m×n V r×n ≈U m×r E r×r V r × n T V r×n = U m×r E r×r . (4)
将公式(4)与公式(2)对照看,这也就是对矩阵的列进行了压缩。类似地,如果需要对的行进行压缩,只需在奇异值算式的两边同时乘以U的转置矩阵即可,有Compare formula (4) with formula (2), which means that the columns of the matrix are compressed. Similarly, if you need to compress the rows, just multiply both sides of the singular value expression by the transpose matrix of U.
U r×m TA m×n≈E r×rV r×n T        (5) U r×m T A m×n ≈E r×r V r×n T (5)
通过公式(4)和(5),我们就可以得到两个方向上压缩后的主成分特征值。特征值求出来以后,协方差矩阵里的特征值将会被降序排列,特征向量也对应的改变顺序,取前300个特征向量,就可重构图像生成压缩后的具有主成分特征值的外薄膜图像。Through formulas (4) and (5), we can get the principal component eigenvalues compressed in two directions. After the eigenvalues are calculated, the eigenvalues in the covariance matrix will be arranged in descending order, and the eigenvectors will also change correspondingly. Taking the first 300 eigenvectors, the reconstructed image can be generated and compressed with the main component eigenvalues. Film image.
4)第二次图像预处理:采用深度双边学习的方法对第一步图像预处理后的外包膜图像进行图片增强。4) Second image pre-processing: using deep bilateral learning to enhance the outer envelope image after the first image pre-processing.
目前的前列腺微创手术中,内窥镜影像的对焦要由手术医师来操作,加上光源与被摄体之间距离也总在不断地变化,所以,不可避免地存在一些影像不太清晰的情况。基于手术预警系统“宁可错判,也要避免漏判”的设计要求,再结合外包膜识别主要依据其纹理特征(颜色和形状不重要)的特点,我们采用深度双边学习的方法来对不太清晰的图片进行图片增强,使被检测的图像特征更明显。这对前期的模型训练和后前的检测预警都有帮助。该算法构建的新型网络架构可以在移动设备上以全高清分辨率实时再现图像增强。算法处理结果具有HDR(高动态范围图像处理)功能,使画面富有表现力,并保留边缘信息,并且在全分辨率下仅需要有限的计算。因此,该算法也可用于微创手术的嵌入式设备上进行实时图像增强。In the current minimally invasive prostate surgery, the focus of the endoscopic image must be operated by the surgeon, and the distance between the light source and the subject is constantly changing. Therefore, there are inevitably some images that are not clear. Happening. Based on the design requirements of the surgical early warning system "would rather misjudge, but also avoid missing judgment", combined with the outer envelope recognition mainly based on the characteristics of its texture features (color and shape are not important), we use deep bilateral learning method to Pictures that are too clear are enhanced with pictures to make the detected image features more obvious. This is helpful for early model training and early detection and early warning. The new network architecture built by the algorithm can reproduce the image enhancement in real time on mobile devices at full HD resolution. The algorithm processing result has HDR (High Dynamic Range Image Processing) function, which makes the picture expressive and retains edge information, and only requires limited calculations at full resolution. Therefore, the algorithm can also be used for real-time image enhancement on embedded devices for minimally invasive surgery.
4.1)低分辨率图像的特征提取,通过将高分辨率的输入图像转换成低分辨率,并在低分辨率上进行大多数的学习和训练过程,可以节省大量计算成本,实现模型的快速评估。在低分辨率流中对输入图像I的低分辨率拷贝
Figure PCTCN2019074084-appb-000010
进行了大部分推断,最终以类似于双边网格的表示来预测局部仿射变换。
4.1) Feature extraction of low-resolution images, by converting high-resolution input images to low-resolution, and performing most of the learning and training processes at low-resolution, can save a lot of calculation costs and achieve rapid evaluation of the model . Low-resolution copy of input image I in a low-resolution stream
Figure PCTCN2019074084-appb-000010
Most of the inferences were made, and finally a local affine transformation was predicted with a representation similar to a bilateral grid.
将图像尺寸调整成256×256,再通过一系列跨步为2(stride=2)的卷积核进行下采样,公式如下,Adjust the image size to 256×256, and then down-sample through a series of convolution kernels with stride of 2 (stride=2). The formula is as follows,
Figure PCTCN2019074084-appb-000011
Figure PCTCN2019074084-appb-000011
其中,S i为跨步卷积层,
Figure PCTCN2019074084-appb-000012
为卷积层的索引;x′,y′为卷积前像素的横、纵坐标,x,y为卷积后像素的横、纵坐标;c和c′为卷积层通道的索引;w为卷积核权重矩阵;b为偏置。激活函数σ采用ReLU卷积时,采用0填充,由于图像卷积后尺度会缩小,在原始图片外围补充初始化为0的像素点,可一定程度上保持卷积后图像的尺度。该公式即表示对图像的低分辨率拷贝进行n s层操作,每一个卷积层包括卷积核对图像的卷积操作及将结果输入激活函数,这样得到低分辨率图像的特征图。
Among them, S i is the stride convolution layer,
Figure PCTCN2019074084-appb-000012
Is the index of the convolution layer; x′, y′ are the horizontal and vertical coordinates of the pixel before convolution, x, y is the horizontal and vertical coordinates of the pixel after convolution; c and c′ are the index of the convolution layer channel; w Convolution kernel weight matrix; b is offset. When the activation function σ adopts ReLU convolution, it is filled with 0. Since the scale of the image will be reduced after the convolution, the pixels that are initialized to 0 at the periphery of the original picture can be added to maintain the scale of the convolved image to a certain extent. This formula means to perform n s layer operation on the low-resolution copy of the image. Each convolution layer includes the convolution operation of the convolution check image and input the result into the activation function, thus obtaining the feature map of the low-resolution image.
图像实际缩小了
Figure PCTCN2019074084-appb-000013
倍。n s为上述卷积层索引i的最大值)有两个作用:一是驱动学习低分辨输入和最后网格里仿射系数的学习,n s越大网格就越粗糙;二是控制预测结果的复杂度,更深的网络层数可以获得更复杂更抽象的特征。这里设定n s=4,卷积核大小为3×3。
The image actually shrank
Figure PCTCN2019074084-appb-000013
Times. n s is the maximum value of the above convolutional layer index i). It has two functions: one is to drive the learning of low-resolution input and the learning of the affine coefficients in the final grid. The larger the n s , the coarser the grid; the second is to control the prediction As a result of the complexity, the deeper network layers can obtain more complex and abstract features. Here, n s = 4 is set, and the size of the convolution kernel is 3×3.
4.2)将低分辨率流分为局部路径和全局路径,局部路径使用全卷积层学习图像数据的局部特征,全局路径使用卷积层和全连接层学习图像的全局特征,然后,将两条路径的输出融合到一组共同的融合特征中。4.2) The low-resolution stream is divided into a local path and a global path. The local path uses the full convolution layer to learn the local features of the image data, and the global path uses the convolution layer and the fully connected layer to learn the global features of the image. Then, the two The output of the path is fused into a common set of fused features.
局部特征:对低分辨率图像的特征进一步处理,
Figure PCTCN2019074084-appb-000014
即将公式(6)中得到的第
Figure PCTCN2019074084-appb-000015
层特征图再通过n L=2的卷积层进一步提取特征。这里设定stride=1,也就是这部分分辨率不再改变,同时,通道数也不发生改变。所以,加上步骤4.1)中用到的卷积,总共是n S+n L层。
Local features: further processing features of low-resolution images,
Figure PCTCN2019074084-appb-000014
That is, the number obtained in formula (6)
Figure PCTCN2019074084-appb-000015
The layer feature map further extracts features through a convolutional layer with n L =2. Set stride=1 here, which means that the resolution of this part will not change anymore, and at the same time, the number of channels will not change. Therefore, adding the convolution used in step 4.1), the total is n S + n L layers.
全局特征:全局特征对低分辨率图像的特征图中的特征进一步发展,该部分由G i表示,
Figure PCTCN2019074084-appb-000016
层数n G=5,将步骤4.1)中得到的第
Figure PCTCN2019074084-appb-000017
层特征图再通过两个卷积层及三个全连接层来提取全局特征。全局特征具有的全局信息可以作为局部特征提取的先验,如果没有全局特征去描述图像信息的高维表示,网络可能会做出错误的局部特征。
Global features: Global features further develop the features in the feature map of low-resolution images. This part is represented by G i ,
Figure PCTCN2019074084-appb-000016
The number of layers n G = 5, the first
Figure PCTCN2019074084-appb-000017
The layer feature map extracts global features through two convolutional layers and three fully connected layers. Global information possessed by global features can be used as a priori for local feature extraction. If there is no global feature to describe the high-dimensional representation of image information, the network may make incorrect local features.
使用一个逐点的放射变换去融合全局特征和局部特征,即对得到的局部特征图
Figure PCTCN2019074084-appb-000018
和全局特征图
Figure PCTCN2019074084-appb-000019
进行仿射相加,并使用ReLu函数进行激活。计算公式如 下,其中F表示融合后的特征图,
Use a point-by-point radiation transformation to fuse global features and local features, that is, to obtain the local feature map
Figure PCTCN2019074084-appb-000018
And global feature map
Figure PCTCN2019074084-appb-000019
Perform affine addition and use the ReLu function to activate. The calculation formula is as follows, where F represents the feature map after fusion,
Figure PCTCN2019074084-appb-000020
Figure PCTCN2019074084-appb-000020
这样得到一个16×16×64的特征矩阵,将其输入1×1的卷积层可得到16×16大小,输出通道为96的特征,计算式如下:In this way, a 16×16×64 feature matrix is obtained, and its input to a 1×1 convolution layer can obtain a feature of 16×16 size and an output channel of 96. The calculation formula is as follows:
A c[x,y]=b c+∑ c′F c′[x,y]w cc′。      (8) A c [x,y]=b c +∑ c′ F c′ [x,y]w cc ′. (8)
4.3)将融合特征作为第三维展开的双边网络,输出放射系数的双边网格。4.3) Use the fusion feature as the third-dimensional unfolded bilateral network, and output a bilateral grid of emissivity coefficients.
将融合特征作为第三维已经展开的双边网络,计算式如下Using the fusion feature as the third-dimensional already developed bilateral network, the calculation formula is as follows
Figure PCTCN2019074084-appb-000021
Figure PCTCN2019074084-appb-000021
其中,dc=8也就是网络的深度。通过这个转换,A可以看作是一个16×16×8的双边网格,每个格子有一个3×4的仿射颜色变换矩阵。这个转换使得前面的特征提取和操作都是在双边域中操作,其对应于在x和y维上进行的卷积,学习z和c维相互交融的特征。因此,前面提取特征的操作也比使用3D卷积在双边网格中卷积更具有表现力,因为后者只能关联z维。同时,它也比一般的双边网格要有效,因为只关注c维上离散化即可。总之,也就是通过利用2D卷积并将最后一层作为双边网格,可以用来决定2D转换到3D的最优方式。Among them, dc = 8 is the depth of the network. Through this conversion, A can be seen as a 16×16×8 double-sided grid, each grid has a 3×4 affine color transformation matrix. This conversion makes the previous feature extraction and operation operate in the bilateral domain, which corresponds to the convolution in the x and y dimensions, learning the features of the z and c dimensions blending with each other. Therefore, the previous feature extraction operation is also more expressive than using 3D convolution in a double-sided grid, because the latter can only relate to the z dimension. At the same time, it is more effective than the general two-sided grid, because it only needs to focus on the c-dimensional discretization. In short, that is, by using 2D convolution and using the last layer as a double-sided grid, it can be used to determine the optimal way to convert 2D to 3D.
4.4)通过一个单通道的引导图对放射系数的双边网格进行上采样。4.4) Upsample the two-sided grid of emissivity through a single-channel guide map.
将上一步的输出结果转换到输入的高分辨率空间,通过一个单通道的引导图对其进行“上采样”。基于引导图g对A的上采样是利用A的系数进行三次线性插值,位置由g决定,计算式如下Convert the output result of the previous step to the input high-resolution space, and "up-sample" it with a single-channel guide map. The upsampling of A based on the guide graph g is to use the coefficient of A for cubic linear interpolation, the position is determined by g, and the calculation formula is as follows
Figure PCTCN2019074084-appb-000022
Figure PCTCN2019074084-appb-000022
其中,A c[i,j,k]表示基于低分辨率图像得到的双边网格系数,i,j,k分别表示其三个维度。
Figure PCTCN2019074084-appb-000023
表示A c[i,j,k]上采样后得到的基于高分辨率空间的系数。τ(·)=max(1-|·|,0)τ(·)表示线性插值,s x和s y分别表示网格和全分辨原图的高度和宽度比例,特别的,每个像素都被分配了一个系数(这个系数是上面仿射变换的系数),其在网格里对应的深度由图像灰度值g[x,y]决定,也就是A c[x,y,g[x,y]],即使用引导图对网格进行插值,插值后每个像素的深度是对应的引导图像素减去对应网格的深度。分片使用OpenGL库完成,通过这个操作使得输出图的边缘遵循输入图的边缘,达到保边的效果。
Among them, A c [i, j, k] represents the bilateral grid coefficients obtained based on the low-resolution image, and i, j, k represents its three dimensions.
Figure PCTCN2019074084-appb-000023
Represents the coefficient based on high-resolution space obtained after up-sampling of A c [i,j,k]. τ(·)=max(1-|·|,0)τ(·) represents linear interpolation, s x and s y represent the height and width ratios of the grid and the full resolution original image, in particular, each pixel Is assigned a coefficient (this coefficient is the coefficient of the affine transformation above), and its corresponding depth in the grid is determined by the image gray value g[x,y], which is A c [x,y,g[x ,y]], that is, use the guide map to interpolate the grid. After interpolation, the depth of each pixel is the corresponding guide map pixel minus the depth of the corresponding grid. The slicing is done using the OpenGL library. Through this operation, the edges of the output graph follow the edges of the input graph, achieving the edge-preserving effect.
4.5)对融合特征做仿射变换后以全分辨率输出。4.5) After affine transformation of the fusion feature, it is output at full resolution.
对于输入图像I,提取其特征
Figure PCTCN2019074084-appb-000024
其用途一是获得引导图,二是为上述得到的全分辨局部仿射模型做回归。
For input image I, extract its features
Figure PCTCN2019074084-appb-000024
Its purpose is to obtain the guide map, and the second is to do regression for the full-resolution local affine model obtained above.
引导图的获取是对原始图像进行三个通道操作后相加得到,计算公式如下The guide map is obtained by adding three channels to the original image, and the calculation formula is as follows
Figure PCTCN2019074084-appb-000025
Figure PCTCN2019074084-appb-000025
其中,
Figure PCTCN2019074084-appb-000026
是一个3×3的颜色转换矩阵,b和b′是偏置。而ρ c是一个分段线性的转换模块,包括阈值t c,i和梯度a c,i,由16个ReLU激活单元得到,计算式如下:
among them,
Figure PCTCN2019074084-appb-000026
Is a 3×3 color conversion matrix, and b and b′ are offset. Ρ c is a piecewise linear conversion module, including threshold t c,i and gradient a c,i , which is obtained by 16 ReLU activation units. The calculation formula is as follows:
Figure PCTCN2019074084-appb-000027
Figure PCTCN2019074084-appb-000027
参数M,a,t,b,b′都是通过学习获得的。The parameters M, a, t, b, b′ are all obtained through learning.
在原始图像I(这里与
Figure PCTCN2019074084-appb-000028
相同)上述过程中得到的系数矩阵
Figure PCTCN2019074084-appb-000029
计算最终的输出O,其可以看作是对输入的结果,计算式如下,
In the original image I (here with
Figure PCTCN2019074084-appb-000028
Same) The coefficient matrix obtained in the above process
Figure PCTCN2019074084-appb-000029
Calculate the final output O, which can be regarded as the result of the input, the calculation formula is as follows,
Figure PCTCN2019074084-appb-000030
Figure PCTCN2019074084-appb-000030
5)神经网络训练:对第二次图像预处理后的外包膜图像进行特征提取和网络训练,产生训练后的检测模型,具体步骤包括:5) Neural network training: Perform feature extraction and network training on the outer envelope image after the second image preprocessing to generate a trained detection model. Specific steps include:
5.1)预训练5.1) Pre-training
YOLOv2将预训练分成两步:先用224×224的输入从头开始训练网络,大概160个序列(将所有训练数据循环跑160次);然后,再将输入调整到448×448,再训练10个序列。YOLOv2 divides pre-training into two steps: first train the network from the beginning with a 224×224 input, about 160 sequences (loop all training data 160 times); then, adjust the input to 448×448, and train 10 more sequence.
5.2)特征提取5.2) Feature extraction
本发明采用的训练结构使用MobileNet进行特征提取。MobileNet的核心思想是将标准的卷积层分解为分通道卷积和单像素卷积两个卷积层。分通道卷积用M个卷积核生成M个特征图,单像素卷积对特征图进行线性组合。The training structure adopted by the present invention uses MobileNet for feature extraction. The core idea of MobileNet is to decompose the standard convolutional layer into two convolutional layers: sub-channel convolution and single-pixel convolution. Sub-channel convolution uses M convolution kernels to generate M feature maps, and single pixel convolution linearly combines the feature maps.
MobileNet卷积层的计算可分为两步:The calculation of the MobileNet convolutional layer can be divided into two steps:
分通道卷积。对于输入的每一个通道,分别用一个D K×D K×1的卷积核进行卷积,共使用了M个卷积核,得到了M个D F×D F×1的特征图,这些特征图分别是从输入的不同通道而来,彼此独立。 Sub-channel convolution. For each input channel, a convolution kernel of D K ×D K ×1 is used for convolution, and a total of M convolution kernels are used to obtain M D F ×D F ×1 feature maps. These The feature maps come from different channels of input and are independent of each other.
单像素卷积。对于上一步得到的M个通道的输入,用N个1×1×M的卷积核进行标准卷积,得到D F×D F×N的输出。 Single pixel convolution. For the input of the M channels obtained in the previous step, standard convolution is performed with N 1×1×M convolution kernels, and the output of D F ×D F ×N is obtained.
相比标准卷积层,使用MobileNet卷积方法计算量能节约8到9倍左右,可以有效地减少Yolo算法的参数量,降低计算量,进一步保证预警功能的实时性。Compared with the standard convolutional layer, the calculation amount using the MobileNet convolution method can save about 8 to 9 times, which can effectively reduce the parameter amount of the Yolo algorithm, reduce the calculation amount, and further ensure the real-time performance of the early warning function.
5.3)边界箱预测5.3) Boundary box prediction
YOLOv2的“锚箱”是通过聚类的方法得到的。对训练的样本进行统计,取前面数量最多的几个形状作为“锚箱”。由于数据来源于训练样本,所以,若每个网格都按此进行预测,则会基本囊括最有可能出现的情况,回召率会相对较高。YOLOv2通过“锚箱”来预测“边界箱”。YOLOv2's "anchor box" is obtained by clustering. Count the training samples, and take the top shapes as the "anchor box". Since the data comes from training samples, if each grid is predicted according to this, it will basically cover the most likely situation, and the recall rate will be relatively high. YOLOv2 uses "anchor boxes" to predict "boundary boxes".
YOLOv2通过划分格子来进行目标角检测,每个格子负责检测图片的一部分,每个格子包括5个“锚箱”。YOLOv2针对每个“锚箱”预测四个坐标值(t x,t y,t w,t h),根据图像左上角的偏移(c x,c y)和先前得到的边界框的宽p w和高p h,方程如下, YOLOv2 performs target angle detection by dividing grids. Each grid is responsible for detecting a part of the picture. Each grid includes 5 “anchor boxes”. YOLOv2 predicts four coordinate values (t x , t y , t w , t h ) for each “anchor box”, based on the offset (c x , c y ) in the upper left corner of the image and the width p of the previously obtained bounding box w and high p h , the equation is as follows,
b y=σ(t y)+c yb y =σ(t y )+c y ;
b x=σ(t x)+c xb x =σ(t x )+c x ;
Figure PCTCN2019074084-appb-000031
Figure PCTCN2019074084-appb-000031
Figure PCTCN2019074084-appb-000032
Figure PCTCN2019074084-appb-000032
YOLOv2对每个“边界箱”通过逻辑回归预测一个物体的得分,如果预测的这个“边界箱”与真实的边框值大部分重合且比其他所有预测的要好,那么这个值就为1。如果重叠部分没有达到一个阈值(YOLOv2中默认设定的阈值是0.5),那么这个预测的“边界箱”将会被忽略,也就是会表示成没有损失值。YOLOv2 predicts the score of an object through logistic regression for each "boundary box". If the predicted "boundary box" mostly coincides with the real border value and is better than all other predicted, then this value is 1. If the overlapping part does not reach a threshold (the default threshold set in YOLOv2 is 0.5), then the predicted "boundary box" will be ignored, which means that there will be no loss.
5.4)分类5.4) Classification
YOLOv2的神经网络输出的向量尺寸是13×13×30,其中13×13是将图片划分为13行和13列共169个细胞,30代表每个细胞有30个数据。对于每个细胞的30个数据分解为30=5×(5+1),即每个细胞包括5个“锚箱”,每个“锚箱”包括6数据:物品存在置信度、物品中心位置(x,y)、物品尺寸(w,h)和类别信息。The vector size of the YOLOv2 neural network output is 13×13×30, of which 13×13 is to divide the picture into 13 rows and 13 columns, a total of 169 cells, and 30 represents 30 data per cell. The 30 data for each cell is decomposed into 30=5×(5+1), that is, each cell includes 5 “anchor boxes”, and each “anchor box” includes 6 data: item existence confidence, item center position (x,y), item size (w,h) and category information.
6)检测和预警:实时采集前列腺手术现场视频的动态图像,将动态图像识别为图像数据经过第一次图像预处理和第二次图像预处理后输入至检测模型,当检测模型检测到外包膜特征目标时,输出报警信息。6) Detection and early warning: real-time dynamic images of the prostate surgery site video are collected, and the dynamic images are recognized as image data after the first image preprocessing and the second image preprocessing are input to the detection model, when the detection model detects outsourcing When the membrane features a target, an alarm message is output.
6.1)检测流程6.1) Testing process
系统的检测预警的工作流程如图2所示。The workflow of system detection and early warning is shown in Figure 2.
●管理机通过专用的视频转接卡读取内窥镜设备输出实时视频;●The management machine reads the endoscope equipment through a dedicated video adapter card to output real-time video;
●实时视频交由检测预警模块分析,将检测结果以视频方式输出,当出现外包膜目标时,蜂鸣器响起,警示医生注意;●The real-time video is analyzed by the detection and early warning module, and the detection result is output in the form of video. When the outer envelope target appears, the buzzer sounds to alert the doctor;
●医生观看实时地检测结果,迅速定位病灶。●The doctor watches the test results in real time and quickly locates the lesion.
6.2)检测结果6.2) Test results
部分检测效果如图3所示。检测识别的帧速率满足30fps,识别的平均准确率可达到90%。Part of the detection effect is shown in Figure 3. The frame rate of detection and identification meets 30fps, and the average accuracy of identification can reach 90%.
6.3)系统配置要求6.3) System configuration requirements
管理机操作系统最低要求Windows 7或ubuntu16.04,CPU i5四核,内存8G,配备含有支持深度学习算法的图像处理单元(GPU)或多个Movidius(神经计算棒)能进一步加快视频处理速度。The minimum operating system requirements for the management machine are Windows 7 or ubuntu 16.04, CPU i5 quad-core, and 8G memory. Equipped with an image processing unit (GPU) or multiple Movidius (neural computing sticks) that support deep learning algorithms can further accelerate the video processing speed.
本发明中所描述的具体实施范例仅仅是对本发明精神作举例说明。本发明所属技术领域的技术人员可以对所描述的具体实施范例做各种各样的修改或补充或采用类似的方式替代,但是,并不会偏离本发明的精神或者超越所附权利要求书所定义的范围。The specific implementation examples described in the present invention merely exemplify the spirit of the present invention. Those skilled in the art to which the present invention pertains can make various modifications or additions to the described specific implementation examples or replace them in a similar manner, but they will not deviate from the spirit of the present invention or go beyond the appended claims Defined scope.
参考文献:references:
[1]Kadam D B,Gade S S,Uplane M D,et al.Neural network based brain tumor detection using MR images[J].2011,2:325-31.[1] Kadam D, Gade S, Uplane M, D, et al. Neural network based brain detection detection using MR images [J]. 2011, 2: 325-31.
[2]Othman M F,Basri M A M.Probabilistic Neural Network for Brain Tumor Classification[C]//Second International Conference on Intelligent Systems,Modelling and Simulation.IEEE,2011:136-138.[2] Othman M, F, Basri M, Probabilistic Neural Network, for Brain Tumor Classification [C]//Second International Conference, Intelligent Systems, Modelling, and Simulation. IEEE, 2011: 136-138.
[3]Selvam V S,Shenbagadevi S.Brain tumor detection using scalp eeg with modified Wavelet-ICA and multi layer feed forward neural network[C]//International Conference of the IEEE Engineering in Medicine&Biology Society.Conf Proc IEEE Eng Med Biol Soc,2011:6104.[3]SelvamV,S,ShenbagadeviS.Brain,tumor,detection,scalp,eeg,withmodified,Wavelet-ICA,andmultilayer,feed,forward,neural,network[C]//International,Conference,oftheIEEE,Engineering,Medicine&Biology,Society.Confl,Proc ,2011:6104.
[4]Du X,Li Y,Yao D.A Support Vector Machine Based Algorithm for Magnetic Resonance Image Segmentation[C]//Fourth International Conference on Natural  Computation.IEEE Computer Society,2008:49-53.[4] Du X, Li Y, Yao D. A Support Vector Vector Machine Based Algorithm Algorithm for Magnetic Resonance Image Segmentation [C]//Fourth International Conference on Natural Computation. IEEE Computer Society, 2008: 49-53.
[5]Pujar J H,Gurjal P S,Shambhavi D S,et al.Medical image segmentation based on vigorous smoothing and edge detection ideology[J].World Academy of Science Engineering&Technology,2010,19(68):444.[5] Pujar, J, H, Gurjal, P, S, Shambhavi, D, S, et. Al., Medical, image, segmentation, based, on, vigorous,, smoothing, and edge, detection, andideology [J]. World Academy of Science, Engineering & Technology, 2010, 19 (68): 444.
[6]Hota H S,Shukla S P,Gulhare K.Review of Intelligent Techniques Applied for Classification and Preprocessing of Medical Image Data[J].International Journal of Computer Science Issues,2013,10(1).[6]HotaH,Sukla,SukP,GulhareK.Review ofIntelligentTechniquesApplied forClassification andPreprocessing ofMedicalImageImageData[J].International Journal of Computer ScienceIssues,2013,10(1).
[7]Vinod Kumar,NiranjanKhandelwal and et.al.“Classification of BrainTumors using PCA-ANN”,978-1-4673-0126-8/11,IEEE 2011.[7] Vinod Kumar, NiranjanKhandelwal and et.al. "Classification of BrainTumors using PCA-ANN", 978-1-4673-0126-8/11, IEEE 2011.
[8]Rajini N H,Bhavani R.Classification of MRI brain images using k-nearestneighbor and artificial neural network[C]//International Conference on Recent Trends in Information Technology.IEEE,2011:563-568.[8] Rajini, N, Bhavani, R. Classification, MRI, brain, images, using k-nearestneighbor, and artificial neural network[C]//InternationalConferenceRecentTrendsInformationTechnology.IEEE,2011:563-568.
[9]Najafi S,Amirani M C,Sedghi Z.Anew approach to MRI brain images classification[C]//Electrical Engineering.IEEE,2011:1-5.[9] Najafi S, Amirani M, C, Sedghi Z. Anew approach to MRI brain images classification [C]//Electrical Engineering. IEEE, 2011: 1-5.

Claims (10)

  1. 一种前列腺手术中外包膜智能检测和预警方法,其特征在于:所述方法包括如下步骤:A method for intelligent detection and early warning of outer envelope in prostate surgery, characterized in that the method includes the following steps:
    1)数据采集:采集前列腺手术录像中的外包膜图像数据;1) Data collection: collect the outer envelope image data in the prostate surgery video;
    2)第一次图像预处理:对所述外包膜数据进行灰度处理和奇异值分解,提取具有主成分特征值的外薄膜图像;2) The first image pre-processing: grayscale processing and singular value decomposition are performed on the outer envelope data to extract the outer film image with the main component characteristic value;
    3)第二次图像预处理:采用深度双边学习的方法对第一步图像预处理后的外包膜图像进行图片增强;3) Second image pre-processing: using deep bilateral learning to enhance the outer envelope image after the first image pre-processing;
    4)神经网络训练:对第二次图像预处理后的外包膜图像进行特征提取和网络训练,产生训练后的检测模型;4) Neural network training: perform feature extraction and network training on the outer envelope image after the second image preprocessing to generate a trained detection model;
    5)检测和预警:实时采集前列腺手术现场视频的动态图像,将动态图像识别为图像数据经过第一次图像预处理和第二次图像预处理后输入至检测模型,当检测模型检测到外包膜特征目标时,输出报警信息。5) Detection and early warning: real-time dynamic images of the prostate surgery site video are collected, and the dynamic images are recognized as image data after the first image preprocessing and the second image preprocessing are input to the detection model, when the detection model detects outsourcing When the membrane features a target, an alarm message is output.
  2. 根据权利要求1所述的前列腺手术中外包膜智能检测和预警方法,其特征在于:步骤2)之前还包括数据扩增步骤。The method for intelligent detection and early warning of outer envelope in prostate surgery according to claim 1, characterized in that before step 2), a data amplification step is further included.
  3. 根据权利要求1所述的前列腺手术中外包膜智能检测和预警方法,其特征在于:所述步骤4)基于YOLOv2软件平台及MobileNet深度学习模型实现。The intelligent detection and early warning method of outer envelope in prostate surgery according to claim 1, characterized in that: step 4) is implemented based on the YOLOv2 software platform and the MobileNet deep learning model.
  4. 根据权利要求1所述的前列腺手术中外包膜智能检测和预警方法,其特征在于:所述步骤3)的具体步骤包括:The intelligent detection and early warning method of outer envelope in prostate surgery according to claim 1, characterized in that the specific steps of step 3) include:
    3.1)将高分辨率的输入图像转换为低分辨率流;3.1) Convert high-resolution input images to low-resolution streams;
    3.2)将低分辨率流分为局部路径和全局路径,局部路径使用全卷积层学习图像数据的局部特征,全局路径使用卷积层和全连接层学习图像的全局特征,然后将两条路径的输出融合到一组共同的融合特征中;3.2) The low-resolution stream is divided into a local path and a global path. The local path uses a full convolution layer to learn the local features of the image data, and the global path uses the convolution layer and the fully connected layer to learn the global features of the image, and then the two paths The output of is fused into a common set of fusion features;
    3.3)将所述融合特征作为第三维展开的双边网络,输出放射系数的双边网格;3.3) Use the fusion feature as a third-dimensional unfolded bilateral network and output a bilateral grid of emissivity coefficients;
    3.4)通过一个单通道的引导图对放射系数的双边网格进行上采样;3.4) Upsampling the two-sided grid of emissivity through a single-channel guide map;
    3.5)对融合特征做仿射变换后以全分辨率输出。3.5) Affine transform the fused features and output at full resolution.
  5. 根据权利要求2所述的前列腺手术中外包膜智能检测和预警方法,其特 征在于:所述数据扩增步骤的具体步骤为:导入模块,实例化管道对象,指定包含要处理图片所在的目录;定义数据增强操作,包括透视、角度偏差、剪切、弹性形变、亮度、对比度、颜色、旋转、裁剪,添加到管道中;调用管道的样本函数,指定增强后的样本总量。The intelligent detection and early warning method of outer envelope in prostate surgery according to claim 2, characterized in that: the specific steps of the data amplification step are: importing a module, instantiating a pipeline object, and specifying a directory containing images to be processed; Define data enhancement operations, including perspective, angle deviation, shear, elastic deformation, brightness, contrast, color, rotation, and cropping, and add to the pipeline; call the pipeline's sample function to specify the total amount of samples after enhancement.
  6. 根据权利要求1所述的前列腺手术中外包膜智能检测和预警方法,其特征在于:所述步骤4)的具体步骤包括:4.1)预训练;4.2)特征提取;4.3)边界箱预测;4.4)分类。The intelligent detection and early warning method of outer envelope in prostate surgery according to claim 1, characterized in that the specific steps of step 4) include: 4.1) pre-training; 4.2) feature extraction; 4.3) boundary box prediction; 4.4) classification.
  7. 一种根据权利要求1~6中任一权利要求所述的前列腺手术中外包膜智能检测和预警系统,其特征在于:包括图像采集模块、图像处理模块、图像检测预警模块;所述图像采集模块用于采集和存储图像信息和模型;所述图像处理模块用于对采集的图像数据进行第一次图像预处理、第二次图像预处理;所述图像检测预警模块用于对处理后的图像进行网络训练,产生训练后的检测模型,再将待检测图像输入检测模型得到检测和预警结果。An intelligent detection and early warning system for outer envelope in prostate surgery according to any one of claims 1 to 6, characterized by comprising: an image acquisition module, an image processing module, an image detection and early warning module; the image acquisition module It is used to collect and store image information and models; the image processing module is used to perform the first image preprocessing and the second image preprocessing on the collected image data; the image detection and early warning module is used to process the processed image Perform network training to generate a trained detection model, and then input the image to be detected into the detection model to obtain detection and early warning results.
  8. 根据权利要求7所述的前列腺手术中外包膜智能检测和预警系统,其特征在于:所述图像采集模块包括用于与内窥镜对接的数字视频接口、用于存储手术中实时图像数据的图像数据存储器和用于存储经过处理后的图像和经过深度学习后的模型的图像模型存储器。The intelligent detection and early warning system for outer envelope in prostate surgery according to claim 7, wherein the image acquisition module includes a digital video interface for docking with an endoscope, and an image for storing real-time image data during surgery Data memory and image model memory for storing processed images and models after deep learning.
  9. 根据权利要求7所述的前列腺手术中外包膜智能检测和预警系统,其特征在于:所述图像处理模块包括数据扩增组件、图像特征提取组件和图像增强组件。The intelligent detection and early warning system for outer envelope in prostate surgery according to claim 7, wherein the image processing module includes a data amplification component, an image feature extraction component and an image enhancement component.
  10. 根据权利要求7所述的前列腺手术中外包膜智能检测和预警系统,其特征在于:所述图像检测预警模块包括图像深度训练组件和图像检测预警组件。The intelligent detection and early warning system for outer envelope in prostate surgery according to claim 7, wherein the image detection and early warning module includes an image depth training component and an image detection and early warning component.
PCT/CN2019/074084 2018-12-27 2019-01-31 Method and system for intelligent envelope detection and warning in prostate surgery WO2020133636A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811613042.7 2018-12-27
CN201811613042.7A CN109754007A (en) 2018-12-27 2018-12-27 Peplos intelligent measurement and method for early warning and system in operation on prostate

Publications (1)

Publication Number Publication Date
WO2020133636A1 true WO2020133636A1 (en) 2020-07-02

Family

ID=66404122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074084 WO2020133636A1 (en) 2018-12-27 2019-01-31 Method and system for intelligent envelope detection and warning in prostate surgery

Country Status (2)

Country Link
CN (1) CN109754007A (en)
WO (1) WO2020133636A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914937A (en) * 2020-08-05 2020-11-10 湖北工业大学 Lightweight improved target detection method and detection system
CN112231183A (en) * 2020-07-13 2021-01-15 国网宁夏电力有限公司电力科学研究院 Communication equipment alarm prediction method and device, electronic equipment and readable storage medium
CN112669312A (en) * 2021-01-12 2021-04-16 中国计量大学 Chest radiography pneumonia detection method and system based on depth feature symmetric fusion
CN113408423A (en) * 2021-06-21 2021-09-17 西安工业大学 Aquatic product target real-time detection method suitable for TX2 embedded platform
CN113627472A (en) * 2021-07-05 2021-11-09 南京邮电大学 Intelligent garden defoliating pest identification method based on layered deep learning model

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490232B (en) * 2019-07-18 2021-08-13 北京捷通华声科技股份有限公司 Method, device, equipment and medium for training character row direction prediction model
CN112545477B (en) * 2019-09-26 2022-07-15 北京赛迈特锐医疗科技有限公司 System and method for automatically generating mpMRI prostate cancer comprehensive evaluation report
CN112545476B (en) * 2019-09-26 2022-07-15 北京赛迈特锐医疗科技有限公司 System and method for detecting prostate cancer extracapsular invasion on mpMRI
CN112545481B (en) * 2019-09-26 2022-07-15 北京赛迈特锐医疗科技有限公司 System and method for automatically segmenting and localizing prostate cancer on mpMRI
CN111091559A (en) * 2019-12-17 2020-05-01 山东大学齐鲁医院 Depth learning-based auxiliary diagnosis system for small intestine sub-scope lymphoma
CN111583192B (en) * 2020-04-21 2023-09-26 天津大学 MRI image and deep learning breast cancer image processing method and early screening system
CN113538211A (en) * 2020-04-22 2021-10-22 华为技术有限公司 Image quality enhancement device and related method
CN111815613B (en) * 2020-07-17 2023-06-27 上海工程技术大学 Liver cirrhosis disease stage identification method based on envelope line morphological feature analysis
CN112734704B (en) * 2020-12-29 2023-05-16 上海索验智能科技有限公司 Skill training evaluation method under neural network machine learning recognition objective lens
CN114397929B (en) * 2022-01-18 2023-03-31 中山东菱威力电器有限公司 Intelligent toilet lid control system capable of improving initial temperature of flushing water
CN114145844B (en) * 2022-02-10 2022-06-10 北京数智元宇人工智能科技有限公司 Laparoscopic surgery artificial intelligence cloud auxiliary system based on deep learning algorithm

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103976790A (en) * 2014-05-21 2014-08-13 周勇 Real-time evaluation and correction method in spine posterior approach operation
CN104582622A (en) * 2012-04-16 2015-04-29 儿童国家医疗中心 Dual-mode stereo imaging system for tracking and control in surgical and interventional procedures
CN104899891A (en) * 2015-06-24 2015-09-09 重庆金山科技(集团)有限公司 Method and device for identifying gestational sac tissue, and uterine cavity suction device
CN105389589A (en) * 2015-11-06 2016-03-09 北京航空航天大学 Random-forest-regression-based rib detection method of chest X-ray film
US20170007778A1 (en) * 2015-07-07 2017-01-12 National Yang-Ming University Method of obtaining a classification boundary and automatic recognition method and system using the same
CN107705852A (en) * 2017-12-06 2018-02-16 北京华信佳音医疗科技发展有限责任公司 Real-time the lesion intelligent identification Method and device of a kind of medical electronic endoscope

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339591B (en) * 2016-08-25 2019-04-02 汤一平 A kind of self-service healthy cloud service system of prevention breast cancer based on depth convolutional neural networks
CN109087302A (en) * 2018-08-06 2018-12-25 北京大恒普信医疗技术有限公司 A kind of eye fundus image blood vessel segmentation method and apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104582622A (en) * 2012-04-16 2015-04-29 儿童国家医疗中心 Dual-mode stereo imaging system for tracking and control in surgical and interventional procedures
CN103976790A (en) * 2014-05-21 2014-08-13 周勇 Real-time evaluation and correction method in spine posterior approach operation
CN104899891A (en) * 2015-06-24 2015-09-09 重庆金山科技(集团)有限公司 Method and device for identifying gestational sac tissue, and uterine cavity suction device
US20170007778A1 (en) * 2015-07-07 2017-01-12 National Yang-Ming University Method of obtaining a classification boundary and automatic recognition method and system using the same
CN105389589A (en) * 2015-11-06 2016-03-09 北京航空航天大学 Random-forest-regression-based rib detection method of chest X-ray film
CN107705852A (en) * 2017-12-06 2018-02-16 北京华信佳音医疗科技发展有限责任公司 Real-time the lesion intelligent identification Method and device of a kind of medical electronic endoscope

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231183A (en) * 2020-07-13 2021-01-15 国网宁夏电力有限公司电力科学研究院 Communication equipment alarm prediction method and device, electronic equipment and readable storage medium
CN111914937A (en) * 2020-08-05 2020-11-10 湖北工业大学 Lightweight improved target detection method and detection system
CN112669312A (en) * 2021-01-12 2021-04-16 中国计量大学 Chest radiography pneumonia detection method and system based on depth feature symmetric fusion
CN113408423A (en) * 2021-06-21 2021-09-17 西安工业大学 Aquatic product target real-time detection method suitable for TX2 embedded platform
CN113408423B (en) * 2021-06-21 2023-09-05 西安工业大学 Aquatic product target real-time detection method suitable for TX2 embedded platform
CN113627472A (en) * 2021-07-05 2021-11-09 南京邮电大学 Intelligent garden defoliating pest identification method based on layered deep learning model
CN113627472B (en) * 2021-07-05 2023-10-13 南京邮电大学 Intelligent garden leaf feeding pest identification method based on layered deep learning model

Also Published As

Publication number Publication date
CN109754007A (en) 2019-05-14

Similar Documents

Publication Publication Date Title
WO2020133636A1 (en) Method and system for intelligent envelope detection and warning in prostate surgery
Afza et al. A framework of human action recognition using length control features fusion and weighted entropy-variances based feature selection
US20210264599A1 (en) Deep learning based medical image detection method and related device
US10452899B2 (en) Unsupervised deep representation learning for fine-grained body part recognition
WO2020260936A1 (en) Medical image segmentation using an integrated edge guidance module and object segmentation network
CN106408001B (en) Area-of-interest rapid detection method based on depth core Hash
JP2022505498A (en) Image processing methods, devices, electronic devices and computer readable storage media
US20220189142A1 (en) Ai-based object classification method and apparatus, and medical imaging device and storage medium
CN109685768A (en) Lung neoplasm automatic testing method and system based on lung CT sequence
Keceli et al. Combining 2D and 3D deep models for action recognition with depth information
CN112750531A (en) Automatic inspection system, method, equipment and medium for traditional Chinese medicine
CN111916206B (en) CT image auxiliary diagnosis system based on cascade connection
Zhang et al. Deepgi: An automated approach for gastrointestinal tract segmentation in mri scans
Nie et al. Recent advances in diagnosis of skin lesions using dermoscopic images based on deep learning
Wang et al. Automatic measurement of fetal head circumference using a novel GCN-assisted deep convolutional network
Chatterjee et al. A survey on techniques used in medical imaging processing
Gu et al. AYOLOv5: Improved YOLOv5 based on attention mechanism for blood cell detection
Pavithra et al. An Overview of Convolutional Neural Network Architecture and Its Variants in Medical Diagnostics of Cancer and Covid-19
Xu et al. Application of artificial intelligence technology in medical imaging
CN114842238B (en) Identification method of embedded breast ultrasonic image
Qian et al. Multi-scale context UNet-like network with redesigned skip connections for medical image segmentation
Wang et al. Optic disc detection based on fully convolutional neural network and structured matrix decomposition
Pan et al. Preferential image segmentation using trees of shapes
CN112651363A (en) Micro-expression fitting method and system based on multiple characteristic points
Aburass Cubixel: A Novel Paradigm in Image Processing Using Three-Dimensional Pixel Representation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19905558

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19905558

Country of ref document: EP

Kind code of ref document: A1