CN110335240A

CN110335240A - The method that automatic batch grabs alimentary canal inner tissue or foreign matter feature image

Info

Publication number: CN110335240A
Application number: CN201910385767.3A
Authority: CN
Inventors: 曾凡; 黄锦; 柯钦瑜; 黄勇; 邰海军; 段惠峰
Original assignee: Henan Xuan Yongtang Medical Information Technology Co Ltd
Current assignee: Henan Xuanwei Digital Medical Technology Co.,Ltd.
Priority date: 2019-05-09
Filing date: 2019-05-09
Publication date: 2019-10-15
Anticipated expiration: 2039-05-09
Also published as: CN110335240B

Abstract

The invention discloses the methods of automatic batch crawl alimentary canal inner tissue feature image, video is formatted, remove the background in video frame, gradation conversion, binary conversion treatment are carried out to target signature, using the contour detecting of target signature, the truncated picture for exporting target signature, truncated picture is stored, and has fast accurate, beneficial effect.

Description

The method that automatic batch grabs alimentary canal inner tissue or foreign matter feature image

Technical field

The present invention relates to image identification technical fields, and in particular to automatic batch grabs alimentary canal inner tissue feature image Method.

Background technique

Intelligent assistant diagnosis under digestive endoscopy is used as using deep learning realizes intelligentized most effective algorithm, and depth Study is dependent on the model after the data set of feature and training, and in most cases, deep learning model can not be completely from arbitrary Learnt in data, need to be labeled data and classify, the mark of usual data and is classified as being proficient in target signature Personnel are to classify and grab, but manually the crawl to Target Photo in video and screening need a large amount of manpower to carry out, But the accuracy of the picture manually intercepted is lower, the picture of feature of the same race, if the difference in the region of interception, size and segment, all The model training of machine learning can be impacted, and environment in the digestive tract be it is non-geometric, dynamic, there are fractal structure, The space of closed conduct, digestive endoscopy is movable within, identifies on the usual model enteron aisle of destination organization, causes alimentary canal inner wall special The pollution to the input training data of identification tissue signature is levied, the appearance over-fitting during prediction is caused.

Summary of the invention

To solve the above problems, the present invention provides a kind of method of automatic batch crawl alimentary canal inner tissue feature image, Having the advantages that can specific characteristic picture after uninterrupted, batch-automated crawl formats from historical operation video.

The invention is realized by the following technical scheme:

The method that automatic batch grabs alimentary canal inner tissue feature image, includes the following steps:

A): video is read and the conversion of color channel format: reading the video of storage equipment superior gastrointestinal endoscope diagnosis and treatment process, will regard The Color Channel format of frequency is converted to HSV by RGB；

B): target and removing video background in positioning video: adjusting the range of parameter H, S and V in hsv color space to position view Frequency content, background adjustment parameter H, S and V all in addition to removing target signature, the target signature are alimentary canal inner tissue device Any one in official, excrement, inspection and surgical instrument；

C): obtaining target signature picture: according to target signature, obtaining target signature picture；

D): gradation conversion and binary conversion treatment are carried out to target signature picture:

E): contour detecting and positioning being carried out to target signature: contour detecting being carried out to binary picture using Freeman chain code, is returned Go back to the position of target signature picture, the statistics sum of target signature profile and target feature point；

F): calculating ratio of the target signature in picture: the target signature in binary image is mapped to matrix, and by square Battle array is converted to the end to end vector of row, and vector value is accumulated and divided by 255 and obtains the quantity of all white pixel points of characteristic value, White pixel is calculated in the ratio of background black picture element, obtains size of the target signature on picture；

G): determining whether target signature meets interception decision condition in video frame by frame, if satisfied, then carrying out in target signature Interception, and save interception result.

It is characterized by: obtaining target signature picture in step c) are as follows: utilize each pixel in mask and target signature Mask operation is carried out, and the target signature picture includes target signature area image and non-targeted characteristic area image, the target Pixel value in the image of characteristic area remains unchanged, and the pixel value of non-targeted characteristic area image is zero.

In step d), using gradation conversion formula, target signature picture is obtained, the grayscale image of target signature picture is passed through Binary threshold algorithm obtains binary picture, and carries out morphological erosion operation and expansive working to binary picture to denoise, institute The grayscale image for stating target signature picture is single channel grayscale image, and single channel value range is 0 to 255, and the binary picture is single The binary picture that channel value is 0 or 255.

In step g), determine whether the frame in video meets interception decision condition and include the following steps:

G1): whether the statistics sum of the target feature point in judgment step e) is greater than 5000, if more than step g2 is then transferred to), it is no Then, the conversion of next frame is directly carried out；

G2): the wide and high ratio of the target signature profile in judgment step e) whether be the ratio of width to height at 5 times hereinafter, and being greater than 1/5th, if so, being transferred to step g3), otherwise, directly carry out the conversion of next frame；

G3): whether ratio of the target signature in entire picture in judgment step f) be within the scope of 2%-20%, if cutting Target signature in frame is taken, result set is saved in, otherwise, directly carries out the conversion of next frame.

The invention discloses the methods of automatic batch crawl alimentary canal inner tissue feature image, and video is carried out format and is turned Change, remove the background in video frame, prominent target signature, and to target signature carry out gradation conversion, binary conversion treatment, denoising and Expansive working further protrudes target signature using the contour detecting of target signature and exports the location information of target signature, right Target signature at neighbouring same position is compared, and is judged whether it is the frame video of same target feature, is taken multiple groups video Frame format unit carries out the interception of picture, and truncated picture is stored, the beneficial effect with fast accurate.

Detailed description of the invention

Fig. 1 is the method flow diagram of automatic batch crawl alimentary canal inner tissue feature image.

Fig. 2 is the progress bar schematic diagram that parameter H, S and V are adjusted.

Fig. 3 be target signature be surgical instrument when binaryzation after characteristic pattern.

Fig. 4 is from the picture for determining position and width of the target signature in picture.

Fig. 5 is that the target signature intercepted from video is the part picture of surgical instrument.

Fig. 6 is the storage organization schematic diagram that each classification data concentrates picture vectorization.

Fig. 7 is the result figure of the tissue or foreign matter in neural network model identification real time picture.

Fig. 8 is the result figure after the tissue identified in Fig. 7 or foreign matter storage.

Fig. 9 is the quantity of same characteristic features point in two picture feature point sets.

Figure 10 is the picture in the data set for not comparing and filing.

Figure 11 is the result after the picture in Figure 10 data set is compared and filed.

Figure 12 is the result figure of high-precision convolutional neural networks identification surgical procedure and classification.

Figure 13 is the image results of electric burning and cutting division ring metal ferrules in the surgical procedure of identification.

Figure 14 is the image results that metal clip opens in the surgical procedure of identification.

Figure 15 be identification surgical procedure in hemostasis titanium folder closure after without departing from picture.

Figure 16 is the picture being detached from after hemostasis titanium folder closure in the surgical procedure of identification.

Specific embodiment

Below in conjunction with the attached drawing in the present invention, technical solution in the embodiment of the present invention is clearly and completely retouched It states.Obviously, described embodiment example is only a part of embodiment of the present invention, rather than whole embodiments, base In embodiment of the invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its Its embodiment, shall fall within the protection scope of the present invention.

As shown in Figure 1, the method for automatic batch crawl alimentary canal inner tissue feature image, includes the following steps:

Step 1: grabbing operation video frequency feature image from video in batches,

A): video is read and the conversion of color channel format: reading the video of storage equipment superior gastrointestinal endoscope diagnosis and treatment process, will regard The Color Channel format of frequency is converted to HSV by RGB, in order to find the background mask that can remove specific objective identification region；

B): target and removing video background in positioning video: as shown in Fig. 2, adjusting the model of parameter H, S and V in hsv color space It encloses with positioning video content, corresponding HSV mask is obtained using the hsv color space in video background, is positioned by HSV mask Target signature in video, background adjustment parameter H, S and V all in addition to removing target signature, the target signature are digestion Any one in road inner tissue organ, excrement, inspection and surgical instrument；

C): obtaining target signature picture: carrying out mask operation, and the target using each pixel in mask and target signature Feature image includes target signature area image and non-targeted characteristic area image, and the pixel value in target signature area image is kept Constant, the pixel value of non-targeted characteristic area image is zero；

D): gradation conversion and binary conversion treatment being carried out to target signature picture: using gradation conversion formula Gray=(R*299+G* Target signature picture 587+B*114+500)/1000 is obtained, the grayscale image of target signature picture is passed through into binary threshold algorithm It obtains binary picture, and morphological erosion operation and expansive working is carried out to binary picture to denoise, the target signature picture Grayscale image be single channel grayscale image, and single channel value range be 0-255, the binary picture is two that single channel is 0 or 255 Value figure, as shown in Figure 3；

E): contour detecting and positioning being carried out to target signature: contour detecting being carried out to binary picture using Freeman chain code, is returned The position of target signature picture, the statistics sum of target signature profile and target feature point are gone back to, as shown in figure 4, at this point, mesh Mark feature is surgical instrument, and the wire frame positions in picture are position of the target signature in picture.The width size of wire frame is mesh Mark the profile of feature；

G): determining whether video frame meets interception decision condition frame by frame, if satisfied, then cutting to target signature in picture It takes, and saves interception result.As shown in figure 5, part picture when be the target signature that is intercepted from video being surgical instrument.

In step g), determine whether video frame meets interception decision condition and include the following steps:

G1): whether the statistics sum of the target feature point in judgment step e) is greater than 5000, if more than step g2 is then transferred to), it is no Then, the conversion of next video frame is directly carried out；

G2): the wide and high ratio of the target signature profile in judgment step e) whether be the ratio of width to height at 5 times hereinafter, and being greater than 1/5th, if so, being transferred to step g3), otherwise, directly carry out the conversion of next video frame；

G3): whether ratio of the target signature in picture in judgment step f) be within the scope of 2%-20%, if intercepting frame Middle target signature is saved in result set, otherwise, carries out the conversion of next video frame.

Doctor carries out artificial screening to result set, needs to delete the picture of wherein irrelevant feature, finally remaining to be exactly Standard and accurate characteristic pattern.

The step of grabbing target signature based on above-mentioned batch, can also further implement Endoscopic submucosal dissection excision The work of polyp video, specifically comprises the following steps:

Step 2: neural network model is established, and neural network model is trained:

H): establishing data set: collected target signature picture classification storage will be classified from digestive endoscope detection with establishing Data set；

The mathematics and business model of target signature picture are established according to target signature attribute, automation batch crawl is out of alimentary canal The target signature picture occurred in microscopy survey, and classification storage is to establish categorized data set；

Target signature attribute includes that target signature is irregular, it is discrete distribution in video, the size of target signature institute in picture Accounting example is that 3%-20%, target signature color and alimentary canal color are inconsistent, and digestive endoscope camera lens is mobile and shields alimentary canal After background, the illusion and target spy's feature video frame number that available target signature moves in region are higher and need special Industry healthcare givers is labeled picture, and obtained data volume is small；

The categorized data set is the memory space opened up on the storage device, and preferably folder formats are stored, described Memory device includes disk or mobile hard disk.The categorized data set include background classes data set, alimentary canal tissue data set and Foreign matter class data set, the target signature picture of the background classes data set include the non-identifying content graph such as intestinal wall, stomach wall and esophagus Piece, the target signature picture in the alimentary canal tissue data set include the identification of the needs such as cardia, stomach bottom, polyp and tumour and note The intestinal tissue of record, the target signature picture in the foreign matter class data set includes that excrement, clip, lasso and suction pipe etc. need to know Other and record non-bowel organising content.

I): establishing training set, verifying collection and test set: being concentrated from each classification data and extract the generation test of 60% above data Collection；Each categorized data set is divided into training set according to K folding cross-validation method and verifying collects, by the test set, training set Collect with verifying and carries out data vector processing；

The K folding cross-validation method is each data set to be divided into K subregion, and carry out K picture and obtain, random every time to obtain K-1 subregion collects as training set, and using a remaining subregion as verifying to verify；

The training set and verifying collect to carry out the training of depth convolutional neural networks model, and the test set is used to assess depth The practical recognition result of neural network model；

Since the content similarity that the flag data in medical data is less and extracts from video is higher, so that the number of verifying collection According to meeting very little, that verifies in this way has larger fluctuation, and the division mode for verifying collection will cause deep learning neural network model and exist There is very big variance when assessment, it is therefore preferable that being the division methods that K folding cross-validation method is training set and verifying collection, by K picture The test result of acquisition seeks average to assess the reliability of neural network model.

In step i), the test set, training set and verifying collection carry out vectorization processing and include the following steps:

I1): the address information of every class data set is successively stored picture by creation picture path vector imagePaths storage unit In path vector imagePaths；

I2): data and label storage unit is respectively created, traverses all storage pictures in imagePaths, picture is carried out The picture of boil down to 96x96 size, then by picture mean value by column traversal, splicing head and the tail row obtains the vector of picture；

I3): the color value of picture vector being removed 255, so that color value is converted to the decimal in 0 to 1 range and is successively stored in In data, the corresponding item name of picture vector is sequentially stored into label；

As shown in fig. 6, being the storage organization schematic diagram that each classification data concentrates picture vectorization.

J): according to 3D convolution, maximum pond, linking neuron, data flattening and probability output creation neural network entirely Model, and Regularization is carried out to test set, training set and verifying collection, neural network model includes input layer, the first convolution Layer, the first maximum pond layer, the second convolutional layer, the second maximum pond layer, third convolutional layer, third maximum pond layer, data are flat Flat transition zone, full link data Layer and probability output layer；

The input layer is the input entrance of the picture of vectorization, and the wide height of the model of the input layer is 150, and color channel is Triple channel.

Input content is inputted convolution kernel by first convolutional layer, and the size of the convolution kernel is 3*3,64 concealed nodes, Activation primitive is amendment linear unit；

The convolution results of first convolutional layer are carried out the pond 2*2 by the described first maximum pond layer；

The size of the convolution kernel of second convolutional layer is 3*3, and 128 concealed nodes, activation primitive is amendment linear unit；

The convolution results of second convolutional layer are carried out the pond 2*2 by the described second maximum pond layer；

The size of the convolution kernel of the third convolutional layer is 3*3, and 256 concealed nodes, activation primitive is amendment linear unit；

The convolution results of third convolutional layer are carried out the pond 2*2 by third maximum pond layer；

The flat transition zone of data is the transition of convolutional layer to full linking layer by multidimensional data one-dimensional；

Input parameter is passed to 1024 concealed nodes by the full link data Layer, and activation primitive is amendment linear unit；

The probability output layer is normalized to realize to the general of different classifications by the log of gradient of finite term discrete probability distribution Rate distribution；

Use the weight in weight regularization for the regularization method of L2 norm the regularization of neural network model, to reduce mind Over-fitting through network model.

K): neural network model being trained: the loss function of neural network model is set, initializes each layer network ginseng Number, training set and verifying collection after input vector regularization are trained, root-mean-square error are arranged as optimizer, passes through more points The gradient of class cross entropy loss function value declines, and is updated to weighting parameter in each layer network, to obtain training pattern.

L): neural network model is tested: the test set after vector regularization being tested using training pattern, To test its generalization ability and recognition capability.

If generalization ability and recognition capability are insufficient, need to re-start training；

M): obtaining real-time digestive endoscope video and it is identified and is recorded: obtaining real-time digestive endoscope video figure Its even partition is multiple subregions by picture, and each subregion is compressed to the picture format size of training pattern input, traversal Each subregion is carried out vectorization, is input in neural network model, mould by all subregions of the digestive endoscope image Type returns to identification probability vector, using the maximum probability scalar of its intermediate value as a result, whether decision probability scalar is greater than 95%, If more than then the target signature subregion after identification is stored.

In step m), the real-time digestive endoscope image uniform is divided into multiple subregions and includes the following steps:

M1): obtaining the picture traverse and picture altitude of scope realtime graphic, picture traverse and picture altitude are removed ten with by institute Stating digestive endoscope image segmentation is 100 sub-regions；

M2): all subregions are traversed, all subregion pictures are compressed, by all subregion picture vectorization, and will be after vectorization The color value of all subregion removes 255, by the decimal in tri- 0 to 1 ranges of channel value boil down to of RGB.

Picture subregion vector is input in deep learning neural network model, output probability vector predictors and with it is pre- The corresponding index value of measured value, multiplies 100 for predicted value, if more than 95, is then indicated in picture, in picture as shown in Figure 7 Tissue and foreign matter in enteron aisle are identified using block form, then respective value in label is found according to index value, identifies reality When picture in characteristic pattern tissue or foreign matter title, by the grid picture of feature organization or foreign matter with the time where system into After row name, storage record is carried out to picture, as shown in Figure 8.

Step 3: batch traversal video verification neural network model generates predicted pictures according to neural network model.

Step 4: the intelligent higher picture of alignment similarity, and the picture of no similarity is subjected to filing to data set；

P): processor obtains input path and the outgoing route of picture, and the picture concentrated according to picture modification time to data It is ranked up；

Q): two pictures being successively read in data set, two picture be data set in any one picture and with The picture a upper picture or next picture adjacent on modification time；

R): the ratio value of the size of two picture is judged whether within the scope of preset ratio, if being transferred to step S), otherwise, two picture is stored in the data set of outgoing route direction simultaneously, and is transferred to step q), described two The ratio value of the size of picture be modification time preceding picture size divided by modification time rear picture size Size, the size of the picture are the height of picture and the product of width, and the preset ratio range is less than 0.5 or big In 1.5；

S): two picture being converted into the identical gray processing figure of size, the gray processing figure is carried out at subregion conversion Reason, and create gray average matrix；

T): judge whether the standard deviation for the matrix that the Mean Matrix of two picture subtracts each other is less than specified threshold value, if It is less than, is then transferred to step u), otherwise, two picture is stored in simultaneously in the data set of outgoing route direction, and be transferred to Step q), the specified threshold are 15；

U): characteristic value detection being carried out to two picture, respectively obtains two picture set of characteristic points, the characteristic value detection For SIFT(Scale invariant feature transform) characteristic value detector；

V): the quantity of same characteristic features point in two picture feature point sets of statistics carries out matching using LANN and KNN obtains feature The quantity of same characteristic features point in point set, as shown in figure 9, the LANN is (Library for Approximate Nearest Neighbors) quickly approximate KNN search；

W): same characteristic features point amount threshold is calculated, judges whether same characteristic features point quantity is more than characteristic point amount threshold, is not had It has more than, then the posterior picture of modification time is saved in the data set being directed toward to outgoing route, be more than then to be not processed, than The comparison for q) re-starting next picture, the characteristic point amount threshold are entered step after the completion of relatively are as follows: two picture The ratio of picture total quantity in the mean value and data set of size.

As shown in Figure 10, it is picture in the data set for not comparing and filing, Figure 11 is the picture ratio in Figure 10 data set Compared with the result after filing.

In step s), two picture is converted to the identical gray processing figure of size and includes the following steps:

S1): successively obtaining width, height and the color channel information of two picture；

S2): each single channel color value of RGB of two pictures is successively obtained according to channel information, using gradation conversion formula to described Two pictures successively carry out gradation conversion；

S3): calculating separately the width of two pictures and the result of product value of height, the big picture of result of product value is converted to and is multiplied The small picture of product end value.

In step s), the gray processing figure carries out subregion conversion process, and creating gray average matrix includes following step It is rapid:

S1): obtaining the width and elevation information of picture；

S2): the width of picture and elevation information are removed into same constant respectively, obtain each subregion width C ellWidth and The height CellHeigh size of each subregion, the constant are integer and are the picture subregion on width or height Number；

S3): creation dimension matrix, the quantity phase of subregion of the size of the row or column of the matrix with picture on width or height Deng；

S4): traversing the width pixel of picture, by the pixel currently traversed divided by the width C ellWidth of subregion, obtain current Which sub-regions is pixel be in picture width direction, the height pixel of picture is traversed, by the pixel currently traversed divided by son The height CellHeigh in region, which sub-regions obtains current pixel on picture height direction is, by determining current son The pixel value in region adds up with the pixel value before the subregion pixel, and by accumulation result deposit and current pixel position At the column locations of corresponding matrix；

S5): by each value in matrix divided by subregion sum magnitude, obtaining the average color of gray value, and subtract sky for 255 Interior average color, obtains inverted value, and negated space average color value is stored in corresponding matrix.

Step 5: carrying out the retraining of neural network model according to the data set of no similarity picture, obtains high-precision mind Through network model；Network mould is re-started using the data set of no similarity picture as training set according to the method in step 2 Type training, until overall classification accuracy reaches 95%.

Step 6: high-precision neural network model reads surgical procedure picture and classifies；

The picture that label hemostat opens and closes identifies the haemostatic clamp in surgical procedure as training data, marks metal The picture that folder opens and closes identifies the metal clip in surgical procedure as training data, and electricity is marked to burn the opening of metal ferrules Identify that electricity burns metal ferrules, the picture conduct for not falling off and being detached from after label hemostasis titanium folder closure as training data with tightening Training data is as shown in figure 12 to identify hemostasis titanium folder, identification classification results, wherein (I) figure is sorted haemostatic clamp, (II) Figure is that sorted electricity burns metal ferrules, and (III) figure is sorted metal clip, and (IV) figure is sorted hemostasis titanium folder.

Step 7: the neural network model identification particular procedure utensil confirmation video time started simultaneously starts recorded video；

As shown in figure 13, first of the electric burning and cutting division ring metal ferrules in high-precision neural network model identification surgical procedure Figure, and record the time of electric burning and cutting division ring metal ferrules；

As shown in figure 14, the first picture that high-precision neural network model identification metal clip opens, and record metal clip opening Time；

Judge record electric burning and cutting division ring metal ferrules time and record metal clip open the time, with record the time it is preceding when Between be time reference, if there are three or more electric burning and cutting division ring metal ferrules or metal clips in the identification of high-precision neural network model The figure of opening, and there is no video record, then start the recording of video.

Step 8: the neural network model identification particular procedure utensil confirmation video end time simultaneously terminates to record；

High-precision neural network model identification hemostasis titanium folder closure after without departing from picture, and record hemostasis titanium folder closure after do not take off From when last picture occur time, as shown in figure 15；

The picture that is detached from after high-precision neural network model identification hemostasis titanium folder closure, and record when being detached from after hemostasis titanium folder closure The time that last picture occurs, as shown in figure 16；

If it is continuous occur after hemostasis titanium folder closure without departing from picture, the time with the picture being detached from after titanium folder closure of stopping blooding is End time；

If there is the picture being detached from after hemostasis titanium folder closure, it is to be detached from the time of picture after the last one hemostasis titanium folder closure The final end time.

Step 9: editing is carried out to video and is saved.

Be subject to record at the beginning of and the end time, to video carry out editing, and be saved in default specify wheel footpath in It achieves.

The technical means disclosed in the embodiments of the present invention is not limited only to technological means disclosed in above embodiment, further includes Technical solution consisting of any combination of the above technical features.It should be pointed out that for those skilled in the art For, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also considered as Protection scope of the present invention.

Claims

1. the method for automatic batch crawl alimentary canal inner tissue feature image, characterized by the following steps:

D): gradation conversion and binary conversion treatment are carried out to target signature picture；

2. the method for automatic batch crawl alimentary canal inner tissue feature image according to claim 1, it is characterised in that: step It is rapid c) in, obtain target signature picture are as follows: carry out mask operation, and the mesh using each pixel in mask and target signature Mark feature image includes target signature area image and non-targeted characteristic area image, and the pixel value in target signature area image is protected Hold constant, the pixel value of non-targeted characteristic area image is zero.

3. the method for automatic batch crawl alimentary canal inner tissue feature image according to claim 1, it is characterised in that: step It is rapid d) in, using gradation conversion formula, obtain target signature picture, the grayscale image of target signature picture passed through into binary threshold Algorithm obtains binary picture, and carries out morphological erosion operation and expansive working to binary picture to denoise, the target signature The grayscale image of picture be single channel grayscale image, and single channel value range be 0 to 255, the binary picture be single channel value be 0 or 255 binary picture.

4. the method for automatic batch crawl alimentary canal inner tissue feature image according to claim 1, it is characterised in that: step It is rapid g) in, determine video in frame whether meet interception decision condition includes the following steps: