CN116934728A

CN116934728A - Hysteroscope image target detection acceleration method based on embedded AI processor

Info

Publication number: CN116934728A
Application number: CN202310948368.XA
Authority: CN
Inventors: 张云飞; 蔡占毅; 曹黎俊
Original assignee: Jiangsu Jiyuan Medical Technology Co ltd
Current assignee: Jiangsu Jiyuan Medical Technology Co ltd
Priority date: 2023-07-31
Filing date: 2023-07-31
Publication date: 2023-10-24

Abstract

The invention discloses a hysteroscope image target detection acceleration method based on an embedded AI processor, and belongs to the field of medical image processing. With the continuous development of artificial intelligence and deep learning, the network model tends to be complicated, and the parameter and the operand are also increased. The invention utilizes an embedded AI processor to perform real-time pretreatment and feature extraction on hysteroscope images, then uses a network model which is subjected to large model focus and global knowledge distillation to detect and classify abnormal targets in the images, and finally outputs detection results to a display. The invention can effectively improve the speed and accuracy of hysteroscope image target detection, meet the real-time requirement of medical target detection, and reduce the reasoning time and the consumption of calculation resources.

Description

Hysteroscope image target detection acceleration method based on embedded AI processor

Technical Field

The invention relates to the field of medical image processing, in particular to a hysteroscope image target detection acceleration method based on an embedded AI processor.

Background

Hysteroscopes are a medical device for examination and treatment of endometrial diseases, which can transmit images of the uterine cavity to a display through optical fibers for observation and manipulation by a doctor. Hysteroscopic image target detection refers to locating and identifying a region of interest in a hysteroscopic image, such as endometrium, polyp, myoma, adhesion, etc., for a doctor to diagnose and treat. Hysteroscopic image target detection is a challenging task because hysteroscopic images typically have low resolution, low contrast, high noise, occlusion, deformation, etc., which results in difficult target recognition. In order to improve the performance and accuracy of hysteroscopic image target detection, many deep learning-based methods, such as Convolutional Neural Networks (CNNs), have emerged in recent years. These methods typically require a significant amount of labeling data and computational resources to train complex models, such as Faster R-CNN, YOLO, SSD, and the like. However, these models require a long time and high memory consumption in reasoning and are not suitable for real-time medical target detection.

To address this problem, one possible approach is to use an embedded AI processor to accelerate hysteroscopic image target detection. An embedded AI processor is a chip specifically designed to perform artificial intelligence tasks, such as NVIDIA Jetson Nano, google Coral Edge TPU, etc. The chips have the advantages of low power consumption, high performance, small size and the like, and can realize rapid image processing and reasoning on edge equipment. However, there are still some problems with deploying deep learning models directly onto embedded AI processors, such as model oversize, computationally too high, loss of accuracy, etc. To solve these problems, an effective method is distillation (Big Model Knowledge Distillation, BMKD) using large model knowledge. BMKD is a model compression technique that trains a light-weight small model (student model) by building it with the supervision information of the larger model (teacher model) with better performance in order to achieve better performance and accuracy. The BMKD can effectively reduce the parameter quantity and the calculated quantity of the model, and improves the running efficiency and the adaptability of the model on the embedded AI processor.

The invention provides a method for accelerating hysteroscope image target detection based on an embedded AI processor and a BMKD technology.

Disclosure of Invention

The invention aims to provide a hysteroscope image target detection acceleration method based on an embedded AI processor, which mainly adopts a knowledge distillation method to train a lightweight model with good instantaneity, short reasoning time and certain precision by using a large model. The current knowledge distillation method is mostly used for the task of image classification, and the model loss function is improved on the basis of the current knowledge distillation method, so that the performance of knowledge distillation on the target detection task can be improved to a certain extent, and the acceleration by using an AI processor is facilitated.

The aim of the invention is realized by the following technical scheme:

a hysteroscope image target detection acceleration method based on an embedded AI processor comprises the following steps:

step 1: preparing a set of hysteroscopic images for model training, preprocessing the images prior to training, comprising: image scaling, random horizontal overturning, random vertical overturning, random angle rotation, gamma transformation, center cutting and standardization;

step 2: using a GAIA large model as a teacher network and using an arbitrary lightweight object detection model as a student network;

step 3: inputting training images into a teacher network and a student network which are mentioned in the step 2, wherein the teacher network obtains soft labels, and the student network obtains model prediction output;

step 4: capturing single image global relation information from Neck of a network through a GcBLock module, and calculating to obtain global distillation loss L in loss function _global ；

Step 5: binary mask M is calculated through feature map output by network Neck _i,j Scale mask S _i,j Attention mask A ^C 、A ^S From this calculation, the focal distillation loss L in the loss function is obtained _focal ；

Step 6: the original loss function L is calculated according to the soft label and the student network prediction output required by distillation _original And detecting the loss function in the step 4 and the step 5Adding to obtain a comprehensive loss function;

step 7: according to the loss function L _global 、L _focal 、L _original Updating model parameters;

step 8: repeating the steps 2-7 until the training times reach the expected value;

step 9: and deploying the light weight optimal model obtained by training into a proper AI processor, accelerating by using NNIE, accelerating the abnormality detection process by using the strong accelerating capacity of the AI processor, and enhancing the instantaneity.

The aim of the invention can be further achieved by the following technical measures:

according to the hysteroscope image target detection acceleration method based on the embedded AI processor, the student network can use a two-stage detection model, a one-stage anchor-based model and a one-stage anchor-free model.

According to the hysteroscope image target detection acceleration method based on the embedded AI processor, the GcBLock module is adopted by the Neck part, and the feature map is obtained from the Neck part of the model and is used for calculating global distillation loss. The calculation formula of the global distillation loss is given by formula (I).

In the hysteroscope image target detection acceleration method based on the embedded AI processor, a binary mask is used for separating the foreground and the background of the image.

According to the hysteroscope image target detection acceleration method based on the embedded AI processor, the influence of different scale targets on the model performance is balanced by using the scale mask.

The above-mentioned hysteroscope image target detection acceleration method based on the embedded AI processor, considers the pixel and channel attention, calculates the pixel and channel attention force diagram G of the feature image ^S 、G ^C And calculates the attention mask A based on the result ^S 、A ^C . Calculating according to the binary mask, the scale mask and the attention mask to obtain the characteristic loss L _fea 。

The hysteroscope image target detection acceleration method based on the embedded AI processor uses the attention loss L _at Forcing student detector to mimicThe teacher detector's space and channel attention mask. The focal distillation loss consists of both the characteristic loss and the attention loss.

According to the hysteroscope image target detection acceleration method based on the embedded AI processor, the inference engine can adopt the 3588 embedded AI processor, a model optimized through large model distillation can be conveniently deployed, and rapid image processing and inference can be realized on edge equipment.

Compared with the closest prior art, the technical scheme provided by the invention has the following beneficial effects:

the foreground and the background in the image are balanced by adopting the focus distillation loss function and the global distillation loss function, and meanwhile, the relation information among pixels is not lost, so that the distillation method can be well applied to target detection tasks. The light model with good performance is deployed on an AI processor, so that the reasoning speed can be increased, and the requirements of medical image detection and diagnosis on real-time performance are met.

Drawings

FIG. 1 is a flow chart of a hysteroscope image target detection acceleration method in the invention;

fig. 2 is an effect diagram of a hysteroscope image target detection method in the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1-2, the present invention provides a technical solution:

the invention discloses a hysteroscope image target detection acceleration method based on an embedded AI processor, which comprises the following steps:

step 4: capturing single image global relation information from Neck of a network through a GcBLock module, and calculating to obtain global distillation loss L in loss function _global The method comprises the steps of carrying out a first treatment on the surface of the The calculation formula of the global distillation loss is as follows:

wherein W is _v1 、W _v2 、W _k Indicating the convolution layer, LN indicating Layer Normalization, N _p The number of pixels in the feature map is represented, and λ is a super-parameter for balancing the loss.

Step 5: binary mask M is calculated through feature map output by network Neck _i,j Scale mask S _i,j Attention mask A ^C 、A ^S From this calculation, the focal distillation loss L in the loss function is obtained _focal The method comprises the steps of carrying out a first treatment on the surface of the The binary mask has the following calculation formula:

r represents the ground truth box, i, j represents the horizontal and vertical coordinates of the feature map, and if the coordinates fall within the truth box, the mask is 1, otherwise, 0.

The calculation formula of the scale mask is as follows:

H _r W _r representing the height and width of the truth box, if a pixel belongs to a different target box, using the minimum box to calculate S _i,j 。

The pixel and channel attention map calculation formula of the feature map is as follows:

H. w, C the height, width and channel number of the feature map, G ^S 、G ^C Spatial and channel attention patterns, respectively. The formula from which the attention mask is calculated is as follows:

A ^S (F)＝H·W·softmax(G ^S (F)/T)

A ^C (F)＝C·softmax(G ^C (F)/T)

t is a temperature super-parameter used to regulate distillation. By combining the above formulas, the feature loss can be calculated as follows:

where α, β are hyper-parameters for balancing foreground and background losses.

The attention loss is used to force the student detector to imitate the spatial channel attention mask of the teacher detector, and the calculation formula is as follows:

wherein t and s represent teacher and student respectively, from which focal point distillation loss can be obtained:

L _focal ＝L _fea +L _at

step 6: the original loss function L is calculated according to the soft label and the student network prediction output required by distillation _original Adding the integrated loss function with the loss function in the target detection step 4 and the target detection step 5 to obtain an integrated loss function;

the student network can use a two-stage detection model, a one-stage Anchor-based model and a one-stage Anchor-free model.

The negk section uses GcBlock modules, and feature maps are obtained from the negk section of the model for calculation of global distillation loss. The calculation formula of the global distillation loss is given by formula (I).

The acceleration method uses a binary mask to separate the foreground and background of the image. A scale mask is used to balance the impact of different scale targets on model performance.

The acceleration method considers pixel and channel attentions, calculates a pixel and channel attentions force diagram G S, G C of the feature map, and calculates an attentions mask A S, A C based on the pixel and channel attentions force diagram G S, G C. And calculating to obtain the feature loss L_fea according to the binary mask, the scale mask and the attention mask.

The attention loss L _ at is used to force the student detector to mimic the spatial and channel attention mask of the teacher detector. The focal distillation loss consists of both the characteristic loss and the attention loss.

The reasoning engine of the acceleration method can adopt 3588 embedded AI processor, the model optimized by large model distillation can be conveniently deployed, and rapid image processing and reasoning can be realized on the edge equipment.

The effect of the method in practical application is shown in fig. 2. The foreground and the background in the image are balanced by adopting the focus distillation loss function and the global distillation loss function, and meanwhile, the relation information among pixels is not lost, so that the distillation method can be well applied to target detection tasks. The light model with good performance is deployed on an AI processor, so that the reasoning speed can be increased, and the requirements of medical image detection and diagnosis on real-time performance are met.

In addition to the above embodiments, other embodiments of the present invention are possible, and all technical solutions formed by equivalent substitution or equivalent transformation are within the scope of the present invention.

Claims

1. The hysteroscope image target detection acceleration method based on the embedded AI processor is characterized by comprising the following steps of:

step 4: capturing single image global relation information from a Neck of a network through a GcBLock module, and calculating to obtain global distillation loss in a loss function;

step 5: binary mask calculation through network Neck output feature map、Scale mask, attention mask, and loss function calculated therefromFocal distillation loss in numbers;

step 6: calculating to obtain an original loss function according to the soft label and student network prediction output required by distillation, and adding the original loss function with the loss functions in the target detection step 4 and the step 5 to obtain a comprehensive loss function;

step 7: updating model parameters according to the loss function;

2. The method for accelerating hysteroscope image target detection based on an embedded AI processor as claimed in claim 1, wherein the hysteroscope image target detection is carried out by using a large model knowledge distillation model compression technology, a teacher model uses a GAIA vision large model for vision object detection, and a student model uses a convolutional neural network as a lightweight target detection model of a backbone network.

3. The hysteroscope image target detection acceleration method based on the embedded AI processor as claimed in claim 1, wherein the loss functions used include global distillation loss, focus distillation loss and original loss functions, and the problem that the knowledge distillation method does not pay attention to the key information of the picture is improved through the focus distillation loss, so that the knowledge distillation method is suitable for a target detection task; the global distillation loss considers the relation information among different pixel points in the picture, and compensates the global information which is not available for the focus distillation loss.

4. The method for accelerating hysteroscopic image target detection based on an embedded AI processor of claim 1, wherein the student network uses one of a two-stage detection model, a one-stage Anchor-based model, and a one-stage anchor-free model.

5. The hysteroscope image target detection acceleration method based on the embedded AI processor as claimed in claim 1, wherein a GcBLock module is adopted by a Neck part, and a feature map is obtained from the Neck part of the model for calculating global distillation loss; separating the foreground and background of the image by using a binary mask; the effect of different scale targets on model performance is balanced by using a scale mask.