CN115937089A - Training detection method based on improved YOLOV5 focus detection model - Google Patents

Training detection method based on improved YOLOV5 focus detection model Download PDF

Info

Publication number
CN115937089A
CN115937089A CN202211274984.3A CN202211274984A CN115937089A CN 115937089 A CN115937089 A CN 115937089A CN 202211274984 A CN202211274984 A CN 202211274984A CN 115937089 A CN115937089 A CN 115937089A
Authority
CN
China
Prior art keywords
model
focus
image
training
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211274984.3A
Other languages
Chinese (zh)
Inventor
王月
谢海琼
周忠娇
刘永旭
银兴行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Biological Intelligent Manufacturing Research Institute
Original Assignee
Chongqing Biological Intelligent Manufacturing Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Biological Intelligent Manufacturing Research Institute filed Critical Chongqing Biological Intelligent Manufacturing Research Institute
Priority to CN202211274984.3A priority Critical patent/CN115937089A/en
Publication of CN115937089A publication Critical patent/CN115937089A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a training detection method based on an improved YOLOV5 focus detection model, which comprises the steps of constructing an original training sample required by model training; inputting an original training sample into an image detection model to obtain multilayer characteristic layer data; inputting the obtained multilayer characteristic layer data into a predictor model of a YOLO-V5+ SSFPN image detection model, and outputting a prediction frame of a focus in an image and a judgment result of a focus image; comparing the result generated by the model with the standard corresponding to the training picture, and updating the parameters of the network model; and storing the trained model weight file, and loading the weight to the model for lesion detection. According to the invention, a focus picture detection model is trained through pictures and labeled data, so that the focus is rapidly positioned and judged, the structure of a network model is changed aiming at the problem that most sizes of the focus are small, and finally, the rapid and accurate positioning and identification of the focus position by the network model are realized by adding an SSFPN fusion multi-resolution feature layer.

Description

Training detection method based on improved YOLOV5 focus detection model
Technical Field
The invention belongs to the field of image recognition, and particularly relates to a method for training and detecting a human body focus model based on a YOLOV5+ SSFPN improved network.
Background
The development of medical technology is mature, and medical images (such as CT images) are widely used by medical staff. The focus can be displayed in the medical image, but the focus part is usually small, if the focus part is not detected in the medical image, the medical staff can spend time to carry out manual detection, possibly resulting in missed detection or wrong detection, and consuming time and labor.
In the field of image recognition, the image detection method based on artificial intelligence basically meets the requirements of daily life, has a promising prospect in the aspect of solving the problems of time consumption and labor consumption, the application of medical artificial intelligence has shown great potential in the fields of medical image analysis and computer-assisted medical treatment, image lesion detection is one of important means of computer-assisted medical treatment, and has advantages in detection significance, but the complexity of lesion tissues and the diversity of lesions make the task of lesion detection challenging.
Disclosure of Invention
Aiming at the key problems of time and labor consumption, missing detection or false detection and the like in the conventional process of manually detecting focus images, the method for training and detecting the human focus model based on the YOLOV5+ SSFPN improved network is provided and used for quickly and accurately judging and positioning the images of the small focuses of the human body.
In view of this, the technical scheme adopted by the invention is a training detection method based on an improved YOLOV5 lesion detection model, which comprises the following steps:
step 1: and constructing original training samples required by model training.
Step 2: and inputting the original training sample into an image detection model to obtain multilayer characteristic layer data.
And step 3: and inputting the obtained multilayer characteristic layer data into a predictor model of a YOLO-V5+ SSFPN image detection model, and outputting a prediction frame of a focus in the image and a judgment result of a focus image.
And 4, step 4: and comparing the result generated by the model with the standard corresponding to the training picture, and updating the parameters of the network model.
And 5: and storing the trained model weight file, and loading the weight to the model for lesion detection.
The focus picture detection model is trained through the pictures and the labeled data, so that the focus is quickly positioned and judged, the structure of the network model is changed aiming at the problem that most sizes of the focus are small, the SSFPN is added to fuse the multi-resolution feature layer, and finally, the network model is quickly and accurately positioned and identified on the position of the focus, so that technical support is provided for reducing focus detection errors.
The invention uses the image amplification technology to solve the problem of too few focus image samples, increases the number of original samples and improves the accuracy of the trained model detection.
According to the invention, the image amplification is carried out on the focus image of the small-size small sample, the deep neural network YOLO-V5+ SSFPN is trained through more pictures, the network can be used for rapidly and accurately identifying and positioning the small-size focus, and the network has smaller calculated amount compared with other two-stage networks. The detection capability of the network model can be continuously improved by continuously collecting focus images, and the problem of high missed diagnosis rate of manual identification is solved.
Drawings
FIG. 1 is a diagram of a training sample;
FIG. 2 is a diagram of the network structure of YOLO-V5;
FIG. 3 is a block diagram of SSFPN;
FIG. 4 is a flow chart of network model training detection.
Detailed Description
1. A method for training and detecting a human body lesion model based on a YOLOV5+ SSFPN improved network, as shown in fig. 4, the method comprises the following steps:
step 1: processing such as image acquisition and segmentation is performed on positions where focuses may exist, and the processed images are labeled to obtain original training samples required by model training, wherein the samples comprise processed pictures and labeling information, as shown in fig. 1. The number of images with focuses is small, and the sizes of most focuses are small, so that the acquired images are properly cut and then are amplified by an image enhancement method, and the number of original training samples is increased.
The image enhancement method respectively adopts the steps of adding salt and pepper noises with different degrees into an image, carrying out Gaussian blur and bilateral blur with different degrees on the image, properly changing the brightness degree of the image, and combining the image enhancement modes for use, thereby greatly amplifying the original image data set.
And 2, step: and inputting the original training sample into a YOLO-V5+ SSFPN image detection model to obtain multilayer characteristic layer data.
And step 3: inputting the obtained multilayer feature layer data into a prediction sub-model of a YOLO-V5+ SSFPN image detection model (YOLO-V5 is an end-to-end detection model and can be divided into a feature extraction sub-model and a target boundary frame prediction sub-model), and outputting a prediction frame of a focus in the image and a judgment result of a focus image. The YOLO-V5 network model applies a single Convolutional Neural Network (CNN) to the whole image, divides the image into grids, directly predicts the class probability and the bounding box of each grid, greatly reduces the time for judging the focus image, enhances the input image by using Mosaic data, splices the image in the modes of random zooming, random cutting and random arrangement, increases sample data, and improves the detection effect of small targets. The network model of YOLO-V5 is shown in FIG. 2.
The structure of YOLO-V5 is divided into an input end, a Backbone, a Neck and a prediction end, wherein the input end mainly operates on Mosaic data enhancement, adaptive anchor frame calculation and adaptive picture scaling, and the input picture is properly transformed to reach the same size; the Backbone comprises a Focus module and a CSP module, wherein the Focus module divides a high-resolution picture (characteristic diagram) into a plurality of low-resolution pictures/characteristic diagrams by adopting slicing operation, namely alternate column sampling and splicing are carried out, and information loss caused by downsampling is reduced; the CSP structure divides the original input into two branches, respectively carries out convolution operation to reduce the number of channels by half, then carries out Bottleneck N operation (namely N residual modules) on one branch, and then concat two branches, aiming at enabling the model to learn more characteristic information; the Neck module comprises an FPN structure and a PAN structure, and corresponding features with different resolutions can be stored when the image features are transmitted in the model, so that feature information can be better extracted; the prediction end comprises CIOU _ Loss as a Loss function of the bounding box and nms non-maximum suppression, and the overlapped object identification is better identified and hidden.
However, the focus image is complex, the size of part of the focus is too small, and the deeper features obtained by the YOLO-V5 using convolution operation bring information loss, which causes difficulty in focus detection, and the SSFPN feature extraction method performs three-dimensional spatial 3D convolution operation by regarding the FPN structure as a scale space, so that image features of FPN under different resolutions are fused, and the detection effect of the network model on small targets is further improved. The structure of SSFPN is shown in fig. 3.
In order to improve the extraction capability of a YOLO-V5 detection model on small target features, SSFPN is arranged between an FPN and a PAN module of a YOLO-V5 Back bone, resize operation is carried out on feature information with different resolutions extracted by the FPN module to change the feature information into the same size, the feature information is changed into a 4-dimensional tensor after Concat operation, and then the feature information is input to the PAN module after 3D convolution operation.
And 4, step 4: and comparing the result generated by the model with the standard corresponding to the training picture, and updating the parameters of the network model. YOLO-V5 uses CIOU _ Loss as a Loss function of the bounding box, and the formula is as follows:
Figure BDA0003896110430000031
Figure BDA0003896110430000032
Figure BDA0003896110430000033
CIOU_Loss=1–CIOU
CIOU is an improved calculation method of IOU, which considers the length-width ratio in the three elements of the regression frame; the IOU is used for calculating the area intersection ratio of a real frame and a prediction frame; b, b gt Respectively representing a prediction box and a real box, p (b, b) gt ) Representing the Euclidean distance between two central points of the prediction frame and the real frame; c diagonal distance of minimum closure area capable of simultaneously containing prediction box and real box; w is a gt 、h gt W, h represent the width of the real box, the height of the real box, the width of the predicted box, the height of the predicted box, respectively.
And 5: and storing the trained model weight file, and loading the weight to the model for lesion detection.
Step 6: inputting the cut focus detection picture into a detection model, drawing a focus prediction frame in an original picture by the detection model, displaying the judgment result of the focus image and storing the image with the boundary frame into a specific folder. The YOLO-V5 network model is an end-to-end structure, an anchor frame does not need to be predicted, a focus picture can be rapidly judged, FPS can reach 50, and the focus of a small target can be rapidly and accurately detected after an SSFPN structure is added.

Claims (7)

1. A training detection method based on an improved YOLOV5 focus detection model is characterized by comprising the following steps:
step 1: constructing an original training sample required by model training;
and 2, step: inputting an original training sample into an image detection model to obtain multilayer characteristic layer data;
and step 3: inputting the obtained multilayer characteristic layer data into a prediction sub-model of a YOLO-V5+ SSFPN image detection model, and outputting a prediction frame of a focus in an image and a judgment result of a focus image;
and 4, step 4: comparing the result generated by the model with the standard corresponding to the training picture, and updating the parameters of the network model;
and 5: and storing the trained model weight file, and loading the weight to the model for focus detection.
2. The method of claim 1, wherein the improved YOLOV5 lesion detection model-based training detection method comprises: the step 1 also comprises the step of amplifying the images by an image enhancement method after the acquired images are cut, so that the number of original training samples is increased.
3. The method of claim 2, wherein the improved YOLOV5 lesion detection model-based training detection method comprises: the image enhancement comprises adding salt and pepper noises with different degrees into the image, carrying out Gaussian blur and bilateral blur with different degrees on the image or changing the brightness degree of the image, or combining the image enhancement modes.
4. The method of claim 1, wherein the improved YOLOV5 lesion detection model-based training detection method comprises: the YOLO-V5 network model comprises an input end, a backhaul end, a Neck end and a prediction end, wherein the input end comprises the following operations: the method comprises the following steps of Mosaic data enhancement, self-adaptive anchor frame calculation and self-adaptive picture scaling; the Backbone comprises a Focus module and a CSP module, wherein the Focus module splits a high-resolution picture/feature map into a plurality of low-resolution pictures/feature maps by adopting slicing operation, the CSP structure divides an input into two branches, convolution operation is respectively carried out, then Bottleneck N operation is carried out on one branch, and then two branches are concat; the Neck comprises FPN and PAN structures; the predictor includes CIOU _ Loss as a Loss function and non-maximum suppression of bounding boxes.
5. The method of claim 4, wherein the improved YOLOV5 lesion detection model-based training detection method comprises: SSFPN is arranged between the FPN and the PAN module of the YOLOV5 Back bone, resize operation is carried out on the feature information with different resolutions extracted by the FPN module to change the feature information into the same size, concat operation is changed into a 4-dimensional tensor, and then 3D convolution operation is carried out on the tensor to input the tensor to the PAN module.
6. The method of claim 1, wherein the improved YOLOV5 lesion detection model-based training detection method comprises: the YOLO-V5 uses CIOU _ Loss as a Loss function of the bounding box, and the formula is as follows:
Figure FDA0003896110420000021
Figure FDA0003896110420000022
Figure FDA0003896110420000023
CIOU_Loss=1–CIOU
CIOU is an improved calculation method of IOU, which considers the length-width ratio in the three elements of the regression frame; the IOU is used for calculating the area intersection ratio of a real frame and a prediction frame; b, b gt Respectively representing a prediction box and a real box, p (b, b) gt ) Representing the Euclidean distance between two central points of the prediction frame and the real frame; c diagonal distance of minimum closure area capable of simultaneously containing prediction box and real box; w is a gt 、h gt W, h represent the width of the real box, the height of the real box, the width of the predicted box, the height of the predicted box, respectively.
7. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the method for training a human lesion model according to any one of claims 1 to 5.
CN202211274984.3A 2022-10-18 2022-10-18 Training detection method based on improved YOLOV5 focus detection model Pending CN115937089A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211274984.3A CN115937089A (en) 2022-10-18 2022-10-18 Training detection method based on improved YOLOV5 focus detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211274984.3A CN115937089A (en) 2022-10-18 2022-10-18 Training detection method based on improved YOLOV5 focus detection model

Publications (1)

Publication Number Publication Date
CN115937089A true CN115937089A (en) 2023-04-07

Family

ID=86651585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211274984.3A Pending CN115937089A (en) 2022-10-18 2022-10-18 Training detection method based on improved YOLOV5 focus detection model

Country Status (1)

Country Link
CN (1) CN115937089A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777893A (en) * 2023-07-05 2023-09-19 脉得智能科技(无锡)有限公司 Segmentation and identification method based on characteristic nodules of breast ultrasound transverse and longitudinal sections
CN117911418A (en) * 2024-03-20 2024-04-19 常熟理工学院 Focus detection method, system and storage medium based on improved YOLO algorithm

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777893A (en) * 2023-07-05 2023-09-19 脉得智能科技(无锡)有限公司 Segmentation and identification method based on characteristic nodules of breast ultrasound transverse and longitudinal sections
CN116777893B (en) * 2023-07-05 2024-05-07 脉得智能科技(无锡)有限公司 Segmentation and identification method based on characteristic nodules of breast ultrasound transverse and longitudinal sections
CN117911418A (en) * 2024-03-20 2024-04-19 常熟理工学院 Focus detection method, system and storage medium based on improved YOLO algorithm

Similar Documents

Publication Publication Date Title
CN111445478B (en) Automatic intracranial aneurysm region detection system and detection method for CTA image
CN108062525B (en) Deep learning hand detection method based on hand region prediction
CN111383214B (en) Real-time endoscope enteroscope polyp detection system
CN115937089A (en) Training detection method based on improved YOLOV5 focus detection model
WO2020133636A1 (en) Method and system for intelligent envelope detection and warning in prostate surgery
CN110956126B (en) Small target detection method combined with super-resolution reconstruction
Yin et al. FD-SSD: An improved SSD object detection algorithm based on feature fusion and dilated convolution
CN111260688A (en) Twin double-path target tracking method
CN109584156A (en) Micro- sequence image splicing method and device
CN110648331B (en) Detection method for medical image segmentation, medical image segmentation method and device
CN114220015A (en) Improved YOLOv 5-based satellite image small target detection method
CN112862824A (en) Novel coronavirus pneumonia focus detection method, system, device and storage medium
CN111916206B (en) CT image auxiliary diagnosis system based on cascade connection
Buayai et al. End-to-end automatic berry counting for table grape thinning
CN113610087B (en) Priori super-resolution-based image small target detection method and storage medium
CN113988271A (en) Method, device and equipment for detecting high-resolution remote sensing image change
CN115482523A (en) Small object target detection method and system of lightweight multi-scale attention mechanism
Zhang et al. TPMv2: An end-to-end tomato pose method based on 3D key points detection
US9392146B2 (en) Apparatus and method for extracting object
CN114332582A (en) Multi-scale target detection method based on infrared and visible light
CN113971763A (en) Small target segmentation method and device based on target detection and super-resolution reconstruction
JP2021064120A (en) Information processing device, information processing method, and program
CN116894959B (en) Infrared small target detection method and device based on mixed scale and focusing network
CN114596580B (en) Multi-human-body target identification method, system, equipment and medium
CN113469172B (en) Target positioning method, model training method, interface interaction method and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination