CN116229458A - A method for detecting inclusions based on YOLOV5 - Google Patents
A method for detecting inclusions based on YOLOV5 Download PDFInfo
- Publication number
- CN116229458A CN116229458A CN202310231853.5A CN202310231853A CN116229458A CN 116229458 A CN116229458 A CN 116229458A CN 202310231853 A CN202310231853 A CN 202310231853A CN 116229458 A CN116229458 A CN 116229458A
- Authority
- CN
- China
- Prior art keywords
- model
- inclusions
- yolov5
- fluid
- detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 15
- 238000001514 detection method Methods 0.000 claims abstract description 47
- 239000012530 fluid Substances 0.000 claims abstract description 21
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims description 14
- 230000003287 optical effect Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 2
- 230000004927 fusion Effects 0.000 description 17
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000011176 pooling Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000009529 body temperature measurement Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 229910052500 inorganic mineral Inorganic materials 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 239000011707 mineral Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 238000009933 burial Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 229910000514 dolomite Inorganic materials 0.000 description 1
- 239000010459 dolomite Substances 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/693—Acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/695—Preprocessing, e.g. image segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Sampling And Sample Adjustment (AREA)
Abstract
Description
技术领域technical field
本发明涉及油气勘探领域,是一种基于YOLOV5的流体包裹体的检测方法。The invention relates to the field of oil and gas exploration, and relates to a method for detecting fluid inclusions based on YOLOV5.
背景技术Background technique
矿物中捕获的包裹体是迄今保留下来最完整最直接的原始流体(或熔体)样品,对包裹体测温可以为确定流体性质、来源、成岩作用期次、成岩环境的研究提供量化依据。对油气资源评价、油藏地球化学、流体类型、流体来源与勘探等都具有重要的指导意义。利用油气包裹体均一温度可以确定油气充注时间,根据油气充注时间的不同,可以直接划分油气成藏期次。结合样品产出的地质构造特征、储层的埋藏受热史等,利用均一温度还可以进行成藏时间和古地温梯度的估算、成岩作用历史的推断、烃源岩成熟度的研究等。要进行包裹体的研究首先要找到矿物中的包裹体。目前常用的寻找包裹体的方法是在光学显微镜下观察包裹体片,然后圈出大致范围,再进行下一步测温研究。Inclusions captured in minerals are the most complete and direct original fluid (or melt) samples preserved so far, and temperature measurement of inclusions can provide quantitative basis for the determination of fluid properties, sources, diagenetic stages, and diagenetic environments. It has important guiding significance for oil and gas resource evaluation, reservoir geochemistry, fluid type, fluid source and exploration. The oil and gas charging time can be determined by using the homogeneous temperature of oil and gas inclusions, and the oil and gas accumulation periods can be directly divided according to the difference in oil and gas charging time. Combined with the geological structure characteristics of the sample output, the burial heating history of the reservoir, etc., the homogeneous temperature can also be used to estimate the accumulation time and paleogeothermal gradient, infer the history of diagenesis, and study the maturity of source rocks. In order to study inclusions, we must first find inclusions in minerals. At present, the commonly used method to find inclusions is to observe the inclusion sheet under an optical microscope, then circle the approximate range, and then conduct the next step of temperature measurement research.
目标检测任务是让计算机自动检测出图像或者视频中关注的目标对象所在的位置及种类,是计算机视觉中的一项经典任务。目前目标检测领域的深度学习方法主要分为两类:two stage的目标检测算法;one stage的目标检测算法。前者是先由算法生成一系列作为样本的候选框,再通过卷积神经网络进行样本分类;后者则不用产生候选框,直接将目标边框定位的问题转化为回归问题处理。正是由于两种方法的差异,在性能上也有不同,前者在检测准确率和定位精度上占优,后者在算法速度上占优。YOLOv5作为一种One-stage检测器,具有计算量小、识别速度快等优点。目前在目标识别方面运用广泛,识别效果较好。The target detection task is to let the computer automatically detect the location and type of the target object in the image or video. It is a classic task in computer vision. At present, deep learning methods in the field of target detection are mainly divided into two categories: two-stage target detection algorithms; one-stage target detection algorithms. The former is to generate a series of candidate boxes as samples by the algorithm, and then classify the samples through the convolutional neural network; the latter does not need to generate candidate boxes, and directly converts the problem of target frame positioning into a regression problem. It is precisely because of the difference between the two methods that they are also different in performance. The former is superior in detection accuracy and positioning accuracy, and the latter is superior in algorithm speed. As a one-stage detector, YOLOv5 has the advantages of small amount of calculation and fast recognition speed. At present, it is widely used in target recognition, and the recognition effect is good.
在光学显微镜下寻找包裹体也可视为一种图像处理的工作,寻找特定形状的图案。流体包裹体大多呈椭圆形、圆形和不规则状,少量纯液相包裹体呈不规则状。流体包裹体主要有单相和两相包裹体,多数气液两相包裹体中的气相剧烈跳动。根据流体包裹体的特点,存在气相剧烈跳动的现象。在瞬时的图像上有明显的特征,可用深度学习的方法识别这种特殊的形状的图像,即包裹体目标检测。YOLOv5是一个高性能、通用的目标检测模型,能一次性完成目标定位与目标分类两个任务,因此选择YOLOv5作为目标检测的基本骨架。Finding inclusions under a light microscope can also be viewed as a form of image processing, looking for patterns of specific shapes. Most of the fluid inclusions are oval, round and irregular, and a small amount of pure liquid inclusions are irregular. Fluid inclusions mainly include single-phase and two-phase inclusions, and the gas phase in most gas-liquid two-phase inclusions fluctuates violently. According to the characteristics of fluid inclusions, there is a phenomenon of violent jumping of the gas phase. There are obvious features in the instantaneous image, and the deep learning method can be used to identify the image of this special shape, that is, inclusion target detection. YOLOv5 is a high-performance, general-purpose target detection model that can complete the two tasks of target location and target classification at one time, so YOLOv5 is selected as the basic skeleton of target detection.
因为包裹体形状小,一般在在光学显微镜下寻找包裹体是一个费时费力,并需要依赖一定的经验去寻找,同时人工的寻找难免会存在找漏,找错的情况。本发明对YOLOV5模型进行改进,提高了对于流体包裹体检测的精度和效果。Because of the small size of inclusions, it is generally time-consuming and laborious to find inclusions under an optical microscope, and requires a certain amount of experience to find them. At the same time, manual searches will inevitably lead to missing and wrong finding. The invention improves the YOLOV5 model to improve the accuracy and effect of fluid inclusion detection.
发明内容Contents of the invention
本发明所要解决的技术问题是提出一种基于YOLOV5的流体包裹体的检测方法。包裹体目标较小,本发明对YOLOV5模型的特征提取网络和特征融合网络进行研究和改进,同时加入了小目标检测层,提高了对流体包裹体的检测能力。本发明克服了现在人工寻找存在的缺陷,实现高效、准确的识别包裹体。The technical problem to be solved by the present invention is to propose a method for detecting fluid inclusions based on YOLOV5. The inclusion target is small. The present invention researches and improves the feature extraction network and feature fusion network of the YOLOV5 model, and at the same time adds a small target detection layer to improve the detection ability of fluid inclusions. The present invention overcomes the defects existing in manual search and realizes efficient and accurate identification of inclusions.
1、为了实现上述目的,通过以下技术方案来实现的,其具体步骤为:1. In order to achieve the above purpose, it is realized through the following technical solutions, and its specific steps are:
步骤一、包裹体图像收集:在光学显微镜下观测包裹体,拍摄包裹体的视频。对拍摄的视频进行采样,抽取视频不同时间的帧,获得图片。
步骤二、图像标注以及数据集划分:对获得的图片标注图中包裹体的边界框位置和类别,将标注的数据按照4:1的比例随机分成训练集和测试集。Step 2: Image annotation and data set division: Annotate the bounding box position and category of the inclusions in the obtained image, and divide the annotated data randomly into a training set and a test set at a ratio of 4:1.
步骤三:图像数据增强:使用torchvision对训练图像进行处理,旋转、裁剪,增加训练集图片数量,来提高模型的识别能力。Step 3: Image data enhancement: Use torchvision to process the training images, rotate and crop them, and increase the number of images in the training set to improve the recognition ability of the model.
步骤四、构建模型:YOLOV5的检测网络模型由主干backbone、Neck和输出模块output三个部分组成其中,所述主干backbone包括BottleneckCSP模块和Focus模块;所述BottleneckCSP模块用于增强整个卷积神经网络学习性能;所述Focus模块用于对图片进行切片操作,将输入通道扩充为原来的4倍,并经过一次卷积得到下采样特征图;所述Neck中采用了FPN与PAN结合的结构,将常规的FPN层与自底向上的特征金字塔进行结合,将所提取的语义特征与位置特征进行融合,同时将主干层与检测层进行特征融合,使模型获取更加丰富的特征信息;所述输出模块output对图像特征进行预测,输出一个具有目标对象的类别概率、对象得分和该对象边界框的位置的向量。Step 4, build the model: the detection network model of YOLOV5 consists of three parts: backbone backbone, Neck and output module output. The backbone backbone includes the BottleneckCSP module and the Focus module; the BottleneckCSP module is used to enhance the learning of the entire convolutional neural network Performance; the Focus module is used to slice the picture, expand the input channel to 4 times the original, and obtain the downsampling feature map after a convolution; the Neck uses a structure combining FPN and PAN, and the conventional The FPN layer of the FPN layer is combined with the feature pyramid from the bottom up, and the extracted semantic feature and position feature are fused, and the feature fusion of the backbone layer and the detection layer is performed at the same time, so that the model can obtain more abundant feature information; the output module output Predicts image features, outputting a vector with the target object's class probability, object score, and location of the object's bounding box.
添加注意力机制:在YoloV5的骨干网络中引入coordinate attention(CA)注意力机制。CA注意力机制。通过精确的位置信息对通道关系和长期依赖性进行编码,具体操作分为Coordinate信息嵌入和Coordinate Attention生成2个步骤。首先将全局平均池化分解成水平和竖直两个方向。具体来说,给定输入X,首先使用尺寸为(H,1)或(1,W)的poolingkernel分别沿着水平坐标和垂直坐标对每个通道进行编码。因此,高度为h的第c通道的输出可以表示为:Add attention mechanism: Introduce the coordinate attention (CA) attention mechanism in the backbone network of YoloV5. CA attention mechanism. The channel relationship and long-term dependence are encoded through precise location information. The specific operation is divided into two steps: Coordinate information embedding and Coordinate Attention generation. First, the global average pooling is decomposed into horizontal and vertical directions. Specifically, given an input X, we first encode each channel along the horizontal and vertical coordinates using a pooling kernel of size (H, 1) or (1, W), respectively. Therefore, the output of the cth channel with height h can be expressed as:
同样,宽度为w的第c通道的输出可表示为:Similarly, the output of the cth channel of width w can be expressed as:
上述2种变换分别沿两个空间方向聚合特征,得到一对方向感知的特征图。能够将横向和纵向的位置信息编码到channel attention中,使得移动网络能够关注大范围的位置信息又不会带来过多的计算量。不仅获取了通道间信息,还考虑了方向相关的位置信息,有助于模型更好地定位和识别目标;足够灵活和轻量,能够简单地插入移动网络的核心结构中;有助于对包裹体的定位识别。The above two transformations aggregate features along two spatial directions respectively to obtain a pair of direction-aware feature maps. The ability to encode horizontal and vertical location information into channel attention enables mobile networks to focus on a wide range of location information without causing too much computation. Not only the inter-channel information is obtained, but also the direction-related position information is considered, which helps the model to better locate and identify targets; it is flexible and lightweight enough to be easily inserted into the core structure of the mobile network; it is helpful for the package Body location recognition.
对模型的Neck部分进行优化:将原始网络的PANet换成BiFPN网络,以提高检测精度。BiFPN网络引入了可学习的权重因子来表征不同输入特征的重要程度同时反复应用自顶向下和自底向上的多尺度特征融合。BiFPN采用跨连接去除PANet中对特征融合贡献度较小的节点,在同一尺度的输入节点和输出节点间增加一个跳跃连接,在不增加较多成本的同时,融合了更多的特征。在同一特征尺度上,把每一个双向路径看作一个特征网络层,并多次反复利用同一层,以实现更高层次的特征融,以提高检测精度。即对原始输入图片增加一个4倍下采样的过程。原始图片经过4倍下采样后送入到特征融合网络得到新尺寸的特征图,该特征图感受野较小,位置信息相对丰富,可以提升检测小目标的检测效果。Optimize the Neck part of the model: replace the PANet of the original network with a BiFPN network to improve detection accuracy. The BiFPN network introduces learnable weight factors to represent the importance of different input features while repeatedly applying top-down and bottom-up multi-scale feature fusion. BiFPN uses cross-connection to remove nodes that contribute less to feature fusion in PANet, and adds a skip connection between the input node and output node of the same scale, which fuses more features without adding more costs. On the same feature scale, each bidirectional path is regarded as a feature network layer, and the same layer is repeatedly used to achieve higher-level feature fusion to improve detection accuracy. That is, the process of adding a 4-fold downsampling to the original input image. After the original image is down-sampled by 4 times, it is sent to the feature fusion network to obtain a feature map of a new size. The feature map has a small receptive field and relatively rich position information, which can improve the detection effect of small targets.
对检测尺度进行调整:与传统目标检测网络类似,YOLOv5s原网络也是从第3层特征层开始进行特征融合的。小目标检测层则是将第2层特征层加入特征融合网络,从而提高网络对小目标的检测能力,本发明在原始YOLOv5s算法基础上添加了一个小目标检测层以保留浅层语义信息。将特征提取网络中原本没有进行融合的160×160的特征图增加到检测层,并在特征融合网络中增加1次上采样操作和下采样操作,从而将最后输出检测层增加至4层。增加了检测层后,输出的预测框也从9个相应地增加到12个,所增加的3个预测框均为长宽比不同且针对小目标检测的。Adjust the detection scale: Similar to the traditional target detection network, the original YOLOv5s network also performs feature fusion from the third feature layer. The small target detection layer is to add the second feature layer to the feature fusion network, thereby improving the detection ability of the network for small targets. The present invention adds a small target detection layer on the basis of the original YOLOv5s algorithm to retain shallow semantic information. Add the 160×160 feature map that was not fused in the feature extraction network to the detection layer, and add an upsampling operation and a downsampling operation in the feature fusion network, so as to increase the final output detection layer to 4 layers. After adding the detection layer, the number of output prediction frames is correspondingly increased from 9 to 12, and the 3 added prediction frames are all with different aspect ratios and for small target detection.
步骤五、训练模型并调参优化模型:采样划分好的训练集基于改进的YOLOV5模型进行训练。每次迭代都计算损失函数,并更新参数值,使损失函数的值最小,直到模型收敛,同时为防止过拟合。Step 5. Train the model and adjust parameters to optimize the model: the training set that has been sampled and divided is trained based on the improved YOLOV5 model. The loss function is calculated for each iteration, and the parameter values are updated to minimize the value of the loss function until the model converges, while preventing overfitting.
步骤六、在完成模型训练后,保存模型权重参数,设置格式为.pt格式。对保存到模型权重文件重新加载,并用这个权重文件检测包裹体,检测图像中是否存在包裹体。Step 6. After completing the model training, save the model weight parameters and set the format to .pt format. Reload the weight file saved to the model, and use this weight file to detect inclusions, and detect whether there are inclusions in the image.
附图说明Description of drawings
图1采集并处理的包裹体图片;Figure 1 is a picture of inclusions collected and processed;
图2使用Make Sense对图像打标;Figure 2 uses Make Sense to mark the image;
图3CA注意力机制;Figure 3CA attention mechanism;
图4改进的YOLOv5网络模型Figure 4 Improved YOLOv5 network model
图5结果图Figure 5 result graph
具体实施方式Detailed ways
步骤一、包裹体图像收集:由于没有关于包裹体的数据集,本发明数据集的收集渠道为实验室五个白云岩薄片。在光学显微镜下观测包裹体,拍摄包裹体的视频。再通过视频采样图片,使用OpenCV对拍摄的视频进行间隔采样,抽取视频不同时间的帧,获得图片500张。
步骤二、图像标注以及数据集划分:使用Make Sense对获得的图片进行标注物体边界框位置和类别,然后将数据集划分为训练集、验证集,其比例为4:1。Step 2. Image annotation and data set division: Use Make Sense to mark the position and category of the object bounding box on the obtained picture, and then divide the data set into a training set and a verification set with a ratio of 4:1.
步骤三:图像数据增强:使用torchvision对训练图像进行处理,旋转、裁剪,增加训练集图片数量,来提高模型的识别能力。Step 3: Image data enhancement: Use torchvision to process the training images, rotate and crop them, and increase the number of images in the training set to improve the recognition ability of the model.
步骤四、构建模型:YOLOV5的检测网络模型由主干backbone、Neck和输出模块output三个部分组成其中,所述主干backbone包括BottleneckCSP模块和Focus模块;所述BottleneckCSP模块用于增强整个卷积神经网络学习性能;所述Focus模块用于对图片进行切片操作,将输入通道扩充为原来的4倍,并经过一次卷积得到下采样特征图;所述Neck中采用了FPN与PAN结合的结构,将常规的FPN层与自底向上的特征金字塔进行结合,将所提取的语义特征与位置特征进行融合,同时将主干层与检测层进行特征融合,使模型获取更加丰富的特征信息;所述输出模块output对图像特征进行预测,输出一个具有目标对象的类别概率、对象得分和该对象边界框的位置的向量。Step 4, build the model: the detection network model of YOLOV5 consists of three parts: backbone backbone, Neck and output module output. The backbone backbone includes the BottleneckCSP module and the Focus module; the BottleneckCSP module is used to enhance the learning of the entire convolutional neural network Performance; the Focus module is used to slice the picture, expand the input channel to 4 times the original, and obtain the downsampling feature map after a convolution; the Neck uses a structure combining FPN and PAN, and the conventional The FPN layer of the FPN layer is combined with the feature pyramid from the bottom up, and the extracted semantic feature and position feature are fused, and the feature fusion of the backbone layer and the detection layer is performed at the same time, so that the model can obtain more abundant feature information; the output module output Predicts image features, outputting a vector with the target object's class probability, object score, and location of the object's bounding box.
添加注意力机制:在YoloV5的骨干网络中引入coordinate attention(CA)注意力机制。CA注意力机制。通过精确的位置信息对通道关系和长期依赖性进行编码,具体操作分为Coordinate信息嵌入和Coordinate Attention生成2个步骤。首先将全局平均池化分解成水平和竖直两个方向。具体来说,给定输入X,首先使用尺寸为(H,1)或(1,W)的poolingkernel分别沿着水平坐标和垂直坐标对每个通道进行编码。因此,高度为h的第c通道的输出可以表示为:Add attention mechanism: Introduce the coordinate attention (CA) attention mechanism in the backbone network of YoloV5. CA attention mechanism. The channel relationship and long-term dependence are encoded through precise location information. The specific operation is divided into two steps: Coordinate information embedding and Coordinate Attention generation. First, the global average pooling is decomposed into horizontal and vertical directions. Specifically, given an input X, each channel is first encoded along the horizontal and vertical coordinates using a pooling kernel of size (H, 1) or (1, W), respectively. Therefore, the output of the cth channel with height h can be expressed as:
同样,宽度为w的第c通道的输出可表示为:Similarly, the output of the cth channel of width w can be expressed as:
上述2种变换分别沿两个空间方向聚合特征,得到一对方向感知的特征图。能够将横向和纵向的位置信息编码到channel attention中,使得移动网络能够关注大范围的位置信息又不会带来过多的计算量。不仅获取了通道间信息,还考虑了方向相关的位置信息,有助于模型更好地定位和识别目标;足够灵活和轻量,能够简单地插入移动网络的核心结构中;有助于对包裹体的定位识别。The above two transformations aggregate features along two spatial directions respectively to obtain a pair of direction-aware feature maps. The ability to encode horizontal and vertical location information into channel attention enables mobile networks to focus on a wide range of location information without causing too much computation. Not only the inter-channel information is obtained, but also the direction-related position information is considered, which helps the model to better locate and identify targets; it is flexible and lightweight enough to be easily inserted into the core structure of the mobile network; it is helpful for the package Body location recognition.
对模型的Neck部分进行优化:将原始网络的PANet换成BiFPN网络,以提高检测精度。BiFPN网络引入了可学习的权重因子来表征不同输入特征的重要程度同时反复应用自顶向下和自底向上的多尺度特征融合。BiFPN采用跨连接去除PANet中对特征融合贡献度较小的节点,在同一尺度的输入节点和输出节点间增加一个跳跃连接,在不增加较多成本的同时,融合了更多的特征。在同一特征尺度上,把每一个双向路径看作一个特征网络层,并多次反复利用同一层,以实现更高层次的特征融,以提高检测精度。即对原始输入图片增加一个4倍下采样的过程。原始图片经过4倍下采样后送入到特征融合网络得到新尺寸的特征图,该特征图感受野较小,位置信息相对丰富,可以提升检测小目标的检测效果。Optimize the Neck part of the model: replace the PANet of the original network with a BiFPN network to improve detection accuracy. The BiFPN network introduces learnable weight factors to represent the importance of different input features while repeatedly applying top-down and bottom-up multi-scale feature fusion. BiFPN uses cross-connection to remove nodes that contribute less to feature fusion in PANet, and adds a skip connection between the input node and output node of the same scale, which fuses more features without adding more costs. On the same feature scale, each bidirectional path is regarded as a feature network layer, and the same layer is repeatedly used to achieve higher-level feature fusion to improve detection accuracy. That is, the process of adding a 4-fold downsampling to the original input image. After the original image is down-sampled by 4 times, it is sent to the feature fusion network to obtain a feature map of a new size. The feature map has a small receptive field and relatively rich position information, which can improve the detection effect of small targets.
对检测尺度进行调整:与传统目标检测网络类似,YOLOv5s原网络也是从第3层特征层开始进行特征融合的。小目标检测层则是将第2层特征层加入特征融合网络,从而提高网络对小目标的检测能力,本发明在原始YOLOv5s算法基础上添加了一个小目标检测层以保留浅层语义信息。将特征提取网络中原本没有进行融合的160×160的特征图增加到检测层,并在特征融合网络中增加1次上采样操作和下采样操作,从而将最后输出检测层增加至4层。增加了检测层后,输出的预测框也从9个相应地增加到12个,所增加的3个预测框均为长宽比不同且针对小目标检测的。Adjust the detection scale: Similar to the traditional target detection network, the original YOLOv5s network also performs feature fusion from the third feature layer. The small target detection layer is to add the second feature layer to the feature fusion network, thereby improving the detection ability of the network for small targets. The present invention adds a small target detection layer on the basis of the original YOLOv5s algorithm to retain shallow semantic information. Add the 160×160 feature map that was not fused in the feature extraction network to the detection layer, and add an upsampling operation and a downsampling operation in the feature fusion network, so as to increase the final output detection layer to 4 layers. After adding the detection layer, the number of output prediction frames is correspondingly increased from 9 to 12, and the 3 added prediction frames are all with different aspect ratios and for small target detection.
步骤五、训练模型并调参优化模型:采样划分好的训练集基于改进的YOLOV5模型进行训练。每次迭代都计算损失函数,并更新参数值,使损失函数的值最小,直到模型收敛,同时为防止过拟合。Step 5. Train the model and adjust parameters to optimize the model: the training set that has been sampled and divided is trained based on the improved YOLOV5 model. The loss function is calculated for each iteration, and the parameter values are updated to minimize the value of the loss function until the model converges, while preventing overfitting.
步骤六、在完成模型训练后,保存模型权重参数,设置格式为.pt格式。对保存到模型权重文件重新加载,并用这个权重文件检测包裹体,检测图像中是否存在包裹体。Step 6. After completing the model training, save the model weight parameters and set the format to .pt format. Reload the weight file saved to the model, and use this weight file to detect inclusions, and detect whether there are inclusions in the image.
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310231853.5A CN116229458A (en) | 2023-03-10 | 2023-03-10 | A method for detecting inclusions based on YOLOV5 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310231853.5A CN116229458A (en) | 2023-03-10 | 2023-03-10 | A method for detecting inclusions based on YOLOV5 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116229458A true CN116229458A (en) | 2023-06-06 |
Family
ID=86588996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310231853.5A Pending CN116229458A (en) | 2023-03-10 | 2023-03-10 | A method for detecting inclusions based on YOLOV5 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116229458A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117496666A (en) * | 2023-11-16 | 2024-02-02 | 成都理工大学 | An intelligent and efficient drowning rescue system and method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444833A (en) * | 2020-03-25 | 2020-07-24 | 中国农业科学院农业信息研究所 | Fruit measurement production method and device, computer equipment and storage medium |
CN113705531A (en) * | 2021-09-10 | 2021-11-26 | 北京航空航天大学 | Method for identifying alloy powder inclusions based on microscopic imaging |
CN114627502A (en) * | 2022-03-10 | 2022-06-14 | 安徽农业大学 | A target recognition detection method based on improved YOLOv5 |
-
2023
- 2023-03-10 CN CN202310231853.5A patent/CN116229458A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444833A (en) * | 2020-03-25 | 2020-07-24 | 中国农业科学院农业信息研究所 | Fruit measurement production method and device, computer equipment and storage medium |
CN113705531A (en) * | 2021-09-10 | 2021-11-26 | 北京航空航天大学 | Method for identifying alloy powder inclusions based on microscopic imaging |
CN114627502A (en) * | 2022-03-10 | 2022-06-14 | 安徽农业大学 | A target recognition detection method based on improved YOLOv5 |
Non-Patent Citations (2)
Title |
---|
余平平 等: "融合BiFPN和YOLOv5s的密集型原木端面检测方法", 《林业工程学报》, vol. 8, no. 1, 25 January 2023 (2023-01-25), pages 126 - 134 * |
邹辉军 等: "面向输电线路小目标异物检测的改进YOLO 网络", 《南京工程学院学报( 自然科学版)》, vol. 20, no. 3, 15 September 2022 (2022-09-15), pages 7 - 14 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117496666A (en) * | 2023-11-16 | 2024-02-02 | 成都理工大学 | An intelligent and efficient drowning rescue system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110533084B (en) | Multi-scale target detection method based on self-attention mechanism | |
CN107145908B (en) | A small target detection method based on R-FCN | |
CN108830285B (en) | Target detection method for reinforcement learning based on fast-RCNN | |
CN111783590A (en) | A Multi-Class Small Object Detection Method Based on Metric Learning | |
CN111401410A (en) | A Traffic Sign Detection Method Based on Improved Cascaded Neural Network | |
CN106682696A (en) | Multi-example detection network based on refining of online example classifier and training method thereof | |
CN112784756B (en) | Human body identification tracking method | |
CN110659601A (en) | Depth full convolution network remote sensing image dense vehicle detection method based on central point | |
CN113239753A (en) | Improved traffic sign detection and identification method based on YOLOv4 | |
CN112991281B (en) | Visual detection method, system, electronic equipment and medium | |
CN110348492A (en) | A kind of correlation filtering method for tracking target based on contextual information and multiple features fusion | |
CN114821102A (en) | Intensive citrus quantity detection method, equipment, storage medium and device | |
CN111008576A (en) | Pedestrian detection and model training and updating method, device and readable storage medium thereof | |
CN113723558A (en) | Remote sensing image small sample ship detection method based on attention mechanism | |
CN105320764A (en) | 3D model retrieval method and 3D model retrieval apparatus based on slow increment features | |
CN114519819A (en) | Remote sensing image target detection method based on global context awareness | |
CN113361528A (en) | Multi-scale target detection method and system | |
CN112991280B (en) | Visual detection method, visual detection system and electronic equipment | |
CN112507904B (en) | A real-time detection method of classroom human posture based on multi-scale features | |
CN117437555A (en) | A remote sensing image target extraction and processing method and device based on deep learning | |
CN113177511A (en) | Rotating frame intelligent perception target detection method based on multiple data streams | |
CN108154113A (en) | Tumble event detecting method based on full convolutional network temperature figure | |
CN110852241A (en) | A small target detection method applied to nursing robots | |
CN117671457A (en) | Welding part surface defect detection method based on improved YOLOV7-Tiny algorithm | |
CN117079125A (en) | Kiwi fruit pollination flower identification method based on improved YOLOv5 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information |
Inventor after: Wang Xingjian Inventor after: Wen Xuemei Inventor after: Cao Junxing Inventor after: He Faqi Inventor before: Wang Xingjian Inventor before: Wen Xuemei Inventor before: Cao Junxing |
|
CB03 | Change of inventor or designer information |