CN112598660B

CN112598660B - Automatic detection method for pulp cargo quantity in wharf loading and unloading process

Info

Publication number: CN112598660B
Application number: CN202011590292.0A
Authority: CN
Inventors: 耿增涛; 李全喜; 张子青; 李宁孝; 陆兵; 乔善青; 石雪琳; 王国栋; 李新照; 郭振
Original assignee: Qingdao Ocean Shipping Tally Co ltd; Shandong Port Technology Group Qingdao Co ltd; Qingdao University
Current assignee: Qingdao Ocean Shipping Tally Co ltd; Shandong Port Technology Group Qingdao Co ltd; Qingdao University
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2022-10-21
Anticipated expiration: 2040-12-29
Also published as: CN112598660A

Abstract

The invention discloses an automatic detection method for the quantity of paper pulp cargos in the loading and unloading process of a wharf, which comprises the following steps: s1: extracting a video stream of a transport vehicle loaded with pulp goods in real time at a certain frequency; s2: processing the extracted video image to segment a characteristic diagram of the paper pulp cargo part; s3: processing the characteristic diagram to obtain candidate connection points; s4: extracting a series of candidate line segment samples according to the candidate connecting points; s5: filtering the candidate line segment samples; s6: and counting the number of the candidate line segment samples to obtain the number of the pulp cargos. The invention can automatically detect the quantity of paper pulp goods, reduce the task amount and improve the detection efficiency, has no manual participation, has high intelligent degree and improves the detection accuracy.

Description

A method for automatic detection of pulp cargo quantity in terminal loading and unloading process

技术领域technical field

本发明涉及计算机视觉及深度学习技术领域，尤其涉及一种码头装卸过程的纸浆货物数量自动检测方法。The invention relates to the technical field of computer vision and deep learning, in particular to an automatic detection method for the quantity of pulp cargo in the loading and unloading process of a wharf.

背景技术Background technique

目前，码头业务繁荣发展，纸浆货物在码头装卸过程中，需要通过运输工具（例如平板卡车）输送。纸浆货物具有一捆一包的特点，在进行纸浆货物装卸过程中，需要人工对纸浆货物进行清点，在进行人工清点后纸质记录清点数量，此后方可将卡车放行。At present, the terminal business is booming, and pulp cargo needs to be transported by means of transport (such as flatbed trucks) during the loading and unloading process at the terminal. Pulp cargo has the characteristics of one bundle and one package. During the loading and unloading process of pulp cargo, it is necessary to manually count the pulp cargo. After the manual inventory is carried out, the paper records the counted quantity, and then the truck can be released.

此种人工清点工作不仅任务量大，增加人力成本，而且效率低，同时还存在由于人工疲劳或疏忽而产生的清点数量有误的问题。Such manual inventory work not only has a large workload, increases labor costs, but also has low efficiency. At the same time, there is a problem that the inventory quantity is wrong due to manual fatigue or negligence.

因此，需要一种智能化且自动化的纸浆货物数量检测方法，降低任务量的同时提升清点效率及准确性。Therefore, there is a need for an intelligent and automated method for detecting the quantity of pulp cargoes, which can reduce the amount of tasks and improve the efficiency and accuracy of inventory.

发明内容SUMMARY OF THE INVENTION

本发明的实施例提供一种码头装卸过程的纸浆货物数量自动检测方法，首先确定视频图像中纸浆货物部分的位置，再对检测出的纸浆货物部分进行数量检测，自动化检测纸浆货物的数量，降低任务量且提高检测效率，无人工参与，智能化程度高，提高检测准确性。The embodiment of the present invention provides an automatic detection method for the quantity of pulp cargo in the loading and unloading process of the terminal. First, the position of the pulp cargo part in the video image is determined, and then the quantity of the detected pulp cargo part is detected, so as to automatically detect the quantity of the pulp cargo and reduce the The workload is high and the detection efficiency is improved, no manual participation is required, the degree of intelligence is high, and the detection accuracy is improved.

为实现上述发明目的，本发明采用下述技术方案予以实现：In order to realize the above-mentioned purpose of the invention, the present invention adopts the following technical scheme to realize:

本申请涉及一种码头装卸过程的纸浆货物数量自动检测方法，其特征在于，包括如下步骤：The present application relates to an automatic detection method for the quantity of pulp cargo in the loading and unloading process of a terminal, which is characterized in that it includes the following steps:

S1：对装载有纸浆货物的运输工具的视频流以一定的频率实时抽取；S1: Real-time extraction of the video stream of the transport vehicle loaded with pulp cargo at a certain frequency;

S2：对所抽取的视频图像进行处理，分割出纸浆货物部分的特征图；S2: Process the extracted video image, and segment the feature map of the pulp cargo part;

S3：对所述特征图进行处理，获取候选连接点；S3: Process the feature map to obtain candidate connection points;

S4：根据所述候选连接点，提取一系列候选线段样本；S4: extracting a series of candidate line segment samples according to the candidate connection points;

S5：过滤所述候选线段样本；S5: filter the candidate line segment samples;

S6：统计所述候选线段样本的数量，以获取纸浆货物数量。S6: Count the number of the candidate line segment samples to obtain the number of pulp cargoes.

在本申请中，步骤S2包括如下：In this application, step S2 includes the following:

S21：将所抽取的视频图像输入至残差网络，以得到特征图像；S21: Input the extracted video image to a residual network to obtain a feature image;

S22：将所述特征图像进行处理，提取区分前景和背景的矩形候选框；S22: Process the feature image to extract a rectangular candidate frame that distinguishes the foreground and the background;

S23：将所提取到的矩形候选框映射到所述特征图像中，利用区域特征聚集技术统一所述矩形候选框的窗口大小；S23: Map the extracted rectangular candidate frame to the feature image, and unify the window size of the rectangular candidate frame by using regional feature aggregation technology;

S24：将所述特征图像按照所述矩形候选框的坐标信息分割出所述纸浆货物部分的特征图。S24: Segment the feature image into a feature map of the pulp cargo portion according to the coordinate information of the rectangular candidate frame.

在本申请中，所述S2还包括在S23之后且在S24之前的如下步骤：S23'：对S23中统一窗口后的矩形候选框进行边界边框回归，以对所述矩形候选框的坐标信息进行修正。In the present application, the S2 also includes the following steps after S23 and before S24: S23': perform boundary frame regression on the rectangular candidate frame after the unified window in S23, so as to perform the coordinate information of the rectangular candidate frame. Correction.

在本申请中，步骤S3包括如下步骤：In this application, step S3 includes the following steps:

S31：将所述特征图进行网格划分，形成M个网格单元Wx×Hx；S31: Perform grid division on the feature map to form M grid units W x ×H x ;

S32：对所述特征图依次输入至卷积层和分类层进行处理，以计算每个网格单元的置信度，并将经过所述卷积层处理后的特征图转换为连接点偏移特征图O(x)，S32: Input the feature map to the convolution layer and the classification layer in turn for processing to calculate the confidence of each grid unit, and convert the feature map processed by the convolution layer into a connection point offset feature Figure O( x ),

；

;

其中V表示点集，li表示点集V中某个连接点i在网格单元x中的位置，

表示网格单元x的中心位置；where V represents the point set, li represents the position of a connection point i in the point set V in the grid unit x ,

Represents the center position of the grid cell x ;

S33：对所计算的每个网格单元的置信度进行阈值限定，获取概率特征图P(x)并对各网格单元中是否存在连接点进行分类，S33: Thresholding the calculated confidence level of each grid unit, obtaining a probability feature map P( x ) and classifying whether there is a connection point in each grid unit,

；

;

S34：利用连接点偏移特征图O（x）预测连接点在对应网格单元中的相对位置；S34: Use the connection point offset feature map O( x ) to predict the relative position of the connection point in the corresponding grid unit;

S35：利用线性回归对连接点在对应网格单元中的相对位置进行优化。S35: Use linear regression to optimize the relative positions of the connection points in the corresponding grid cells.

在本申请中，步骤S3还包括如下步骤：通过非极大值抑制技术获取网格单元中连接点的精确的相对位置信息。In the present application, step S3 further includes the following step: acquiring precise relative position information of the connection points in the grid unit by using a non-maximum value suppression technique.

在本申请中，步骤S4包括：In this application, step S4 includes:

S41：采用正负样本混合采样机制的方式输出一系列线段样本的端点坐标信息；S41: outputting the endpoint coordinate information of a series of line segment samples by adopting a mixed sampling mechanism of positive and negative samples;

S42：根据各线段样本的端点坐标信息，对各线段样本进行固定长度的向量化处理，获取各线段样本的特征向量，以提取一系列线段样本。S42: According to the endpoint coordinate information of each line segment sample, perform a fixed-length vectorization process on each line segment sample, and obtain a feature vector of each line segment sample, so as to extract a series of line segment samples.

在本申请中，步骤S5具体为：利用各线段样本为交线形成的各矩形框面积之间的交并比过滤各线段样本。In the present application, step S5 is specifically: filtering each line segment sample by using the intersection ratio between the areas of each rectangular frame formed by each line segment sample as an intersecting line.

在本申请中，步骤S5具体为：利用各线段样本之间的欧式距离过滤各线段样本。In the present application, step S5 is specifically: filtering each line segment sample by using the Euclidean distance between each line segment sample.

本申请涉及的纸浆货物数量自动检测方法，具有如下优点和有益效果：The automatic detection method for the quantity of pulp cargo involved in this application has the following advantages and beneficial effects:

在纸浆货物到港后，将纸浆货物装载在运输工具上，通过在监控画面中实时监控装载有纸浆货物的运输工具的视频流，从实时视频流中抽取视频图像，根据视频图像检测出纸浆货物部分，结合纸浆货物一捆一包的特性，再根据检测出的纸浆货物部分进行线段检测，从而实现纸浆货物数量的检测，整个过程不需要人工参与，降低人工任务量，实现智能化及自动化，且借助计算机进行运算，检测速度快，提高检测效率。After the pulp cargo arrives at the port, the pulp cargo is loaded on the means of transportation, and the video stream of the means of transport loaded with the pulp cargo is monitored in real time on the monitoring screen, and the video image is extracted from the real-time video stream, and the pulp cargo is detected according to the video image. Part, combined with the characteristics of one bundle of pulp cargo, and then perform line segment detection according to the detected pulp cargo part, so as to realize the detection of the quantity of pulp cargo. The whole process does not require manual participation, reduces the amount of manual tasks, and realizes intelligence and automation. And the computer is used for calculation, the detection speed is fast, and the detection efficiency is improved.

结合附图阅读本发明的具体实施方式后，本发明的其他特点和优点将变得更加清楚。Other features and advantages of the present invention will become more apparent after reading the detailed description of the present invention in conjunction with the accompanying drawings.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the embodiments. Obviously, the drawings in the following description are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

图1是本发明提出的码头装卸过程的纸浆货物数量自动检测方法一实施例的流程图；Fig. 1 is the flow chart of one embodiment of the method for automatic detection of pulp cargo quantity in terminal loading and unloading process proposed by the present invention;

图2是本发明提出的码头装卸过程的纸浆货物数量自动检测方法实施例中涉及到的纸浆货物部分原图；Fig. 2 is the original drawing of part of the pulp cargo involved in the embodiment of the method for automatic detection of pulp cargo quantity in the terminal loading and unloading process proposed by the present invention;

图3是本发明提出的码头装卸过程的纸浆货物数量自动检测方法实施例中获取的候选连接点的图；3 is a diagram of candidate connection points obtained in the embodiment of the method for automatic detection of the quantity of pulp cargo in the terminal loading and unloading process proposed by the present invention;

图4是本发明提出的码头装卸过程的纸浆货物数量自动检测方法实施例中获取的真的候选线段样本的图；4 is a diagram of a real candidate line segment sample obtained in the embodiment of the method for automatic detection of pulp cargo quantity in the terminal loading and unloading process proposed by the present invention;

图5是本发明提出的码头装卸过程的纸浆货物数量自动检测方法实施例中将真的候选线段样本映射到纸浆货物部分原图上的最终检测图；5 is a final detection diagram of mapping the true candidate line segment samples to the original image of the pulp cargo part in the embodiment of the method for automatic detection of pulp cargo quantity in the terminal loading and unloading process proposed by the present invention;

图6为所抽取的视频图像；Fig. 6 is the video image that is extracted;

图7为利用本发明提出的码头装卸过程的纸浆货物数量自动检测方法应用在图6中视频图像上的纸浆货物现场检测的效果图。FIG. 7 is an effect diagram of the on-site detection of pulp cargo on the video image in FIG. 6 by using the method for automatic detection of pulp cargo quantity in the terminal loading and unloading process proposed by the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments.

为了避免人工清点在码头装卸过程中纸浆货物包的数量，本申请涉及一种纸浆货物数量自动检测方法。纸浆货物数量自动检测方法的主要任务是，对于监控视频中经过的所有车辆，检测纸浆货物包并计算所包含的纸浆货物数量。In order to avoid manual counting of the quantity of pulp cargo bales during loading and unloading at the wharf, the present application relates to an automatic detection method for the quantity of pulp cargoes. The main task of the method for automatic detection of pulp cargo quantity is to detect pulp cargo packages and calculate the contained pulp cargo quantity for all vehicles passing by in the surveillance video.

该任务为了实现运输工具上的纸浆货物的数量自动检测，综合运用目标检测和线段检测。In this task, in order to realize the automatic detection of the quantity of pulp cargo on the transportation vehicle, the target detection and the line segment detection are comprehensively used.

目标检测主要有两个任务：目标分类和定位，需要从背景中分离出前景物体，并确定处前景的类别和位置信息。Object detection mainly has two tasks: object classification and localization, which needs to separate foreground objects from the background, and determine the category and location information of the foreground.

当前目标检测算法可分为基于候选区域和基于端到端的方法，在检测准确率和定位精度方面，基于候选区域的方法更占优势。同时由于纸浆货物摆放不规则，难以获得体积及中点等辅助信息，为进一步确定纸浆货物数量提高了难度。Current target detection algorithms can be divided into candidate region-based and end-to-end based methods. In terms of detection accuracy and positioning accuracy, candidate region-based methods are more dominant. At the same time, due to the irregular placement of pulp cargoes, it is difficult to obtain auxiliary information such as volume and midpoint, which increases the difficulty for further determining the quantity of pulp cargoes.

纸浆货物具有一捆一包的捆绑特点，结合当前线段检测的技术，采用线段检测的方法可以有效提取数量信息。Pulp cargo has the characteristic of being bundled in a bundle. Combined with the current line segment detection technology, the line segment detection method can effectively extract the quantity information.

参见图1至图7，如下将具体描述纸浆货物数量自动检测方法的实现。Referring to Fig. 1 to Fig. 7, the realization of the method for automatic detection of pulp cargo quantity will be described in detail as follows.

S1：对装载有纸浆货物的运输工具的视频流以一定的频率实时抽取。S1: Real-time extraction of video streams of vehicles loaded with pulp cargo at a certain frequency.

纸浆货物到港后，装载至运输工具，例如运输卡车上，其通过监控画面进行监控，在监控的视频流中按照每秒帧数进行一定频率的帧数抽取，截取视频图像，参考图2。After the pulp cargo arrives at the port, it is loaded onto a transport vehicle, such as a transport truck, which is monitored through the monitoring screen. In the monitored video stream, a certain frequency of frames is extracted according to the number of frames per second, and the video image is captured. Refer to Figure 2.

需要说明的是，运输工具上装载的每包纸浆货物基本成长方体，且高度及横截面积均基本一致，通过绑带（例如钢丝）捆绑成一包。It should be noted that each package of pulp cargo loaded on the means of transport is basically a cuboid, and the height and cross-sectional area are basically the same, and are bundled into a package by straps (such as steel wires).

S2：对所抽取的视频图像进行处理，分割出纸浆货物部分的特征图。S2: Process the extracted video image to segment the feature map of the pulp cargo part.

步骤S2的任务是实现目标检测，分为如下多个部分执行。The task of step S2 is to realize target detection, which is divided into the following parts for execution.

S21: 视频图像输入至残差网络，以得到特征图像。S21: The video image is input to the residual network to obtain the feature image.

为了提取视频图像的语义特征，提高获取图像细节特征的能力，视频图像输入残差网络，以获取特征图。In order to extract the semantic features of the video images and improve the ability to obtain the detailed features of the images, the video images are input into the residual network to obtain the feature map.

该残差网络为融合了膨胀卷积的非对称编码译码结构。The residual network is an asymmetric coding-decoding structure fused with dilated convolutions.

为了方便后续处理，首先将输入的大小不一的视频图像统一为512*512的正方形，然后依次输入多个编码译码模块中。In order to facilitate subsequent processing, the input video images of different sizes are first unified into a 512*512 square, and then sequentially input into multiple coding and decoding modules.

在编码译码模块中，编码部分包含3次步长为2的卷积操作，每个尺寸的卷积后有6个残差连接块，其中残差连接块中所使用的的卷积核进行了膨胀率为2的处理，以便获得更大的感受野。译码部分将图像恢复为输入大小。In the coding and decoding module, the coding part includes 3 convolution operations with a stride of 2, and there are 6 residual connection blocks after the convolution of each size, in which the convolution kernel used in the residual connection block performs The expansion rate of 2 is processed in order to obtain a larger receptive field. The decoding part restores the image to the input size.

非对称编码译码结构对于细节的提取更加充分，同时融合更大的感受野，因此残差模块能够为后续处理提供丰富的细节特征。The asymmetric coding and decoding structure is more sufficient for the extraction of details, and at the same time integrates a larger receptive field, so the residual module can provide rich details for subsequent processing.

S22：将特征图像进行处理，提取区分前景和背景的矩形候选框。S22: Process the feature image to extract a rectangular candidate frame that distinguishes the foreground and the background.

首先，生成包含3种形状（长宽比∈{1:1,1:2,2:1}）的9个矩形框，遍历特征图像中的每个点，并为每个点匹配这9个矩形框，通过Softmax分类器判断出属于前景的矩形框，为一个二分类的问题；同时利用边框边界回归对矩形框的坐标信息进行修正，形成较精确的矩形候选框。First, generate 9 rectangular boxes containing 3 shapes (aspect ratio ∈ {1:1, 1:2, 2:1}), iterate over each point in the feature image, and match these 9 for each point Rectangular frame, the rectangular frame that belongs to the foreground is judged by the Softmax classifier, which is a two-class problem; at the same time, the coordinate information of the rectangular frame is corrected by using the border boundary regression to form a more accurate rectangular candidate frame.

需要说明的是，前景指纸浆货物，背景指除纸浆货物部分外的部分。It should be noted that the foreground refers to the pulp cargo, and the background refers to the part other than the pulp cargo part.

S23：将所提取到的矩形候选框映射到特征图像中，利用区域特征聚集技术统一矩形候选框的窗口大小。S23: Map the extracted rectangular candidate frame to the feature image, and use the regional feature aggregation technology to unify the window size of the rectangular candidate frame.

区域特征聚集技术是在Mask RCNN中使用以便使生成的候选框region proposal映射产生固定大小的feature map时提出的，此为现有实例分割架构中常用到的技术。The region feature aggregation technique is proposed when used in Mask RCNN to generate a fixed-size feature map from the generated candidate frame region proposal mapping, which is a technique commonly used in existing instance segmentation architectures.

S24：将特征图像按照矩形候选框的坐标信息分割出纸浆货物部分的特征图。S24: Segment the feature image according to the coordinate information of the rectangular candidate frame to obtain the feature map of the pulp cargo part.

根据各矩形候选框的坐标信息，找出属于前景的矩形候选框，从而分割出纸浆货物部分的特征图。According to the coordinate information of each rectangular candidate frame, the rectangular candidate frame belonging to the foreground is found, and the feature map of the pulp cargo part is segmented.

为了保证矩形候选框的准确性，以提高纸浆货物数量自动检测精度，在分割纸浆货物部分的特征图之前，对S23中获取到的的矩形候选框再进行依次边界边框回归，对矩形候选框的坐标信息进行修正，实现坐标信息精确。In order to ensure the accuracy of the rectangular candidate frame and improve the automatic detection accuracy of the pulp cargo quantity, before dividing the feature map of the pulp cargo part, the rectangular candidate frame obtained in S23 is subjected to boundary frame regression in sequence, and the rectangular candidate frame obtained in S23 The coordinate information is corrected to achieve accurate coordinate information.

据此具有精确的坐标信息的矩形候选框，才将特征图像按照该矩形候选框的坐标信息分割出纸浆货物部分的特征图。According to the rectangular candidate frame with accurate coordinate information, the feature image of the pulp cargo part is segmented according to the coordinate information of the rectangular candidate frame.

分割出纸浆货物部分的特征图，即完成了目标检测。The feature map of the pulp cargo part is segmented, and the target detection is completed.

此后需要对纸浆货物部分的特征图进行线段检测，以检测视频图像中纸浆货物的数量。After that, line segment detection needs to be performed on the feature map of the pulp cargo part to detect the quantity of pulp cargo in the video image.

具体线段检测的部分参见图1至图7进行如下描述。The specific line segment detection part is described below with reference to FIGS. 1 to 7 .

S3：对特征图进行处理，获取候选连接点。S3: Process the feature map to obtain candidate connection points.

此处的特征图指S24中分割出的纸浆货物部分的特征图。The feature map here refers to the feature map of the pulp cargo portion segmented in S24.

S31：将特征图进行网格划分，形成M个网格单元Wx×Hx。S31 : Perform grid division on the feature map to form M grid units Wx×Hx .

将特征图进行网格化处理，将W×H的特征图划分为M个网格单元，网格单元的网格区域为Wx×Hx，其中V表示点集。The feature map is gridded, and the W×H feature map is divided into M grid units, and the grid area of the grid unit is Wx×Hx , where V represents the point set.

在某个网格单元x中，需要预测出是否存在候选连接点，以及若存在连接点，预测该连接点在该网格单元x中的相对位置。In a certain grid unit x , it is necessary to predict whether there is a candidate connection point, and if there is a connection point, predict the relative position of the connection point in the grid unit x .

S32：对所述特征图依次输入至卷积层和分类层进行处理，以计算每个网格单元的置信度，并将经过所述卷积层处理后的特征图转换连接点偏移特征图O(x)。S32: Input the feature map to the convolutional layer and the classification layer in turn for processing to calculate the confidence of each grid unit, and convert the feature map processed by the convolutional layer into a connection point offset feature map O(x).

具体地，采用包含1*1卷积层和分类层的网络将特征图进行处理，其中在分类层中通过softmax分类函数对每个网格单元中是否存在连接点计算置信度。Specifically, a network including a 1*1 convolutional layer and a classification layer is used to process the feature map, and in the classification layer, the softmax classification function is used to calculate the confidence of whether there is a connection point in each grid unit.

且采用包含1*1卷积层的网络将特征图转换为连接点偏移特征图O（x），如下：And a network containing 1*1 convolutional layers is used to convert the feature map into a connection point offset feature map O( x ), as follows:

（1）。

(1).

其中，li表示点集V中某连接点i在网格单元x中的位置，

表示网格单元x的中心位置。Among them, li represents the position of a connection point i in the point set V in the grid unit x ,

Represents the center position of grid cell x .

S33：对所计算的每个网格单元的置信度进行阈值限定，获取概率特征图P(x)并对各网格单元中是否存在连接点进行分类。S33: Threshold the calculated confidence level of each grid unit, obtain a probability feature map P( x ), and classify whether there is a connection point in each grid unit.

概率特征图P(x)如下：The probability feature map P( x ) is as follows:

（2）。

(2).

即，网格单元中是否存在连接点即为二分类问题。That is, whether there are connection points in grid cells is a binary classification problem.

对计算出的每个网格单元的置信度进行阈值p限定，若网格单元x的置信度大于阈值，则根据公式（3），P(x)=1，认为该网格单元x中存在连接点，否则P(x)=0，认为该网格单元x中不存在连接点。The calculated confidence of each grid unit is limited by a threshold p. If the confidence of grid unit x is greater than the threshold, then according to formula (3), P( x )=1, it is considered that the grid unit x exists in connection point, otherwise P( x )=0, it is considered that there is no connection point in the grid unit x .

若预测到网格单元中存在连接点，则继续预测连接点在该网格单元中的相对位置（此部分如下详细描述）。If it is predicted that there is a connection point in the grid cell, continue to predict the relative position of the connection point in the grid cell (this part is described in detail below).

S34：利用连接点偏移特征图O（x）预测连接点在对应网格单元中的相对位置。S34: Predict the relative position of the connection point in the corresponding grid cell using the connection point offset feature map O( x ).

O（x）设置为网格单元的中心点和连接点i的相对坐标差，用于预测连接点i在网格单元x中的相对位置。O( x ) is set as the relative coordinate difference between the center point of the grid cell and the connection point i , which is used to predict the relative position of the connection point i in the grid cell x .

如果网格单元x中包含连接点i，则选择采用L2线性回归对连接点的相对位置进行优化，该L2线性回归的目标函数为：If the grid cell x contains the connection point i, choose to use L2 linear regression to optimize the relative position of the connection point. The objective function of the L2 linear regression is:

（3）。

(3).

其中Nv表示连接点的个数。where Nv represents the number of connection points.

此外，采用非极大值抑制技术进一步排除每个网络单元中的“非连接点”，即，获取网络单元中连接点的更精确的相对位置信息。In addition, non-maximum suppression techniques are employed to further exclude "non-connected points" in each network element, ie to obtain more precise relative position information of connected points in a network element.

具体在程序中可以通过最大池化的操作实现，用于获取网络单元中连接点的更精确的相对位置信息。Specifically, it can be implemented through the maximum pooling operation in the program, which is used to obtain more accurate relative position information of the connection points in the network unit.

在经过如上所述的S31至S35的处理后，最终输出置信度最高的K个候选连接点的相对位置关系

，参考图3。After the processes of S31 to S35 as described above, the relative positional relationship of the K candidate connection points with the highest confidence is finally output

, refer to Figure 3.

需要说明的是，在实际执行步骤S3之前，会需要利用交叉熵损失函数对S3中的整个模型进行训练，训练好的模型在实际使用时可直接使用，并在特征图作为输入时，输出获得如上所述候选连接点。It should be noted that before step S3 is actually performed, the entire model in S3 needs to be trained by using the cross-entropy loss function. The trained model can be used directly in actual use, and when the feature map is used as input, the output is obtained. Candidate connection points as described above.

S4：根据S3中获取的候选连接点，提取一系列候选线段样本。S4: Extract a series of candidate line segment samples according to the candidate connection points obtained in S3.

此步骤的目的是根据S3中所获取的K个候选连接点

，获取T个候选线段样本

，其中

和

表示第z个候选线段样本的端点坐标。The purpose of this step is based on the K candidate connection points obtained in S3

, get T candidate line segment samples

,in

and

Represents the endpoint coordinates of the zth candidate line segment sample.

S41：采用正负样本混合采样机制的方式获取T个候选线段样本的端点坐标信息。S41: Obtain the endpoint coordinate information of the T candidate line segment samples by adopting a mixed sampling mechanism of positive and negative samples.

需要说明的是，正负样本混合采样是模型训练的准备工作，在训练过程中，K个候选连接点的正负样本数量差距大，需要对正样本和负样本的数量加以平衡，采用正负样本混合训练的方式，其中正样本来自标注的真是线段样本，负样本是通过启发式学习随机产生的非真实的线段样本。It should be noted that the mixed sampling of positive and negative samples is the preparation work for model training. During the training process, the number of positive and negative samples of the K candidate connection points is greatly different, and the number of positive and negative samples needs to be balanced. The way of sample mixed training, in which positive samples are from real line segment samples marked, and negative samples are non-real line segment samples randomly generated through heuristic learning.

当提取到的K个候选连接点中几乎没有准确的正样本或训练达到饱和时，加入定量的正\负样本帮助启动训练。而且，添加的正样本帮助预测点调整位置，提高预测性能。When there are almost no accurate positive samples in the extracted K candidate connection points or the training is saturated, adding quantitative positive/negative samples helps start the training. Moreover, the added positive samples help the prediction points to adjust their positions and improve the prediction performance.

S41：根据各线段样本的端点坐标信息，对各线段样本进行固定长度的向量化处理，获取各线段样本的特征向量，以提取一系列候选线段样本。S41: Perform vectorization processing on each line segment sample with a fixed length according to the endpoint coordinate information of each line segment sample, and obtain a feature vector of each line segment sample, so as to extract a series of candidate line segment samples.

根据某个候选线段样本，例如第z个候选线段样本的两个端点坐标

和

对线段样本进行固定长度的向量化处理，即通过两个端点坐标计算出N _l个均匀分布点，并在步骤S2中输出的特征图上通过双线性插值得到中间点坐标：According to a certain candidate line segment sample, such as the coordinates of the two endpoints of the zth candidate line segment sample

and

The fixed-length vectorization process is performed on the line segment samples, that is, N _l uniformly distributed points are calculated through the coordinates of the two endpoints, and the coordinates of the intermediate points are obtained by bilinear interpolation on the feature map output in step S2:

（5）。

(5).

如此，提取出线段样本的特征向量q，其为C×N _l，其中C为步骤S2输出的特征图的通道数。In this way, the feature vector q of the line segment sample is extracted, which is C × N _l , where C is the number of channels of the feature map output in step S2.

此时，即提取到了候选线段样本。At this point, the candidate line segment samples are extracted.

需要说明的是，在通过线段样本的端点坐标信息获取到一系列候选线段样本之前，是需要经过模型训练的。It should be noted that, before obtaining a series of candidate line segment samples through the endpoint coordinate information of the line segment samples, model training is required.

在训练过程中，需要步骤S2输出的特征图、及步骤S4输出的候选线段样本作为输入。In the training process, the feature map output in step S2 and the candidate line segment samples output in step S4 are required as input.

训练过程简要说明如下：The training process is briefly described as follows:

首先，根据某个候选线段样本，例如第z个候选线段样本的两个端点坐标

和

对线段样本进行固定长度的向量化处理，即通过两个端点坐标计算出N _l个均匀分布点，并在步骤S2中输出的特征图上通过双线性插值得到中间点坐标：First, according to a candidate line segment sample, for example, the coordinates of the two endpoints of the zth candidate line segment sample

and

。

.

然后，通过步长s的一维最大池化操作对特征向量q降维，变为C×N _l/s，并将其展开为一维特征向量。Then, the feature vector q is dimensionally reduced by a one-dimensional max-pooling operation with stride s to become C × N _l /s, and expanded into a one-dimensional feature vector.

将该一维特征向量输入全连接层进行卷积处理，以获取一逻辑数值，具体地，该一维特征向量进行两次全连接卷积后，取log值后并返回

。The one-dimensional feature vector is input into the fully connected layer for convolution processing to obtain a logical value. Specifically, the one-dimensional feature vector is subjected to two fully connected convolutions, and the log value is taken and returned.

.

与真实值y一起进行sigmoid损失计算并模型优化，以提高预测准确率，其中损失函数如下：

Perform sigmoid loss calculation and model optimization together with the true value y to improve the prediction accuracy, where the loss function is as follows:

（6）。

(6).

其中真实值y即是线段样本的真实标签在S2中输出的特征图中的特征值进行卷积后，取log后返回的值。The true value y is the value returned by log after the eigenvalue of the true label of the line segment sample is convolved in the feature map output in S2.

损失是计算预测的log值（即

）和该线段样本的真实标签对应的log值（即y）之间的误差，用于模型训练及优化。The loss is the log value of the computed prediction (i.e.

) and the log value (ie y) corresponding to the true label of the line segment sample, which is used for model training and optimization.

由于在检测中会不可避免地出现重复检测的现象，即，存在两条线段样本重合情况，因此，为了提高线段样本的检测准确性，需要对S4中输出的线段样本进行过滤。Since the phenomenon of repeated detection will inevitably occur in the detection, that is, two line segment samples overlap, therefore, in order to improve the detection accuracy of the line segment samples, it is necessary to filter the line segment samples output in S4.

上述过滤的方法采用步骤S5实现，即，利用各线段样本为交线形成的各矩形框面积之间的交并比（Intersection-over-Union，IoU）过滤各线段样本，或者利用各线段样本之间的欧式距离过滤各线段样本。The above filtering method is implemented in step S5, that is, using the intersection-over-union (IoU) ratio between the areas of each rectangular frame formed by each line segment sample as an intersecting line to filter each line segment sample, or using each line segment sample. The Euclidean distance between each line segment samples.

过滤方法在此不做限制，只要能达到过滤的目的即可。The filtering method is not limited here, as long as the purpose of filtering can be achieved.

以IoU过滤线段样本为例进行说明。Take the IoU filtering line segment sample as an example to illustrate.

假设两条重合线段样本的坐标分别为L₁[(x ₁₁,y ₁₁)，(x ₁₂,y ₁₂)]和L₂[(x ₂₁,y ₂₁)，(x ₂₂, y ₂₂)]。Suppose the coordinates of two coincident line segment samples are L ₁ [( x ₁₁ , y ₁₁ ), ( x ₁₂ , y ₁₂ )] and L ₂ [( x ₂₁ , y ₂₁ ), ( x ₂₂ , y ₂₂ )], respectively.

L₁所形成的矩形框R1的长H₁、宽W₁及其面积A₁，以及L₂所形成的矩形框R2的长H₂、宽W₂及其面积A₂分别表示为：The length H ₁ , the width W ₁ and its area A ₁ of the rectangular frame R1 formed by L ₁ , and the length H ₂ , the width W ₂ and its area A ₂ of the rectangular frame R2 formed by L ₂ are expressed as:

两个矩形框R1和R2交集矩形框的长H、宽H及其面积A分别表示为:The length H, width H and area A of the intersection of the two rectangular frames R1 and R2 are respectively expressed as:

如果

或

，IoU=0。否则，使用如下公式计算IoU：if

or

, IoU=0. Otherwise, calculate the IoU using the following formula:

对IoU进行阈值限定，调整IoU的阈值进行过滤，最终输出过滤后的线段样本，参见图4。Limit the IoU threshold, adjust the IoU threshold for filtering, and finally output the filtered line segment samples, see Figure 4.

S6：统计过滤后的线段样本的数量，以获取纸浆货物数量。S6: Count the number of filtered line segment samples to obtain the number of pulp cargoes.

过滤后的线段样本的数量就是纸浆获取的数量。The number of filtered line segment samples is the number of pulp obtained.

图5是图4中的线段候选样本映射到纸浆货物原图上的最终检测图，从检测效果可以看出，纸浆货物数量自动检测准确。Figure 5 is the final detection map of the line segment candidate sample in Figure 4 mapped to the original image of the pulp cargo. It can be seen from the detection effect that the automatic detection of the quantity of pulp cargo is accurate.

参见图6和图7，图6是截取的原始的视频图像；图7为利用本申请提出的纸浆货物数量自动检测方法应用在图6中视频图像上的纸浆货物现场检测的效果图，其输出了线段样本的数量，即纸浆货物数量。Referring to Fig. 6 and Fig. 7, Fig. 6 is a captured original video image; Fig. 7 is an effect diagram of the on-site inspection of pulp cargoes applied to the video image in Fig. 6 by using the method for automatic detection of pulp cargo quantity proposed by the present application, and its output The number of line segment samples, that is, the number of pulp cargoes.

通过图5和图7的检测效果图可看出，本申请提出的纸浆货物数量自动检测方法，检测准确性高。It can be seen from the detection effect diagrams in FIGS. 5 and 7 that the automatic detection method for the quantity of pulp cargo proposed in the present application has high detection accuracy.

且本申请的纸浆货物数量自动检测方法直接从视频流中抽取视频图像进行检测，完全自动化进行，无需人工参与，降低人工任务量，且计算机智能计算，检测速度快，提高了检测效率。Moreover, the method for automatic detection of pulp cargo quantity of the present application directly extracts video images from the video stream for detection, which is completely automated without manual participation, reduces the amount of manual tasks, and has intelligent computer calculation, fast detection speed, and improved detection efficiency.

以上实施例仅用以说明本发明的技术方案，而非对其进行限制；尽管参照前述实施例对本发明进行了详细的说明，对于本领域的普通技术人员来说，依然可以对前述实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或替换，并不使相应技术方案的本质脱离本发明所要求保护的技术方案的精神和范围。The above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art can still The recorded technical solutions are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions claimed in the present invention.

Claims

1. a method for automatic detection of pulp cargo quantity in a dock loading and unloading process, is characterized in that, comprises the steps:

S1: Real-time extraction of the video stream of the transport vehicle loaded with pulp cargo at a certain frequency;

S2: Process the extracted video image, and segment the feature map of the pulp cargo part;

S3: Process the feature map to obtain K candidate connection points, where K>1;

S4: extracting a series of candidate line segment samples according to the candidate connection points;

S5: filter the candidate line segment samples;

S6: Count the number of the candidate line segment samples to obtain the number of pulp cargoes;

Wherein step S3 includes the following:

S31: Perform grid division on the feature map to form M grid units W x ×H x ;

S32: Input the feature map to the convolution layer and the classification layer in turn for processing to calculate the confidence of each grid unit, and convert the feature map processed by the convolution layer into a connection point offset feature Figure O( x ),

;

where V represents the point set, li represents the position of a connection point i in the point set V in the grid unit x ,

Represents the center position of the grid cell x ;

S33: Thresholding the calculated confidence level of each grid unit, obtaining a probability feature map P( x ) and classifying whether there is a connection point in each grid unit,

;

S34: Use the connection point offset feature map O( x ) to predict the relative position of the connection point in the corresponding grid unit;

S35: Use linear regression to optimize the relative positions of the connection points in the corresponding grid cells.

2. The method for automatically detecting the quantity of pulp goods according to claim 1, wherein step S2 comprises the following steps:

S21: Input the extracted video image to a residual network to obtain a feature image;

S22: Process the feature image to extract a rectangular candidate frame that distinguishes the foreground and the background;

S23: Map the extracted rectangular candidate frame to the feature image, and unify the window size of the rectangular candidate frame by using regional feature aggregation technology;

S24: Segment the feature image into a feature map of the pulp cargo portion according to the coordinate information of the rectangular candidate frame.

3. The method for automatic detection of pulp cargo quantity according to claim 2, wherein the S2 further comprises the following steps after S23 and before S24:

S23': Perform boundary frame regression on the rectangular candidate frame after the unified window in S23, so as to correct the coordinate information of the rectangular candidate frame.

4. The method for automatically detecting the quantity of pulp goods according to claim 1, wherein step S3 further comprises the following steps:

Accurate relative position information of connection points in grid cells is obtained by non-maximum suppression technique.

5. The method for automatically detecting the quantity of pulp goods according to claim 1, wherein step S4 comprises:

S41: outputting the endpoint coordinate information of a series of line segment samples by means of mixed sampling of positive and negative samples;

S42: According to the endpoint coordinate information of each line segment sample, perform a fixed-length vectorization process on each line segment sample, and obtain the feature vector of each line segment sample, so as to extract a series of line segment samples;

Among them, the mixed sampling of positive and negative samples is the preparation work for model training. During the training process, the number of positive and negative samples of the K candidate connection points is greatly different, and the number of positive samples and negative samples needs to be balanced, and positive and negative samples are used for mixed training. method, where the positive samples are from the labeled real line segment samples, and the negative samples are non-real line segment samples randomly generated through heuristic learning.

6. The method for automatically detecting the quantity of pulp goods according to claim 1, wherein step S5 is specifically:

Each line segment sample is filtered using the intersection ratio between the areas of each rectangular frame formed by each line segment sample as an intersecting line.

7. The method for automatically detecting the quantity of pulp goods according to claim 1, wherein step S5 is specifically:

Filter each line segment sample using the Euclidean distance between each line segment sample.