WO2023159927A1 - Rapid object detection method based on conditional branches and expert systems - Google Patents

Rapid object detection method based on conditional branches and expert systems Download PDF

Info

Publication number
WO2023159927A1
WO2023159927A1 PCT/CN2022/120298 CN2022120298W WO2023159927A1 WO 2023159927 A1 WO2023159927 A1 WO 2023159927A1 CN 2022120298 W CN2022120298 W CN 2022120298W WO 2023159927 A1 WO2023159927 A1 WO 2023159927A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
roi
expert system
contribution
image
Prior art date
Application number
PCT/CN2022/120298
Other languages
French (fr)
Chinese (zh)
Inventor
高红霞
黄滨
廖宏宇
牛世成
Original Assignee
华南理工大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华南理工大学 filed Critical 华南理工大学
Publication of WO2023159927A1 publication Critical patent/WO2023159927A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • the present invention relates to the technical field of smart home appliance detection, in particular to a rapid object detection method based on conditional branches and an expert system, which realizes automatic detection, reduces work costs, and improves product defects on PCBA home appliance production and assembly lines and X-ray security inspection. Contraband detection accuracy and efficiency.
  • the intelligent detection of home appliance PCBA uses algorithms to realize automatic detection, but the traditional algorithms currently used rely too much on prior knowledge, and the algorithm is fixedly designed according to the characteristics of the current short-term detection objects, such as feature selection and threshold limit wait.
  • the above traditional algorithms can realize automatic detection, their generalization ability is poor.
  • the algorithm needs to be readjusted to adapt to the new data.
  • a large number of judgment conditions are often added to the algorithm, which greatly reduces the detection speed of the object and causes the problem of poor real-time detection.
  • the same problem also exists in the field of X-ray security detection.
  • the current detection methods mainly rely on Manual operation requires a lot of human resources and requires long-term professional training for testing personnel. In the process of detection, due to long-term concentration, it may lead to the decrease of the staff's attention and distraction in the detection, which will lead to the increase of time in the detection, and the situation of missed detection and false detection often occurs. Sometimes it is necessary to adjust the running speed of the security inspection channel so that the inspectors can find out the contraband.
  • the purpose of the present invention is to overcome the shortcomings and deficiencies of the prior art, and propose a rapid object detection method based on conditional branching and expert systems, to realize automatic detection of home appliance production lines and contraband security checks, without training special staff , reduce the input of manpower and material resources, and can maintain stable detection accuracy and detection speed, and realize high-efficiency work.
  • the technical solution provided by the present invention is: a method for rapid object detection based on conditional branching and expert system, comprising the following steps:
  • the prediction results of the three expert system networks are weighted and fused, and the category and location of the detected object are identified and marked.
  • step 1) the detection object is placed on the conveyor belt, and when the conveyor belt transports the detection object to the detection area, the X-ray instrument emits a fan-shaped ray beam through the collimator to scan the detection object, and the fan-shaped ray beam passes through the detection area.
  • the inside of the object is projected on the receiving screen, and the X-ray image of the detected object is obtained through computer rendering technology.
  • each branch is provided with a feature extraction network, and the X-ray image is sent to three conditional branches after color space transformation, and the image feature maps of RGB, HSV and gradient are obtained after the operation;
  • the feature extraction network is a deep network consisting of a convolutional layer, a pooling layer and a nonlinear mapping layer;
  • f 1 [x, y] is the data of the image in the (x, y) area
  • w[x, y] is the convolution kernel
  • f 2 [x, y] is the convolution in the (x, y) area
  • n i , nj are the offset distances from the convolution center
  • n 1 , n 2 are the maximum offset distances in the vertical and horizontal directions of the convolution, respectively
  • f[x+n i ,y+ n j ] is the value of the image at (x+n i ,y+n j )
  • w[n i ,n j ] is the weight of the convolution kernel at (n i ,n j );
  • f 3 [x, y] is the feature map obtained after nonlinear mapping.
  • each point in the RGB image feature map is defined as an anchor point, and each anchor point defines 9 anchor boxes centered on itself, and the anchor boxes beyond the image area are removed, and the remaining anchor boxes Feature map for binary classification and bounding box regression:
  • y is the classification prediction of the foreground border
  • f 4 (x, y) is the feature map of the anchor box
  • f is the classifier
  • the classifier artificially sets a threshold, and the prediction greater than this threshold is the foreground, and is added to the subsequent steps to calculate , predictions smaller than this threshold are background and discarded;
  • r is the offset of the foreground border
  • g is the linear regression function
  • ⁇ x, ⁇ y are the center offset predictions of the anchor frame
  • ⁇ h, ⁇ w are the scale factors of the anchor frame
  • the position and scale of the anchor frame are calculated according to the foreground regression Adjustment; then use non-maximum value suppression to filter the anchor boxes, and remove overlapping anchor boxes; then take the top n anchor boxes with the highest confidence as the ROI area, and enter the next step for processing.
  • step 4 after obtaining the ROI area extracted by the region proposal network, the ROI area is scale-adapted, scaled according to the ratio of the original image to the size of the feature map, and then the scaled area is aligned to RGB, HSV and gradient Feature map, get three different ROI feature maps.
  • step 5 for the ROI region, calculate the contribution degree of the three ROI feature maps for detection, assign corresponding weight vectors to the three conditional branches according to the contribution degree, and perform feature concatenation according to the respective weight vectors;
  • the contribution rate is calculated by the following formula:
  • c is the maximum number of feature channels
  • f i k is the feature value of the i-th channel after the k-th feature passes through the channel pooling layer
  • m k is the feature mean value of the k-th feature after passing through the channel pooling layer
  • V k is the contribution degree of each feature
  • W is the final contribution vector.
  • a contribution vector must be calculated for each ROI feature map and multiplied with the contribution vector to obtain three weighted and fused feature vectors.
  • step 6 three expert system networks are set, and the feature vectors after three weighted fusions are input into corresponding three expert system networks respectively, and each expert system network reasoning obtains object category and position;
  • f p is the feature vector of weighted fusion
  • h is a multi-classifier
  • the output y' is the confidence degree of each class
  • r' is the offset of the predicted frame
  • ⁇ x', ⁇ y' are the center offset prediction of the predicted frame
  • ⁇ h', ⁇ w' are the scaling factors of the predicted frame
  • g is the linear regression function
  • Regression is performed on each ROI area to obtain a more accurate ROI area.
  • step 7 according to the contribution vector obtained in step 5), the prediction results of each expert system network in step 6) are weighted and fused to obtain the final prediction result:
  • y f is the final classification prediction result
  • r f is the final regression prediction result
  • W i is the contribution of the i-th branch to the classification prediction
  • y i is the classification prediction result of the i-th branch
  • W j is The contribution of the jth branch to the regression prediction
  • r j is the regression prediction result of the jth branch; after the above process, the final prediction result is obtained, which is marked in the detection image to obtain the category and position of the object.
  • the present invention has the following advantages and beneficial effects:
  • the present invention improves the detection speed while maintaining the detection accuracy.
  • the proposed method splits the complex feature network into multiple conditional branches, and splits the detection head network into multiple expert In the system network, each network is small in scale and is calculated in parallel with each other, so the overall reasoning time is reduced.
  • the use of branch feature alignment avoids redundant calculation of region proposals under multiple branches and improves detection efficiency.
  • the present invention adopts conditional branching for object detection, decomposes and expands the feature space, enables the network to dig out more distinguishing features, and avoids feature redundancy under massive data sets, resulting in excessive utilization The problem of overfitting.
  • the present invention sets a plurality of expert system networks, and each expert system network focuses on reasoning object categories belonging to its own branch, which improves the mapping ability between the feature space and the solution space, and is small for inter-class distances and large intra-class distances dataset, the method proposed by the present invention has higher detection accuracy.
  • the method of the present invention has a wide application space in computer vision tasks, can realize end-to-end training and detection, has strong data adaptability, and has broad application prospects.
  • Fig. 1 is the test picture of this embodiment.
  • Fig. 2 is a characteristic heat map of the present embodiment.
  • FIG. 3 is a schematic diagram of the detection results of this embodiment.
  • This embodiment discloses a method for fast object detection based on conditional branching and expert system, comprising the following steps:
  • Each conditional branch is equipped with a feature extraction network.
  • RGB, HSV and gradient images are respectively obtained.
  • Feature map calculate the low-resolution feature heat map after superimposing the three feature maps, scale the size of the low-resolution feature heat map to the same size as the original image, and then superimpose it with the original image to generate the final feature heat map, as shown in Figure 2 As shown, it can be found that the features are concentrated on the object surface.
  • the feature extraction network is a deep network, mainly composed of a convolutional layer, a pooling layer and a nonlinear mapping layer.
  • f 1 [x, y] is the data of the image in the (x, y) area
  • w[x, y] is the convolution kernel
  • f 2 [x, y] is the convolution in the (x, y) area
  • n i , n j are the offset distances from the convolution center
  • n 1 , n 2 are the maximum offset distances in the vertical direction and horizontal direction of the convolution, respectively
  • f[x+n i ,y +n j ] is the value of the image at (x+n i ,y+n j )
  • w[n i ,n j ] is the weight of the convolution kernel at (n i ,n j );
  • f 3 [x, y] is the feature map obtained after nonlinear mapping.
  • the object features of three different dimensions can be obtained, which improves the feature expression ability.
  • each point in the RGB image feature map is defined as an anchor point.
  • each anchor point defines three sizes with itself as the center, and three anchor boxes with three aspect ratios combined with each other. Anchor boxes that exceed the image area, perform binary classification and border regression on the remaining anchor box feature maps:
  • y is the classification prediction of the foreground border
  • f 4 (x, y) is the feature map of the anchor box
  • f is the classifier
  • the classifier artificially sets a threshold, and the prediction greater than this threshold is the foreground, and is added to the subsequent steps to calculate , predictions smaller than this threshold are considered background and discarded.
  • r is the offset of the foreground frame
  • g is the linear regression function
  • ⁇ x, ⁇ y are the center offset predictions of the anchor box
  • ⁇ h, ⁇ w are the scale factors of the anchor box. Adjust the position and scale of the anchor frame according to the foreground regression.
  • the anchor boxes are then screened using non-maximum suppression to remove overlapping anchor boxes. Then take the first n anchor boxes with the highest confidence as the ROI area, and enter the next step for processing.
  • this method of combining single feature calculation ROI+ROI multi-feature alignment can avoid redundant calculation of ROI area under multi-branch, and improve the reasoning speed.
  • c is the maximum number of feature channels
  • f i k is the feature value of the i-th channel after the k-th feature passes through the channel pooling layer
  • m k is the feature mean value of the k-th feature after passing through the channel pooling layer.
  • V k is the contribution degree of each feature
  • W is the final contribution vector.
  • a contribution vector must be calculated for each ROI feature map and multiplied with the contribution vector to obtain three weighted and fused feature vectors.
  • each expert system network deduces the object category and location, for the sake of simplicity, the three expert system networks Using the same structure, it consists of a channel reduction convolutional layer and a fully connected layer;
  • f p is the feature vector of weighted fusion
  • h is a multi-classifier
  • the output y' is the confidence degree of each class
  • r' is the offset of the predicted frame
  • ⁇ x', ⁇ y' are the center offset predictions of the predicted frame
  • ⁇ h', ⁇ w' are the scaling factors of the predicted frame
  • g is the linear regression function
  • Regression is performed on each ROI area to obtain a more accurate ROI area.
  • step 6) According to the contribution vector obtained in step 5), the prediction results of each expert system network in step 6) are weighted and fused to obtain the final prediction result.
  • y f is the final classification prediction result
  • r f is the final regression prediction result
  • W i is the contribution of the i-th branch to the classification prediction
  • y i is the classification prediction result of the i-th branch
  • W j is The contribution of the jth branch to the regression prediction
  • r j is the regression prediction result of the jth branch

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

A rapid object detection method based on conditional branches and expert systems. The method comprises: 1) acquiring an X-ray image; 2) obtaining image feature maps of RGB, HSV and gradient by means of conditional branches; 3) obtaining an ROI region using a region proposal network; 4) obtaining three ROI feature maps by means of branch feature alignment; 5) calculating the contribution degrees of the three feature maps, and performing feature concatenation on the basis of the contribution degrees, so as to obtain feature vectors subjected to weighted fusion; 6) inputting the three feature vectors subjected to weighted fusion into three expert system networks, so as to obtain object categories and positions; and 7) performing weighted fusion on prediction results of the three expert system networks, and identifying and marking the category and position of a tested object. Object detection is performed on the basis of conditional branches and expert systems, and a complex network is decomposed into network branches for parallel calculation, such that the inference speed of the network is increased, and the capability of mapping between a feature space and a solution space is also enhanced, thereby improving the speed and precision of object detection.

Description

基于条件分支和专家系统的对象快速检测方法A Fast Object Detection Method Based on Conditional Branch and Expert System 技术领域technical field
本发明涉及智能家电检测的技术领域,尤其是指一种基于条件分支和专家系统的对象快速检测方法,实现了自动检测,降低工作成本,提高了PCBA家电生产装配线上产品缺陷以及X光安检中违禁品的检测精度与效率。The present invention relates to the technical field of smart home appliance detection, in particular to a rapid object detection method based on conditional branches and an expert system, which realizes automatic detection, reduces work costs, and improves product defects on PCBA home appliance production and assembly lines and X-ray security inspection. Contraband detection accuracy and efficiency.
背景技术Background technique
随着人工智能的发展,利用机器替代人力实现劳力工作渐渐成为了新的科技发展趋势,特别是在智能家电检测以及X光安全检查领域体现尤为突出。PCBA智能检测就是针对目前滞后的手工/半自动平台测试及日益提升的产能效率需求而诞生的。通过通用接驳台与现有生产线实现无缝连接,配合现有的ICT、功能测试设备,可以组成一条完整的自动化测试线,全自动在线测试。而X光安全检测随着机器视觉的理论进步从而得到了一定的技术更新,相关机构经常会在地铁,飞机场等公共场合安置X光安检仪进行安全检测,从源头预防危险的发生。With the development of artificial intelligence, the use of machines to replace manpower to achieve labor work has gradually become a new trend of technological development, especially in the fields of smart home appliance detection and X-ray safety inspection. PCBA intelligent detection was born in response to the current lagging manual/semi-automatic platform testing and the increasing demand for production efficiency. The seamless connection with the existing production line is realized through the universal connecting station, and a complete automatic test line can be formed with the existing ICT and functional test equipment, and the fully automatic online test. With the advancement of machine vision theory, X-ray safety detection has been updated to a certain extent. Relevant agencies often install X-ray security inspection devices in public places such as subways and airports for safety detection, so as to prevent danger from the source.
现有技术中,家电PCBA智能检测利用算法实现自动检测,但是目前所使用的传统算法中过于依赖先验知识,针对当前短期内检测对象的特点对算法进行固定的设计,例如特征选取,阈值限定等。虽然以上传统算法可以实现自动检测,但泛化能力较差,当有新一批数据引进时,需要重新调整算法以适应新数据。为了提高检测性能,算法中经常会加入大量判断条件,从而大大降低了对象的检测速度,造成了检测实时性差的问题,而在X光安全检测领域也存在同样的问题,目前的检测手段主要靠人工操作,需要大量人力资源,且要对检测人员进行长时间的专业培训。在检测的过程中,由于长期的精力集中,可能导致工作人员检测的注意力下降,分散等情况,从而导致检测中随着时间的增长,时常出现漏检和误检的情况,为了降低漏误检,有时还需要调整安检通道的运行速度以使得检测人员能够找出其中的违禁品。In the existing technology, the intelligent detection of home appliance PCBA uses algorithms to realize automatic detection, but the traditional algorithms currently used rely too much on prior knowledge, and the algorithm is fixedly designed according to the characteristics of the current short-term detection objects, such as feature selection and threshold limit wait. Although the above traditional algorithms can realize automatic detection, their generalization ability is poor. When a new batch of data is introduced, the algorithm needs to be readjusted to adapt to the new data. In order to improve the detection performance, a large number of judgment conditions are often added to the algorithm, which greatly reduces the detection speed of the object and causes the problem of poor real-time detection. The same problem also exists in the field of X-ray security detection. The current detection methods mainly rely on Manual operation requires a lot of human resources and requires long-term professional training for testing personnel. In the process of detection, due to long-term concentration, it may lead to the decrease of the staff's attention and distraction in the detection, which will lead to the increase of time in the detection, and the situation of missed detection and false detection often occurs. Sometimes it is necessary to adjust the running speed of the security inspection channel so that the inspectors can find out the contraband.
所以无论是PCBA家电检测还是X光安全检测,目前所用到的检测方法都非常低效,不适合长期运维。Therefore, whether it is PCBA home appliance inspection or X-ray safety inspection, the currently used inspection methods are very inefficient and not suitable for long-term operation and maintenance.
发明内容Contents of the invention
本发明的目的在于克服现有技术的缺点与不足,提出了一种基于条件分支和专家系统的对象快速检测方法,实现对家电生产线以及违禁品安全排查的自动检测,不需要培训专门的工作人员,降低了对人力物力的投入,并且可以保持稳定的检测精度和检测速度,实现了高效率的工作。The purpose of the present invention is to overcome the shortcomings and deficiencies of the prior art, and propose a rapid object detection method based on conditional branching and expert systems, to realize automatic detection of home appliance production lines and contraband security checks, without training special staff , reduce the input of manpower and material resources, and can maintain stable detection accuracy and detection speed, and realize high-efficiency work.
为实现上述目的,本发明所提供的技术方案为:基于条件分支和专家系统的对象快速检测方法,包括以下步骤:In order to achieve the above object, the technical solution provided by the present invention is: a method for rapid object detection based on conditional branching and expert system, comprising the following steps:
1)采集传输送带上检测对象的X光图像;1) Collect the X-ray image of the detection object on the conveyor belt;
2)将X光图像输入三个条件分支分别获得RGB、HSV以及梯度的图像特征图;2) Input the X-ray image into the three conditional branches to obtain the image feature maps of RGB, HSV and gradient respectively;
3)将RGB图像特征图输入区域建议网络,获得ROI区域;3) Input the RGB image feature map into the region proposal network to obtain the ROI region;
4)将ROI区域利用分支特征对齐,得到对应RGB、HSV、梯度三种特征图的ROI特征图;4) Align the ROI area using branch features to obtain ROI feature maps corresponding to the three feature maps of RGB, HSV, and gradient;
5)针对于ROI区域,计算三种ROI特征图对于检测的可贡献度,根据可贡献度为三个条件分支分配相应的权重向量并根据各自的权重向量进行特征串联,其中,每一种ROI特征图都要计算出一个贡献向量,并和贡献向量做点乘,得到三个经过加权融合的特征向量;5) For the ROI area, calculate the contribution degree of the three ROI feature maps to the detection, assign corresponding weight vectors to the three conditional branches according to the contribution degree, and perform feature concatenation according to the respective weight vectors, wherein each ROI The feature map must calculate a contribution vector, and do point multiplication with the contribution vector to obtain three weighted and fused feature vectors;
6)将三个经过加权融合的特征向量输入对应的三个专家系统网络,得到对象类别和位置;6) Input the three weighted and fused feature vectors into the corresponding three expert system networks to obtain the object category and position;
7)根据贡献向量对三个专家系统网络的预测结果进行加权融合,识别出检测对象的类别和位置并标注。7) According to the contribution vector, the prediction results of the three expert system networks are weighted and fused, and the category and location of the detected object are identified and marked.
进一步,在步骤1)中,将检测对象置于传送带,传送带将检测对象运至检测区时,X光射线仪通过准直器发射出扇形射线束对检测对象进行扫描,扇形射线束穿过检测对象内部并投射在接收屏上,通过计算机渲染技术得到检测对象的X光图像。Further, in step 1), the detection object is placed on the conveyor belt, and when the conveyor belt transports the detection object to the detection area, the X-ray instrument emits a fan-shaped ray beam through the collimator to scan the detection object, and the fan-shaped ray beam passes through the detection area. The inside of the object is projected on the receiving screen, and the X-ray image of the detected object is obtained through computer rendering technology.
进一步,在步骤2)中,每个分支设置一个特征提取网络,将X光图像经过颜色空间变换后分别送入三个条件分支,运算后得到RGB、HSV和梯度的图像特征图;Further, in step 2), each branch is provided with a feature extraction network, and the X-ray image is sent to three conditional branches after color space transformation, and the image feature maps of RGB, HSV and gradient are obtained after the operation;
所述特征提取网络为深层网络,由卷积层、池化层与非线性映射层组成;The feature extraction network is a deep network consisting of a convolutional layer, a pooling layer and a nonlinear mapping layer;
其卷积过程如下:Its convolution process is as follows:
Figure PCTCN2022120298-appb-000001
Figure PCTCN2022120298-appb-000001
式中,f 1[x,y]为图像在(x,y)区域的数据,w[x,y]为卷积核,f 2[x,y]为在(x,y)区域卷积后所得特征,n inj为距离卷积中心的偏移距离,n 1、n 2分别为卷积垂直方向最大偏移距离和水平方向最大偏移距离,f[x+n i,y+n j]为图像在(x+n i,y+n j)的数值,w[n i,n j]为卷积核在(n i,n j)位置的权重; In the formula, f 1 [x, y] is the data of the image in the (x, y) area, w[x, y] is the convolution kernel, f 2 [x, y] is the convolution in the (x, y) area The resulting features, n i , nj are the offset distances from the convolution center, n 1 , n 2 are the maximum offset distances in the vertical and horizontal directions of the convolution, respectively, f[x+n i ,y+ n j ] is the value of the image at (x+n i ,y+n j ), w[n i ,n j ] is the weight of the convolution kernel at (n i ,n j );
其非线性映射过程:Its nonlinear mapping process:
f 3[x,y]=max(0,f 2[x,y]) f 3 [x,y]=max(0,f 2 [x,y])
式中,f 3[x,y]为做非线性映射后得到的特征图。 In the formula, f 3 [x, y] is the feature map obtained after nonlinear mapping.
进一步,在步骤3)中,RGB图像特征图中的每一个点定义为锚点,每个锚点以自身为中心定义9个锚框,除去超出图像区域的锚框,对剩下的锚框特征图进行二分类和边框回归:Further, in step 3), each point in the RGB image feature map is defined as an anchor point, and each anchor point defines 9 anchor boxes centered on itself, and the anchor boxes beyond the image area are removed, and the remaining anchor boxes Feature map for binary classification and bounding box regression:
a、二分类:y=f[f 4(x,y)] a. Two classifications: y=f[f 4 (x,y)]
式中,y为前景边框的分类预测,f 4(x,y)为锚框特征图,f为分类器,分类器人为设定一个阈值,大于此阈值的预测为前景,并加入后续步骤计算,小于此阈值的预测为背景并被抛弃; In the formula, y is the classification prediction of the foreground border, f 4 (x, y) is the feature map of the anchor box, f is the classifier, and the classifier artificially sets a threshold, and the prediction greater than this threshold is the foreground, and is added to the subsequent steps to calculate , predictions smaller than this threshold are background and discarded;
b、边框回归:r=[Δx,Δy,Δh,Δw]=g(f 4[x,y]) b. Border regression: r=[Δx,Δy,Δh,Δw]=g(f 4 [x,y])
式中,r为前景边框的偏移量,g为线性回归函数;Δx,Δy为锚框的中心偏移预测;Δh,Δw为锚框尺度缩放因子;根据前景回归对锚框进行位置及尺度调整;然后对锚框使用非极大值抑制进行筛选,剔除重叠的锚框;再取置信度最高的前n个锚框,作为ROI区域,进入后续步骤处理。In the formula, r is the offset of the foreground border, g is the linear regression function; Δx, Δy are the center offset predictions of the anchor frame; Δh, Δw are the scale factors of the anchor frame; the position and scale of the anchor frame are calculated according to the foreground regression Adjustment; then use non-maximum value suppression to filter the anchor boxes, and remove overlapping anchor boxes; then take the top n anchor boxes with the highest confidence as the ROI area, and enter the next step for processing.
进一步,在步骤4)中,得到区域建议网络提取的ROI区域后,将ROI区域进行尺度适应,按原图与特征图大小的比例进行缩放,然后将缩放后的区域对齐至RGB、HSV以及梯度特征图,得到三种不同的ROI特征图。Further, in step 4), after obtaining the ROI area extracted by the region proposal network, the ROI area is scale-adapted, scaled according to the ratio of the original image to the size of the feature map, and then the scaled area is aligned to RGB, HSV and gradient Feature map, get three different ROI feature maps.
进一步,在步骤5)中,针对于ROI区域,计算三种ROI特征图对于检测的可贡献度,根据可贡献度为三个条件分支分配相应的权重向量并根据各自的权重向量进行特征串联;Further, in step 5), for the ROI region, calculate the contribution degree of the three ROI feature maps for detection, assign corresponding weight vectors to the three conditional branches according to the contribution degree, and perform feature concatenation according to the respective weight vectors;
可贡献度由如下公式进行计算:The contribution rate is calculated by the following formula:
Figure PCTCN2022120298-appb-000002
Figure PCTCN2022120298-appb-000002
W=softmax([V 1,V 2,V 3]) W=softmax([V 1 ,V 2 ,V 3 ])
式中,c为最大特征通道数,f i k为第k种特征经过通道池化层后的第i个通道的特征值,m k为第k种特征经过通道池化层后的特征均值,V k为每个特征的可贡献度,W为最终的贡献向量,每一种ROI特征图都要计算出一个贡献向量,并和贡献向量做点乘,得到三个经过加权融合的特征向量。 In the formula, c is the maximum number of feature channels, f i k is the feature value of the i-th channel after the k-th feature passes through the channel pooling layer, m k is the feature mean value of the k-th feature after passing through the channel pooling layer, V k is the contribution degree of each feature, and W is the final contribution vector. A contribution vector must be calculated for each ROI feature map and multiplied with the contribution vector to obtain three weighted and fused feature vectors.
进一步,在步骤6)中,设置三个专家系统网络,将三个加权融合后的特征向量分别输入对应的三个专家系统网络,每个专家系统网络推理得到对象类别和位置;Further, in step 6), three expert system networks are set, and the feature vectors after three weighted fusions are input into corresponding three expert system networks respectively, and each expert system network reasoning obtains object category and position;
每个专家系统网络需要完成分类和回归两个任务:Each expert system network needs to complete two tasks of classification and regression:
分类:y′=max(h(f p)) Classification: y′=max(h(f p ))
式中,f p为加权融合的特征向量,h为多分类器,输出y′为每类的置信度; In the formula, f p is the feature vector of weighted fusion, h is a multi-classifier, and the output y' is the confidence degree of each class;
对每一种ROI特征图经过重赋权得到的所有特征向量进行分类,取置信度最高的分类结果作为ROI特征图的分类结果;Classify all feature vectors obtained by reweighting each ROI feature map, and take the classification result with the highest confidence as the classification result of the ROI feature map;
回归:r′=[Δx′,Δy′,Δh′,Δw′]=g(f p) Regression: r'=[Δx',Δy',Δh',Δw']=g(f p )
式中,r′为预测边框的偏移量;Δx′,Δy′为预测边框的中心偏移预测;Δh′,Δw′为预测边框 尺度缩放因子;g为线性回归函数;In the formula, r' is the offset of the predicted frame; Δx', Δy' are the center offset prediction of the predicted frame; Δh', Δw' are the scaling factors of the predicted frame; g is the linear regression function;
对每一个ROI区域进行回归,得到更精确的ROI区域。Regression is performed on each ROI area to obtain a more accurate ROI area.
进一步,在步骤7)中,根据步骤5)中得到的贡献向量,对步骤6)中每一个专家系统网络的预测结果进行加权融合,得到最终的预测结果:Further, in step 7), according to the contribution vector obtained in step 5), the prediction results of each expert system network in step 6) are weighted and fused to obtain the final prediction result:
Figure PCTCN2022120298-appb-000003
Figure PCTCN2022120298-appb-000003
Figure PCTCN2022120298-appb-000004
Figure PCTCN2022120298-appb-000004
式中,y f为最终的分类预测结果,r f为最终的回归预测结果,W i为第i个分支对于分类预测的贡献度,y i为第i个分支的分类预测结果,W j为第j个分支对于回归预测的贡献度,r j为第j个分支的回归预测结果;经过以上过程,得到最终的预测结果,将其标注在检测图像中,得到对象的类别和位置。 In the formula, y f is the final classification prediction result, r f is the final regression prediction result, W i is the contribution of the i-th branch to the classification prediction, y i is the classification prediction result of the i-th branch, W j is The contribution of the jth branch to the regression prediction, r j is the regression prediction result of the jth branch; after the above process, the final prediction result is obtained, which is marked in the detection image to obtain the category and position of the object.
本发明与现有技术相比,具有如下优点与有益效果:Compared with the prior art, the present invention has the following advantages and beneficial effects:
1、本发明与其它深度学习检测方法相比,在保持检测精度的同时提高了检测速度,所提出的方法将复杂特征网络拆分为多个条件分支,将检测头网络拆分为多个专家系统网络,每个网络的规模较小,而且相互之间属于并行计算,所以总体上减少了推理时间,同时利用分支特征对齐避免了多分支下区域建议的冗余计算,提高了检测效率。1. Compared with other deep learning detection methods, the present invention improves the detection speed while maintaining the detection accuracy. The proposed method splits the complex feature network into multiple conditional branches, and splits the detection head network into multiple expert In the system network, each network is small in scale and is calculated in parallel with each other, so the overall reasoning time is reduced. At the same time, the use of branch feature alignment avoids redundant calculation of region proposals under multiple branches and improves detection efficiency.
2、本发明首次在X光检测领域采用条件分支进行对象检测,分解并拓展了特征空间,使网络可以挖掘出更具有区分度的特征,避免了海量数据集下特征冗余,过度利用而导致过拟合的问题。2. For the first time in the field of X-ray detection, the present invention adopts conditional branching for object detection, decomposes and expands the feature space, enables the network to dig out more distinguishing features, and avoids feature redundancy under massive data sets, resulting in excessive utilization The problem of overfitting.
3、本发明设置多个专家系统网络,每个专家系统网络专注于推理属于自己分支的对象类别,提高了特征空间与解空间之间的映射能力,对于类间距离小,类内距离大的数据集,本发明提出的方法具有更高的检测精度。3. The present invention sets a plurality of expert system networks, and each expert system network focuses on reasoning object categories belonging to its own branch, which improves the mapping ability between the feature space and the solution space, and is small for inter-class distances and large intra-class distances dataset, the method proposed by the present invention has higher detection accuracy.
4、本发明方法在计算机视觉任务中具有广泛的使用空间,可实现端到端训练检测,数据适应能力强,具有广阔的应用前景。4. The method of the present invention has a wide application space in computer vision tasks, can realize end-to-end training and detection, has strong data adaptability, and has broad application prospects.
附图说明Description of drawings
图1为本实施例的测试图片。Fig. 1 is the test picture of this embodiment.
图2为本实施例的特征热图。Fig. 2 is a characteristic heat map of the present embodiment.
图3为本实施例的检测结果示意图。FIG. 3 is a schematic diagram of the detection results of this embodiment.
具体实施方式Detailed ways
下面结合实施例及附图对本发明作进一步详细的描述,但本发明的实施方式不限于此。The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.
本实施例公开了一种基于条件分支和专家系统的对象快速检测方法,包括以下步骤:This embodiment discloses a method for fast object detection based on conditional branching and expert system, comprising the following steps:
1)将带有手刺的包裹检测对象置于传送带,传送带将检测对象运至检测区时,X光射线仪通过准直器发射出扇形射线束对检测对象进行扫描,扇形射线束穿过检测对象内部并投射在接收屏上,通过计算机渲染技术得到手刺的X光图像,如图1所示。1) Place the package detection object with a business card on the conveyor belt. When the conveyor belt transports the detection object to the detection area, the X-ray instrument emits a fan-shaped ray beam through the collimator to scan the detection object, and the fan-shaped ray beam passes through the detection area. The inside of the object is projected on the receiving screen, and the X-ray image of the business card is obtained through computer rendering technology, as shown in Figure 1.
2)将手刺的X光图像进行颜色空间变换,分别送入三个条件分支,每个条件分支设置一个特征提取网络,图像经过三个条件分支的运算后分别得到RGB、HSV和梯度的图像特征图,将三个特征图进行叠加后计算低分辨率特征热图,将低分辨率特征热图尺寸放缩至和原图相同尺寸后与原图叠加生成最后的特征热图,如图2所示,可以发现特征集中在对象表面上。2) Transform the X-ray image of the business card into the color space and send it to three conditional branches respectively. Each conditional branch is equipped with a feature extraction network. After the image is processed by the three conditional branches, RGB, HSV and gradient images are respectively obtained. Feature map, calculate the low-resolution feature heat map after superimposing the three feature maps, scale the size of the low-resolution feature heat map to the same size as the original image, and then superimpose it with the original image to generate the final feature heat map, as shown in Figure 2 As shown, it can be found that the features are concentrated on the object surface.
所述特征提取网络为深层网络,主要由卷积层、池化层与非线性映射层组成。The feature extraction network is a deep network, mainly composed of a convolutional layer, a pooling layer and a nonlinear mapping layer.
其卷积过程如下:Its convolution process is as follows:
Figure PCTCN2022120298-appb-000005
Figure PCTCN2022120298-appb-000005
式中,f 1[x,y]为图像在(x,y)区域的数据,w[x,y]为卷积核,f 2[x,y]为在(x,y)区域卷积后所得特征,n i、n j为距离卷积中心的偏移距离,n 1、n 2分别为卷积垂直方向最大偏移距离和水平方向最大偏移距离,f[x+n i,y+n j]为图像在(x+n i,y+n j)的数值,w[n i,n j]为卷积核在(n i,n j)位置的权重; In the formula, f 1 [x, y] is the data of the image in the (x, y) area, w[x, y] is the convolution kernel, f 2 [x, y] is the convolution in the (x, y) area The resulting features, n i , n j are the offset distances from the convolution center, n 1 , n 2 are the maximum offset distances in the vertical direction and horizontal direction of the convolution, respectively, f[x+n i ,y +n j ] is the value of the image at (x+n i ,y+n j ), w[n i ,n j ] is the weight of the convolution kernel at (n i ,n j );
其非线性映射过程:Its nonlinear mapping process:
f 3[x,y]=max(0,f 2[x,y]) f 3 [x,y]=max(0,f 2 [x,y])
式中,f 3[x,y]为做非线性映射后得到的特征图。 In the formula, f 3 [x, y] is the feature map obtained after nonlinear mapping.
对于原算法中RGB输入分量难以拟合预测曲线的检测对象,通过将特征空间分解,可以得到三个不同维度的对象特征,提高了特征表述能力。For the detection object whose RGB input component is difficult to fit the prediction curve in the original algorithm, by decomposing the feature space, the object features of three different dimensions can be obtained, which improves the feature expression ability.
3)将RGB图像特征图输入区域建议网络,获得ROI区域。3) Input the RGB image feature map into the region proposal network to obtain the ROI region.
RGB图像特征图中的每一个点定义为锚点,为了更好的匹配不同大小尺寸的对象,每个锚点以自身为中心定义三种尺寸,三种长宽比例相互组合的锚框,除去超出图像区域的锚框,对剩下的锚框特征图进行二分类和边框回归:Each point in the RGB image feature map is defined as an anchor point. In order to better match objects of different sizes, each anchor point defines three sizes with itself as the center, and three anchor boxes with three aspect ratios combined with each other. Anchor boxes that exceed the image area, perform binary classification and border regression on the remaining anchor box feature maps:
a、二分类:y=f[f 4(x,y)] a. Two classifications: y=f[f 4 (x,y)]
式中,y为前景边框的分类预测,f 4(x,y)为锚框特征图,f为分类器,分类器人为设定一个阈值,大于此阈值的预测为前景,并加入后续步骤计算,小于此阈值的预测为背景并被抛 弃。 In the formula, y is the classification prediction of the foreground border, f 4 (x, y) is the feature map of the anchor box, f is the classifier, and the classifier artificially sets a threshold, and the prediction greater than this threshold is the foreground, and is added to the subsequent steps to calculate , predictions smaller than this threshold are considered background and discarded.
b、边框回归:r=[Δx,Δy,Δh,Δw]=g(f 4[x,y]) b. Border regression: r=[Δx,Δy,Δh,Δw]=g(f 4 [x,y])
式中,r为前景边框的偏移量,g为线性回归函数;Δx,Δy为锚框的中心偏移预测;Δh,Δw为锚框尺度缩放因子。根据前景回归对锚框进行位置及尺度调整。然后对锚框使用非极大值抑制进行筛选,剔除重叠的锚框。再取置信度最高的前n个锚框,作为ROI区域,进入后续步骤处理。In the formula, r is the offset of the foreground frame, g is the linear regression function; Δx, Δy are the center offset predictions of the anchor box; Δh, Δw are the scale factors of the anchor box. Adjust the position and scale of the anchor frame according to the foreground regression. The anchor boxes are then screened using non-maximum suppression to remove overlapping anchor boxes. Then take the first n anchor boxes with the highest confidence as the ROI area, and enter the next step for processing.
4)得到区域建议网络提取的ROI区域后,将ROI区域进行尺度适应,按原图与特征图大小的比例进行缩放,然后将缩放后的区域对齐至RGB、HSV以及梯度特征图,得到三种不同的ROI特征图,这种结合单特征计算ROI+ROI多特征对齐的方式可以避免多分支下ROI区域的冗余计算,提高了推理速度。4) After obtaining the ROI area extracted by the region proposal network, scale the ROI area, scale it according to the ratio of the original image to the feature map size, and then align the scaled area to the RGB, HSV, and gradient feature maps to obtain three Different ROI feature maps, this method of combining single feature calculation ROI+ROI multi-feature alignment can avoid redundant calculation of ROI area under multi-branch, and improve the reasoning speed.
5)针对于ROI区域,计算三种ROI特征图对于检测的可贡献度,根据可贡献度为三个条件分支分配相应的权重向量并根据各自的权重向量进行特征串联。每个检测对象中的显著特征各不相同,利用数据驱动学习到对象不同特征中更有利于检测的特征并施加注意力机制可以专家系统网络的推理能力。5) For the ROI area, calculate the contribution degree of the three ROI feature maps to the detection, assign corresponding weight vectors to the three conditional branches according to the contribution degree, and perform feature concatenation according to the respective weight vectors. The salient features in each detected object are different, and using data-driven learning to learn the features that are more conducive to detection among different features of the object and applying the attention mechanism can improve the reasoning ability of the expert system network.
可贡献度可由如下公式进行计算:Contribution can be calculated by the following formula:
Figure PCTCN2022120298-appb-000006
Figure PCTCN2022120298-appb-000006
W=softmax([V 1,V 2,V 3]) W=softmax([V 1 ,V 2 ,V 3 ])
式中,c为最大特征通道数,f i k为第k种特征经过通道池化层后的第i个通道的特征值,m k为第k种特征经过通道池化层后的特征均值。V k为每个特征的可贡献度,W为最终的贡献向量,每一种ROI特征图都要计算出一个贡献向量,并和贡献向量做点乘,得到三个经过加权融合的特征向量。 In the formula, c is the maximum number of feature channels, f i k is the feature value of the i-th channel after the k-th feature passes through the channel pooling layer, and m k is the feature mean value of the k-th feature after passing through the channel pooling layer. V k is the contribution degree of each feature, and W is the final contribution vector. A contribution vector must be calculated for each ROI feature map and multiplied with the contribution vector to obtain three weighted and fused feature vectors.
6)设置三个专家系统网络,将三个加权融合后的特征向量分别输入对应的三个专家系统网络,每个专家系统网络推理得到对象类别和位置,出于简便设计,三个专家系统网络采用相同的结构,由通道缩减卷积层和全连接层组成;6) Set up three expert system networks, input the three weighted and fused feature vectors into the corresponding three expert system networks, each expert system network deduces the object category and location, for the sake of simplicity, the three expert system networks Using the same structure, it consists of a channel reduction convolutional layer and a fully connected layer;
每个专家系统网络需要完成分类和回归两个任务:Each expert system network needs to complete two tasks of classification and regression:
分类:y′=max(h(f p)) Classification: y′=max(h(f p ))
式中,f p为加权融合的特征向量,h为多分类器,输出y′为每类的置信度; In the formula, f p is the feature vector of weighted fusion, h is a multi-classifier, and the output y' is the confidence degree of each class;
对每一种ROI特征图经过重赋权得到的所有特征向量进行分类,取置信度最高的分类结 果作为ROI特征图的分类结果。Classify all the feature vectors obtained by reweighting each ROI feature map, and take the classification result with the highest confidence as the classification result of the ROI feature map.
回归:r′=[Δx′,Δy′,Δh′,Δw′]=g(f p) Regression: r'=[Δx',Δy',Δh',Δw']=g(f p )
式中,r′为预测边框的偏移量;Δx′,Δy′为预测边框的中心偏移预测;Δh′,Δw′为预测边框尺度缩放因子;g为线性回归函数;。In the formula, r' is the offset of the predicted frame; Δx', Δy' are the center offset predictions of the predicted frame; Δh', Δw' are the scaling factors of the predicted frame; g is the linear regression function;
对每一个ROI区域进行回归,得到更精确的ROI区域。Regression is performed on each ROI area to obtain a more accurate ROI area.
7)根据步骤5)中得到的贡献向量,对步骤6)中每一个专家系统网络的预测结果进行加权融合,得到最终的预测结果。7) According to the contribution vector obtained in step 5), the prediction results of each expert system network in step 6) are weighted and fused to obtain the final prediction result.
Figure PCTCN2022120298-appb-000007
Figure PCTCN2022120298-appb-000007
Figure PCTCN2022120298-appb-000008
Figure PCTCN2022120298-appb-000008
式中,y f为最终的分类预测结果,r f为最终的回归预测结果,W i为第i个分支对于分类预测的贡献度,y i为第i个分支的分类预测结果,W j为第j个分支对于回归预测的贡献度,r j为第j个分支的回归预测结果;经过以上过程,得到最终的预测结果,将其标注在检测图像中,得到对象的类别和位置,最终的检测结果如图3所示。 In the formula, y f is the final classification prediction result, r f is the final regression prediction result, W i is the contribution of the i-th branch to the classification prediction, y i is the classification prediction result of the i-th branch, W j is The contribution of the jth branch to the regression prediction, r j is the regression prediction result of the jth branch; after the above process, the final prediction result is obtained, which is marked in the detection image to obtain the category and position of the object, and the final The test results are shown in Figure 3.
上述实施例为本发明较佳的实施方式,但本发明的实施方式并不受上述实施例的限制,其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化,均应为等效的置换方式,都包含在本发明的保护范围之内。The above-mentioned embodiment is a preferred embodiment of the present invention, but the embodiment of the present invention is not limited by the above-mentioned embodiment, and any other changes, modifications, substitutions, combinations, Simplifications should be equivalent replacement methods, and all are included in the protection scope of the present invention.

Claims (8)

  1. 基于条件分支和专家系统的对象快速检测方法,其特征在于,包括以下步骤:The object quick detection method based on conditional branch and expert system, is characterized in that, comprises the following steps:
    1)采集传输送带上检测对象的X光图像;1) Collect the X-ray image of the detection object on the conveyor belt;
    2)将X光图像输入三个条件分支分别获得RGB、HSV以及梯度的图像特征图;2) Input the X-ray image into the three conditional branches to obtain the image feature maps of RGB, HSV and gradient respectively;
    3)将RGB图像特征图输入区域建议网络,获得ROI区域;3) Input the RGB image feature map into the region proposal network to obtain the ROI region;
    4)将ROI区域利用分支特征对齐,得到对应RGB、HSV、梯度三种特征图的ROI特征图;4) Align the ROI area using branch features to obtain ROI feature maps corresponding to the three feature maps of RGB, HSV, and gradient;
    5)针对于ROI区域,计算三种ROI特征图对于检测的可贡献度,根据可贡献度为三个条件分支分配相应的权重向量并根据各自的权重向量进行特征串联,其中,每一种ROI特征图都要计算出一个贡献向量,并和贡献向量做点乘,得到三个经过加权融合的特征向量;5) For the ROI area, calculate the contribution degree of the three ROI feature maps to the detection, assign corresponding weight vectors to the three conditional branches according to the contribution degree, and perform feature concatenation according to the respective weight vectors, wherein each ROI The feature map must calculate a contribution vector, and do point multiplication with the contribution vector to obtain three weighted and fused feature vectors;
    6)将三个经过加权融合的特征向量输入对应的三个专家系统网络,得到对象类别和位置;6) Input the three weighted and fused feature vectors into the corresponding three expert system networks to obtain the object category and position;
    7)根据贡献向量对三个专家系统网络的预测结果进行加权融合,识别出检测对象的类别和位置并标注。7) According to the contribution vector, the prediction results of the three expert system networks are weighted and fused, and the category and location of the detected object are identified and marked.
  2. 根据权利要求1所述的基于条件分支和专家系统的对象快速检测方法,其特征在于,在步骤1)中,将检测对象置于传送带,传送带将检测对象运至检测区时,X光射线仪通过准直器发射出扇形射线束对检测对象进行扫描,扇形射线束穿过检测对象内部并投射在接收屏上,通过计算机渲染技术得到检测对象的X光图像。The object rapid detection method based on conditional branch and expert system according to claim 1, characterized in that, in step 1), the detection object is placed on the conveyor belt, and when the conveyor belt transports the detection object to the detection area, the X-ray instrument The fan-shaped ray beam is emitted by the collimator to scan the detection object, the fan-shaped ray beam passes through the inside of the detection object and is projected on the receiving screen, and the X-ray image of the detection object is obtained through computer rendering technology.
  3. 根据权利要求1所述的基于条件分支和专家系统的对象快速检测方法,其特征在于,在步骤2)中,每个分支设置一个特征提取网络,将X光图像经过颜色空间变换后分别送入三个条件分支,运算后得到RGB、HSV和梯度的图像特征图;The object rapid detection method based on conditional branch and expert system according to claim 1, it is characterized in that, in step 2), each branch is provided with a feature extraction network, X-ray images are sent into respectively after color space transformation Three conditional branches, after the operation, the image feature maps of RGB, HSV and gradient are obtained;
    所述特征提取网络为深层网络,由卷积层、池化层与非线性映射层组成;The feature extraction network is a deep network consisting of a convolutional layer, a pooling layer and a nonlinear mapping layer;
    其卷积过程如下:Its convolution process is as follows:
    Figure PCTCN2022120298-appb-100001
    Figure PCTCN2022120298-appb-100001
    式中,f 1[x,y]为图像在(x,y)区域的数据,w[x,y]为卷积核,f 2[x,y]为在(x,y)区域卷积后所得特征,n i、n j为距离卷积中心的偏移距离,n 1、n 2分别为卷积垂直方向最大偏移距离和水平方向最大偏移距离,f[x+n i,y+n j]为图像在(x+n i,y+n j)的数值,w[n i,n j]为卷积核在(n i,n j)位置的权重; In the formula, f 1 [x, y] is the data of the image in the (x, y) area, w[x, y] is the convolution kernel, f 2 [x, y] is the convolution in the (x, y) area The resulting features, n i , n j are the offset distances from the convolution center, n 1 , n 2 are the maximum offset distances in the vertical direction and horizontal direction of the convolution, respectively, f[x+n i ,y +n j ] is the value of the image at (x+n i ,y+n j ), w[n i ,n j ] is the weight of the convolution kernel at (n i ,n j );
    其非线性映射过程:Its nonlinear mapping process:
    f 3[x,y]=max(0,f 2[x,y]) f 3 [x,y]=max(0,f 2 [x,y])
    式中,f 3[x,y]为做非线性映射后得到的特征图。 In the formula, f 3 [x, y] is the feature map obtained after nonlinear mapping.
  4. 根据权利要求1所述的基于条件分支和专家系统的对象快速检测方法,其特征在于:在步 骤3)中,RGB图像特征图中的每一个点定义为锚点,每个锚点以自身为中心定义9个锚框,除去超出图像区域的锚框,对剩下的锚框特征图进行二分类和边框回归:The object rapid detection method based on conditional branch and expert system according to claim 1, it is characterized in that: in step 3), each point in the RGB image feature map is defined as an anchor point, and each anchor point takes itself as Define 9 anchor boxes in the center, remove the anchor boxes beyond the image area, and perform binary classification and border regression on the remaining anchor box feature maps:
    a、二分类:y=f[f 4(x,y)] a. Two classifications: y=f[f 4 (x,y)]
    式中,y为前景边框的分类预测,f 4(x,y)为锚框特征图,f为分类器,分类器人为设定一个阈值,大于此阈值的预测为前景,并加入后续步骤计算,小于此阈值的预测为背景并被抛弃; In the formula, y is the classification prediction of the foreground border, f 4 (x, y) is the feature map of the anchor box, f is the classifier, and the classifier artificially sets a threshold, and the prediction greater than this threshold is the foreground, and is added to the subsequent steps to calculate , predictions smaller than this threshold are background and discarded;
    b、边框回归:r=[Δx,Δy,Δh,Δw]=g(f 4[x,y]) b. Border regression: r=[Δx,Δy,Δh,Δw]=g(f 4 [x,y])
    式中,r为前景边框的偏移量,g为线性回归函数;Δx,Δy为锚框的中心偏移预测;Δh,Δw为锚框尺度缩放因子;根据前景回归对锚框进行位置及尺度调整;然后对锚框使用非极大值抑制进行筛选,剔除重叠的锚框;再取置信度最高的前n个锚框,作为ROI区域,进入后续步骤处理。In the formula, r is the offset of the foreground border, g is the linear regression function; Δx, Δy are the center offset predictions of the anchor frame; Δh, Δw are the scale factors of the anchor frame; the position and scale of the anchor frame are calculated according to the foreground regression Adjustment; then use non-maximum value suppression to filter the anchor boxes, and remove overlapping anchor boxes; then take the top n anchor boxes with the highest confidence as the ROI area, and enter the next step for processing.
  5. 根据权利要求1所述的基于基于条件分支和专家系统的对象快速检测方法,其特征在于:在步骤4)中,得到区域建议网络提取的ROI区域后,将ROI区域进行尺度适应,按原图与特征图大小的比例进行缩放,然后将缩放后的区域对齐至RGB、HSV以及梯度特征图,得到三种不同的ROI特征图。The object rapid detection method based on conditional branch and expert system according to claim 1, characterized in that: in step 4), after obtaining the ROI region extracted by the region suggestion network, the ROI region is scale-adapted, according to the original image Scale in proportion to the size of the feature map, and then align the scaled area to RGB, HSV, and gradient feature maps to obtain three different ROI feature maps.
  6. 根据权利要求1所述的基于基于条件分支和专家系统的对象快速检测方法,其特征在于:在步骤5)中,针对于ROI区域,计算三种ROI特征图对于检测的可贡献度,根据可贡献度为三个条件分支分配相应的权重向量并根据各自的权重向量进行特征串联;The object rapid detection method based on conditional branching and expert system according to claim 1, characterized in that: in step 5), for the ROI region, calculate three kinds of ROI feature maps for the degree of contribution that can be detected, according to available The contribution degree assigns corresponding weight vectors to the three conditional branches and performs feature concatenation according to the respective weight vectors;
    可贡献度由如下公式进行计算:The contribution rate is calculated by the following formula:
    Figure PCTCN2022120298-appb-100002
    Figure PCTCN2022120298-appb-100002
    W=soft max([V 1,V 2,V 3]) W=soft max([V 1 ,V 2 ,V 3 ])
    式中,c为最大特征通道数,
    Figure PCTCN2022120298-appb-100003
    为第k种特征经过通道池化层后的第i个通道的特征值,m k为第k种特征经过通道池化层后的特征均值,V k为每个特征的可贡献度,W为最终的贡献向量,每一种ROI特征图都要计算出一个贡献向量,并和贡献向量做点乘,得到三个经过加权融合的特征向量。
    where c is the maximum number of feature channels,
    Figure PCTCN2022120298-appb-100003
    is the feature value of the i-th channel after the k-th feature passes through the channel pooling layer, m k is the feature mean value of the k-th feature after passing through the channel pooling layer, V k is the contribution degree of each feature, and W is For the final contribution vector, a contribution vector must be calculated for each ROI feature map, and dot multiplication with the contribution vector to obtain three weighted and fused feature vectors.
  7. 根据权利要求1所述的基于基于条件分支和专家系统的对象快速检测方法,其特征在于:在步骤6)中,设置三个专家系统网络,将三个加权融合后的特征向量分别输入对应的三个专家系统网络,每个专家系统网络推理得到对象类别和位置;The object rapid detection method based on conditional branch and expert system according to claim 1, is characterized in that: in step 6), three expert system networks are set, and the feature vectors after three weighted fusions are input into corresponding Three expert system networks, each expert system network infers the object category and location;
    每个专家系统网络需要完成分类和回归两个任务:Each expert system network needs to complete two tasks of classification and regression:
    分类:y′=max(h(f p)) Classification: y′=max(h(f p ))
    式中,f p为加权融合的特征向量,h为多分类器,输出y′为每类的置信度; In the formula, f p is the feature vector of weighted fusion, h is a multi-classifier, and the output y' is the confidence degree of each class;
    对每一种ROI特征图经过重赋权得到的所有特征向量进行分类,取置信度最高的分类结果作为ROI特征图的分类结果;Classify all feature vectors obtained by reweighting each ROI feature map, and take the classification result with the highest confidence as the classification result of the ROI feature map;
    回归:r′=[Δx′,Δy′,Δh′,Δw′]=g(f p) Regression: r'=[Δx',Δy',Δh',Δw']=g(f p )
    式中,r′为预测边框的偏移量;Δx′,Δy′为预测边框的中心偏移预测;Δh′,Δw′为预测边框尺度缩放因子;g为线性回归函数;In the formula, r' is the offset of the predicted frame; Δx', Δy' are the center offset predictions of the predicted frame; Δh', Δw' are the scaling factors of the predicted frame; g is the linear regression function;
    对每一个ROI区域进行回归,得到更精确的ROI区域。Regression is performed on each ROI area to obtain a more accurate ROI area.
  8. 根据权利要求1所述的基于基于条件分支和专家系统的对象快速检测方法,其特征在于:在步骤7)中,根据步骤5)中得到的贡献向量,对步骤6)中每一个专家系统网络的预测结果进行加权融合,得到最终的预测结果:The object rapid detection method based on conditional branch and expert system according to claim 1, characterized in that: in step 7), according to the contribution vector obtained in step 5), for each expert system network in step 6) The prediction results are weighted and fused to get the final prediction result:
    Figure PCTCN2022120298-appb-100004
    Figure PCTCN2022120298-appb-100004
    Figure PCTCN2022120298-appb-100005
    Figure PCTCN2022120298-appb-100005
    式中,y f为最终的分类预测结果,r f为最终的回归预测结果,W i为第i个分支对于分类预测的贡献度,y i为第i个分支的分类预测结果,W j为第j个分支对于回归预测的贡献度,r j为第j个分支的回归预测结果;经过以上过程,得到最终的预测结果,将其标注在检测图像中,得到对象的类别和位置。 In the formula, y f is the final classification prediction result, r f is the final regression prediction result, W i is the contribution of the i-th branch to the classification prediction, y i is the classification prediction result of the i-th branch, W j is The contribution of the jth branch to the regression prediction, r j is the regression prediction result of the jth branch; after the above process, the final prediction result is obtained, which is marked in the detection image to obtain the category and position of the object.
PCT/CN2022/120298 2022-02-25 2022-09-21 Rapid object detection method based on conditional branches and expert systems WO2023159927A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210180014.0A CN114626443B (en) 2022-02-25 2022-02-25 Object rapid detection method based on conditional branching and expert system
CN202210180014.0 2022-02-25

Publications (1)

Publication Number Publication Date
WO2023159927A1 true WO2023159927A1 (en) 2023-08-31

Family

ID=81900503

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/120298 WO2023159927A1 (en) 2022-02-25 2022-09-21 Rapid object detection method based on conditional branches and expert systems

Country Status (2)

Country Link
CN (1) CN114626443B (en)
WO (1) WO2023159927A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117893840A (en) * 2024-03-15 2024-04-16 深圳市宗匠科技有限公司 Acne severity grading method and device, electronic equipment and storage medium
CN117893840B (en) * 2024-03-15 2024-06-28 深圳市宗匠科技有限公司 Acne severity grading method and device, electronic equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114626443B (en) * 2022-02-25 2024-05-03 华南理工大学 Object rapid detection method based on conditional branching and expert system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103500719A (en) * 2013-09-29 2014-01-08 华南理工大学 Expert system-based adaptive micro-focusing X-ray detection method
US10223610B1 (en) * 2017-10-15 2019-03-05 International Business Machines Corporation System and method for detection and classification of findings in images
CN111178432A (en) * 2019-12-30 2020-05-19 武汉科技大学 Weak supervision fine-grained image classification method of multi-branch neural network model
CN111860510A (en) * 2020-07-29 2020-10-30 浙江大华技术股份有限公司 X-ray image target detection method and device
CN114626443A (en) * 2022-02-25 2022-06-14 华南理工大学 Object rapid detection method based on conditional branch and expert system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180373992A1 (en) * 2017-06-26 2018-12-27 Futurewei Technologies, Inc. System and methods for object filtering and uniform representation for autonomous systems
CN112070079B (en) * 2020-07-24 2022-07-05 华南理工大学 X-ray contraband package detection method and device based on feature map weighting

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103500719A (en) * 2013-09-29 2014-01-08 华南理工大学 Expert system-based adaptive micro-focusing X-ray detection method
US10223610B1 (en) * 2017-10-15 2019-03-05 International Business Machines Corporation System and method for detection and classification of findings in images
CN111178432A (en) * 2019-12-30 2020-05-19 武汉科技大学 Weak supervision fine-grained image classification method of multi-branch neural network model
CN111860510A (en) * 2020-07-29 2020-10-30 浙江大华技术股份有限公司 X-ray image target detection method and device
CN114626443A (en) * 2022-02-25 2022-06-14 华南理工大学 Object rapid detection method based on conditional branch and expert system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KANG JIANAN, ZHANG LIANG: "Multi-scale X-Ray Security Inspection Image Detection with Multi-channel Region Proposal", COMPUTER ENGINEERING AND APPLICATIONS, HUABEI JISUAN JISHU YANJIUSUO, CN, vol. 58, no. 1, 1 January 2022 (2022-01-01), CN , pages 224 - 231, XP093088580, ISSN: 1002-8331, DOI: 10.3778/j.issn.1002-8331.2008-0345 *
LIU MOYUN; XIE JINGMING; HAO JING; ZHANG YANG; CHEN XUZHAN; CHEN YOUPING: "A lightweight and accurate recognition framework for signs of X-ray weld images", COMPUTERS IN INDUSTRY, ELSEVIER, AMSTERDAM, NL, vol. 135, 25 November 2021 (2021-11-25), AMSTERDAM, NL , XP086904275, ISSN: 0166-3615, DOI: 10.1016/j.compind.2021.103559 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117893840A (en) * 2024-03-15 2024-04-16 深圳市宗匠科技有限公司 Acne severity grading method and device, electronic equipment and storage medium
CN117893840B (en) * 2024-03-15 2024-06-28 深圳市宗匠科技有限公司 Acne severity grading method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114626443A (en) 2022-06-14
CN114626443B (en) 2024-05-03

Similar Documents

Publication Publication Date Title
CN110175982B (en) Defect detection method based on target detection
WO2022111219A1 (en) Domain adaptation device operation and maintenance system and method
CN110097053B (en) Improved fast-RCNN-based electric power equipment appearance defect detection method
CN104834942B (en) Remote sensing image variation detection method and system based on mask classification
CN112233073A (en) Real-time detection method for infrared thermal imaging abnormity of power transformation equipment
CN111402224B (en) Target identification method for power equipment
Zhu et al. The defect detection algorithm for tire x-ray images based on deep learning
CN109191255B (en) Commodity alignment method based on unsupervised feature point detection
CN111209864B (en) Power equipment target identification method
CN112270681B (en) Method and system for detecting and counting yellow plate pests deeply
US20210390282A1 (en) Training data increment method, electronic apparatus and computer-readable medium
Wang et al. Research on detection technology of various fruit disease spots based on mask R-CNN
WO2023159927A1 (en) Rapid object detection method based on conditional branches and expert systems
Zou et al. Dangerous objects detection of X-ray images using convolution neural network
Lianqiao et al. Recognition and application of infrared thermal image among power facilities based on yolo
CN111310899B (en) Power defect identification method based on symbiotic relation and small sample learning
CN116542962A (en) Improved Yolov5m model-based photovoltaic cell defect detection method
Wang et al. Data augmentation method for fabric defect detection
CN116229236A (en) Bacillus tuberculosis detection method based on improved YOLO v5 model
Wu et al. Semiautomatic mask generating for electronics component inspection
Aldabbagh et al. Classification of chili plant growth using deep learning
Lin et al. CAM-Guided u-net with adversarial regularization for defect segmentation
CN113052799A (en) Osteosarcoma and osteochondroma prediction method based on Mask RCNN network
Pang et al. An Efficient Network for Obstacle Detection in Rail Transit Based on Multi-Task Learning
Xiong et al. Defect detection of biscuit packaging based on level set map