CN116385480B - Detection method and system for moving object below tower crane - Google Patents

Detection method and system for moving object below tower crane Download PDF

Info

Publication number
CN116385480B
CN116385480B CN202310053361.1A CN202310053361A CN116385480B CN 116385480 B CN116385480 B CN 116385480B CN 202310053361 A CN202310053361 A CN 202310053361A CN 116385480 B CN116385480 B CN 116385480B
Authority
CN
China
Prior art keywords
image
small
moving target
obtaining
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310053361.1A
Other languages
Chinese (zh)
Other versions
CN116385480A (en
Inventor
葛晓东
米文忠
姜贺
房新奥
郭振威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Light Speed Intelligent Equipment Co ltd
Tenghui Technology Building Intelligence Shenzhen Co ltd
Original Assignee
Guangdong Light Speed Intelligent Equipment Co ltd
Tenghui Technology Building Intelligence Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Light Speed Intelligent Equipment Co ltd, Tenghui Technology Building Intelligence Shenzhen Co ltd filed Critical Guangdong Light Speed Intelligent Equipment Co ltd
Priority to CN202310053361.1A priority Critical patent/CN116385480B/en
Publication of CN116385480A publication Critical patent/CN116385480A/en
Application granted granted Critical
Publication of CN116385480B publication Critical patent/CN116385480B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a detection method and a detection system for a moving object below a tower crane, wherein the detection method comprises the following steps: extracting feature points of an image sequence from the acquired moving target image to obtain a moving target image after the feature points are extracted; establishing motion estimation based on two-dimensional affine transformation to obtain affine transformation relation of two adjacent frames of images, and further obtaining similarity; the method comprises the steps of reserving a moving target pixel with similarity lower than a threshold as a candidate moving target pixel, performing European spatial clustering to obtain a plurality of clustering results, and obtaining a corresponding outer boundary box for each clustering result; according to affine transformation relation of two adjacent frames of images, finding out characteristic point pairs which do not meet affine transformation relation of two adjacent frames of images, and taking the characteristic point pairs as a matched outer point set; obtaining the number of outliers in each outer boundary frame according to the matched outlier set, and obtaining the outer boundary frame meeting the confidence requirement by using the number of outliers; and classifying the external boundary frame by using the trained visual neural network to realize the detection of the moving target.

Description

Detection method and system for moving object below tower crane
Technical Field
The invention relates to the technical field of target detection, in particular to a detection method and a detection system for a moving target below a tower crane.
Background
In the automatic driving process of the intelligent tower crane, the intelligent automatic driving device has important significance in detecting the moving object below the tower crane, and the moving object mainly comprises a person, a cargo truck, a bicycle, a tricycle and the like. The current detection method is mainly based on deep learning, and mainly adopts sensors such as images, laser radars and the like to collect data sources.
The detection method based on the image only mainly realizes the detection of the moving target based on the deep neural network, and the common detection method comprises two-stage FaterRCNN, single-stage SSD and single-stage YOLO. The backbone network selection is mainly based on the Darknet and MobileNet series. Because the construction scene complexity below the tower crane is higher, and because the camera mainly adopts top-down shooting, the condition that partial target information is blocked or shielded easily appears. It is therefore desirable to guarantee the expressive power of the entire network by employing a neural network backbone network with a high degree of complexity. However, such methods tend to result in high computational power and therefore generally require computation with a processor with dedicated AI acceleration hardware support.
The method based on the laser radar and the image mainly adopts a fusion detection method of two data. The method has a loose fusion mode, such as performing deep neural network target detection based on an image, finding out a corresponding laser radar point cloud according to a projection view cone of the image, and performing target classification judgment again according to the laser radar point cloud. The method has a compact fusion method, namely, the combination of the laser radar point cloud and the image is used as the input of the neural network, the deep neural network compatible with the laser radar point cloud and the image is directly trained, and the end-to-end target detection is completed.
However, both the above methods rely heavily on deep neural networks, whose reasoning requires high computational hardware. Problems are encountered in practical large scale deployments. The operation system of the intelligent tower crane is a large-scale system comprising a plurality of links such as sensing, decision making, planning, control, safety and the like, and has numerous sensors, large calculated amount and complex architecture. Heterogeneous computing mechanism support is often required, i.e., the computation of each sensor is done as much as possible at the terminal, rather than aggregating all the data into a central service node computation. In this case, facing the single task of moving object detection, if the computation is done by relying entirely on a neural network of higher complexity, it tends to result in high costs and high power consumption of the computing device at the edge.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method and a system for detecting a moving object below a tower crane, which are used for solving the technical problems that the cost and the power consumption of computing equipment are high in the existing method for detecting the moving object below the tower crane, so that the purposes of accurately detecting the moving object below the tower crane and simultaneously reducing the cost and the power consumption of the computing equipment are achieved.
In order to solve the problems, the technical scheme adopted by the invention is as follows:
the detection method for the moving object below the tower crane comprises the following steps:
image acquisition is carried out on a moving object below the tower crane by utilizing an acquisition device arranged below the tower crane trolley;
extracting feature points of an image sequence from the acquired moving target image to obtain a moving target image after the feature points are extracted;
establishing motion estimation for the motion target image after the feature points are extracted based on two-dimensional affine transformation to obtain affine transformation relation of two adjacent frames of images;
obtaining the similarity of the moving target images of the adjacent frames by utilizing the affine transformation relation of the two adjacent frames, and reserving the moving target pixels with the similarity lower than a threshold value as candidate moving target pixels;
performing European spatial clustering on the candidate moving target pixels to obtain a plurality of clustering results, and obtaining a corresponding outer boundary box aiming at each clustering result;
according to the affine transformation relation of the two adjacent frames of images, in all feature point matching results, finding out feature point pairs which do not meet the affine transformation relation of the two adjacent frames of images, and taking the feature point pairs as a matching outer point set;
obtaining the number of outliers in each outer boundary frame according to the matched outlier set, obtaining the confidence coefficient of the moving target of each outer boundary frame by using the number of outliers, and screening out the outer boundary frames meeting the confidence coefficient requirement;
and classifying the outer boundary box meeting the confidence coefficient requirement by utilizing a classification network of the visual neural network training target, and determining the type and the confidence coefficient of the moving target.
In a preferred embodiment of the present invention, when obtaining a moving object image from which feature points are extracted, the method includes:
establishing an image pyramid of the moving target image by Gaussian filtering;
extracting a small number of stable feature points from the top layer of the image pyramid by using a feature extraction algorithm to obtain a feature map;
searching local maximum points in the feature map, and reserving all the local maximum points obtained by searching according to a set percentage to serve as feature points;
acquiring a local direction gradient histogram of each feature point, and setting the maximum value of the local direction gradient histogram as the direction of the feature point;
extracting a large number of small-scale ORB characteristic points at the bottom layer of the image pyramid by using an ORB algorithm;
and after the feature point extraction and the small-scale ORB feature point extraction are completed on the moving target image, obtaining the moving target image after the feature point extraction.
As a preferred embodiment of the present invention, when extracting a large number of small-scale ORB feature points using an ORB algorithm, it includes:
setting an image small window, obtaining the accumulated quantity of the difference between the central point and the surrounding pixel points of the image small window, and constructing a small-scale ORB characteristic diagram according to the accumulated quantity of the difference;
searching a local maximum point in the small-scale ORB feature map as a small-scale ORB feature candidate point, and screening the small-scale ORB feature candidate point to obtain the small-scale ORB feature point;
and acquiring a centroid point of the image small window, connecting the centroid point with the center point, and taking the connecting line direction as the direction of the small-scale ORB characteristic point.
As a preferred embodiment of the present invention, when obtaining affine transformation relations of two adjacent frame images, it includes:
performing violent search based on the stable characteristic points to obtain a point pair with the minimum characteristic Euclidean distance as a rough matching point pair;
obtaining the difference of each small-scale ORB characteristic point according to the rough matching point pairs, and eliminating the small-scale ORB characteristic points with excessive difference;
and carrying out violent search again based on the reserved small-scale ORB characteristic points to obtain a point pair with the minimum characteristic Euclidean distance as a final matching result, and obtaining affine transformation relation of the two adjacent frames of images by utilizing the final matching result.
As a preferred embodiment of the present invention, when eliminating the feature points of the small-scale ORB with excessive variability, the method includes:
according to the rough matching point pairs, a two-dimensional affine transformation model of the moving image after the feature points are extracted is obtained, and small-scale ORB feature points of one image in the moving object image after the feature points are extracted are projected onto the other image by utilizing the two-dimensional affine transformation model, so that a corresponding projection result is obtained;
judging whether corresponding small-scale ORB feature points exist around projection according to the projection result, if not, judging that the small-scale ORB feature points do not have corresponding homonymous points, considering that the difference of the small-scale ORB feature points is too large, and eliminating the small-scale ORB feature points;
when obtaining the affine transformation relation of the two adjacent frames of images, the method comprises the following steps:
and carrying out least square solving on the two-dimensional affine transformation model of the moving image after the feature points are extracted by utilizing the final matching result, and obtaining the affine transformation relation of the two adjacent frames of images.
As a preferred embodiment of the present invention, when obtaining the similarity of moving object images of adjacent frames, it comprises:
acquiring all adjacent frame moving target images in the moving target images after the feature points are extracted, and taking a small image window for each pixel of one frame of moving target image in each group of adjacent frame moving target images; obtaining a small image window corresponding to another corresponding frame of moving target image according to the affine transformation relation of the two adjacent frames of images;
according to the small image window and the corresponding small image window, window image groups of each group of adjacent frame moving target images are obtained;
and acquiring the average value of the normalized correlation products of the three channels of the window image group R, G, B, and taking the average value of the normalized correlation products as the similarity.
As a preferred embodiment of the present invention, when obtaining a small image window corresponding to another frame of moving object image, it includes:
multiplying the center point of the small image window by an affine transformation matrix to obtain the center point of the small image window corresponding to another frame of moving target image;
and obtaining the corresponding small image window according to the same window size according to the center point of the corresponding small image window.
As a preferred embodiment of the present invention, when obtaining the average value of the normalized correlation product, it includes:
acquiring a gray average value of the window image group, and subtracting the gray average value corresponding to each pixel in the window image group from the gray average value of each pixel in the window image group to obtain a de-centralized window image group;
multiplying and summing corresponding pixels of the de-centralized window image group to obtain a correlation product, and dividing the correlation product by a two-norm analog value of the correlation product to obtain a normalized correlation product;
and obtaining the average value of the normalized correlation products of the three channels of the window image group R, G, B according to the normalized correlation products.
As a preferred embodiment of the present invention, when obtaining the moving object confidence of each of the outer bounding boxes, it includes:
the confidence that the outer bounding box belongs to a moving object is weighted by dividing the number of the outer points in the outer bounding box by a set value as a weight, so that the confidence of the outer point proportion is obtained;
acquiring the proportion of pixel points in the outer boundary frame to the small image window as similarity confidence;
and multiplying the outlier proportional confidence by the similarity confidence to obtain the moving target confidence of each external boundary box.
A detection system for a moving object below a tower crane, comprising:
feature extraction unit: the device is used for acquiring images of a moving object below the tower crane by using an acquisition device arranged below the tower crane trolley; extracting feature points of an image sequence from the acquired moving target image to obtain a moving target image after the feature points are extracted;
an outer bounding box acquisition unit: the method comprises the steps of establishing motion estimation for a moving target image after feature points are extracted based on two-dimensional affine transformation, and obtaining affine transformation relation of two adjacent frames of images; obtaining the similarity of the moving target images of the adjacent frames by utilizing the affine transformation relation of the two adjacent frames, and reserving the moving target pixels with the similarity lower than a threshold value as candidate moving target pixels; performing European spatial clustering on the candidate moving target pixels to obtain a plurality of clustering results, and obtaining a corresponding outer boundary box aiming at each clustering result;
an outer bounding box screening unit: the method comprises the steps of finding out characteristic point pairs which do not meet affine transformation relation of two adjacent frame images in all characteristic point matching results according to affine transformation relation of the two adjacent frame images, and taking the characteristic point pairs as a matching outer point set; obtaining the number of outliers in each outer boundary frame according to the matched outlier set, obtaining the confidence coefficient of the moving target of each outer boundary frame by using the number of outliers, and screening out the outer boundary frames meeting the confidence coefficient requirement;
classification unit: and the classification network is used for training the target by utilizing the visual neural network, classifying the outer boundary box meeting the confidence coefficient requirement and determining the type and the confidence coefficient of the moving target.
Compared with the prior art, the invention has the beneficial effects that:
(1) In the lifting process of the tower crane, different characteristic point extraction methods are used on different scales, so that the algorithm speed is effectively improved and the subsequent matching precision is ensured;
(2) When the method is used for detecting the moving object below the tower crane, the external boundary box meeting the confidence coefficient requirement is obtained by fusion image sequence analysis and feature point matching, and then the trained visual neural network is used for classification, so that the moving object is accurately detected, and the method is different from the existing detection method which completely relies on the neural network to complete calculation, so that the cost and the power consumption of the computing equipment are effectively reduced.
The invention is described in further detail below with reference to the drawings and the detailed description.
Drawings
FIG. 1 is a schematic diagram of a deployment location of an acquisition device according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating steps of a method for detecting a moving object under a tower crane according to an embodiment of the present invention.
Reference numerals illustrate: 1. a suspension arm; 2. a tower body; 3. a tower crane trolley; 4. and a collecting device.
Detailed Description
The method for detecting the moving object below the tower crane provided by the invention, as shown in fig. 2, comprises the following steps:
step S1: image acquisition is carried out on a moving object below the tower crane by utilizing an acquisition device 4 arranged below the tower crane trolley 3;
step S2: extracting feature points of an image sequence from the acquired moving target image to obtain a moving target image after the feature points are extracted;
step S3: establishing motion estimation on the motion target image after the feature points are extracted based on two-dimensional affine transformation to obtain affine transformation relation of two adjacent frames of images;
step S4: obtaining the similarity of the moving target images of the adjacent frames by utilizing the affine transformation relation of the two adjacent frames, and reserving the moving target pixels with the similarity lower than a threshold value as candidate moving target pixels;
step S5: performing European spatial clustering on candidate moving target pixels to obtain a plurality of clustering results, and obtaining a corresponding outer boundary box aiming at each clustering result;
step S6: according to affine transformation relation of two adjacent frames of images, in all feature point matching results, finding out feature point pairs which do not meet affine transformation relation of two adjacent frames of images, and taking the feature point pairs as a matching outer point set;
step S7: obtaining the number of outliers in each outer boundary frame according to the matched outlier set, obtaining the confidence coefficient of the moving target of each outer boundary frame by using the number of outliers, and screening out the outer boundary frames meeting the confidence coefficient requirement;
step S8: and classifying the outer boundary boxes meeting the confidence coefficient requirements by utilizing a classification network of the visual neural network training target, and determining the type and the confidence coefficient of the moving target.
Specifically, the acquisition device 4 adopted by the invention is a downward-looking camera, and as shown in fig. 1, the main components of the tower crane comprise a suspension arm 1, a tower body 2 and a tower crane trolley 3. The downward-looking camera is arranged below the crane trolley 3 and is perpendicular to the ground to shoot a moving object, so that the moving object below the crane is effectively subjected to image acquisition.
In step S2, when obtaining a moving object image from which feature points are extracted, the method includes:
establishing an image pyramid of the moving target image by Gaussian filtering;
extracting a small amount of stable feature points on the top layer of the image pyramid by using a feature extraction algorithm to obtain a feature map;
searching local maximum points in the feature map, and reserving all the local maximum points obtained by searching according to a set percentage to serve as feature points;
acquiring a local direction gradient histogram of each feature point, and setting the maximum value of the local direction gradient histogram as the direction of the feature point;
extracting a large number of small-scale ORB characteristic points at the bottom layer of the image pyramid by using an ORB algorithm;
and after feature point extraction and small-scale ORB feature point extraction are completed on the moving target image, obtaining the moving target image after feature point extraction.
Further, the feature extraction algorithm adopted in the present invention is a SIFT algorithm.
Further, the percentage set in the present invention is the first twenty percent.
Specifically, the image pyramid is constructed by using a gaussian filter, the basic value of the standard deviation of the gaussian function is 1.6, and 6 times of filtering are calculated in total, and in the ith filtering, the standard deviation is 1.6 to the power of i. And extracting a small amount of stable feature points from the top layer of the image pyramid by using a SIFT algorithm, namely obtaining a feature map by using the difference value between the top layer and the secondary top layer of the image pyramid.
Further, when a large number of small-scale ORB feature points are extracted using the ORB algorithm, it includes:
setting an image small window, acquiring the accumulated quantity of differences between the central point and surrounding pixel points of the image small window, and constructing a small-scale ORB characteristic diagram according to the accumulated quantity of the differences;
searching local maximum points in the small-scale ORB feature map to serve as small-scale ORB feature candidate points, and screening the small-scale ORB feature candidate points to obtain small-scale ORB feature points;
and acquiring a centroid point of the small window of the image, connecting the centroid point with the center point, and taking the connecting line direction as the direction of the small-scale ORB characteristic point.
In the above step S3, when obtaining the affine transformation relationship of the adjacent two frame images, it includes:
performing violent search based on the stable characteristic points to obtain a point pair with the minimum characteristic Euclidean distance as a rough matching point pair;
obtaining the difference of each small-scale ORB characteristic point according to the rough matching point pairs, and eliminating the small-scale ORB characteristic points with excessive difference;
and carrying out violent search again based on the reserved small-scale ORB characteristic points to obtain a point pair with the minimum characteristic Euclidean distance as a final matching result, and obtaining affine transformation relation of two adjacent frames of images by utilizing the final matching result.
Further, when eliminating the feature points of the small scale ORB with excessive variability, the method includes:
according to the rough matching point pairs, a two-dimensional affine transformation model of the moving image after the feature points are extracted is obtained, and small-scale ORB feature points of one image in the moving target image after the feature points are extracted are projected onto the other image by using the two-dimensional affine transformation model, so that a corresponding projection result is obtained;
and judging whether corresponding small-scale ORB feature points exist around the projection according to the projection result, if not, judging that the small-scale ORB feature points do not have corresponding homonymous points, considering that the difference of the small-scale ORB feature points is too large, and eliminating the small-scale ORB feature points.
Further, when obtaining affine transformation relations of two adjacent frame images, it includes:
and carrying out least square solution on the two-dimensional affine transformation model of the moving image after the feature points are adjacently extracted by utilizing the final matching result, and obtaining the affine transformation relation of the two adjacent frames of images.
In the above step S4, the moving object pixels having the similarity lower than 0.5 are reserved as candidate moving object pixels.
In the above step S4, when obtaining the similarity of the moving object images of the adjacent frames, it includes:
acquiring all adjacent frame moving target images in the moving target images after the feature points are extracted, and taking a small image window for each pixel of one frame of moving target image in each group of adjacent frame moving target images; obtaining a small image window corresponding to another corresponding frame of moving target image according to affine transformation relation of two adjacent frames of images;
according to the small image window and the corresponding small image window, obtaining a window image group of each group of adjacent frame moving target images;
the average value of the normalized correlation products of the three channels of the window image group R, G, B is obtained, and the average value of the normalized correlation products is taken as the similarity.
Further, when a small image window corresponding to another frame of the moving object image is obtained, the method includes:
obtaining the center point of the small image window corresponding to the other frame of moving target image by multiplying the center point of the small image window by the affine transformation matrix;
and obtaining the corresponding small image window according to the same window size according to the center point of the corresponding small image window.
Further, when obtaining the mean value of the normalized correlation product, the method includes:
acquiring a gray average value of a window image group, and subtracting the gray average value corresponding to each pixel in the window image group from the gray average value of each pixel in the window image group to acquire a de-centralized window image group;
multiplying and summing corresponding pixels of the window image group subjected to decentralization to obtain a correlation product, and dividing the correlation product by a two-norm analog value of the correlation product to obtain a normalized correlation product;
the mean of the normalized correlation products for the three channels of the window image set R, G, B is obtained from the normalized correlation products.
In the above step S7, when obtaining the moving object confidence of each outer bounding box, it includes:
the confidence that the external bounding box belongs to the moving object is weighted by taking the number of the external points in the external bounding box divided by a set value as a weight, so that the external point proportion confidence is obtained;
acquiring the proportion of pixel points in the outer boundary frame to the small image window as similarity confidence;
and multiplying the outlier proportional confidence coefficient by the similarity confidence coefficient to obtain the moving target confidence coefficient of each external boundary box.
Further, the number of outliers in the outer bounding box divided by 15 is used as the weight.
In the step S5, the specific process of performing the european space clustering to obtain a plurality of clustering results is as follows:
and performing European spatial clustering on the candidate moving target pixels, traversing all the candidate moving target pixels, and sequencing by using the number of the candidate moving target pixels in the neighborhood, wherein the more the number of the candidate moving target pixels in the neighborhood is, the higher the ranking is. Region growing is performed starting from the candidate moving object pixels arranged in front, i.e. all candidate moving object pixels having a spatial distance from the pixel of less than 3 are classified into the region. And performing logical calculation on the newly classified pixels until no new candidate moving target pixels can be classified into positions. At this time, the above calculation is repeated for the remaining candidate moving object pixels, so that the candidate moving object pixels are classified as different connected objects, and clustering is completed. For each communication body, a minimum bounding box is calculated as an outer bounding box.
In the step S7, the outer bounding box with the confidence level smaller than 2 is removed.
In the above step S8, the classification network of the target is trained using the mobiletv 2 neural network.
The invention provides a detection system for a moving object below a tower crane, which comprises the following components:
feature extraction unit: the device is used for acquiring images of a moving object under the tower crane by utilizing an acquisition device 4 arranged under the tower crane trolley 3; extracting feature points of an image sequence from the acquired moving target image to obtain a moving target image after the feature points are extracted;
an outer bounding box acquisition unit: the method comprises the steps of establishing motion estimation for a moving target image after feature points are extracted based on two-dimensional affine transformation, and obtaining affine transformation relation of two adjacent frames of images; obtaining the similarity of the moving target images of the adjacent frames by utilizing the affine transformation relation of the two adjacent frames, and reserving the moving target pixels with the similarity lower than a threshold value as candidate moving target pixels; performing European spatial clustering on candidate moving target pixels to obtain a plurality of clustering results, and obtaining a corresponding outer boundary box aiming at each clustering result;
an outer bounding box screening unit: the method comprises the steps of finding out characteristic point pairs which do not meet affine transformation relation of two adjacent frames of images in all characteristic point matching results according to affine transformation relation of the two adjacent frames of images, and taking the characteristic point pairs as a matching outer point set; obtaining the number of outliers in each outer boundary frame according to the matched outlier set, obtaining the confidence coefficient of the moving target of each outer boundary frame by using the number of outliers, and screening out the outer boundary frames meeting the confidence coefficient requirement;
classification unit: the classification network is used for training the target by utilizing the visual neural network, classifying the outer boundary box meeting the confidence coefficient requirement and determining the type and the confidence coefficient of the moving target.
Compared with the prior art, the invention has the beneficial effects that:
(1) In the lifting process of the tower crane, different characteristic point extraction methods are used on different scales, so that the algorithm speed is effectively improved and the subsequent matching precision is ensured;
(2) When the method is used for detecting the moving object below the tower crane, the external boundary box meeting the confidence coefficient requirement is obtained by fusion image sequence analysis and feature point matching, and then the trained visual neural network is used for classification, so that the moving object is accurately detected, and the method is different from the existing detection method which completely relies on the neural network to complete calculation, so that the cost and the power consumption of the computing equipment are effectively reduced.
The above embodiments are only preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, but any insubstantial changes and substitutions made by those skilled in the art on the basis of the present invention are intended to be within the scope of the present invention as claimed.

Claims (10)

1. The detection method for the moving object below the tower crane is characterized by comprising the following steps of:
image acquisition is carried out on a moving object below the tower crane by utilizing an acquisition device arranged below the tower crane trolley;
extracting feature points of an image sequence from the acquired moving target image to obtain a moving target image after the feature points are extracted;
establishing motion estimation for the motion target image after the feature points are extracted based on two-dimensional affine transformation to obtain affine transformation relation of two adjacent frames of images;
obtaining the similarity of the moving target images of the adjacent frames by utilizing the affine transformation relation of the two adjacent frames, and reserving the moving target pixels with the similarity lower than a threshold value as candidate moving target pixels;
performing European spatial clustering on the candidate moving target pixels to obtain a plurality of clustering results, and obtaining a corresponding outer boundary box aiming at each clustering result;
according to the affine transformation relation of the two adjacent frames of images, in all feature point matching results, finding out feature point pairs which do not meet the affine transformation relation of the two adjacent frames of images, and taking the feature point pairs as a matching outer point set;
obtaining the number of outliers in each outer boundary frame according to the matched outlier set, obtaining the confidence coefficient of the moving target of each outer boundary frame by using the number of outliers, and screening out the outer boundary frames meeting the confidence coefficient requirement;
and classifying the outer boundary box meeting the confidence coefficient requirement by utilizing a classification network of the visual neural network training target, and determining the type and the confidence coefficient of the moving target.
2. The method for detecting a moving object under a tower crane according to claim 1, wherein when a moving object image after feature point extraction is obtained, comprising:
establishing an image pyramid of the moving target image by Gaussian filtering;
extracting a small number of stable feature points from the top layer of the image pyramid by using a feature extraction algorithm to obtain a feature map;
searching local maximum points in the feature map, and reserving all the local maximum points obtained by searching according to a set percentage to serve as feature points;
acquiring a local direction gradient histogram of each feature point, and setting the maximum value of the local direction gradient histogram as the direction of the feature point;
extracting a large number of small-scale ORB characteristic points at the bottom layer of the image pyramid by using an ORB algorithm;
and after the feature point extraction and the small-scale ORB feature point extraction are completed on the moving target image, obtaining the moving target image after the feature point extraction.
3. The method for detecting a moving object under a tower crane according to claim 2, wherein when a large number of small-scale ORB feature points are extracted by an ORB algorithm, the method comprises:
setting an image small window, obtaining the accumulated quantity of the difference between the central point and the surrounding pixel points of the image small window, and constructing a small-scale ORB characteristic diagram according to the accumulated quantity of the difference;
searching a local maximum point in the small-scale ORB feature map as a small-scale ORB feature candidate point, and screening the small-scale ORB feature candidate point to obtain the small-scale ORB feature point;
and acquiring a centroid point of the image small window, connecting the centroid point with the center point, and taking the connecting line direction as the direction of the small-scale ORB characteristic point.
4. The method for detecting a moving object under a tower crane according to claim 2, wherein when obtaining affine transformation relations of two adjacent frames of images, comprising:
performing violent search based on the stable characteristic points to obtain a point pair with the minimum characteristic Euclidean distance as a rough matching point pair;
obtaining the difference of each small-scale ORB characteristic point according to the rough matching point pairs, and eliminating the small-scale ORB characteristic points with excessive difference;
and carrying out violent search again based on the reserved small-scale ORB characteristic points to obtain a point pair with the minimum characteristic Euclidean distance as a final matching result, and obtaining affine transformation relation of the two adjacent frames of images by utilizing the final matching result.
5. The method for detecting a moving object under a tower crane according to claim 4, wherein when eliminating feature points of small scale ORB with excessive variability, the method comprises:
according to the rough matching point pairs, a two-dimensional affine transformation model of the moving image after the feature points are extracted is obtained, and small-scale ORB feature points of one image in the moving object image after the feature points are extracted are projected onto the other image by utilizing the two-dimensional affine transformation model, so that a corresponding projection result is obtained;
judging whether corresponding small-scale ORB feature points exist around projection according to the projection result, if not, judging that the small-scale ORB feature points do not have corresponding homonymous points, considering that the difference of the small-scale ORB feature points is too large, and eliminating the small-scale ORB feature points;
when obtaining the affine transformation relation of the two adjacent frames of images, the method comprises the following steps:
and carrying out least square solving on the two-dimensional affine transformation model of the moving image after the feature points are extracted by utilizing the final matching result, and obtaining the affine transformation relation of the two adjacent frames of images.
6. The method for detecting a moving object under a tower crane according to claim 1, wherein when obtaining the similarity of moving object images of adjacent frames, comprising:
acquiring all adjacent frame moving target images in the moving target images after the feature points are extracted, and taking a small image window for each pixel of one frame of moving target image in each group of adjacent frame moving target images; obtaining a small image window corresponding to another corresponding frame of moving target image according to the affine transformation relation of the two adjacent frames of images;
according to the small image window and the corresponding small image window, window image groups of each group of adjacent frame moving target images are obtained;
and acquiring the average value of the normalized correlation products of the three channels of the window image group R, G, B, and taking the average value of the normalized correlation products as the similarity.
7. The method for detecting a moving object under a tower crane according to claim 6, wherein when a small image window corresponding to another frame of moving object image is obtained, comprising:
multiplying the center point of the small image window by an affine transformation matrix to obtain the center point of the small image window corresponding to another frame of moving target image;
and obtaining the corresponding small image window according to the same window size according to the center point of the corresponding small image window.
8. The method for detecting a moving object under a tower crane according to claim 6, wherein when obtaining the average value of normalized correlation products, comprising:
acquiring a gray average value of the window image group, and subtracting the gray average value corresponding to each pixel in the window image group from the gray average value of each pixel in the window image group to obtain a de-centralized window image group;
multiplying and summing corresponding pixels of the de-centralized window image group to obtain a correlation product, and dividing the correlation product by a two-norm analog value of the correlation product to obtain a normalized correlation product;
and obtaining the average value of the normalized correlation products of the three channels of the window image group R, G, B according to the normalized correlation products.
9. The method for detecting a moving object under a tower crane according to claim 6, wherein when obtaining the confidence of the moving object of each outer bounding box, comprising:
the confidence that the outer bounding box belongs to a moving object is weighted by dividing the number of the outer points in the outer bounding box by a set value as a weight, so that the confidence of the outer point proportion is obtained;
acquiring the proportion of pixel points in the outer boundary frame to the small image window as similarity confidence;
and multiplying the outlier proportional confidence by the similarity confidence to obtain the moving target confidence of each external boundary box.
10. A detection system for a moving object below a tower crane, comprising:
feature extraction unit: the device is used for acquiring images of a moving object below the tower crane by using an acquisition device arranged below the tower crane trolley; extracting feature points of an image sequence from the acquired moving target image to obtain a moving target image after the feature points are extracted;
an outer bounding box acquisition unit: the method comprises the steps of establishing motion estimation for a moving target image after feature points are extracted based on two-dimensional affine transformation, and obtaining affine transformation relation of two adjacent frames of images; obtaining the similarity of the moving target images of the adjacent frames by utilizing the affine transformation relation of the two adjacent frames, and reserving the moving target pixels with the similarity lower than a threshold value as candidate moving target pixels; performing European spatial clustering on the candidate moving target pixels to obtain a plurality of clustering results, and obtaining a corresponding outer boundary box aiming at each clustering result;
an outer bounding box screening unit: the method comprises the steps of finding out characteristic point pairs which do not meet affine transformation relation of two adjacent frame images in all characteristic point matching results according to affine transformation relation of the two adjacent frame images, and taking the characteristic point pairs as a matching outer point set; obtaining the number of outliers in each outer boundary frame according to the matched outlier set, obtaining the confidence coefficient of the moving target of each outer boundary frame by using the number of outliers, and screening out the outer boundary frames meeting the confidence coefficient requirement;
classification unit: and the classification network is used for training the target by utilizing the visual neural network, classifying the outer boundary box meeting the confidence coefficient requirement and determining the type and the confidence coefficient of the moving target.
CN202310053361.1A 2023-02-03 2023-02-03 Detection method and system for moving object below tower crane Active CN116385480B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310053361.1A CN116385480B (en) 2023-02-03 2023-02-03 Detection method and system for moving object below tower crane

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310053361.1A CN116385480B (en) 2023-02-03 2023-02-03 Detection method and system for moving object below tower crane

Publications (2)

Publication Number Publication Date
CN116385480A CN116385480A (en) 2023-07-04
CN116385480B true CN116385480B (en) 2023-10-20

Family

ID=86977618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310053361.1A Active CN116385480B (en) 2023-02-03 2023-02-03 Detection method and system for moving object below tower crane

Country Status (1)

Country Link
CN (1) CN116385480B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014092550A2 (en) * 2012-12-10 2014-06-19 Mimos Berhad Method for camera motion estimation with presence of moving object
CN110245671A (en) * 2019-06-17 2019-09-17 艾瑞迈迪科技石家庄有限公司 A kind of endoscopic images characteristic point matching method and system
CN112418251A (en) * 2020-12-10 2021-02-26 研祥智能科技股份有限公司 Infrared body temperature detection method and system
CN114358166A (en) * 2021-12-29 2022-04-15 青岛星科瑞升信息科技有限公司 Multi-target positioning method based on self-adaptive k-means clustering

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4492036B2 (en) * 2003-04-28 2010-06-30 ソニー株式会社 Image recognition apparatus and method, and robot apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014092550A2 (en) * 2012-12-10 2014-06-19 Mimos Berhad Method for camera motion estimation with presence of moving object
CN110245671A (en) * 2019-06-17 2019-09-17 艾瑞迈迪科技石家庄有限公司 A kind of endoscopic images characteristic point matching method and system
CN112418251A (en) * 2020-12-10 2021-02-26 研祥智能科技股份有限公司 Infrared body temperature detection method and system
CN114358166A (en) * 2021-12-29 2022-04-15 青岛星科瑞升信息科技有限公司 Multi-target positioning method based on self-adaptive k-means clustering

Also Published As

Publication number Publication date
CN116385480A (en) 2023-07-04

Similar Documents

Publication Publication Date Title
CN110175576B (en) Driving vehicle visual detection method combining laser point cloud data
CN111680542B (en) Steel coil point cloud identification and classification method based on multi-scale feature extraction and Pointnet neural network
WO2019228063A1 (en) Product inspection terminal, method and system, computer apparatus and readable medium
CN106886216B (en) Robot automatic tracking method and system based on RGBD face detection
CN108986148B (en) Method for realizing multi-intelligent-trolley collaborative search, identification and tracking of specific target group
CN109935080B (en) Monitoring system and method for real-time calculation of traffic flow on traffic line
CN107993488A (en) A kind of parking stall recognition methods, system and medium based on fisheye camera
CN103246896A (en) Robust real-time vehicle detection and tracking method
CN111753682B (en) Hoisting area dynamic monitoring method based on target detection algorithm
CN111223129A (en) Detection method, detection device, monitoring equipment and computer readable storage medium
CN105069451B (en) A kind of Car license recognition and localization method based on binocular camera
CN115272652A (en) Dense object image detection method based on multiple regression and adaptive focus loss
CN111915583B (en) Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene
CN110852179B (en) Suspicious personnel invasion detection method based on video monitoring platform
CN104915642A (en) Method and apparatus for measurement of distance to vehicle ahead
CN113688797A (en) Abnormal behavior identification method and system based on skeleton extraction
CN112560619A (en) Multi-focus image fusion-based multi-distance bird accurate identification method
CN111091057A (en) Information processing method and device and computer readable storage medium
CN115100741A (en) Point cloud pedestrian distance risk detection method, system, equipment and medium
CN109215059B (en) Local data association method for tracking moving vehicle in aerial video
CN111259736A (en) Real-time pedestrian detection method based on deep learning in complex environment
CN114022837A (en) Station left article detection method and device, electronic equipment and storage medium
CN116385480B (en) Detection method and system for moving object below tower crane
CN109815887B (en) Multi-agent cooperation-based face image classification method under complex illumination
CN112465854A (en) Unmanned aerial vehicle tracking method based on anchor-free detection algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant