CN117333686A

CN117333686A - Target positioning method, device, equipment and medium

Info

Publication number: CN117333686A
Application number: CN202311303921.0A
Authority: CN
Inventors: 朱静; 黎广宇; 王俊文; 王太伟; 王雪; 程为平
Original assignee: China Waterborne Transport Research Institute
Current assignee: China Waterborne Transport Research Institute
Priority date: 2023-10-09
Filing date: 2023-10-09
Publication date: 2024-01-02

Abstract

The application provides a target positioning method, a device, equipment and a medium, which can be used in the field of target positioning. In the method, after feature extraction is carried out on an image to be processed, the extracted first feature points are matched with second feature points corresponding to a preset reference image, and then image offset coordinates are calculated according to matched feature matching point pairs. After target detection processing is carried out on the image to be processed, according to target image coordinates obtained by target detection, the actual space coordinates of the target can be obtained by combining the image offset coordinates and a preset homography matrix. According to the scheme, the actual space coordinates of the target are calculated by determining the image offset coordinates, so that the accuracy of target positioning is effectively improved.

Description

Target positioning method, device, equipment and medium

Technical Field

The present disclosure relates to the field of target positioning, and in particular, to a target positioning method, apparatus, device, and medium.

Background

Along with the development of technology, a target positioning method for determining the position of a target in a world coordinate system according to the position of the target in an image is widely used in various fields.

In the prior art, when the target positioning is realized, a camera is usually required to be used for photographing to obtain an image, and then the coordinate of the target in the image in the actual space is determined according to the corresponding relation between the image coordinate system and the space coordinate system in the actual space, so that the target positioning is realized.

However, in working scenes such as construction sites and ports, the installation position of a general camera is high, and the camera is easily subjected to shaking caused by influence of wind power, vibration caused by vehicle running and the like, so that the shooting range is changed, and the accuracy of target positioning is low.

Disclosure of Invention

The embodiment of the application provides a target positioning method, device, equipment and medium, which are used for solving the problem that the accuracy of target positioning is low because the existing target positioning method directly uses an image shot by a camera to perform target positioning.

In a first aspect, an embodiment of the present application provides a target positioning method, including:

performing target detection processing on the acquired image to be processed to obtain at least one target and an image coordinate of each target;

performing feature extraction processing on the image to be processed to obtain at least one first feature point corresponding to the image to be processed and a first descriptor corresponding to each first feature point;

matching the first feature point and the second feature point according to the image coordinates of the first feature point, the first descriptor and the image coordinates of the second feature point corresponding to the preset reference image and the second descriptor corresponding to each second feature point to obtain at least one first feature matching point pair;

For each first feature matching point pair, differentiating the image coordinates of the feature points in the first feature matching point pair to obtain offset coordinates corresponding to the first feature matching point pair;

calculating an image offset coordinate according to the offset coordinate corresponding to the at least one first feature matching point pair;

and calculating the actual space coordinate of each target according to the image coordinate of each target, the image offset coordinate and a preset homography matrix.

In a specific embodiment, the calculating the image offset coordinate according to the offset coordinate corresponding to the at least one first feature matching point pair includes:

clustering the corresponding offset coordinates of the at least one first feature matching point by adopting a density-based clustering algorithm to obtain at least one coordinate class;

taking the coordinate class with the largest offset coordinate in the at least one coordinate class as a result coordinate class;

and taking the average value of the offset coordinates in the result coordinate class as the image offset coordinate.

In a specific embodiment, after the feature extraction processing is performed on the image to be processed to obtain at least one first feature point corresponding to the image to be processed and a first descriptor corresponding to each first feature point, the method further includes:

Deleting the first characteristic points belonging to the preset position range in the image to be processed from the at least one first characteristic point to obtain a third characteristic point;

deleting the second characteristic points belonging to the preset position range in the preset reference image from the second characteristic points to obtain a fourth characteristic point;

correspondingly, the matching the first feature point and the second feature point according to the image coordinates of the first feature point, the first descriptor, and the image coordinates of the second feature point corresponding to the preset reference image and the second descriptor corresponding to each second feature point to obtain at least one first feature matching point pair, including:

and matching the third feature point and the fourth feature point according to the image coordinates of the third feature point, the first descriptor corresponding to each third feature point, the image coordinates of the fourth feature point and the second descriptor corresponding to each fourth feature point to obtain at least one first feature matching point pair.

In a specific embodiment, before performing object detection processing on the acquired image to be processed to obtain at least one object and the image coordinates of each object, the method further includes:

Acquiring an original image set;

performing feature extraction processing on each image to be selected in the original image set to obtain at least one second feature point corresponding to each image to be selected and a second descriptor corresponding to each second feature point;

for each two images to be selected in the original image set, matching the second feature points corresponding to the two images to be selected according to the image coordinates of the second feature points corresponding to the two images to be selected and the second descriptors corresponding to the second feature points to obtain second feature matching point pairs corresponding to the two images to be selected;

calculating a coordinate distance average value of the matching point pairs corresponding to the two images to be selected according to the image coordinates of the fifth characteristic point in the second characteristic matching point pair;

if the matching point pair coordinate distance average value larger than the preset distance threshold value does not exist in all the matching point pair coordinate distance average values, calculating jitter parameter values corresponding to the images to be selected according to the images to be selected and second characteristic matching point pairs corresponding to the original images in the original images except the images to be selected;

and collecting the original images, and taking the image to be selected with the minimum jitter parameter value as the preset reference image.

In a specific embodiment, the calculating, according to the to-be-selected image and the second feature matching point pair corresponding to each image in the original image set except for the to-be-selected image, a jitter parameter value corresponding to the to-be-selected image includes:

calculating Euclidean distance between the image to be selected and a second feature matching point pair corresponding to each image except the image to be selected in the original image set;

and taking the square sum of the Euclidean distance as a jitter parameter value corresponding to the image to be selected.

In one embodiment, the method further comprises:

acquiring a moving target image set;

detecting a moving target of the moving target image set to obtain an initial target point;

taking an initial target point belonging to a preset target category as a result target point;

clustering the result target points to determine result reference points;

acquiring actual space coordinates of the result datum points;

and generating the preset homography matrix according to the image coordinates of the result datum points and the actual space coordinates of the result datum points.

In a specific embodiment, the clustering processing is performed on the result target points, and determining a result reference point includes:

Clustering the result target points according to the preset clustering quantity and the preset clustering distance to obtain at least four target point classes;

taking the center point in each target point class as an initial reference point;

generating an external rectangle of the initial datum point;

and taking the union of the vertex of the circumscribed rectangle and the initial datum point as a result datum point.

In a second aspect, embodiments of the present application provide a target positioning device, including:

the target detection module is used for carrying out target detection processing on the acquired image to be processed to obtain at least one target and an image coordinate of each target;

the feature extraction module is used for carrying out feature extraction processing on the image to be processed to obtain at least one first feature point corresponding to the image to be processed and a first descriptor corresponding to each first feature point;

the matching module is used for matching the first feature point and the second feature point according to the image coordinates of the first feature point, the first descriptor, the image coordinates of the second feature point corresponding to the preset reference image and the second descriptor corresponding to each second feature point to obtain at least one first feature matching point pair;

A processing module for:

In a third aspect, an embodiment of the present application provides an electronic device, including:

a processor, a memory, a communication interface;

the memory is used for storing executable instructions of the processor;

wherein the processor is configured to perform the object localization method of any of the first aspects via execution of the executable instructions.

In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the object localization method according to any of the first aspects.

According to the target positioning method, device, equipment and medium, after the feature extraction is carried out on the image to be processed, the first feature points extracted are matched with the second feature points corresponding to the preset reference image, and then the image offset coordinates are calculated according to the matched feature matching point pairs. After target detection processing is carried out on the image to be processed, according to target image coordinates obtained by target detection, the actual space coordinates of the target can be obtained by combining the image offset coordinates and a preset homography matrix. According to the scheme, the actual space coordinates of the target are calculated by determining the image offset coordinates, so that the accuracy of target positioning is effectively improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the prior art descriptions, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.

Fig. 1 is a schematic flow chart of a first embodiment of a target positioning method provided in the present application;

fig. 2 is a schematic flow chart of a second embodiment of a target positioning method provided in the present application;

fig. 3 is a schematic flow chart of a third embodiment of a target positioning method provided in the present application;

FIG. 4 is a schematic structural diagram of an embodiment of a target positioning device provided in the present application;

fig. 5 is a schematic structural diagram of an electronic device provided in the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which a person of ordinary skill in the art would have, based on the embodiments in this application, come within the scope of protection of this application.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims of this application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Along with the development of science and technology, the target positioning is widely applied to traffic, construction sites, ports and other scenes, and the operation efficiency can be improved.

In the prior art, the camera can be calibrated to determine camera parameters such as internal parameters, distortion, external parameters and the like of the camera, so as to determine the corresponding relation between an image coordinate system and a space coordinate system of an actual space.

Or, using a neural network model method to learn the corresponding relation between the image coordinate system and the space coordinate system of the actual space from a large amount of image data; or determining the corresponding relation between the image coordinate system and the space coordinate system of the actual space by calculating the homography matrix.

And further, after the camera is used for shooting to obtain an image, determining the image coordinates of the target in the image, and combining the corresponding relation to determine the coordinates of the target in the image in the actual space so as to realize target positioning.

However, in work scenes such as construction sites and ports, the mounting position of a general camera is high, and the camera is easily subject to shaking caused by influence of wind force, vibration caused by running of a vehicle, and the like, and further, the shooting range is changed, and if the image coordinates of the target are still determined by using the image, the target positioning is performed, which results in a problem that the accuracy of the target positioning is low.

In order to solve the problems in the prior art, the inventor finds that in order to improve the accuracy of target positioning in the process of researching a target positioning method, after an image to be processed is acquired, characteristic points in the image are extracted, then the image is matched with the characteristic points of a preset reference image, and the image offset coordinates are determined according to the image coordinates of the successfully matched characteristic points. And the image coordinates can be corrected according to the pixel coordinates of the target in the image to be processed and in combination with the image offset coordinates, and then the actual space coordinates of the target are obtained by using a preset homography matrix, so that the accuracy of target positioning can be effectively improved. Based on the above inventive concept, a target positioning scheme in the present application is designed.

The subject of the target positioning method in the present application may be a computer, or may be a server, a camera, or other devices, which is not limited in the present application, and a computer will be described as an example.

An application scenario of the target positioning method provided in the present application is illustrated below.

In this application scenario, the camera is installed at the high place of a high-pole lamp or a steel frame in the harbor operation scenario, the camera transmits a photographed image to be processed to a computer, and the computer performs target positioning to determine the positions of travelers and vehicles for scheduling.

After the computer obtains the image to be processed, the computer can perform object detection processing on the image to obtain at least one object and the image coordinates of each object.

Because in a harbor operation scene, wind power is generally large, a camera is easy to shake due to the influence of wind power, and vehicles in a harbor are generally heavy, and shake of the camera is caused when the vehicles pass through a high-pole lamp or a steel frame position, a computer needs to correct image coordinates of a target in an image to be processed.

After the computer acquires the image to be processed, the computer also needs to perform feature extraction processing on the image to be processed, the extracted feature points are matched with the feature points in the preset reference image, and then the image offset coordinates are determined according to the image coordinates of the feature points in the first feature matching point pair which are successfully matched.

And the computer then makes difference between the image coordinates of each target and the image offset coordinates to obtain correction coordinates, and then the correction coordinates are combined with a preset homography matrix to obtain the actual space coordinates of each target.

The computer may then display the location of each target on the actual space map based on the actual space coordinates of each target for the staff to view or schedule.

It should be noted that the above scenario is only an example of an application scenario provided by the embodiment of the present application, and the embodiment of the present application does not limit the actual forms of various devices included in the scenario, and does not limit the interaction manner between the devices, and in a specific application of the scheme, the application may be set according to actual requirements.

The following describes the technical scheme of the present application in detail through specific embodiments. It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

Fig. 1 is a schematic flow chart of a first embodiment of a target positioning method provided in the present application, where in the embodiment of the present application, a computer extracts a feature point in an image to be processed, matches the feature point with a feature point of a preset reference image, and determines an image offset coordinate according to a matching result; further, the case of determining the actual spatial coordinates of the target from the image offset coordinates will be described. The method in this embodiment may be implemented by software, hardware, or a combination of software and hardware. As shown in fig. 1, the target positioning method specifically includes the following steps:

S101: and performing target detection processing on the acquired image to be processed to obtain at least one target and image coordinates of each target.

In the step, after the camera shoots the image to be processed, the image to be processed is transmitted to the computer for processing, and the computer carries out target detection processing on the acquired image to be processed to obtain at least one target and the image coordinates of each target.

The target detection processing may be performed by using an R-CNN (Region with CNN Feature) algorithm, or may be performed by using a YOLO (You Only Look Once) algorithm, or may be performed by using a DETR (Detection Transformer) algorithm. The embodiment of the application does not limit the mode of performing the target detection processing, and can be set according to actual situations.

It should be noted that, after the target detection processing is performed on the image to be processed to obtain the target, the target which does not belong to the preset target class may be deleted, and then the remaining target is used for target positioning. The preset target category can be a vehicle, a pedestrian, an animal and the like, and the preset target category is not limited in the embodiment of the application and can be set according to actual conditions.

S102: and carrying out feature extraction processing on the image to be processed to obtain at least one first feature point corresponding to the image to be processed and a first descriptor corresponding to each first feature point.

In this step, after the computer obtains the image to be processed, in order to correct the target in the image to be processed, a feature extraction algorithm is required to be used to perform feature extraction processing on the image to be processed, so as to obtain at least one first feature point corresponding to the image to be processed and a first descriptor corresponding to each first feature point. The descriptor is a vector characterizing the features of the feature points.

It should be noted that, the feature extraction algorithm may be a scale invariant feature transform (Scale Invariant Feature Transform, abbreviated as SIFT) algorithm, or may be an accelerated robust feature (Speeded-Up Robust Features, abbreviated as SURF) algorithm, ORB (Oriented FAST and Rotated BRIEF) algorithm, etc., which is not limited in the embodiment of the present application, and may be determined according to practical situations.

S103: and matching the first feature points with the second feature points according to the image coordinates and the first descriptors of the first feature points, and the image coordinates of the second feature points corresponding to the preset reference image and the second descriptors corresponding to each second feature point to obtain at least one first feature matching point pair.

In this step, after the computer obtains the first feature points and the first descriptors, a feature matching algorithm may be adopted to match the first feature points and the second feature points according to the image coordinates of the first feature points and the first descriptors, and the image coordinates of the second feature points corresponding to the preset reference image and the second descriptors corresponding to each second feature point, so as to obtain at least one first feature matching point pair.

It should be noted that, the feature matching algorithm may be a Brute-force matching (Brute-Froce match) algorithm, or may be an algorithm in a near-neighbor fast library (Fast Library for Approximate Nearest Neighbors, abbreviated as "FLANN"), a cross matching algorithm, or the like.

Optionally, deleting the first characteristic points belonging to the preset position range in the image to be processed from at least one first characteristic point to obtain a third characteristic point; and deleting the second characteristic points belonging to the preset position range in the preset reference image from the second characteristic points to obtain fourth characteristic points.

And matching the third feature point with the fourth feature point according to the image coordinates of the third feature point and the first descriptor corresponding to each third feature point and the image coordinates of the fourth feature point and the second descriptor corresponding to each fourth feature point to obtain at least one first feature matching point pair, so that the matching accuracy can be improved.

It should be noted that if the distance between the feature point and the left or right side of the image is smaller than the preset first distance, or the distance between the feature point and the upper or lower side of the image is smaller than the preset second distance, it is indicated that the feature point belongs to the preset position range. The preset first distance and the preset second distance can be 5 pixels, 6 pixels, 10 pixels and the like, and the preset first distance and the preset second distance are not limited and can be determined according to actual conditions.

S104: and for each first feature matching point pair, carrying out difference on the image coordinates of the feature points in the first feature matching point pair to obtain offset coordinates corresponding to the first feature matching point pair.

In this step, after the computer obtains the first feature matching point pairs, for each first feature matching point pair, the image coordinates of the feature points in the first feature matching point pair are differenced, so as to obtain the offset coordinates corresponding to the first feature matching point pair.

Exemplary, the image coordinates of the first feature point in the first feature matching point pair areThe image coordinates of the second feature point are +.>The corresponding offset coordinate of the first feature matching point pair is (deltax _i ,Δy _i ) Wherein, the method comprises the steps of, wherein,

it should be noted that, the feature matching algorithm outputs the first feature matching point pairs, and may further determine the matching degree, and rank the first feature matching point pairs according to the matching degree, and if the number of the obtained offset coordinates is greater than the preset number of culling, select the first feature matching point pairs with the previous preset offset number to calculate the offset coordinates.

The preset rejection number may be 500, 1000, 5000, etc., the preset offset number may be 100, 200, 300, etc., and the embodiment of the present application does not limit the preset rejection number and the preset offset number, and may be set according to actual situations.

S105: and calculating the image offset coordinates according to the offset coordinates corresponding to the at least one first feature matching point pair.

In this step, after the computer obtains the offset coordinates corresponding to the first feature matching point pair, the computer may calculate the image offset coordinates according to the offset coordinates corresponding to at least one first feature matching point pair.

Specifically, a density-based clustering algorithm is adopted to cluster the offset coordinates corresponding to at least one first feature matching point pair, so as to obtain at least one coordinate class.

Taking the coordinate class with the largest offset coordinate in at least one coordinate class as a result coordinate class; and taking the average value of the offset coordinates in the result coordinate class as the image offset coordinates.

Illustratively, the coordinates in the resulting coordinate class are, in order: (Deltax) ₁ ,Δy ₁ )，…，(Δx _m ,Δy _m ) The method comprises the steps of carrying out a first treatment on the surface of the The image offset coordinates areWherein (1)>m is the number of coordinates in the resulting coordinate class.

It should be noted that, the Density-based clustering algorithm may be a DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm, or may be a OPTICS (Ordering points to identify the clustering structure) algorithm, a DENCLUE (Density based Clustering) algorithm, or the like, which is not limited in the embodiment of the present application, and may be determined according to actual situations.

It should be noted that when the DBSCAN algorithm is used for clustering, the parameter eps may be set to a smaller preset parameter value, so that the clustering result is more in line with the shaking scene, the preset parameter value may be 2 pixels, 3 pixels, 5 pixels, and the like.

It should be noted that, the execution sequence of step S101 and step S102-step S105 may be: step S101 is executed first, and then step S102-step S105 are executed; it is also possible that: step S102-step S105 are executed first, and step S101 is executed again; it is also possible that: step S101 is performed simultaneously with step S102 to step S105. The execution sequence of step S101 and step S102 to step S105 is not limited in the embodiment of the present application, and may be set according to actual situations.

S106: and calculating the actual space coordinate of each target according to the image coordinate of each target, the image offset coordinate and the preset homography matrix.

In the step, after determining the image coordinates of the target in the image to be processed and the image offset coordinates, the computer subtracts the image coordinates of the target from the image offset coordinates for each target to obtain the corrected coordinates of the target, and the actual space coordinates of the target can be calculated by combining with a preset homography matrix.

Exemplary, the image coordinates of the target are (x, y), the image offset coordinates areThe corrected coordinates of this target are (x ', y'), where +.>The homography matrix is a matrix H of 3*3, and the correction coordinates are complemented to (x ^′ ,y ^′ 1), it can be determined that (X, Y, 1) =h (X ^′ ,y ^′ 1) the actual spatial coordinates of the object are (X, Y).

According to the target positioning method provided by the embodiment, after the feature extraction is carried out on the image to be processed, the extracted first feature points are matched with the second feature points corresponding to the preset reference image, and then the image offset coordinates are calculated according to the matched feature matching point pairs. After target detection processing is carried out on the image to be processed, according to target image coordinates obtained by target detection, the actual space coordinates of the target can be obtained by combining the image offset coordinates and a preset homography matrix. Compared with the prior art that the actual space coordinates of the target are calculated by directly using the image coordinates of the target, the method and the device have the advantages that the image coordinates of the target are corrected through the determined image offset coordinates, and then the corrected coordinates are used for calculating the actual space coordinates of the target, so that the accuracy of target positioning is effectively improved.

Fig. 2 is a schematic flow chart of a second embodiment of the target positioning method provided in the present application, and on the basis of the foregoing embodiment, the case of acquiring a preset reference image before a computer performs target positioning in the present application is described. As shown in fig. 2, the target positioning method specifically includes the following steps:

S201: an original image set is acquired.

In this step, before the computer performs the target positioning, a preset reference image needs to be determined, and an original image set needs to be acquired first.

The video camera sends recorded videos to the computer, workers can select the videos, select video segments with smaller jitter degree and fewer moving targets, and then the computer extracts images from the video segments to form an original image set.

After the video camera sends the recorded video to the computer, the computer can also directly extract images from the video to form an original image set.

S202: and carrying out feature extraction processing on each image to be selected in the original image set to obtain at least one second feature point corresponding to each image to be selected and a second descriptor corresponding to each second feature point.

In this step, after the computer obtains the original image set, it needs to determine whether the jitter degree of the original image set meets the requirement, and it needs to perform feature extraction processing on each image to be selected in the original image set first, so as to obtain at least one second feature point corresponding to each image to be selected and a second descriptor corresponding to each second feature point.

It should be noted that, the present step is similar to step S102 in the first embodiment, and will not be described here again.

S203: and for every two images to be selected in the original image set, matching the second characteristic points corresponding to the two images to be selected according to the image coordinates of the second characteristic points corresponding to the two images to be selected and the second descriptors corresponding to the second characteristic points to obtain second characteristic matching point pairs corresponding to the two images to be selected.

It should be noted that, the present step is similar to step S103 in the first embodiment, and will not be described here again.

S204: and calculating the average value of the matching point pair coordinates corresponding to the two images to be selected according to the image coordinates of the fifth feature point in the second feature matching point pair.

In the step, after the computer determines the second feature matching point pair, according to the image coordinates of the fifth feature point in the second feature matching point pair, calculating the average value of the coordinate distances of the matching point pairs corresponding to the two images to be selected.

For example, m second feature matching point pairs corresponding to two images to be selected are provided, wherein image coordinates of two fifth feature points in the second feature matching point pairs are respectively: (x) _ik ,y _ik ) And (x) _jk ,y _jk ) The average value of the coordinate distance of the matching points corresponding to the two images to be selected is Wherein,

it should be noted that, the feature matching algorithm outputs the second feature matching point pairs, and may further determine the matching degree, and may sort the second feature matching point pairs according to the matching degree, and if the obtained second feature matching point pairs are greater than the preset number of culling, select the second feature matching point pairs with the preset number of offset before to calculate the average value of the coordinates of the matching point pairs.

S205: if the coordinate distance average value larger than the preset distance threshold value does not exist in all the coordinate distance average values, calculating jitter parameter values corresponding to the images to be selected according to the images to be selected and second feature matching point pairs corresponding to the images in the original image set except the images to be selected for each image to be selected.

In the step, after determining the average value of the coordinate distances of the matching points corresponding to each two images to be selected, the computer judges whether the average value of the coordinate distances greater than a preset distance threshold exists in all the average values of the coordinate distances so as to judge whether the jitter degree of the original image set meets the requirement.

If the coordinate distance average value which is larger than the preset distance threshold value does not exist in all the coordinate distance average values, the dithering degree of the original image set is proved to be in accordance with the requirement, and for each image to be selected, the dithering parameter value corresponding to the image to be selected is calculated according to the image to be selected and the second feature matching point pair corresponding to each image except the image to be selected in the original image set.

It should be noted that, the preset distance threshold may be 2 pixels, 3 pixels, 5 pixels, etc., and the embodiment of the present application does not limit the preset distance threshold, and may be set according to practical situations.

Specifically, the Euclidean distance of the second feature matching point pair corresponding to each image except the image to be selected in the image to be selected and the original image set is calculated.

The euclidean distance of the second feature matching point pair refers to the euclidean distance of the image coordinates of the two second feature points in the second feature matching point pair.

And taking the square sum of all the obtained Euclidean distances as a jitter parameter value corresponding to the image to be selected.

It should be noted that, the feature matching algorithm outputs the second feature matching point pairs, and may further determine the matching degree, and rank the second feature matching point pairs according to the matching degree, and if the obtained second feature matching point pairs are greater than the preset number of culling points, select the second feature matching point pairs with the preset number of offset before to calculate the jitter parameter value.

If the average value of the coordinate distances greater than the preset distance threshold exists in all the average values of the coordinate distances, the original image set is required to be re-acquired if the jitter degree of the original image set is not in accordance with the requirement, whether the average value of the coordinate distances greater than the preset distance threshold exists in all the average values of the coordinate distances is re-judged, until the average value of the coordinate distances greater than the preset distance threshold does not exist in all the average values of the coordinate distances, and the jitter parameter value is calculated.

S206: and collecting the original images, and taking the image to be selected with the minimum jitter parameter value as a preset reference image.

In this step, after obtaining the shake parameter value corresponding to each image to be selected, the computer concentrates the original images, and uses the image to be selected with the minimum shake parameter value as the preset reference image.

It should be noted that, the computer also needs to store the second feature point corresponding to the preset reference image and the second descriptor corresponding to the second feature point. And the characteristic extraction processing can be carried out again on the preset reference image, and the extracted second characteristic points and the second descriptors corresponding to the second characteristic points are stored.

When the mounting position and the mounting direction of the camera are changed, the predetermined reference image needs to be redetermined according to the above steps.

According to the target positioning method, feature extraction processing is carried out on the obtained original image set, feature point matching is further carried out, the average value of matching point-to-coordinate distances is calculated, when the matching point-to-coordinate distance average value which is larger than the preset distance threshold value does not exist in all the matching point-to-coordinate distance average values, the shaking parameter value is calculated, and an image to be selected with the minimum shaking parameter value is selected to be used as a preset reference image, so that the shaking degree of the selected preset reference image is smaller.

Fig. 3 is a schematic flow chart of a third embodiment of the target positioning method provided in the present application, and on the basis of the foregoing embodiment, the embodiment of the present application describes a case of generating a preset homography matrix after a computer determines a preset reference image. As shown in fig. 3, the target positioning method specifically includes the following steps:

s301: a set of moving object images is acquired.

In this step, before the computer performs the target positioning, a preset homography matrix needs to be determined, and a moving target image set needs to be acquired first.

The video camera sends recorded videos to the computer, workers can select the videos, video segments with more moving targets are selected, and then the computer extracts images from the video segments to form a moving target image set.

S302: and detecting the moving target of the moving target image set to obtain an initial target point.

In the step, after the computer obtains the moving target image set, the moving target detection is carried out on the images in the moving target image set to obtain an initial target point.

It should be noted that, the present step is similar to step S101 in the first embodiment, and will not be described here again.

S303: and taking the initial target point belonging to the preset target category as a result target point.

In this step, after the computer obtains the initial target point, the computer needs to use the initial target point belonging to the preset target category as the result target point. And the target is positioned by using a preset homography matrix determined according to the result target point, so that the accuracy of target positioning can be improved.

It should be noted that the preset target category may be a vehicle, a pedestrian, an animal, etc., and the embodiment of the present application does not limit the preset target category, and may be set according to actual situations.

S304: clustering is carried out on the result target points, and the result reference points are determined.

In the step, after the computer obtains the result target point, clustering is carried out on the result target point, and a result reference point is determined.

Specifically, clustering the result target points according to the preset clustering number and the preset clustering distance to obtain at least four target point classes, wherein the distance between the central points of every two target point classes is larger than the preset clustering distance.

It should be noted that, the preset number of clusters is a number greater than or equal to 4, and may be 4, 5, 9, etc., the preset cluster distance is 1/K of the short sides of the images in the moving target image set, and K is the preset number of clusters. The embodiment of the application does not limit the preset clustering quantity and the preset clustering distance, and can be set according to actual conditions.

The clustering method may be performed by using a K-means algorithm, by using a K-sum algorithm, or by using a K-sum algorithm.

Taking the center point in each target point class as an initial reference point; and generating an external rectangle of the initial datum point, and taking the union of the vertex of the external rectangle and the initial datum point as a result datum point.

It should be noted that, if the side length of the circumscribed rectangle is smaller than the preset side length threshold value, the length of the side length is elongated to the preset side length threshold value. The preset side length threshold may be 100 pixels, 200 pixels, 500 pixels, etc., and the embodiment of the present application does not limit the preset side length threshold, and may be set according to practical situations.

The initial reference point may be used as the resultant reference point.

S305: the actual spatial coordinates of the resulting fiducial point are obtained.

In this step, after the computer obtains the result datum point, the result datum point can be marked in a preset datum image for display, after the result datum point is checked by a worker, if the result datum point is not in the same plane in the actual space, the result datum point is adjusted to be in the same plane in the actual space.

And the staff determines the actual space coordinates of each result datum point and then inputs the actual space coordinates into the computer, and the computer can acquire the actual space coordinates of the result datum points.

S306: and generating a preset homography matrix according to the image coordinates of the result datum points and the actual space coordinates of the result datum points.

In this step, after determining the image coordinates and the actual space coordinates of the result reference points, the computer generates a preset homography matrix according to the image coordinates of the result reference points and the actual space coordinates of the result reference points, because the number of the result reference points is greater than 4.

When the mounting position and the mounting direction of the camera are changed, the preset homography matrix needs to be redetermined according to the steps.

According to the target positioning method provided by the embodiment, the target points in the moving target image set are determined through moving target detection, clustering is performed to obtain the result reference points, and then the preset homography matrix is generated according to the image coordinates of the result reference points and the actual space coordinates of the result reference points, so that the accuracy of target positioning by using the preset homography matrix in the follow-up process can be improved. In addition, the calibration of the camera is not needed, and for the homography matrix showing the corresponding relation between the image coordinate system and the space coordinate system of the actual space, errors caused by the distance between the camera and the target are avoided, so that the homography matrix has higher accuracy and lower cost for generating the preset homography matrix; model training is not needed, the generation efficiency of the preset homography matrix is improved, and the cost for generating the preset homography matrix is low.

The following are device embodiments of the present application, which may be used to perform method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.

Fig. 4 is a schematic structural diagram of an embodiment of the object positioning device provided in the present application. As shown in fig. 4, the object positioning device 40 includes:

the target detection module 41 is configured to perform target detection processing on the acquired image to be processed, so as to obtain at least one target and an image coordinate of each target;

the feature extraction module 42 is configured to perform feature extraction processing on the image to be processed, so as to obtain at least one first feature point corresponding to the image to be processed and a first descriptor corresponding to each first feature point;

the matching module 43 is configured to match the first feature point with the second feature point according to the image coordinates of the first feature point, the first descriptor, and the image coordinates of the second feature point corresponding to the preset reference image, and the second descriptor corresponding to each second feature point, so as to obtain at least one first feature matching point pair;

a processing module 44 for:

Further, the processing module 44 is specifically configured to:

Further, the processing module 44 is further configured to:

further, the matching module 43 is specifically configured to:

Further, the processing module 44 is further configured to acquire an original image set;

further, the feature extraction module 42 is further configured to perform feature extraction processing on each image to be selected in the original image set, so as to obtain at least one second feature point corresponding to each image to be selected and a second descriptor corresponding to each second feature point;

further, the matching module 43 is further configured to match, for each two images to be selected in the original image set, the second feature points corresponding to the two images to be selected according to the image coordinates of the second feature points corresponding to the two images to be selected and the second descriptors corresponding to the second feature points, so as to obtain second feature matching point pairs corresponding to the two images to be selected;

further, the processing module 44 is further configured to:

Further, the processing module 44 is further configured to:

Further, the processing module 44 is further configured to acquire a moving target image set;

further, the target detection module 41 is further configured to perform moving target detection on the moving target image set to obtain an initial target point;

further, the processing module 44 is further configured to:

clustering the result target points to determine result reference points;

acquiring actual space coordinates of the result datum points;

Further, the processing module 44 is further configured to:

generating an external rectangle of the initial datum point;

a union of the vertices of the bounding rectangle and the initial fiducial point is used as a resultant fiducial point

The target positioning device provided in this embodiment is configured to execute the technical solution in any of the foregoing method embodiments, and its implementation principle and technical effects are similar, and are not described herein again.

Fig. 5 is a schematic structural diagram of an electronic device provided in the present application. As shown in fig. 5, the electronic device 50 includes:

a processor 51, a memory 52, and a communication interface 53;

the memory 52 is configured to store executable instructions of the processor 51;

wherein the processor 51 is configured to perform the technical solution of any of the method embodiments described above via execution of the executable instructions.

Alternatively, the memory 52 may be separate or integrated with the processor 51.

Optionally, when the memory 52 is a device separate from the processor 51, the electronic device 50 may further include:

Bus 54, memory 52 and communication interface 53 are coupled to processor 51 via bus 54 and communicate with each other, and communication interface 53 is used to communicate with other devices.

Alternatively, the communication interface 53 may be implemented specifically by a transceiver. The communication interface is used to enable communication between the database access apparatus and other devices (e.g., clients, read-write libraries, and read-only libraries). The memory may comprise random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

Bus 54 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The processor may be a general-purpose processor, including a Central Processing Unit (CPU), a network processor (network processor, NP), etc.; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.

The electronic device is configured to execute the technical scheme in any of the foregoing method embodiments, and its implementation principle and technical effects are similar, and are not described herein again.

The embodiment of the application also provides a readable storage medium, on which a computer program is stored, which when executed by a processor implements the technical solution provided by any of the foregoing method embodiments.

The embodiments of the present application also provide a computer program product, which includes a computer program, where the computer program is used to implement the technical solution provided by any of the foregoing method embodiments when executed by a processor.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features can be replaced equivalently; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A method of locating a target, comprising:

2. The method of claim 1, wherein calculating image offset coordinates from offset coordinates corresponding to the at least one first feature matching point pair comprises:

3. The method according to claim 1, wherein after performing feature extraction processing on the image to be processed to obtain at least one first feature point corresponding to the image to be processed and a first descriptor corresponding to each first feature point, the method further includes:

4. The method of claim 1, wherein before performing object detection processing on the acquired image to be processed to obtain at least one object and an image coordinate of each object, the method further comprises:

acquiring an original image set;

5. The method according to claim 4, wherein calculating the jitter parameter value corresponding to the image to be selected according to the second feature matching point pair corresponding to each image in the image to be selected and the original image set except the image to be selected includes:

6. The method according to claim 4, wherein the method further comprises:

acquiring a moving target image set;

clustering the result target points to determine result reference points;

acquiring actual space coordinates of the result datum points;

7. The method of claim 6, wherein clustering the result target points to determine result reference points comprises:

Generating an external rectangle of the initial datum point;

8. A target positioning device, comprising:

a processing module for:

9. An electronic device, comprising:

a processor, a memory, a communication interface;

the memory is used for storing executable instructions of the processor;

wherein the processor is configured to perform the object localization method of any one of claims 1 to 7 via execution of the executable instructions.

10. A readable storage medium having stored thereon a computer program, which when executed by a processor implements the object localization method of any of claims 1 to 7.