WO2022199360A1

WO2022199360A1 - Moving object positioning method and apparatus, electronic device, and storage medium

Info

Publication number: WO2022199360A1
Application number: PCT/CN2022/079340
Authority: WO
Inventors: 吴臻志; 马欣; 祝夭龙
Original assignee: 北京灵汐科技有限公司
Priority date: 2021-03-23
Filing date: 2022-03-04
Publication date: 2022-09-29
Also published as: CN113012200A; CN113012200B

Abstract

A moving object positioning method, comprising: acquiring event stream information by means of a dynamic vision sensor, and acquiring a sampling event frame according to the event stream information; acquiring, according to the sampling event frame, the number of pixels respectively corresponding to at least two event count thresholds in an event count threshold set; determining a target event count threshold according to the number of pixels respectively corresponding to the at least two event count thresholds; and determining a target pixel the number of corresponding generated events of which is greater than or equal to the target event count threshold, and determining a location area of a moving object according to the target pixel. Also disclosed are a moving object positioning apparatus, an electronic device, and a storage medium.

Description

Positioning method, device, electronic device and storage medium for moving objects

technical field

The present disclosure relates to the technical field of image recognition, and in particular, to a method, an apparatus, an electronic device, and a computer-readable storage medium for locating a moving object.

Background technique

With the continuous advancement of science and technology, image recognition technology has developed rapidly and is widely used in various fields. The positioning of moving objects in images has become an important branch of image recognition technology.

In the related art, the image recognition technology usually extracts the acquired video image from the global image through the image classification model, and judges whether there is a moving object in the image according to the extracted image features, and determines whether the moving object exists in the image. position.

However, in such an image recognition method, the extraction of image features requires a large amount of calculation, resulting in a slow positioning speed of moving objects, and it is difficult to meet the real-time positioning of moving objects, especially for small moving objects, the positioning effect is poor.

SUMMARY OF THE INVENTION

Embodiments of the present disclosure provide a method, apparatus, electronic device, and computer-readable storage medium for locating a moving object, so as to locate the moving object in an image.

In a first aspect, an embodiment of the present disclosure provides a method for locating a moving object, where the method includes:

Obtain event flow information through a dynamic vision sensor, and obtain sampled event frames according to the event flow information;

According to the sampled event frame, the number of pixels corresponding to at least two event number thresholds in the event number threshold set is obtained, and the pixel points corresponding to the event number threshold are correspondingly generated events greater than or equal to the event number threshold. pixel point;

Determine the target event number threshold according to the number of pixels corresponding to the at least two event number thresholds respectively;

Determine the target pixel points in the sampled event frame with the corresponding event times greater than or equal to the target event times threshold, and determine the position area of the moving object according to the target pixel points.

In a second aspect, an embodiment of the present disclosure provides a positioning device for a moving object, and the positioning device includes:

an event frame acquisition module, used for acquiring event stream information through a dynamic vision sensor, and acquiring sampling event frames according to the event stream information;

A threshold acquisition module, configured to acquire, according to the sampled event frame, the number of pixels corresponding to at least two event number thresholds in the event number threshold set respectively, where the pixel points corresponding to the event number threshold are correspondingly generated event numbers greater than or Pixels equal to the threshold for the number of events; determine the threshold for the number of events of interest according to the number of pixels corresponding to the at least two thresholds for the number of events;

A location area acquisition module, configured to determine a target pixel point in the sampled event frame whose number of events is greater than or equal to the target event number threshold, and determine a location area of a moving object according to the target pixel point.

In a third aspect, an embodiment of the present disclosure provides an electronic device, the electronic device comprising:

one or more processors;

memory for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the method for locating a moving object according to any embodiment of the present disclosure.

In a fourth aspect, an embodiment of the present disclosure further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method for locating a moving object described in any embodiment of the present disclosure.

According to the technical solutions disclosed in the embodiments of the present disclosure, after the sampling event frame is acquired, the target event number threshold is determined according to the number of pixels corresponding to the at least two event number thresholds, and the corresponding generated event number is greater than or equal to the number of events. The target pixel points of the target event threshold, and finally determine the position area of the moving object according to all the target pixel points, so as to realize the positioning of the moving object, and there is no need to perform image feature extraction and calculation process when positioning the moving object, which effectively saves the calculation. At the same time, the recognition efficiency of moving objects is improved, and the accurate positioning of moving objects of small volume can be effectively realized.

Description of drawings

1 is a schematic flowchart of a method for locating a moving object according to an embodiment of the present disclosure;

FIG. 2 is a schematic flowchart of another method for locating a moving object according to an embodiment of the present disclosure;

3 is a schematic flowchart of another method for locating a moving object according to an embodiment of the present disclosure;

4 is a schematic flowchart of another method for locating a moving object according to an embodiment of the present disclosure;

5 is a structural block diagram of a device for positioning a moving object according to an embodiment of the present disclosure;

FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed ways

The present disclosure will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present disclosure, but not to limit the present disclosure. In addition, it should be noted that, for the convenience of description, the drawings only show some but not all structures related to the present disclosure.

1 is a schematic flowchart of a method for locating a moving object provided by an embodiment of the present disclosure. The positioning method provided by the embodiment of the present disclosure can be used to locate a moving object in a video image. The device may be implemented by software and/or hardware, and integrated into an electronic device, and the method may include the following steps: Steps S110 to S140.

Step S110: Acquire event stream information through a dynamic vision sensor, and acquire sampled event frames according to the event stream information.

Among them, Dynamic Vision Sensor (DVS) is an image acquisition device that adopts pixel asynchronous mechanism and is based on address and event expression (AER); , and sequentially read all the pixel information in each "frame", DVS does not need to read all the pixels in the picture, but only needs to obtain the address and information of the pixels whose light intensity changes.

For the dynamic vision sensor, when the dynamic vision sensor detects that the light intensity change of a certain pixel point is greater than or equal to a preset threshold value, an event signal of the pixel point is sent out. Among them, if the light intensity change is a positive change, that is, the brightness of the pixel jumps from low brightness to high brightness, an event signal represented by "+1" is sent out and marked as a positive event; if the light intensity changes as Negative change, that is, the pixel jumps from high brightness to low brightness, an event signal represented by "-1" is sent, and it is marked as a negative event; if the light intensity change is less than the preset threshold value, no event signal is sent. , marked as no events. The dynamic vision sensor forms event flow information by marking the event signal generated by each pixel point, and the event flow information records the event situation generated by each pixel point in the picture collected by the dynamic vision sensor.

Compared with the background image with small changes in the light intensity, the light intensity of the corresponding pixels in the area where the moving object passes in the picture will change to different degrees. The light intensity will increase significantly. When the moving object disappears, the light intensity of the pixels in the disappearing area of the moving object will be significantly reduced. Therefore, according to the event stream information, it can be determined which pixels in the picture may have moving objects.

In some embodiments, the event stream information collected by the dynamic vision sensor may be sampled according to a preset sampling period to obtain sampled event frames.

Within the preset sampling period, if the event flow information of a certain pixel includes a positive event or a negative event, the pixel may be a pixel related to a moving object. The sampling event frame is an image frame displayed after summarizing all the labeling events of each pixel in the captured picture within the preset sampling period, which is used to describe the events (such as positive events or negative events) that occur at all pixels in the picture. ).

The preset sampling period can be set according to actual needs. For example, in order to improve the detection efficiency of moving objects in video images, the preset sampling period can be set to a lower value; in order to reduce the processing pressure of the sampled images, the preset sampling period can be set to a higher value; especially Yes, due to the high detection accuracy of DVS, the detection efficiency of the event signal of the pixel point can reach the nanosecond level (for example, 1000 nanoseconds, that is, the event signal of the pixel point can be obtained once every 1000 nanoseconds), and the preset The sampling period can usually be set to the millisecond level (for example, 10 milliseconds). Therefore, in one sampling period, the light intensity of a pixel may experience multiple changes, that is, the DVS sends out multiple event signals for a pixel, That is to say, a pixel generates multiple events.

Step S120: Acquire the number of pixels corresponding to at least two event number thresholds in the event number threshold set according to the sampled event frame.

Step S130: Determine the target event number threshold according to the number of pixels corresponding to the at least two event number thresholds respectively; wherein, in the at least two event number thresholds, the pixel corresponding to each event number threshold is the corresponding generated event number Pixels greater than or equal to the threshold of the number of events.

In the sampling event frame, when the number of events corresponding to a pixel is greater than or equal to a threshold of the number of events, the pixel is a pixel corresponding to the threshold of the number of events.

In this embodiment of the present disclosure, the threshold for the number of events refers to the minimum number of times that the DVS sends an event signal for the same pixel in a sampling period. For example, if there are two event count thresholds in the event count threshold set, one of the event count thresholds is configured to be 5, and the other event count threshold is configured to be 6, then in step S120, obtain the event signal generated in the sampling event frame within the sampling period. The number of pixel points whose times are greater than or equal to 5 times, and the number of pixel points whose times are greater than or equal to 6 times in the sampling event frame in the acquisition sampling period.

It should be noted that the higher the threshold of the number of events, the fewer the number of pixels whose number of events is greater than or equal to the threshold of the number of events. The number is small, so it may not be able to accurately describe the actual location area of the moving object; and the lower the event count threshold, the more the number of pixels with the event count greater than or equal to the event count threshold, the more likely the area where these pixels are located. Noise points (that is, falsely detected interference points), but due to the large number of these pixel points, it can more accurately describe the actual motion area of the moving object. Therefore, it is necessary to determine the threshold of the number of target events in at least two thresholds of the number of events, and determine the required pixel points through the threshold of the number of target events, so that the actual motion area of the moving object can be described by more pixels, and it can be Effectively reduce the appearance of noise spots.

In this embodiment of the present disclosure, the event count threshold set is a preconfigured set consisting of at least two event count thresholds, and the minimum event count threshold in the event count threshold set may be preset according to actual needs, for example, the minimum event count threshold may be If it is set to 1, the threshold of the minimum number of events can also be set to other smaller values. The threshold of the maximum number of events in the event number threshold set can be preset according to actual needs. For example, the threshold of the maximum number of events can be set to a larger value. Numeric value (eg, 50). In some embodiments, the event number threshold set may include multiple consecutive event number thresholds. For example, when the maximum event number threshold is 50, the event number threshold set includes 1 to 50, for a total of 50 event number thresholds, that is, at In step S130, among the above-mentioned 50 event times thresholds, a target event times threshold is determined.

Step S140: Determine the target pixel points in the sampling event frame with the corresponding event times greater than or equal to the target event times threshold, and determine the location area of the moving object according to the target pixel points.

In step S140, after determining the target pixel points in the sampling event frame whose number of events is greater than or equal to the threshold of the number of target events, all target pixels are divided into one or more densely distributed areas according to the proximity principle, wherein, if There is only one moving object in the sampling event frame, then there is a dense distribution area of target pixels in the sampling event frame. If there are multiple moving objects in the sampling event frame, then there are multiple target pixels in the sampling event frame. area; by connecting the outer edge pixels of the densely distributed area of the target pixels, the real contour information of the moving objects in the area can be obtained, so that the position area of the moving objects can be determined.

According to the technical solution of the method for locating a moving object provided by the embodiment of the present disclosure, after the sampling event frame is acquired, the target event number threshold is determined according to the number of pixels corresponding to at least two event number thresholds respectively, and then the corresponding generated event number threshold is obtained. The target pixels whose number of events is greater than or equal to the threshold of the number of target events will finally determine the location area of the moving object according to all the target pixels, so as to realize the positioning of the moving object, and there is no need to perform image feature extraction and The calculation process effectively saves computing resources, improves the recognition efficiency of moving objects, and can effectively achieve accurate positioning for small-volume moving objects.

FIG. 2 is a schematic flowchart of another method for locating a moving object according to an embodiment of the present disclosure. In some embodiments of the present disclosure, as shown in FIG. 2 , at least two events in the event count threshold set are obtained according to sampled event frames. Before the number of pixels corresponding to the thresholds of the number of events, that is, before step S120, the positioning method may further include: step S111 and step S112.

Step S111 , according to the sampled event frame, determine a candidate pixel point having at least one event.

Step S112: Determine the maximum matching event number threshold according to the number of candidate pixels to determine the event number threshold set.

In step S111, the sampling event frame records events (such as positive events or negative events) that occur at all pixel points in the picture collected by the dynamic vision sensor, so according to the sampling event frame, all corresponding pixel points that generate at least one event can be determined, as an alternative pixel.

In some embodiments of the present disclosure, the larger the number of candidate pixel points, the larger the position area occupied by the target moving object in the image, or the larger the sum of the position areas occupied by multiple moving objects in the image. Correspondingly, it is necessary to The actual location area of the moving object is planned by a larger number of pixels. Therefore, the maximum number of events threshold in the event number threshold set can be set to a small value to obtain as many pixels as possible; The smaller the number, the smaller the location area occupied by the target moving object in the image, or the smaller the sum of the location area occupied by multiple moving objects in the image. Correspondingly, only a smaller number of pixels are needed to plan the moving object. The actual location area, therefore, the maximum event number threshold in the event number threshold set can be set to a larger value to reduce the occurrence of noise points. In step S112, according to the difference in the number of candidate pixel points, the matching maximum event number threshold is obtained, thereby determining the event number threshold set, which can effectively improve the acquisition efficiency of the target event number threshold. For example, the event number threshold set can be configured to include from 1 Consecutive values up to the maximum number of events threshold.

In some embodiments of the present disclosure, the step of determining the maximum number of events threshold for matching according to the number of candidate pixels may further include: obtaining, according to a predetermined correspondence between the number of candidate pixels and the threshold for the maximum number of events, obtaining Threshold for the maximum number of events to match the number of candidate pixels.

In some embodiments of the present disclosure, the correspondence between the number of candidate pixel points and the threshold of the maximum number of events may be obtained through a pixel threshold comparison table or a preset calculation rule. Among them, the pixel threshold comparison table is used to describe the corresponding relationship between the number of candidate pixels and the maximum event threshold. After the number of candidate pixels is obtained, the number interval in which the number of candidate pixels is located can be used to pass the pixel threshold. Look up the corresponding maximum number of events threshold according to the table; you can also take the number of candidate pixel points as a known parameter and bring it into the pre-built calculation formula according to the preset calculation rule to obtain the corresponding maximum number of events threshold, which calculates The formula can be set according to actual needs.

In some embodiments of the present disclosure, the step of determining the target event number threshold according to the number of pixels corresponding to the at least two event number thresholds respectively, that is, step S130, may further include: according to the at least two event number thresholds respectively corresponding to The number of pixel points, determine the threshold of the number of critical events, and use the threshold of the number of critical events as the threshold of the number of target events.

In some embodiments of the present disclosure, the step of determining the target event number threshold according to the number of pixels corresponding to the at least two event number thresholds respectively, that is, step S130, may further include: according to the at least two event number thresholds respectively corresponding to The number of pixel points determines the threshold of the number of critical events; according to the threshold of the maximum number of events and the threshold of the number of critical events in the event number threshold set, the threshold of the number of intermediate events is determined, and the threshold of the number of intermediate events is used as the target event number threshold.

When the threshold of the number of events is lower than the threshold of the critical number of events, although the number of pixel points representing the motion area of the object will increase significantly, due to the influence of noise points, there is a large error in the obtained position area of the moving object. Therefore, in some embodiments, the threshold for the number of critical events, that is, the threshold for the number of events before the number of pixels increases substantially, the threshold for the number of critical events may be used as the threshold for the number of target events.

In particular, the pixel points obtained through the threshold of the critical event times also have a certain degree of noise, but the number of noise points does not increase significantly compared with other thresholds of event times with lower values. Therefore, in other embodiments, in order to further reduce the influence of the noise point moving object location area, the intermediate event count threshold between the critical event count threshold and the maximum event count threshold in the set of event count thresholds may also be used as a screening condition, The intermediate event number threshold is used as the target event number threshold to obtain target pixels corresponding to the intermediate event number threshold, and then determine the location area of the moving object; for example, the critical event number threshold is 7, and the maximum event number threshold is 11 , correspondingly, the intermediate event number threshold (ie, 9) between the two is selected as the filtering condition for obtaining the target pixel.

In some embodiments of the present disclosure, the step of determining a critical event number threshold according to the number of pixels corresponding to the at least two event number thresholds may further include: setting each adjacent two event number thresholds in the event number threshold set The number of corresponding pixel points is subjected to difference operation, and the difference result is obtained; in each difference result, the target difference result with the largest value is selected, and the difference between the two event times thresholds corresponding to the target difference result is compared. A large value is used as the threshold for the number of critical events. Wherein, for every two adjacent event number thresholds in the event number threshold set, the result of the difference between the number of pixels corresponding to the two adjacent event number thresholds is the number of pixels corresponding to the adjacent two event number thresholds The absolute value of the difference.

Since the event number threshold set includes multiple consecutive event number thresholds, by obtaining the number of pixels corresponding to each event number threshold, the difference between the number of pixels corresponding to each adjacent two event number thresholds is calculated. , and according to the difference results of the above statistics, obtain two event count thresholds related to the maximum difference result, and select the larger one among the above two event count thresholds as the critical event count threshold; The count threshold set includes 8 event count thresholds. The values of the 8 event count thresholds are 11, 10, 9, 8, 7, 6, 5, and 4, respectively, and the corresponding pixel numbers are 80,000 and 100,000 respectively. , 120,000, 150,000, 180,000, 200,000, 270,000, and 300,000, then the difference between the number of pixels corresponding to the thresholds for the number of adjacent events is 20,000, 20,000, 30,000, 30,000, 20,000, 70,000, and 30,000. Obviously, the difference with the largest value is 70,000, and the corresponding two event thresholds are 6 and 5, respectively. Therefore, the event threshold 6 is determined as the critical event threshold. .

In some embodiments of the present disclosure, when the number of pixels corresponding to any two adjacent event number thresholds in the event number threshold set is obtained, and there is an obvious increase or decrease in the number of pixels, the above adjacent number of pixels is determined. The larger value of the two event count thresholds is used as the critical event count threshold.

In some embodiments of the present disclosure, the step of determining the threshold for the number of critical events according to the number of pixels corresponding to the at least two thresholds for the number of events may further include: if any adjacent two thresholds for the number of events are acquired If the difference calculation result of the number of pixels corresponding to the event number thresholds is greater than or equal to the preset number threshold, the larger value of the two adjacent event number thresholds is used as the critical event number threshold. The calculation result of the difference between the numbers of pixels corresponding to two adjacent event times thresholds is the absolute value of the difference between the numbers of pixels corresponding to two adjacent event times thresholds.

In other embodiments of the present disclosure, the step of determining the critical event count threshold according to the number of pixels corresponding to the at least two event count thresholds may further include: if any adjacent two event count threshold sets are acquired When the ratio between the difference calculation result of the number of pixels corresponding to the thresholds of the number of events of each event and the total number of pixels of the sampled event frame is greater than or equal to the preset percentage threshold, the adjacent two thresholds of the number of events will be divided into The larger value of , as the threshold for the number of critical events.

Exemplarily, the preset number threshold is 50,000, and for the above-mentioned adjacent event number threshold 6 and event number threshold 5, the difference between the number of pixels corresponding to the two is 70,000, and the difference is greater than 70,000. The preset number threshold is 50,000. Therefore, the larger of the event number threshold 6 and the event number threshold 5, that is, 6 is used as the critical event number threshold. At this time, there is no need to perform other adjacent event number thresholds. The corresponding pixels The difference operation between the number of points or the ratio operation between the difference operation result and the total number of pixel points reduces the amount of data calculation and improves the acquisition speed of the threshold for the number of critical events.

Exemplarily, the preset percentage threshold is 10%, and the total number of pixels in the sampling event frame can be determined according to the resolution of the dynamic vision sensor. For example, the total number of pixels in the sampling event frame is 600,000. For the above threshold of adjacent events 6 and the number of events threshold 5, the difference between the number of pixels corresponding to the two is 70,000, and the ratio of the difference to the total number of pixels in the sampled event frame is 7÷60=11.7%, the ratio is greater than The preset percentage threshold is 10%. Therefore, the larger of the event number threshold 6 and the event number threshold 5, that is, 6 is used as the critical event number threshold. At this time, there is no need to perform other adjacent event number thresholds. The corresponding pixels The difference operation between the number of points or the ratio operation between the difference operation result and the total number of pixel points reduces the amount of data calculation and improves the acquisition speed of the threshold for the number of critical events.

FIG. 3 is a schematic flowchart of another method for locating a moving object according to an embodiment of the present disclosure. In some embodiments of the present disclosure, as shown in FIG. 3 , according to a sampled event frame, a candidate having at least one event is determined. After the step of pixel point, that is, after step S111, the positioning method further includes: step S113.

Step S113 , perform side suppression processing on the region where the candidate pixel points are located in the sampling event frame.

Further, according to the sampled event frame, the step of obtaining the number of pixels corresponding to at least two event count thresholds in the event count threshold set, that is, step S120, may further include: obtaining the event according to the sampled event frame after the side suppression processing. The number of pixels corresponding to at least two event count thresholds in the count threshold set.

Among them, lateral inhibition is the inhibitory effect that occurs between adjacent neurons, that is, when a neuron is stimulated and excited, the adjacent neurons are stimulated again, and the latter (that is, the above-mentioned similar neurons) will occur. The inhibitory effect of the excitation on the former (that is, the above-mentioned certain neuron), the lateral inhibition is essentially the phenomenon of mutual inhibition between adjacent receptors; in some embodiments of the present disclosure, the area where the candidate pixel points are located After the side suppression processing is performed, the display effect of the candidate pixels can be enhanced, and the background pixels in the area can be suppressed.

In some embodiments of the present disclosure, in step S140 , determining the location area of the moving object according to the target pixel point may further include: marking the location area of the moving object through a region of interest frame according to the target pixel point.

Among them, the region of interest (Region Of Interest, ROI) is a box, circle, ellipse and polygon to outline the area that needs to be processed, because the acquired contour information of the moving object is usually an irregular figure, which is inconvenient in the image. Positioning, in some embodiments of the present disclosure, the smallest square that also includes the outline of the moving object can be marked in the image by means of a square marking frame, and the area within the square marking frame and the square marking frame is the area of the moving object. location area.

FIG. 4 is a schematic flowchart of another method for locating a moving object according to an embodiment of the present disclosure. In some embodiments of the present disclosure, as shown in FIG. 4 , after the step of determining the location area of the moving object according to the target pixel point , that is, after step S140, the positioning method further includes: step S141.

Step S141 , determining the moving trajectory of the moving object according to the position regions of the moving object in the multiple sampling event frames, and determining whether the moving trajectory is the target trajectory through the trained image classification model.

In the position area of the moving object in each sampling event frame, the center point of the position area is used as the moving point of the moving object, and after multiple consecutive sampling event frames are superimposed, the composition of multiple moving points can be obtained. The movement trajectory of the moving object. The image classification model is a classification model that is pre-trained based on sample images. Its function is to extract image features and obtain feature vectors for the input image information, and then output the corresponding image classification probability according to the obtained feature vectors. Image classification The probability represents the probability that the input image information is a positive sample or a negative sample, and then classify according to the image classification probability (ie binary classification) to determine whether the input image is a target trajectory; among them, the type of the target trajectory is determined by the positive sample image. The trajectory type determines, for example, the high-altitude parabolic trajectory is used as the target trajectory, and whether the moving trajectory of the moving object in the image is a high-altitude parabolic trajectory is determined to determine whether the moving trajectory of the moving object in the sampling event frame is a high-altitude parabolic trajectory.

In some embodiments of the present disclosure, before judging whether the moving track is a target track through the image classification model completed by training, the positioning method further includes: constructing an initial image classification model based on a convolutional neural network, and pairing the image with a sample image set. The initial image classification model performs image recognition and classification training to obtain a trained image classification model.

Among them, Convolutional Neural Networks (CNN) is a feedforward neural network (Feedforward Neural Networks) with deep structure including convolution calculation in Deep Learning (Deep Learning), which is characterized in that the application of convolution operation improves the The extraction accuracy of image features, and the application of pooling layer reduces the computational complexity of image features.

Exemplarily, in the sample image set, the positive sample image is a high-altitude parabolic trajectory image, and the output value of the positive sample image is 1; the negative sample image is image information that does not include high-altitude parabolic trajectory or high-altitude falling object trajectory, for example, the flight trajectory of birds. There are various types of images, such as images of silhouette flashing trajectory images, pixel blank images, and moving trajectory images of upper paraboloids. The output value of negative sample images is 0.

The initial image classification model is trained by a sample image set composed of positive sample images and negative sample images, so that the trained image classification model has image recognition and classification capabilities.

FIG. 5 is a structural block diagram of a device for positioning a moving object according to an embodiment of the present disclosure. The positioning device 200 specifically includes: an event frame obtaining module 210 , a threshold value obtaining module 220 and a location area obtaining module 230 .

The event frame obtaining module 210 is configured to obtain event flow information through the dynamic vision sensor, and obtain sampled event frames according to the event flow information.

The threshold obtaining module 220 is used to: obtain the number of pixels corresponding to at least two event times thresholds in the event times threshold set according to the sampled event frame, wherein the pixel corresponding to the event times threshold is the corresponding generated event times greater than or equal to the number of pixels. Pixels of the event count threshold; determine the target event count threshold according to the number of pixels corresponding to at least two event count thresholds respectively.

The location area acquisition module 230 is configured to determine the target pixel points in the sampled event frame with the corresponding event times greater than or equal to the target event times threshold, and determine the location area of the moving object according to the target pixel points.

According to the technical solution of the device for positioning a moving object provided by the embodiment of the present disclosure, after the sampling event frame is acquired, the target event number threshold is determined according to the number of pixels corresponding to at least two event number thresholds respectively, and then the corresponding generated event number threshold is determined. The target pixels whose number of events is greater than or equal to the threshold of the number of target events will finally determine the location area of the moving object according to all the target pixels, so as to realize the positioning of the moving object, and there is no need to perform image feature extraction and The calculation process effectively saves computing resources, improves the recognition efficiency of moving objects, and can effectively achieve accurate positioning for small-volume moving objects.

In some embodiments, on the basis of the above technical solutions, the apparatus 200 for positioning a moving object may further include: a candidate pixel point acquisition module and a threshold set determination module.

The candidate pixel point acquiring module is configured to determine candidate pixel points having at least one event according to the sampled event frame.

The threshold set determination module is configured to determine the maximum matching event number threshold according to the number of candidate pixels, so as to determine the event number threshold set.

In some embodiments, on the basis of the above technical solutions, the threshold set determination module is configured to obtain the maximum event matching the number of candidate pixels according to the predetermined correspondence between the number of candidate pixels and the threshold of the maximum number of events number of thresholds.

In some embodiments, based on the above technical solutions, the threshold obtaining module 220 is configured to determine the threshold of the number of critical events according to the number of pixels corresponding to the at least two thresholds of the number of events, and use the threshold of the number of critical events as the target event number of thresholds. Or, the threshold value acquisition module 220 is used for: according to the number of pixel points corresponding to at least two event times thresholds respectively, determine the critical event times threshold; according to the maximum event times threshold and the critical event times threshold in the event times threshold set, determine the intermediate event Threshold for the number of times, and use the threshold for the number of intermediate events as the threshold for the number of target events.

In some embodiments, based on the above technical solutions, the threshold value obtaining module 220 specifically includes: a difference value result obtaining unit and a threshold value obtaining unit.

The difference result obtaining unit is configured to perform a difference operation on the number of pixels corresponding to every two adjacent event times thresholds in the event times threshold set, and obtain a difference result.

The threshold obtaining unit is used to select the target difference result with the largest value among the difference results, and use the larger value of the two event count thresholds corresponding to the target difference result as the critical event count threshold.

In some embodiments, on the basis of the above technical solutions, the threshold obtaining module 220 is configured to obtain a difference operation result of the number of pixels corresponding to any two adjacent event count thresholds in the event count threshold set being greater than or equal to When the number threshold is preset, the larger value of the two adjacent event number thresholds is used as the critical event number threshold. Alternatively, the threshold acquisition module 220 is configured to obtain the ratio of the difference calculation result of the number of pixels corresponding to any two adjacent event number thresholds in the event number threshold set to the total number of pixels in the sampling event frame. , when it is greater than or equal to the preset percentage threshold, the larger value of the two adjacent event times thresholds is used as the critical event times threshold.

In some embodiments, on the basis of the above technical solutions, the location area acquisition module 230 is configured to mark the location area of the moving object through the area of interest frame according to the target pixel point.

In some embodiments, based on the above technical solutions, the apparatus 200 for positioning a moving object further includes: a side restraint processing execution module.

The side suppression processing execution module is used for performing side suppression processing on the region where the candidate pixel points are located in the sampling event frame.

In some embodiments, based on the above technical solutions, the threshold obtaining module 220 is configured to obtain the number of pixels corresponding to at least two event count thresholds in the event count threshold set according to the sampled event frame after side suppression processing.

In some embodiments, on the basis of the above technical solutions, the apparatus 200 for positioning a moving object further includes: a movement track acquisition module.

The moving track acquisition module is used to determine the moving track of the moving object according to the position area of the moving object in the multiple sampling event frames, and judge whether the moving track is the target track through the image classification model completed by training.

In some embodiments, based on the above technical solutions, the moving object positioning apparatus 200 further includes: an image classification model acquisition module.

The image classification model acquisition module is used to construct an initial image classification model based on the convolutional neural network, and perform image recognition and classification training on the initial image classification model through the sample image set to obtain the trained image classification model.

The above apparatus can execute the method for positioning a moving object provided by any embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method. For technical details not described in detail in this embodiment, reference may be made to the method provided by any embodiment of the present disclosure.

FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. FIG. 6 shows a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present disclosure. The electronic device 12 shown in FIG. 6 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.

As shown in FIG. 6, the electronic device 12 takes the form of a general-purpose computing device. Components of electronic device 12 may include, but are not limited to, one or more processors or processing units 16 , memory 28 , and a bus 18 connecting various system components including memory 28 and processing unit 16 .

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any of a variety of bus structures. By way of example, these architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, Enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect ( PCI) bus.

Electronic device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by electronic device 12, including both volatile and non-volatile media, removable and non-removable media.

Memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 . Electronic device 12 may further include other removable/non-removable, volatile/non-volatile computer system storage media. For example only, storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 3, commonly referred to as a "hard disk drive"). Although not shown in Figure 3, a disk drive may be provided for reading and writing to removable non-volatile magnetic disks (eg "floppy disks"), as well as removable non-volatile optical disks (eg CD-ROM, DVD-ROM) or other optical media) to read and write optical drives. In these cases, each drive may be connected to bus 18 through one or more data media interfaces. Memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present disclosure.

A program/utility 40 having a set (at least one) of program modules 42, which may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data , each or some combination of these examples may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described in this disclosure.

The electronic device 12 may also communicate with one or more external devices 14 (eg, a keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the electronic device 12, and/or with Any device (eg, network card, modem, etc.) that enables the electronic device 12 to communicate with one or more other computing devices. Such communication may take place through input/output (I/O) interface 22 . Also, the electronic device 12 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 20 . As shown, network adapter 20 communicates with other modules of electronic device 12 via bus 18 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.

The processing unit 16 executes various functional applications and data processing by running the programs stored in the memory 28, for example, implementing the method for positioning a moving object provided by any embodiment of the present disclosure. An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, implements the method for locating a moving object according to any embodiment of the present disclosure.

The computer storage medium of the embodiments of the present disclosure may adopt any combination of one or more computer-readable media. The computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium. A computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (a non-exhaustive list) of computer readable storage media include: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In this document, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

A computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .

Program code embodied on a computer readable medium may be transmitted using any suitable medium, including - but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).

Note that the above are only preferred embodiments of the present disclosure and applied technical principles. Those skilled in the art will understand that the present disclosure is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the scope of protection of the present disclosure. Therefore, although the present disclosure has been described in detail through the above embodiments, the present disclosure is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present disclosure. The scope is determined by the scope of the appended claims.

Claims

A method for positioning a moving object, comprising:

Obtain event flow information through a dynamic vision sensor, and obtain sampled event frames according to the event flow information;

According to the sampled event frame, the number of pixels corresponding to at least two event number thresholds in the event number threshold set is obtained, and the pixel points corresponding to the event number threshold are correspondingly generated events greater than or equal to the event number threshold. pixel point;

Determine the target event number threshold according to the number of pixels corresponding to the at least two event number thresholds respectively;

Determine the target pixel points in the sampled event frame with the corresponding event times greater than or equal to the target event times threshold, and determine the position area of the moving object according to the target pixel points.
The method according to claim 1, characterized in that, before acquiring the number of pixels corresponding to at least two event number thresholds in the event number threshold set according to the sampled event frame, the method further comprises:

According to the sampled event frame, determine a candidate pixel point having at least one event;

According to the number of the candidate pixel points, a matching maximum event number threshold is determined to determine the event number threshold set.
The method according to claim 2, wherein the determining a threshold of the maximum number of events matched according to the number of the candidate pixels, comprising:

According to the corresponding relationship between the predetermined number of candidate pixel points and the maximum number of events threshold, the maximum number of events threshold matching the number of candidate pixels is acquired.
The method according to claim 1, wherein the determining the threshold for the number of target events according to the number of pixels corresponding to the at least two thresholds for the number of events comprises:

A critical event number threshold is determined according to the number of pixels corresponding to the at least two event number thresholds respectively, and the critical event number threshold is used as the target event number threshold.
The method according to claim 1, wherein the determining the threshold for the number of target events according to the number of pixels corresponding to the at least two thresholds for the number of events comprises:

Determine the critical event number threshold according to the number of pixels corresponding to the at least two event number thresholds respectively;

An intermediate event number threshold is determined according to the maximum event number threshold and the critical event number threshold in the event number threshold set, and the intermediate event number threshold is used as the target event number threshold.
The method according to claim 4 or 5, wherein the determining the threshold of the number of critical events according to the number of pixels corresponding to the at least two thresholds of the number of events, comprises:

Perform a difference operation on the number of pixels corresponding to every two adjacent event times thresholds in the event number threshold set, and obtain a difference result;

The target difference result with the largest value is selected from each of the difference results, and the larger value of the two event times thresholds corresponding to the target difference result is used as the critical event times threshold.
The method according to claim 4 or 5, wherein the determining the threshold of the number of critical events according to the number of pixels corresponding to the at least two thresholds of the number of events, comprises:

If the difference calculation result of the number of pixels corresponding to any two adjacent event number thresholds in the event number threshold set is greater than or equal to the preset number threshold, the adjacent two event number thresholds are set to The larger value of the thresholds is used as the threshold for the number of critical events; or,

If the difference calculation result of the number of pixels corresponding to any two adjacent event number thresholds in the event number threshold set is obtained, and the ratio between the total number of pixels in the sampled event frame, is greater than or equal to the predetermined value. When the percentage threshold is set, the larger value of the two adjacent event number thresholds is used as the critical event number threshold.
The method according to claim 1, wherein the determining the position area of the moving object according to the target pixel point comprises:

According to the target pixel point, the position area of the moving object is marked by the area of interest frame.
The method according to claim 2, wherein after determining the candidate pixel point having at least one event according to the sampled event frame, the method further comprises:

Perform side suppression processing on the area where the candidate pixel points are located in the sampling event frame;

According to the sampled event frame, acquiring the number of pixels corresponding to at least two event number thresholds in the event number threshold set, including:

According to the sampled event frame after the side suppression processing, the number of pixels corresponding to the at least two event times thresholds respectively is acquired.
The method according to claim 1, characterized in that after determining the position area of the moving object according to the target pixel, the method further comprises:

Determine the moving trajectory of the moving object according to the position areas of the moving objects in the plurality of sampling event frames, and determine whether the moving trajectory is the target trajectory through the image classification model that has been trained.
A positioning device for a moving object, comprising:

an event frame acquisition module, used for acquiring event stream information through a dynamic vision sensor, and acquiring sampling event frames according to the event stream information;

A threshold acquisition module, configured to acquire, according to the sampled event frame, the number of pixels corresponding to at least two event number thresholds in the event number threshold set respectively, where the pixel points corresponding to the event number threshold are correspondingly generated event numbers greater than or Pixels equal to the threshold for the number of events; determine the threshold for the number of events of interest according to the number of pixels corresponding to the at least two thresholds for the number of events;

A location area acquisition module, configured to determine a target pixel point in the sampled event frame whose number of events is greater than or equal to the target event number threshold, and determine a location area of a moving object according to the target pixel point.
An electronic device, characterized in that the electronic device comprises:

one or more processors;

memory for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the method for locating a moving object according to any one of claims 1-10.
A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the method for locating a moving object according to any one of claims 1-10 is implemented.