CN115049990A

CN115049990A - Data processing method, device and storage medium

Info

Publication number: CN115049990A
Application number: CN202210531411.8A
Authority: CN
Inventors: 张彪; 高强华; 邓兵
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-04-16
Filing date: 2020-04-16
Publication date: 2022-09-13
Also published as: CN113515978A; CN113515978B

Abstract

The embodiment of the application provides a data processing method, data processing equipment and a storage medium. In the data processing method, after the road area in the image to be processed is identified, the road area is mapped into the target image with the set size, so that redundant information in the image to be processed can be removed, and the effective information density of the target image obtained by mapping is improved. And object detection is performed based on the target image with higher effective information density, so that the accuracy of the identification result is improved.

Description

Data processing method, device and storage medium

Technical Field

The present application relates to the field of computer vision technologies, and in particular, to a data processing method, device, and storage medium.

Background

Under an intelligent traffic scene, a video monitoring system can be arranged on a road to shoot road images, and a specific algorithm is adopted to identify objects in the road images. Based on the identified objects, object tracking may be performed or abnormal events may be detected.

In general, the road image captured by the video monitoring system has low effective information density, which is not favorable for accurately identifying the target object in the road image. Therefore, a new solution is yet to be proposed.

Disclosure of Invention

Aspects of the present application provide a data processing method, apparatus, and storage medium for increasing an effective information density of an image.

An embodiment of the present application provides a data processing method, including: acquiring an image to be processed; identifying a road area in the image to be processed; mapping the road area into a target image with a set size according to a set mapping algorithm; and carrying out object detection on the target image so as to identify the object contained in the road area.

An embodiment of the present application further provides a data processing method, including: acquiring a road area in a road image, wherein an object in the road area is marked with a first surrounding frame; according to a set mapping algorithm, mapping processing is carried out on the road area and the first enclosing frame, and a training sample with a set size and a second enclosing frame on the training sample are obtained; and inputting the training sample into a neural network model, and training the object detection capability of the neural network model by taking the second enclosure as a supervision signal.

An embodiment of the present application further provides an electronic device, including: a memory and a processor; the memory is to store one or more computer instructions; the processor is to execute the one or more computer instructions to: the data processing method provided by the embodiment of the application is executed.

The embodiment of the present application further provides a computer-readable storage medium storing a computer program, and the computer program can implement the data processing method provided by the embodiment of the present application when being executed by a processor.

According to the data processing method, after the road area in the image to be processed is identified, the road area is mapped into the target image with the set size, so that redundant information in the image to be processed can be removed, and the effective information density of the target image obtained through mapping is improved. And object detection is performed based on the target image with higher effective information density, so that the accuracy of the identification result is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a schematic flowchart of a data processing method according to an exemplary embodiment of the present application;

fig. 2a is a schematic flowchart of a data processing method according to another exemplary embodiment of the present application;

FIG. 2b is a schematic representation of the geometric features of a road region provided by an exemplary embodiment of the present application;

FIG. 2c is a schematic representation of the geometric features of a road region provided by another exemplary embodiment of the present application;

FIG. 2d is a schematic diagram illustrating an image mapping effect according to an exemplary embodiment of the present application;

FIG. 2e is a schematic flow chart diagram illustrating a data processing method according to another exemplary embodiment of the present application;

FIG. 3 is a schematic flow chart diagram of a data processing method according to another exemplary embodiment of the present application;

FIG. 4 is a diagram illustrating an example application scenario provided by an exemplary embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the field of traffic intelligent perception, in order to realize intelligent monitoring, a public traffic video monitoring system can be generally distributed and controlled on a road to shoot the road. Based on the captured road image, the recognition, tracking and abnormal event finding operations of the target object may be performed by a computer instead of a human. Generally, when a computer identifies a target object, a video image acquired by monitoring equipment can be subjected to perception analysis based on a deep learning method. In the field of traffic perception, a computer can perceive and analyze objects on a road, such as motor vehicles, non-motor vehicles, pedestrians and the like, based on road videos shot by a monitoring system.

However, the monitoring system is limited by the scene, and in the road image shot by the monitoring system, the pixel ratio of the object on the distant road in the road image is low, so that the effective information density of the road image is low. Under the condition, the computer cannot effectively perform perception analysis on the objects on the distant view road, so that the traffic intelligent perception system cannot effectively cover the distant view area. In view of the above technical problem, an embodiment of the present application provides a data processing method, which will be described below with reference to the accompanying drawings.

Fig. 1 is a schematic flowchart of a data processing method according to an exemplary embodiment of the present application, and as shown in fig. 1, the method includes:

step 101, acquiring an image to be processed.

And 102, identifying a road area in the image to be processed.

And 103, mapping the road area into a target image with a set size according to a set mapping algorithm.

And 104, carrying out object detection on the target image to identify objects contained in the road area.

The image to be processed comprises an area obtained by road imaging, and can be described as a road area; the road area includes objects such as vehicles and pedestrians. The image to be processed may be obtained by shooting a road by a public transportation video monitoring system arranged on the road, or obtained by shooting a vehicle data recorder installed on a vehicle running on the road, or obtained by shooting a pedestrian on the road by using an image acquisition device such as a mobile phone and a camera, which is not limited in this embodiment.

Object Detection (Object Detection) refers to perceptively analyzing objects in a picture or video stream by a computer and marking the objects. And carrying out object detection on the image to be processed, identifying objects such as vehicles and pedestrians on the road, and analyzing the flow characteristics of the vehicles and the pedestrians. When detecting an object in an image to be processed, effective information required for detection is concentrated in a road area.

The road is in an extended state, and for the image acquisition equipment, the road has a long-range view and a short-range view. In an image to be processed shot by the image acquisition equipment, a road area usually shows a state that a close-range area occupies a relatively large area and a distant-range area occupies a relatively small area. When the long-range area occupies a small area, the density of effective information is low, which is not beneficial to object detection.

In this embodiment, a road area may be identified from the image to be processed, and the road area may be mapped to an image of a set size. For convenience of description and distinction, an image obtained after the image to be processed is mapped is described as a target image. The image information contained in the target image is composed of the image information contained in the road area, in the mapping process, a small object in a distant view area can be properly amplified, a large object in a close view area can be properly reduced, further, the size of pixels of the distant and close objects in the target image obtained through mapping is approximately in the same magnitude, and the effective information density is greatly improved.

The mapping algorithm may be determined according to a size ratio of the road area, or may be determined according to a geometric characteristic of the road area, or may be determined based on an energy distribution characteristic of the road area, or may also be determined based on a texture characteristic of the road area, which is not limited in this embodiment.

After the target image is obtained through mapping, object detection can be carried out on the basis of the target image. The target image has higher effective information density, and is beneficial to improving the accuracy of object detection. It should be understood that the target image is mapped from the road area, and therefore, there is a correspondence between the object on the target image and the object included in the road area. And after the object on the target image is detected, the object contained in the road area can be analyzed and obtained.

In the embodiment, after the road area in the image to be processed is identified, the road area is mapped into the target image with the set size, so that redundant information in the image to be processed can be removed, and the effective information density of the target image obtained by mapping is improved. And object detection is performed based on the target image with higher effective information density, so that the accuracy of the identification result is improved.

In the above and following embodiments of the present application, when mapping a road area to a target image with a set size, a plurality of different mapping algorithms may be used, such as the mapping algorithm determined according to the size ratio of the road area, the mapping algorithm determined according to the geometric feature of the road area, the mapping algorithm determined based on the energy distribution feature of the road area, the mapping algorithm determined based on the texture feature of the road area, and the like described in the foregoing embodiments, which are not limited in this embodiment.

Optionally, in the algorithm for performing the mapping processing based on the size ratio of the road area, the size ratio of the road area may be calculated, and different local positions in the road area may be enlarged or reduced according to the size ratio to obtain the target image.

Optionally, in the algorithm for performing the mapping processing based on the geometric features of the road region, the road region may be mapped to the target image with the same size as the image to be processed according to the geometric features of the road region and the geometric features of the image to be processed.

Optionally, in an algorithm for performing mapping processing based on energy distribution characteristics of a road region, energy distribution of the road region may be detected, a local region with higher energy distribution may be subjected to amplification processing, and a local region with lower energy distribution may be subjected to reduction processing, so as to obtain a target image.

Optionally, in the algorithm for performing mapping processing based on the texture features of the road region, the texture distribution features in the road region may be identified, the local region with more concentrated texture distribution may be enlarged, and the local region with more dispersed texture distribution may be reduced, so as to obtain the target image.

In different embodiments of the present application, the mapping process of the road area may be implemented based on any one of the above embodiments or a combination of the above embodiments, and the present embodiment is not limited. In the following embodiments, an alternative implementation of the mapping process based on the geometric features of the road area will be exemplarily described with reference to fig. 2a, 2b, 2c, 2d, and 2 e.

Fig. 2a is a schematic flowchart of a data processing method according to another exemplary embodiment of the present application, and as shown in fig. 2a, the method includes:

step 201, acquiring an image to be processed.

Step 202, identifying a road area in the image to be processed.

Step 203, determining a mapping relation according to the to-be-processed image and the geometric characteristics of the road area.

Step 204, performing coordinate mapping on the pixels in the road area according to the mapping relation to obtain the target image; the size of the target image is matched with the image to be processed.

Step 205, identifying an object in the target image, and calculating a first bounding box of the object.

And step 206, performing reverse mapping processing on the first bounding box according to the reverse mapping relation corresponding to the mapping relation so as to map the first bounding box into a second bounding box on the image to be processed.

And step 207, displaying the second enclosing frame on the image to be processed.

In step 201, an image to be processed is acquired. Optionally, the image to be processed may be obtained by sampling a road monitoring video.

In step 202, optionally, when a road region in the image to be processed is identified, edge detection may be performed on the image to be processed to obtain edge information included in the image to be processed; optionally, when performing edge detection on the image to be processed, an edge detection method based on search or an edge detection method based on zero crossing may be adopted, which is not limited in this embodiment.

After the edge information included in the image to be processed is acquired, the road contour can be determined from the image to be processed according to the edge information. In some cases, the road profile is trapezoidal or polygonal. For subsequent processing, optionally, an inscribed trapezoid of the road contour may be calculated to obtain a more regular shape and facilitate calculation of the image area, and the image area corresponding to the inscribed trapezoid is taken as the road area.

Fig. 2b and 2c illustrate two alternative embodiments of computing inscribed rectangles. As shown in fig. 2b, the road contour in the image to be processed is regular, and the shape of the inscribed trapezoid is close to the road contour. As shown in fig. 2c, the road contour in the image to be processed is an irregular polygon, the inscribed trapezoid of which is slightly smaller than the road contour. In practice, in order to obtain a relatively regular road profile, the road image can be prevented from being shot at the road corner, so that the object detection effect is improved.

In step 203, the mapping relationship refers to a mapping relationship between the coordinates of the pixel on the image to be processed and the coordinates of the pixel on the target image. Optionally, a mapping coefficient corresponding to the coordinate of the pixel may be calculated first, and then the mapping relationship may be determined according to the mapping coefficient and the coordinate of the pixel. Optionally, the mapping coefficients include abscissa mapping coefficients, ordinate mapping coefficients, and nonlinear mapping coefficients.

Wherein the non-linear mapping coefficients are used to implement the non-linear mapping process. In some scenarios, the non-linear mapping coefficient may be implemented as a non-linear mapping coefficient of a vertical coordinate when a road extends in a longitudinal direction in a captured road image. When the road extends in the longitudinal direction, objects on the distant view road are smaller and objects on the close view road are larger. The ratio of the close-range objects in the road area can be reasonably compressed based on the non-linear mapping coefficient of the vertical coordinate, the ratio of the far-range objects in the road area is enlarged, and the effective information density of the target image is improved.

An alternative embodiment of calculating the mapping coefficients and the mapping relationship will be exemplarily described below by taking any one pixel in the road area as an example. The geometric features of the image to be processed can include long and high-sized features of the image to be processed; the geometric feature of the road area may include at least one of an abscissa range of the road area in which each pixel is located and an ordinate range of the road area. In general, the range of the ordinate of the road area is a range of the ordinate spanned by the upper base and the lower base of the inscribed trapezoid corresponding to the road area.

Optionally, when calculating the abscissa mapping coefficient of the pixel, the abscissa range of the row in which the pixel is located in the road area may be obtained, and then the abscissa mapping coefficient of the pixel is calculated according to the abscissa range and the length of the image to be processed. Alternatively, a ratio of the length of the abscissa range to the length of the image to be processed may be calculated as the abscissa mapping coefficient. The following formula can be specifically referred to:

wherein S is ₁ The abscissa mapping coefficient, x, representing the pixel _max Represents the maximum value of the abscissa, x, of the row in which the pixel is located in the road region _min Indicating the minimum value of the row of the pixel in the road region, and w indicating the length of the image to be processed, as shown in fig. 2b and 2 c.

Accordingly, after the abscissa mapping coefficient is obtained, the abscissa mapping relationship of the pixel can be calculated according to the abscissa mapping coefficient and the abscissa of the pixel. Alternatively, the product of the abscissa of the pixel and the abscissa mapping coefficient may be calculated, and the relationship in which the minimum value of the abscissa of the row in which the pixel is located in the road region is summed with the product is determined as the abscissa mapping relationship of the pixel. Assuming that the coordinates of the pixel in the image to be processed are (x, y), the abscissa mapping relationship can be described by referring to the following formula:

wherein X represents the mapped abscissa.

Optionally, when the ordinate mapping coefficient of the pixel is calculated, the ordinate range of the road area may be obtained; and calculating the vertical coordinate mapping coefficient of the pixel according to the vertical coordinate range and the height of the image to be processed. Alternatively, a ratio of the height of the ordinate range to the height of the image to be processed may be calculated as the ordinate mapping coefficient. The following formula can be specifically referred to:

wherein S is ₂ A mapping coefficient of ordinate, y, representing the pixel _max Maximum value of ordinate, y, representing the road area _min Indicating the minimum value of the ordinate of the road area and h the height of the image to be processed, as shown in fig. 2b and 2 c.

Optionally, when the non-linear mapping coefficient of the pixel is calculated, in order to ensure the reasonability and high availability of the mapping result, the road area may be further divided according to the extending trend of the road, and different non-linear mapping coefficients may be calculated for the pixels in different divided parts. The division has the advantages that the road area is divided into a close-range area and a distant-range area, and pixels of the close-range area can be subjected to compressed nonlinear mapping processing based on a nonlinear mapping coefficient of the close-range area; based on the nonlinear mapping coefficient of the distant view region, the pixels of the distant view region can be subjected to extended nonlinear mapping processing.

Optionally, the road area may be divided into two, three or four parts according to the central line of the image to be processed, which is not limited in this embodiment. Alternatively, when divided into two parts, one part includes pixels having an ordinate less than h/2, and the other part includes pixels having an ordinate greater than or equal to h/2.

Next, a non-linear mapping coefficient of each pixel may be calculated according to a range to which the ordinate of the pixel belongs and the abscissa mapping coefficient of the pixel. One alternative way of non-linearly mapping coefficients can be described with reference to the following equation:

correspondingly, when the ordinate mapping relation of the pixel is calculated, the product of the ordinate of the pixel, the ordinate mapping coefficient and the nonlinear mapping coefficient can be calculated; then, the summation relation between the minimum value of the ordinate of the column where the pixel is located and the product is determined as the mapping relation of the ordinate of the pixel. The following formula can be specifically referred to:

wherein Y represents the mapped ordinate.

After the abscissa mapping relationship and the ordinate mapping relationship are obtained, step 204 may be executed to perform coordinate mapping on pixels in the road area according to the mapping relationship to obtain a target image. Wherein the size of the target image is adapted to the image to be processed.

When the pixels in the road area are subjected to coordinate mapping, the mapped abscissa value of each pixel can be calculated according to the abscissa mapping relation of each pixel and the abscissa value of each pixel on the image to be processed; and calculating the mapped longitudinal coordinate value of each pixel according to the longitudinal coordinate mapping relation of each pixel and the longitudinal coordinate value of each pixel on the image to be processed. As shown in the following equation:

i' (X, Y) ═ I [ X (X, Y), Y (X, Y) ] formula 6

Wherein, I' (X, Y) represents the target image obtained by mapping, I represents the image to be processed, X represents the abscissa mapping value, Y represents the abscissa mapping value, and (X, Y) represents the coordinate value of the pixel in the image to be processed.

A typical mapping result can be shown in fig. 2d, a trapezoidal road area in the image is mapped into a rectangular image, and the rectangular image has the same size as the original image, so that the effective information density in the mapped image is greatly improved.

Optionally, after the target image is acquired, in step 205, object detection may be performed on the target image to identify an object in the target image.

Alternatively, in the present implementation, object detection may be implemented based on a Deep Learning (Deep Learning) algorithm. Deep learning is a branch of machine learning, and is an algorithm for performing characterization learning on data by taking an artificial neural network as a framework. Deep learning may use multiple processing layers including complex structures or consisting of multiple nonlinear transformations to perform high-level abstraction on data and perform object recognition based on features obtained from the high-level abstraction.

Alternatively, an algorithm model for object detection, i.e. the deep learning detector in fig. 2e, may be trained in advance by using a deep learning algorithm according to training samples. Alternatively, the algorithmic model may be a Neural Networks (NN) model. It should be noted that, as shown in fig. 2e, in the training stage, the road area may be identified from the existing road image and the road area may be subjected to the mapping process according to the method described in the foregoing embodiments to obtain the training sample. The training sample is an image with high density information ratio. The mapping algorithm used for mapping the road region in the road image is the same as the mapping algorithm used for mapping the road region in the image to be processed, so that the algorithm model can be ensured to better perform feature learning, and is not repeated.

After the road area is mapped, the road area may be mapped to an image having the same size and the same shape as the original road image. In this process, small objects at far distance in the road image can be enlarged, and relatively large objects at near distance can be reduced appropriately, i.e. the pixels of the near and far objects in the mapped training sample are almost in the same order of magnitude. Under the condition, when the algorithm model is used for object detection learning, more effective information can be extracted, and more excellent learning performance is realized.

Optionally, in this embodiment, a model training method with supervised learning is adopted, and in the process of supervised learning, an enclosure of an object is labeled on a training sample (the enclosure is a detection enclosure of the object and is used to identify a position where the object is located), and the labeled enclosure can be used as a true value of the enclosure to participate in a supervision process of deep learning.

Optionally, as shown in fig. 2e, the bounding box true value labeled in the training sample is obtained by mapping the bounding box true value labeled in the road image. The mapping algorithm for mapping the bounding box truth value in the road image is the same as the mapping algorithm for mapping the road region in the image to be processed, and is not repeated. For example, in the process of preparing the training sample, a road region may be extracted from one road image, and a bounding box of an object may be marked on the road region according to the position of the object on the road region. And then, mapping the road area and the surrounding frame thereof according to a set mapping algorithm to obtain a training sample and a surrounding frame truth value thereof.

Based on the above, after the road region in the image to be processed is mapped into the target image by the mapping algorithm, the target object may be input into the algorithm model, the object in the target image is identified by the algorithm model, and the bounding box of the object is calculated, as shown in fig. 2 e. For convenience of description and distinction, the bounding box of the object identified from the target image will be described as the first bounding box.

In step 206, optionally, after the first bounding box is acquired, the first bounding box may be subjected to a reverse mapping process to map the first bounding box on the target image to a bounding box on the image to be processed. For convenience of description and distinction, the bounding box on the image to be processed is described as the second bounding box.

The mapping algorithm used for performing the reverse mapping processing on the first bounding box is a reverse mapping algorithm of the mapping algorithm used for performing the mapping processing on the road area in the image to be processed, and is not described herein again. Based on the inverse mapping, the object detection result on the target image can be mapped to the image to be processed. Next, step 207 may be executed to display a second bounding box on the image to be processed, where the second bounding box is a detection box of the object on the image to be processed.

Fig. 3 is a schematic flow chart of a data processing method according to another exemplary embodiment of the present application, and as shown in the drawing, the method includes:

step 301, a road area in the road image is obtained, and a first surrounding frame is marked on an object in the road area.

Step 302, according to a set mapping algorithm, mapping processing is performed on the road area and the first enclosure frame to obtain a training sample with a set size and a second enclosure frame on the training sample.

Step 303, inputting the training sample into a neural network model, and training the object detection capability of the neural network model by using the second enclosure frame as a supervision signal.

The description of the foregoing embodiments may be referred to for a mapping algorithm used for mapping the road region and the first bounding box, and details are not repeated here. The training samples obtained through mapping processing are images with high density information ratio, and feature learning of a neural network is facilitated.

Optionally, the Neural network model (NN) may be implemented as: one or more of Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), Graph Convolutional Neural Networks (GCN), Recurrent Neural Networks (RNN), and Long-Short Term Memory Neural Networks (LSTM), or may be obtained by deforming one or more of the above Neural Networks, which is not limited in this embodiment.

In this embodiment, when the deep learning detection model is trained, the image with the high information density ratio is used as the training sample, so that the learning efficiency of the neural network model in the off-line training and training stage and the detection accuracy of the on-line reasoning stage are greatly improved.

Fig. 4 illustrates a typical application scenario of the embodiment of the present application, and in the illustration of fig. 4, the data processing method provided in this example may be deployed in an intelligent traffic monitoring system. The intelligent traffic monitoring system may include a monitoring device 41 installed above a road, a server 42, and a management terminal 43. Among these, the monitoring device 41 may be realized as a high-speed camera; server 42 may be implemented as a cloud server, data center, or the like; the management terminal 43 may be implemented as a user terminal of a traffic management unit, such as a computer, a smart phone, a smart display screen, etc., but the embodiment includes but is not limited thereto.

After the monitoring device 41 captures the road monitoring video, the road monitoring video may be sent to the server 42. The server 42 performs sampling processing on the road monitoring video to obtain multiple frames of road images. Then, the server 42 may identify the contour of the road according to the distribution rule and the characteristics of the road in the road image, and extract the road region from the road image according to the contour of the road. The profile of the road is generally trapezoidal or polygonal. Next, the server 42 may map the road area into a target image having the same size as the original road image by using the mapping method described in the foregoing embodiment.

Next, the server 42 may input the target image to the deep learning detector and acquire a detection result output by the deep learning detector. Based on the detection results, server 42 may employ a reverse mapping algorithm to map the detection results into the original road image to determine the target object in the road image. Next, the server 42 may issue the detection result of the target object in the road image to the management terminal 43 for display to the manager. Optionally, the server 42 may also calculate a moving track, a moving speed, and the like of the target object according to the target object detected in the multiple frames of continuous road images, and may issue the calculation result to the management terminal 43, which is not described again.

The foregoing embodiments describe the application of the data processing method provided in the present application in the field of intelligent transportation, and it should be understood that the data processing method can be extended to image processing in other fields besides the field of intelligent transportation.

For example, in some scenarios, the data processing method may be applied in the field of facial recognition. In the field of face recognition, different local areas on a face image have different proportions, limited by the shooting angle of view and the facial features. For example, the nose and forehead have a large proportion in the image, and the chin area has a small proportion in the image, and thus face recognition is not facilitated. To improve the accuracy of face recognition, a face region may be identified from an image and mapped to an image of a set size according to a mapping algorithm. The mapping operation can remove information irrelevant to face recognition, expand the face area with smaller occupation ratio and compress the face area with larger occupation ratio. And the face recognition is carried out on the basis of the image obtained by mapping, so that the recognition difficulty of the small target is effectively reduced.

For another example, in other scenarios, the data processing method may be applied to the field of drone target detection. In the field of drone target detection, images captured by drones typically contain distant objects and close objects. The occupation ratio of the distant view object on the image is small, which is not beneficial to target detection. Therefore, the data processing method provided by the embodiment of the application can be adopted to identify the region to be detected, in which the target object possibly exists, from the image shot by the unmanned aerial vehicle, and the region to be detected is subjected to mapping processing according to the mapping algorithm. The mapping processing operation can expand the local area with smaller occupation ratio on the area to be detected and compress the local area with larger occupation ratio, thereby balancing the information amount of the long shot and the short shot and removing irrelevant information. The target detection is carried out on the basis of the image obtained by mapping, so that the long-range view target and the close-range view target can be detected more accurately, and better obstacle avoidance basis and target tracking basis can be provided for the unmanned aerial vehicle.

Of course, besides the above-mentioned fields, the method can also be extended to other fields requiring small target detection, and details are not repeated here.

It should be noted that, the executing subjects of the steps of the method provided in the foregoing embodiments may be the same device, or different devices may also be used as the executing subjects of the method. For example, the execution subjects of step 201 to step 203 may be device a; for another example, the execution subject of

steps

201 and 202 may be device a, and the execution subject of step 203 may be device B; and so on.

In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations are included in a specific order, but it should be clearly understood that the operations may be executed out of the order presented herein or in parallel, and the sequence numbers of the operations, such as 201, 202, etc., are merely used for distinguishing different operations, and the sequence numbers do not represent any execution order per se. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

Fig. 5 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application, and as shown in fig. 5, the electronic device includes: a memory 501 and a processor 502.

The memory 501 is used for storing computer programs and may be configured to store other various data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, contact data, phonebook data, messages, pictures, videos, and so forth.

The memory 501 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

A processor 502, coupled to the memory 501, for executing computer programs in the memory 501 for: acquiring an image to be processed; identifying a road area in the image to be processed; mapping the road area into a target image with a set size according to a set mapping algorithm; and carrying out object detection on the target image so as to identify the object contained in the road area.

Further optionally, when mapping the road region into the target image with the set size according to the set mapping algorithm, the processor 502 is specifically configured to: determining a mapping relation according to the to-be-processed image and the geometric characteristics of the road area; performing coordinate mapping on pixels in the road area according to the mapping relation to obtain the target image; the size of the target image is matched with the image to be processed.

Further optionally, when determining the mapping relationship according to the to-be-processed image and the geometric feature of the road region, the processor 502 is specifically configured to: aiming at any pixel in the road area, calculating a mapping coefficient of the pixel according to the coordinate of the pixel, the coordinate range of the road area and/or the size of the image to be processed; calculating a coordinate mapping relation of the pixels according to the mapping coefficients of the pixels and the coordinates of the pixels; wherein the mapping coefficients comprise: at least one of an abscissa mapping coefficient, an ordinate mapping coefficient, and a nonlinear mapping coefficient.

Further optionally, when the processor 502 calculates the mapping coefficient of the pixel, it is specifically configured to: and calculating the abscissa mapping coefficient of the pixel according to the abscissa range of the row in which the pixel is positioned in the road area and the length of the image to be processed.

Further optionally, when the processor 502 calculates the mapping coefficient of the pixel according to the abscissa range of the row where the pixel is located in the road area and the length of the image to be processed, it is specifically configured to: and taking the ratio of the length of the abscissa range to the length of the image to be processed as the abscissa mapping coefficient.

Further optionally, when the processor 502 calculates the coordinate mapping relationship of the pixel according to the mapping coefficient of the pixel and the coordinate of the pixel, it is specifically configured to: calculating a product of the abscissa of the pixel and the abscissa mapping coefficient; and determining the summation relation of the minimum value of the sitting posture of the row where the pixel is located in the road area and the product as the abscissa mapping relation of the pixel.

Further optionally, when the processor 502 calculates the mapping coefficient of the pixel, it is specifically configured to: and calculating a vertical coordinate mapping coefficient of the pixel according to the vertical coordinate range of the road area and the height of the image to be processed.

Further optionally, when the processor 502 calculates the mapping coefficient of the pixel, it is specifically configured to: calculating an abscissa mapping coefficient of the pixel according to an abscissa range of the row where the pixel is located in the road area and the length of the image to be processed; and calculating the nonlinear mapping coefficient of the pixel according to the ordinate of the pixel, the range of the ordinate of the pixel and the abscissa mapping coefficient of the pixel.

Further optionally, when the processor 502 calculates the coordinate mapping relationship of the pixel according to the mapping coefficient of the pixel and the coordinate of the pixel, it is specifically configured to: calculating a product of the ordinate of the pixel, the ordinate mapping coefficient, and the nonlinear mapping coefficient; and determining the summation relation of the minimum value of the ordinate of the column where the pixel is located and the product as the ordinate mapping relation of the pixel.

Further optionally, when identifying the road region in the image to be processed, the processor 502 is specifically configured to: performing edge detection on the image to be processed to acquire edge information contained in the image to be processed; determining a road profile according to the edge information; and calculating an inscribed trapezoid of the road profile, and taking an image area corresponding to the inscribed trapezoid as the road area.

Further optionally, when performing object detection on the target image to identify an object included in the road region, the processor 502 is specifically configured to: identifying an object in the target image and calculating a first bounding box of the object; according to a reverse mapping algorithm corresponding to the mapping algorithm, performing reverse mapping processing on the first bounding box to map the first bounding box into a second bounding box on the image to be processed; and displaying the second enclosing frame on the image to be processed.

Further optionally, when the processor 502 identifies an object in the target image and calculates the first bounding box of the object, it is specifically configured to: inputting the target object into an algorithm model, so as to identify an object in the target image through the algorithm model, and calculating a first enclosing frame of the object; training samples adopted by training the algorithm model are obtained by mapping road areas in the road image according to the mapping algorithm; and the real value of the bounding box marked in the training sample is obtained by mapping the real value of the bounding box marked in the road image according to the mapping algorithm.

Further, as shown in fig. 5, the electronic device further includes: communication component 503, display component 504, power component 505, audio component 506, and the like. Only some of the components are schematically shown in fig. 5, and it is not meant that the electronic device comprises only the components shown in fig. 5.

Wherein the communication component 503 is configured to facilitate communication between the device in which the communication component is located and other devices in a wired or wireless manner. The device in which the communication component is located may access a wireless network based on a communication standard, such as WiFi, 2G, 3G, 4G, or 5G, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component may be implemented based on Near Field Communication (NFC) technology, Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

The display assembly 504 includes a screen, which may include a liquid crystal display assembly (LCD) and a Touch Panel (TP), among others. If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.

The power supply unit 505 supplies power to various components of the device in which the power supply unit is installed. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.

In this embodiment, after the road region in the image to be processed is identified, the road region is mapped to the target image with the set size, so that redundant information in the image to be processed can be removed, and the effective information density of the target image obtained by mapping is improved. And object detection is performed based on the target image with higher effective information density, so that the accuracy of the identification result is improved.

In addition to the execution logic described in the foregoing embodiments, the electronic device illustrated in fig. 5 may also execute the following data processing logic: acquiring, by a processor 502, a road region in a road image, an object in the road region being marked with a first bounding box; according to a set mapping algorithm, mapping the road area and the first enclosure frame to obtain a training sample with a set size and a second enclosure frame on the training sample; and inputting the training sample into a neural network model, and training the object detection capability of the neural network model by taking the second enclosure as a supervision signal.

Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program is capable of implementing the steps that can be executed by the electronic device in the foregoing method embodiments when executed.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A data processing method, comprising:

acquiring an image to be processed;

identifying a road area in the image to be processed;

mapping the road area into a target image with a set size according to a set mapping algorithm;

carrying out object detection on the target image by adopting a neural network model so as to identify objects contained in the road area;

wherein the mapping algorithm is determined according to the geometric characteristics of the road area and the size proportion of the road area; the geometric features are used for determining a mapping relation containing nonlinear mapping coefficients, and the nonlinear mapping coefficients of a near view area and a far view area in the road area are different; the size ratio is used to enlarge or reduce different local positions in the road region.

2. The method of claim 1, wherein mapping the road region to a target image of a set size according to a set mapping algorithm comprises:

determining a mapping relation according to the to-be-processed image and the geometric characteristics of the road area;

according to the mapping relation, carrying out coordinate mapping on pixels in the road area to obtain the target image; the size of the target image is matched with the image to be processed.

3. The method according to claim 2, wherein determining a mapping relationship according to the geometric features of the image to be processed and the road region comprises:

aiming at any pixel in the road area, calculating a mapping coefficient of the pixel according to the coordinate of the pixel, the coordinate range of the road area and/or the size of the image to be processed;

calculating a coordinate mapping relation of the pixels according to the mapping coefficients of the pixels and the coordinates of the pixels;

wherein the mapping coefficients comprise: at least one of an abscissa mapping coefficient, an ordinate mapping coefficient, and a nonlinear mapping coefficient.

4. The method of claim 3, wherein computing the mapping coefficients for the pixels comprises:

and calculating the abscissa mapping coefficient of the pixel according to the abscissa range of the row where the pixel is located in the road area and the length of the image to be processed.

5. The method according to claim 4, wherein calculating the mapping coefficient of the pixel according to the abscissa range of the row in which the pixel is located in the road region and the length of the image to be processed comprises:

and taking the ratio of the length of the abscissa range to the length of the image to be processed as the abscissa mapping coefficient.

6. The method of claim 3, wherein calculating the coordinate mapping relationship of the pixel according to the mapping coefficient of the pixel and the coordinate of the pixel comprises:

calculating a product of the abscissa of the pixel and the abscissa mapping coefficient;

and determining the summation relation of the minimum value of the sitting posture of the row where the pixel is located in the road area and the product as the abscissa mapping relation of the pixel.

7. The method of claim 3, wherein computing the mapping coefficients for the pixels comprises:

and calculating a vertical coordinate mapping coefficient of the pixel according to the vertical coordinate range of the road area and the height of the image to be processed.

8. The method of claim 3, wherein computing the mapping coefficients for the pixels comprises:

calculating an abscissa mapping coefficient of the pixel according to an abscissa range of the row where the pixel is located in the road area and the length of the image to be processed;

and calculating the nonlinear mapping coefficient of the pixel according to the ordinate of the pixel, the range of the ordinate of the pixel and the abscissa mapping coefficient of the pixel.

9. The method of claim 3, wherein calculating the coordinate mapping relationship of the pixel according to the mapping coefficient of the pixel and the coordinate of the pixel comprises:

calculating a product of the ordinate of the pixel, the ordinate mapping coefficient, and the nonlinear mapping coefficient;

and determining the summation relation of the minimum value of the ordinate of the column where the pixel is located and the product as the ordinate mapping relation of the pixel.

10. The method according to any one of claims 1-9, wherein identifying a road region in the image to be processed comprises:

performing edge detection on the image to be processed to acquire edge information contained in the image to be processed;

determining a road profile according to the edge information;

and calculating an inscribed trapezoid of the road profile, and taking an image area corresponding to the inscribed trapezoid as the road area.

11. The method according to any one of claims 1 to 9, wherein performing object detection on the target image to identify the object included in the road region comprises:

identifying an object in the target image and calculating a first bounding box of the object;

according to a reverse mapping algorithm corresponding to the mapping algorithm, performing reverse mapping processing on the first bounding box to map the first bounding box into a second bounding box on the image to be processed;

and displaying the second enclosing frame on the image to be processed.

12. The method of claim 11, wherein identifying an object in the target image and calculating a first bounding box for the object comprises:

inputting the target object into an algorithm model, so as to identify an object in the target image through the algorithm model, and calculating a first enclosing frame of the object;

training samples adopted by training the algorithm model are obtained by mapping road areas in the road image according to the mapping algorithm; and the real value of the bounding box marked in the training sample is obtained by mapping the real value of the bounding box marked in the road image according to the mapping algorithm.

13. A data processing method, comprising:

acquiring a road area in a road image, wherein an object in the road area is marked with a first surrounding frame;

according to a set mapping algorithm, mapping the road area and the first enclosure frame to obtain a training sample with a set size and a second enclosure frame on the training sample;

and inputting the training sample into a neural network model, and training the object detection capability of the neural network model by taking the second enclosure as a supervision signal.

14. An electronic device, comprising: a memory and a processor;

the memory is to store one or more computer instructions;

the processor is to execute the one or more computer instructions to: performing the data processing method of any one of claims 1-13.

15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is able to carry out the data processing method of any one of claims 1 to 13.