CN112528932B

CN112528932B - Method and device for optimizing position information, road side equipment and cloud control platform

Info

Publication number: CN112528932B
Application number: CN202011526575.9A
Authority: CN
Inventors: 高旭
Original assignee: Apollo Zhilian Beijing Technology Co Ltd
Current assignee: Apollo Zhilian Beijing Technology Co Ltd
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2023-12-08
Anticipated expiration: 2040-12-22
Also published as: CN112528932A

Abstract

The application discloses a method and a device for optimizing position information, road side equipment and a cloud control platform, and relates to the fields of intelligent transportation and automatic driving. The specific implementation scheme is as follows: acquiring an image frame sequence comprising an image of a target to be tracked; selecting a target image frame from the sequence of image frames; generating track position prediction information of the target to be tracked according to the image frame sequence; performing target detection on the target image frame to generate detection position prediction information of the target to be tracked; and generating optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame. Thereby realizing the optimization of the position of the target to be tracked in the image frame. And the accuracy of target detection and track tracking results can be improved, and deviation of the tracking results caused by inaccuracy of the detection results is avoided.

Description

Method and device for optimizing position information, road side equipment and cloud control platform

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a track tracking optimization technology in the fields of intelligent transportation and automatic driving.

Background

The object tracking problem is a fundamental problem in computer vision, and has very important applications in the fields of medical imaging, intelligent transportation, automatic driving and the like. For example, in the intelligent traffic field, after a vehicle is tracked, the tracked track can be analyzed, so as to detect whether abnormal behaviors occur. Therefore, in the field of intelligent transportation, multi-objective tracking is a very basic and important issue.

The prior art can generally comprise a Markov decision process method, a deep learning method and the like, but the problems of high resource occupation, low operation speed and the like exist. For multi-target tracking, depth features and Kalman filtering can be utilized in the prior art. However, the tracking effect in the related art is susceptible to the detection effect. Whether the vehicle is under abnormal working conditions or is shielded, the problems of incomplete vehicle detection or serious shake can be caused, and therefore the tracking effect is poor.

Disclosure of Invention

A method, a device, road side equipment and a cloud control platform for optimizing position information are provided.

According to a first aspect, there is provided a method for optimizing location information, the method comprising: acquiring an image frame sequence comprising an image of a target to be tracked; selecting a target image frame from the image frame sequence; generating track position prediction information of the target to be tracked according to the image frame sequence, wherein the track position prediction information is used for indicating the position of the image of the target to be tracked, which is determined by adopting a target tracking technology, in the target image frame; performing target detection on the target image frame to generate detection position prediction information of the target to be tracked, wherein the detection position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame, which is determined by adopting a target detection technology; and generating optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

According to a second aspect, there is provided an apparatus for optimizing location information, the apparatus comprising: an acquisition unit configured to acquire an image frame sequence including an image of a target to be tracked; a first selecting unit configured to select a target image frame from a sequence of image frames; a first generation unit configured to generate track position prediction information of the target to be tracked according to the image frame sequence, wherein the track position prediction information is used for indicating the position of the image of the target to be tracked, which is determined by adopting a target tracking technology, in the target image frame; a second generating unit configured to perform target detection on the target image frame, and generate detection position prediction information of the target to be tracked, where the detection position prediction information is used to indicate a position of an image of the target to be tracked in the target image frame, which is determined by using a target detection technique; and a third generating unit configured to generate optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in any one of the implementations of the first aspect.

According to a fourth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for enabling a computer to perform a method as described in any of the implementations of the first aspect.

According to a fifth aspect, there is provided a roadside device comprising an electronic device as described in the third aspect.

According to a sixth aspect, there is provided a cloud control platform comprising an electronic device as described in the third aspect.

According to a seventh aspect, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.

The technique according to the application enables the position of the object to be tracked in the image frame to be optimized by combining the position determined by the object tracking technique with the position determined by the object detection technique. And the accuracy of target detection and track tracking results can be improved, and the problem that tracking results deviate due to inaccuracy of detection results is avoided.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:

FIG. 1 is a schematic diagram of a first embodiment according to the present application;

FIG. 2 is a schematic diagram of a second embodiment according to the present application;

FIG. 3 is a schematic diagram of one application scenario in which a method for optimizing location information of an embodiment of the present application may be implemented;

FIG. 4 is a schematic diagram of an apparatus for optimizing location information according to an embodiment of the present application;

fig. 5 is a block diagram of an electronic device for implementing a method for optimizing location information according to an embodiment of the application.

Detailed Description

Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram 100 showing a first embodiment according to the present application. The method for optimizing location information comprises the steps of:

s101, acquiring an image frame sequence comprising an image of an object to be tracked.

In the present embodiment, the execution subject for optimizing the position information may acquire the image frame sequence including the image of the object to be tracked in various ways. As an example, the execution subject may be an autonomous vehicle. The target to be tracked may include various dynamic traffic participants, such as pedestrians, vehicles, riders, etc. The execution subject can acquire images shot for road traffic environment through the vehicle-mounted camera in the driving process. As yet another example, the execution subject may be various electronic monitoring devices, and the image frame sequence of the image of the target to be tracked may be a video shot for the region to be monitored.

It should be noted that the above image frame sequence generally includes at least three images. The number of the targets to be tracked may be 1 or greater than 1.

S102, selecting a target image frame from the image frame sequence.

In the present embodiment, the execution subject may select the target image frame from the image frame sequence acquired in step S101 in various ways. The target image frame may be any image frame specified in advance, or may be an image frame determined according to a predetermined rule. In general, the above-described target image frame may be an image frame in which the position of the target to be tracked is to be determined.

S103, generating track position prediction information of the target to be tracked according to the image frame sequence.

In this embodiment, according to the image frame sequence acquired in step S101, the execution subject may generate the track position prediction information of the target to be tracked in various ways. The track position prediction information may be used to indicate the position of the image of the target to be tracked, which is determined by using the target tracking technique, in the target image frame selected in the step S102.

In this embodiment, the target tracking technique may include, but is not limited to, at least one of: prediction-based tracking methods (bayesian frameworks), region-based tracking methods, model-based tracking methods. The prediction-based tracking method may specifically further include a method based on kalman filtering or particle filtering. The above-mentioned region-based tracking method may further specifically include a method based on a sum of squares of differences, based on color statistics, based on shape, based on gray structural features, and the like. The model-based tracking method can also specifically comprise a line drawing model, a two-position contour model, a three-dimensional model and the like.

And S104, performing target detection on the target image frame to generate detection position prediction information of the target to be tracked.

In this embodiment, the execution body may perform target detection on the target image frame in various manners, and generate detection position prediction information of the target to be tracked. The detected-position prediction information may be used to indicate the position of the image of the target to be tracked, which is determined by using the target detection technique, in the target image frame selected in step S102.

In this embodiment, the above-mentioned target detection technique may include, but is not limited to, at least one of: detection algorithm based on traditional manual characteristics and detection algorithm based on deep learning. The detection algorithm based on deep learning may also include, but is not limited to: and (3) detecting based on an integrated convolution network and an algorithm based on multi-scale multi-port detection.

S105, generating optimized position information according to the track position prediction information and the detection position prediction information.

In this embodiment, the execution subject may generate the optimized position information in various ways based on the track position prediction information generated in step S103 and the detection position prediction information generated in step S104. The location information may be used to indicate a location of the image of the target to be tracked in the target image frame selected in the step S102.

As an example, the execution subject may generate the optimized position information according to the degree of matching between the track position prediction information generated in step S103 and the detection position prediction information generated in step S104. For example, in response to determining that the distance between the center of the target to be tracked indicated by the track position prediction information and the center of the target to be tracked indicated by the detection position prediction information generated in step S104 is smaller than the preset threshold, the above-described execution subject may move the center of the target to be tracked indicated by the track position prediction information generated in step S103 by a minute distance toward the center of the target to be tracked indicated by the detection position prediction information generated in step S104. The minute distance may be, for example, a distance between the center of the target to be tracked indicated by the track position prediction information and the center of the target to be tracked indicated by the detection position prediction information generated in step S104 multiplied by a preset coefficient. The predetermined coefficient is usually between 0 and 1.

As yet another example, the execution subject may perform weighted summation according to the coordinates indicated by the trajectory position prediction information generated in step S103 and the coordinates indicated by the detection position prediction information generated in step S104, thereby generating the optimized position information.

The method provided by the embodiment of the application combines the position determined by adopting the target tracking technology and the position determined by adopting the target detection technology, so as to optimize the position of the target to be tracked in the image frame. Therefore, the accuracy of target detection and track tracking results can be improved, and deviation of the tracking results caused by inaccuracy of the detection results is avoided.

In some optional implementations of this embodiment, according to the image frame sequence, the executing body may generate the track position prediction information of the target to be tracked according to the following steps:

first, based on a detection tracking (tracking by detection) technique, a position of an image of a target to be tracked in each image frame associated with the target image frame in the sequence of image frames is determined.

In these implementations, the executing entity may determine, in a temporal forward direction, a position of an image of the object to be tracked in each image frame associated with the target image frame in the sequence of image frames based on the detection tracking technique. The detection tracking technology may include, for example, a tracking method based on a baseline framework. Alternatively, the detection tracking technique described above may include, for example, a target tracking algorithm that performs post-processing based on NMS (Non Maximum Suppression, non-maximum suppression).

As an example, 5 image frames with vehicle a displayed may be included in the above-described target image frame sequence. The target image frame may be, for example, the 3 rd frame. The executing body may determine the position of the vehicle a in the 1 st and 5 th frames in the target image frame sequence using a DPNMS algorithm.

And secondly, interpolating according to the determined position to generate track position prediction information of the target to be tracked.

In these implementations, the execution body may interpolate by various methods according to the position determined in the first step to generate the track position prediction information of the target to be tracked.

Based on the optional implementation manner, the method can determine the position of the target to be tracked in the target frame by using a detection tracking technology with higher accuracy and an interpolation mode. Therefore, the method can ensure accuracy, avoid deviation of detection results caused by shielding and the like from influencing tracking effect, and has stronger robustness.

In some optional implementations of this embodiment, the executing body may generate the optimized position information according to the track position prediction information and the detected position prediction information according to the following steps:

the first step is to input the track position prediction information and the detection position prediction information into a pre-trained deep neural network, and generate a classification result for indicating whether an image of a target to be tracked is included in a target frame or not and a regression result for indicating position adjustment incremental information.

In these implementations, the deep neural network may include various two-branch (2-branch) models trained in a supervised manner. The deep neural network model can simultaneously perform classification tasks and regression tasks. The above classification result may include various forms such as "0", "1". The regression results may also include various forms such as incremental values of the center coordinates x, y of the detection frame, the width w and the height h of the detection frame.

And secondly, generating optimized position information based on the classification result and the regression result.

In these implementations, the execution subject may generate the optimized location information in various ways based on the classification result and the regression result obtained in the first step. As an example, in response to determining that the classification result is used to indicate that the image of the target frame does not include the target to be tracked, the execution subject may reselect the image frame associated with the target frame for target tracking.

Based on the optional implementation manner, the method can generate the optimized position information by using the deep neural network capable of simultaneously carrying out classification and regression tasks, so that the track optimization performance can be improved.

Alternatively, based on the manner described in the second step, the execution body may update the detection position prediction information according to the regression result in response to determining that the classification result is used to indicate the image including the target to be tracked in the target frame. As an example, the execution subject may increase an increment value indicated by the regression result obtained in the first step on the basis of the detected-position prediction information generated in step S104, thereby realizing the update of the detected-position prediction information.

Based on the optional implementation manner, the method can generate optimized position information based on the result obtained by target detection and based on the result of the deep neural network fusion track tracking technology and the target detection technology, thereby improving the accuracy of track tracking and target detection.

With continued reference to fig. 2, fig. 2 is a schematic diagram 200 according to a second embodiment of the present application. The method for optimizing location information comprises the steps of:

s201, acquiring an image frame sequence comprising images of an object to be tracked.

S202, selecting a target image frame from the image frame sequence.

S203, track position prediction information of the target to be tracked is generated according to the image frame sequence.

S204, target image frames are subjected to target detection, and detection position prediction information of the target to be tracked is generated.

S205, generating optimized position information according to the track position prediction information and the detection position prediction information.

The above S201, S202, S203, S204, and S205 may be identical to S101, S102, S103, S104, and S105 and their alternative implementations in the foregoing embodiments, and the above description of S101, S102, S103, S104, and S105 and their alternative implementations is also applicable to S201, S202, S203, S204, and S205, which are not repeated herein.

S206, re-selecting the image frame corresponding to the non-optimized position information from the image frame sequence as a new target image frame.

In this embodiment, the execution subject of the method for optimizing position information may reselect, in various ways, an image frame corresponding to position information that has not been optimized from the image frame sequence as a new target image frame.

As an example, 5 image frames with vehicle a displayed may be included in the above-described target image frame sequence. The target image frame may be, for example, the 3 rd frame. When the position information indicating the position of the vehicle a in the 3 rd frame image is optimized, the executing body may reselect an image frame (for example, 5 th frame) corresponding to the non-optimized position information as a new target image frame.

S207, based on the optimized position information, optimized position information indicating the position of the object to be tracked in the new object image frame is generated.

In the present embodiment, based on the optimized position information generated in step S205, the above-described execution subject may generate optimized position information indicating the position of the object to be tracked in the new target image frame in a manner consistent with the methods described in steps S103 to S105 and their alternative implementations in the foregoing embodiments.

As can be seen from fig. 2, the flow 200 of the method for optimizing position information in the present embodiment embodies the steps of re-selecting a new target image frame and continuing to optimize position information indicating the position of an object to be tracked in other image frames in the sequence of image frames using the optimized position information. Therefore, the scheme described in the embodiment can realize the optimized detection by using the optimized tracking result and the optimized tracking by using the optimized detection result, so that the tracking result and the detection result can be optimized simultaneously.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of a method for optimizing location information according to an embodiment of the present application. In the application scenario of fig. 3, an autonomous vehicle 301 may take a sequence of image frames 302 with an onboard camera during travel that includes an image of an object to be tracked (e.g., vehicle 1). The autonomous vehicle 301 may then select the target image frame 3022 from the sequence of image frames 302. From the positions of the vehicle 1 in the 1 st frame 3021 and the 5 th frame 3023 in the sequence of image frames 302, the autonomous vehicle 301 may generate track position prediction information 30221 of the vehicle 1. Alternatively, the track position prediction information 30221 of the vehicle 1 described above may be obtained based on interpolation of track points. Thereafter, the autonomous vehicle 301 may perform target detection for the vehicle 1 on the target image frame 3022, generating detection position prediction information 30222 of the vehicle 1. Finally, the autonomous vehicle 301 may generate optimized position information based on the trajectory position prediction information 30221 and the detected position prediction information 30222.

At present, one of the prior art generally adopts a Markov decision process method, a deep learning method, a mode based on depth characteristics and Kalman filtering, and the like, and as the tracking effect is easily influenced by the detection effect, incomplete target detection can be caused under abnormal working conditions or shielding scenes, so that the tracking effect is poor. The method provided by the embodiment of the application combines the position determined by adopting the target tracking technology and the position determined by adopting the target detection technology, so that the position of the target to be tracked in the image frame is optimized. And the accuracy of target detection and track tracking results can be improved, and deviation of the tracking results caused by inaccuracy of the detection results is avoided.

With further reference to fig. 4, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for optimizing location information, which corresponds to the method embodiment shown in fig. 1 or fig. 2, and which is particularly applicable to various electronic devices.

As shown in fig. 4, the apparatus 400 for optimizing position information provided in the present embodiment includes an acquisition unit 401, a first selection unit 402, a first generation unit 403, a second generation unit 404, and a third generation unit 405. Wherein the acquisition unit 401 is configured to acquire an image frame sequence including an image of an object to be tracked; a first selection unit 402 configured to select a target image frame from a sequence of image frames; a first generating unit 403 configured to generate track position prediction information of the target to be tracked according to the image frame sequence, wherein the track position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame, which is determined by adopting the target tracking technology; a second generating unit 404 configured to perform target detection on the target image frame, and generate detection position prediction information of the target to be tracked, where the detection position prediction information is used to indicate a position of the image of the target to be tracked in the target image frame, which is determined by using the target detection technology; a third generating unit 405 configured to generate optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

In the present embodiment, in the apparatus 400 for optimizing position information: specific processes of the obtaining unit 401, the first selecting unit 402, the first generating unit 403, the second generating unit 404, and the third generating unit 405 and technical effects thereof may refer to the relevant descriptions of steps S101, S102, S103, S104, and S105 in the corresponding embodiment of fig. 1, and are not repeated herein.

In some optional implementations of this embodiment, the first generating unit 403 may include: a determination module (not shown in the figures) configured to determine a position of an image of the object to be tracked in each image frame associated with the object image frame in the sequence of image frames based on the detection tracking technique. A first generating module (not shown in the figure) is configured to interpolate according to the determined position and generate track position prediction information of the object to be tracked.

In some optional implementations of this embodiment, the third generating unit 405 may include: a second generation module (not shown in the figure) configured to input the trajectory position prediction information and the detection position prediction information to the deep neural network trained in advance, and generate a classification result for indicating whether an image of the target to be tracked is included in the target frame and a regression result for indicating the position adjustment increment information. A third generation module (not shown in the figure) is configured to generate optimized position information based on the classification result and the regression result.

In some optional implementations of this embodiment, the third generating module may be further configured to: and in response to determining that the classification result is used for indicating the image including the target to be tracked in the target frame, updating the detection position prediction information according to the regression result.

In some optional implementations of this embodiment, the apparatus for optimizing location information may further include: a second selecting unit (not shown in the figure) is configured to reselect an image frame corresponding to the non-optimized position information from the image frame sequence as a new target image frame. A fourth generating unit (not shown in the figure) configured to generate optimized position information indicating a position of the object to be tracked in the new object image frame based on the optimized position information.

The apparatus provided in the above embodiment of the present application combines the position determined by the first generating unit 403 using the target tracking technique and the position determined by the second generating unit 404 using the target detection technique through the third generating unit 405, so as to optimize the position of the target to be tracked in the image frame. And the accuracy of target detection and track tracking results can be improved, and deviation of the tracking results caused by inaccuracy of the detection results is avoided.

Referring now to fig. 5, the present application also provides an electronic device and a readable storage medium according to an embodiment of the present application.

As shown in fig. 5, is a block diagram of an electronic device for a method of optimizing location information according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as an automatic control system of an autonomous vehicle, personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.

As shown in fig. 5, the electronic device includes: one or more processors 501, memory 502, and interfaces for connecting components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 501 is illustrated in fig. 5.

Memory 502 is a non-transitory computer readable storage medium provided by the present application. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for optimizing location information provided by the present application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method for optimizing location information provided by the present application.

The memory 502 is a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules (e.g., the dividing unit 401, the acquiring unit 402, the projecting unit 403, and the generating unit 404 shown in fig. 4) corresponding to the method for optimizing position information in the embodiment of the present application. The processor 501 executes various functional applications of the server and data processing, i.e., implements the method for optimizing location information in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 502.

Memory 502 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device for optimizing the location information, and the like. In addition, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected to the electronic device for optimizing location information via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device for the method of optimizing location information may further include: an input device 503 and an output device 504. The processor 501, memory 502, input devices 503 and output devices 504 may be connected by a bus or otherwise, for example in fig. 5.

The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic device for optimizing location information, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, and the like. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

The road side equipment can also comprise a communication component and the like besides the electronic equipment, and the electronic equipment can be integrated with the communication component or can be arranged in a split mode. The electronic device can acquire data of a sensing device (such as a road side camera), such as pictures, videos and the like, so as to perform image video processing and data calculation.

The cloud control platform performs processing at the cloud, and electronic equipment included in the cloud control platform can acquire data of sensing equipment (such as a road side camera), such as pictures, videos and the like, so that image video processing and data calculation are performed; the cloud control platform can also be called a vehicle-road collaborative management platform, an edge computing platform, a cloud computing platform, a central system, a cloud server and the like

According to the technical scheme of the embodiment of the application, the position determined by adopting the target tracking technology and the position determined by adopting the target detection technology can be combined, so that the position of the target to be tracked in the image frame is optimized. And the accuracy of target detection and track tracking results can be improved, and the problem that tracking results deviate due to inaccuracy of detection results is avoided.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution disclosed in the present application can be achieved, and are not limited herein.

The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims

1. A method for optimizing location information, comprising:

acquiring an image frame sequence comprising an image of a target to be tracked;

selecting a target image frame from the image frame sequence;

generating track position prediction information of the target to be tracked according to the image frame sequence, wherein the track position prediction information is used for indicating the position of the image of the target to be tracked, which is determined by adopting a target tracking technology, in the target image frame;

performing target detection on the target image frame to generate detection position prediction information of the target to be tracked, wherein the detection position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame, which is determined by adopting a target detection technology;

generating optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame;

wherein generating optimized position information according to the track position prediction information and the detection position prediction information includes:

inputting the track position prediction information and the detection position prediction information into a pre-trained deep neural network, and generating a classification result for indicating whether the target frame comprises the image of the target to be tracked or not and a regression result for indicating position adjustment increment information;

and generating optimized position information based on the classification result and the regression result.

2. The method of claim 1, wherein the generating track position prediction information of the object to be tracked from the sequence of image frames comprises:

determining a position of an image of the target to be tracked in each image frame associated with the target image frame in the sequence of image frames based on a detection tracking technique;

and interpolating according to the determined position to generate track position prediction information of the target to be tracked.

3. The method of claim 1, wherein the generating optimized location information based on the classification result and the regression result comprises:

and in response to determining that the classification result is used for indicating the image including the target to be tracked in the target frame, updating the detection position prediction information according to the regression result.

4. A method according to one of claims 1-3, wherein the method further comprises:

re-selecting an image frame corresponding to the unoptimized position information from the image frame sequence as a new target image frame;

and generating optimized position information for indicating the position of the target to be tracked in the new target image frame based on the optimized position information.

5. An apparatus for optimizing location information, comprising:

an acquisition unit configured to acquire an image frame sequence including an image of a target to be tracked;

a first selecting unit configured to select a target image frame from the image frame sequence;

a first generation unit configured to generate track position prediction information of the target to be tracked according to the image frame sequence, wherein the track position prediction information is used for indicating the position of the image of the target to be tracked, which is determined by adopting a target tracking technology, in the target image frame;

a second generating unit configured to perform target detection on the target image frame and generate detection position prediction information of the target to be tracked, wherein the detection position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame, which is determined by adopting a target detection technology;

a third generating unit configured to generate optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame;

wherein the third generating unit includes:

a second generation module configured to input the trajectory position prediction information and the detection position prediction information to a pre-trained deep neural network, and generate a classification result for indicating whether the image of the target to be tracked is included in the target frame and a regression result for indicating position adjustment increment information;

and a third generation module configured to generate optimized location information based on the classification result and the regression result.

6. The apparatus of claim 5, the first generation unit comprising:

a determination module configured to determine a position of an image of the object to be tracked in each image frame associated with the object image frame in the sequence of image frames based on a detection tracking technique;

the first generation module is configured to interpolate according to the determined position and generate track position prediction information of the target to be tracked.

7. The apparatus of claim 5, the third generation module further configured to:

8. The apparatus according to one of claims 5-7, the apparatus further comprising:

a second selecting unit configured to reselect an image frame corresponding to the non-optimized position information from the image frame sequence as a new target image frame;

and a fourth generation unit configured to generate optimized position information indicating a position of the object to be tracked in the new object image frame based on the optimized position information.

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4.

10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-4.

11. A roadside device comprising the electronic device of claim 9.

12. A cloud control platform comprising the electronic device of claim 9.