CN112528932A

CN112528932A - Method and device for optimizing position information, road side equipment and cloud control platform

Info

Publication number: CN112528932A
Application number: CN202011526575.9A
Authority: CN
Inventors: 高旭
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Apollo Intelligent Connectivity Beijing Technology Co Ltd
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2021-03-19
Anticipated expiration: 2040-12-22
Also published as: CN112528932B

Abstract

The application discloses a method and a device for optimizing position information, roadside equipment and a cloud control platform, and relates to the fields of intelligent transportation and automatic driving. The specific implementation scheme is as follows: acquiring an image frame sequence including an image of a target to be tracked; selecting a target image frame from the image frame sequence; generating track position prediction information of the target to be tracked according to the image frame sequence; carrying out target detection on the target image frame to generate detection position prediction information of the target to be tracked; and generating optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame. Therefore, the position of the target to be tracked in the image frame is optimized. And then the accuracy of target detection and trajectory tracking results can be improved, and the deviation of tracking results caused by the inaccuracy of detection results is avoided.

Description

Method and device for optimizing position information, road side equipment and cloud control platform

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a trajectory tracking optimization technology in the fields of intelligent transportation and automatic driving.

Background

The target tracking problem is a basic problem in computer vision, and has very important application in the fields of medical imaging, intelligent transportation, automatic driving and the like. For example, in the field of intelligent transportation, after a vehicle is tracked, the tracked trajectory can be analyzed, so as to detect whether abnormal behaviors occur. Therefore, in the field of intelligent transportation, multi-target tracking is a very basic and important problem.

The prior art can generally comprise a Markov decision process method, a deep learning method and the like, but the problems of high resource occupation, low operation speed and the like exist. For multi-target tracking, a depth feature and a Kalman filtering mode can be utilized in the prior art. However, the tracking effect in the prior art is susceptible to the detection effect. Whether the abnormal working condition or the shielding condition is adopted, the problems of incomplete vehicle detection or serious shaking can be caused, and the tracking effect is poor.

Disclosure of Invention

A method and a device for optimizing position information, road side equipment and a cloud control platform are provided.

According to a first aspect, there is provided a method for optimizing location information, the method comprising: acquiring an image frame sequence including an image of a target to be tracked; selecting a target image frame from an image frame sequence; generating track position prediction information of the target to be tracked according to the image frame sequence, wherein the track position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame determined by adopting a target tracking technology; performing target detection on a target image frame to generate detection position prediction information of a target to be tracked, wherein the detection position prediction information is used for indicating the position of an image of the target to be tracked in the target image frame determined by adopting a target detection technology; and generating optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

According to a second aspect, there is provided an apparatus for optimizing location information, the apparatus comprising: an acquisition unit configured to acquire an image frame sequence including an image of a target to be tracked; a first selecting unit configured to select a target image frame from a sequence of image frames; a first generating unit configured to generate trajectory position prediction information of the target to be tracked according to the image frame sequence, wherein the trajectory position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame determined by adopting a target tracking technology; a second generation unit configured to perform target detection on the target image frame and generate detection position prediction information of the target to be tracked, wherein the detection position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame determined by adopting a target detection technology; and a third generating unit configured to generate optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method as described in any one of the implementations of the first aspect.

According to a fourth aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for enabling a computer to perform the method as described in any one of the implementations of the first aspect.

According to a fifth aspect, there is provided a roadside apparatus including the electronic apparatus as described in the third aspect.

According to a sixth aspect, there is provided a cloud control platform comprising the electronic device as described in the third aspect.

According to a seventh aspect, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method as described in any of the implementations of the first aspect.

The technology realizes the optimization of the position of the target to be tracked in the image frame by combining the position determined by adopting the target tracking technology and the position determined by adopting the target detection technology. And then the accuracy of target detection and trajectory tracking results can be improved, and the problem that the tracking result is deviated due to the inaccuracy of the detection result is avoided.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present application;

FIG. 2 is a schematic diagram according to a second embodiment of the present application;

FIG. 3 is a schematic diagram of an application scenario in which a method for optimizing location information according to an embodiment of the present application may be implemented;

FIG. 4 is a schematic diagram of an apparatus for optimizing location information according to an embodiment of the present application;

fig. 5 is a block diagram of an electronic device for implementing a method for optimizing location information according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram 100 illustrating a first embodiment according to the present application. The method for optimizing location information includes the steps of:

s101, acquiring an image frame sequence including an image of a target to be tracked.

In the present embodiment, the execution subject for optimizing the position information may acquire an image frame sequence including an image of the target to be tracked in various ways. As an example, the execution subject may be an autonomous vehicle. The target to be tracked may include various dynamic traffic participants, such as pedestrians, vehicles, riders, and the like. The execution main body can acquire images shot aiming at the road traffic environment in the driving process through the vehicle-mounted camera. As still another example, the execution subject may be various electronic monitoring devices, and the image frame sequence of the image of the target to be tracked may be a video shot for an area to be monitored.

It should be noted that the image frame sequence generally includes at least three images. The number of the targets to be tracked may be 1, or may be greater than 1.

S102, selecting a target image frame from the image frame sequence.

In the present embodiment, the execution subject described above may select the target image frame from the image frame sequence acquired in step S101 in various ways. The target image frame may be any image frame designated in advance, or may be an image frame determined according to a preset rule. In general, the above-described target image frame may be an image frame in which a position of a target to be tracked is to be determined.

And S103, generating track position prediction information of the target to be tracked according to the image frame sequence.

In the present embodiment, the execution subject described above may generate trajectory position prediction information of the target to be tracked in various ways according to the image frame sequence acquired in step S101. The track position prediction information may be used to indicate the position of the image of the target to be tracked, determined by using a target tracking technique, in the target image frame selected in step S102.

In the present embodiment, the target tracking technology may include, but is not limited to, at least one of the following: prediction-based tracking methods (bayesian framework), region-based tracking methods, model-based tracking methods. The prediction-based tracking method may further specifically include a method based on kalman filtering or particle filtering. The region-based tracking method may specifically include a difference sum of squares method, a color statistics method, a shape method, a gray structure feature method, and the like. The model-based tracking method may further specifically include a line graph model, a two-dimensional contour model, a three-dimensional stereo model, and the like.

And S104, carrying out target detection on the target image frame to generate detection position prediction information of the target to be tracked.

In this embodiment, the executing entity may perform target detection on the target image frame in various ways, and generate the detection position prediction information of the target to be tracked. The detected position prediction information may be used to indicate the position of the image of the target to be tracked, which is determined by the target detection technology, in the target image frame selected in step S102.

In the present embodiment, the above target detection technology may include, but is not limited to, at least one of the following: the detection algorithm based on the traditional manual characteristics and the detection algorithm based on deep learning. The detection algorithm based on deep learning may further include, but is not limited to: the method is based on detection of an integrated convolutional network and an algorithm of multi-scale and multi-port detection.

And S105, generating optimized position information according to the track position prediction information and the detection position prediction information.

In the present embodiment, the execution subject may generate optimized position information in various ways based on the trajectory position prediction information generated in step S103 and the detected position prediction information generated in step S104. The position information may be used to indicate a position of the image of the target to be tracked in the target image frame selected in step S102.

As an example, the execution subject may generate optimized position information according to the degree of matching between the trajectory position prediction information generated in step S103 and the detected position prediction information generated in step S104. For example, in response to determining that the distance between the center of the target to be tracked indicated by the trajectory position prediction information and the center of the target to be tracked indicated by the detection position prediction information generated in step S104 is smaller than the preset threshold, the execution subject may move the center of the target to be tracked indicated by the trajectory position prediction information generated in step S103 by a minute distance toward the center of the target to be tracked indicated by the detection position prediction information generated in step S104. The above-mentioned minute distance may be, for example, a preset coefficient multiplied by a distance between the center of the target to be tracked indicated by the trajectory position prediction information and the center of the target to be tracked indicated by the detection position prediction information generated in step S104. The predetermined coefficient is usually between 0 and 1.

As still another example, the execution body may perform weighted summation according to the coordinates indicated by the trajectory position prediction information generated in step S103 and the coordinates indicated by the detection position prediction information generated in step S104, thereby generating optimized position information.

In the method provided by the above embodiment of the present application, the position determined by using the target tracking technology and the position determined by using the target detection technology are combined, so as to optimize the position of the target to be tracked in the image frame. Therefore, the accuracy of target detection and track tracking results can be improved, and the tracking result is prevented from deviating due to the inaccuracy of the detection result.

In some optional implementations of the embodiment, based on the image frame sequence, the executing body may generate the trajectory position prediction information of the target to be tracked according to the following steps:

in a first step, based on a tracking by detection technique, a position of an image of a target to be tracked in each image frame associated with a target image frame in a sequence of image frames is determined.

In these implementations, the execution subject may determine, in a temporal forward direction, a position of an image of a target to be tracked in each image frame associated with a target image frame in the image frame sequence based on a detection and tracking technique. The detection and tracking technology may include a tracking method based on a baseline framework, for example. Alternatively, the detection tracking technique may include, for example, a target tracking algorithm that performs post-processing based on NMS (Non Maximum Suppression).

As an example, the above-described target image frame sequence may include 5 image frames in which the vehicle a is displayed. The above target image frame may be, for example, the 3 rd frame. The executing agent may determine the position of the vehicle a in the 1 st frame and the 5 th frame of the target image frame sequence using a DPNMS algorithm.

And secondly, performing interpolation according to the determined position to generate track position prediction information of the target to be tracked.

In these implementations, the execution subject may perform interpolation by various methods according to the position determined in the first step to generate the trajectory position prediction information of the target to be tracked.

Based on the optional implementation mode, the position of the target to be tracked in the target frame can be determined by using a detection tracking technology with high accuracy and an interpolation mode. Therefore, the tracking effect can be prevented from being influenced by the deviation of the detection result caused by shielding and the like while the accuracy is ensured, and the robustness is high.

In some optional implementations of this embodiment, based on the track position prediction information and the detection position prediction information, the executing entity may generate the optimized position information according to the following steps:

firstly, inputting the track position prediction information and the detection position prediction information into a pre-trained deep neural network, and generating a classification result for indicating whether the target frame comprises an image of the target to be tracked and a regression result for indicating position adjustment increment information.

In these implementations, the deep neural network may include various two-branch (2-branch) models trained using supervised approaches. The deep neural network model can simultaneously perform a classification task and a regression task. The above classification result may include various forms such as "0", "1". The above regression result may also include various forms such as incremental values of the detection frame center coordinates x, y, the detection frame width w, and the height h.

And secondly, generating optimized position information based on the classification result and the regression result.

In these implementations, the execution subject may generate the optimized location information in various ways based on the classification result and the regression result obtained in the first step. As an example, in response to determining that the classification result indicates that the image of the target to be tracked is not included in the target frame, the execution subject may reselect the image frame associated with the target frame for target tracking.

Based on the optional implementation mode, the optimized position information can be generated by using the deep neural network capable of simultaneously carrying out classification and regression tasks, so that the performance of track optimization can be improved.

Alternatively, in response to determining that the classification result indicates that the image of the target frame includes the target to be tracked is included in the target frame, the execution main body may update the detection position prediction information according to the regression result, in a manner described in the second step. As an example, the execution subject may increase an increment value indicated by the regression result obtained in the first step above, on the basis of the detected position prediction information generated in step S104, thereby realizing the update of the detected position prediction information.

Based on the optional implementation mode, the optimized position information can be generated based on the result obtained by target detection and the result obtained by fusing the track tracking technology and the target detection technology by the deep neural network, so that the accuracy of track tracking and target detection is improved.

With continued reference to fig. 2, fig. 2 is a schematic diagram 200 of a second embodiment according to the present application. The method for optimizing location information includes the steps of:

s201, an image frame sequence including an image of a target to be tracked is acquired.

S202, selecting a target image frame from the image frame sequence.

And S203, generating track position prediction information of the target to be tracked according to the image frame sequence.

And S204, carrying out target detection on the target image frame to generate detection position prediction information of the target to be tracked.

And S205, generating optimized position information according to the track position prediction information and the detection position prediction information.

S201, S202, S203, S204, and S205 may be respectively consistent with S101, S102, S103, S104, and S105 and their optional implementations in the foregoing embodiments, and the above description on S101, S102, S103, S104, and S105 and their optional implementations also applies to S201, S202, S203, S204, and S205, which is not described herein again.

S206, re-selecting the image frame corresponding to the position information which is not optimized from the image frame sequence as a new target image frame.

In this embodiment, the execution subject of the method for optimizing the position information may reselect an image frame corresponding to the position information that is not optimized from the image frame sequence as a new target image frame in various ways.

As an example, the above-described target image frame sequence may include 5 image frames in which the vehicle a is displayed. The above target image frame may be, for example, the 3 rd frame. After the optimization of the position information indicating the position of the vehicle a in the 3 rd frame image is completed, the executing body may re-select the image frame (for example, the 5 th frame) corresponding to the position information that is not optimized as a new target image frame.

And S207, generating optimized position information used for indicating the position of the target to be tracked in the new target image frame based on the optimized position information.

In the present embodiment, based on the optimized position information generated in step S205, the executing entity may generate optimized position information for indicating the position of the target to be tracked in the new target image frame in a manner consistent with the methods described in steps S103 to S105 and their optional implementations in the foregoing embodiments.

As can be seen from fig. 2, the flow 200 of the method for optimizing position information in the present embodiment represents the steps of re-selecting a new target image frame and continuing to optimize the position information indicating the position of the target to be tracked in other image frames in the image frame sequence by using the optimized position information. Therefore, the scheme described in this embodiment can realize the optimized detection by using the optimized tracking result and the optimized tracking by using the optimized detection result, so that the tracking result and the detection result can be optimized simultaneously.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of a method for optimizing location information according to an embodiment of the present application. In the application scenario of fig. 3, an autonomous vehicle 301 may take a sequence of image frames 302 including images of an object to be tracked (e.g., vehicle 1) with an onboard camera while in motion. The autonomous vehicle 301 may then select a target image frame 3022 from the image frame sequence 302. From the position of the vehicle 1 in the 1 st frame 3021 and the 5 th frame 3023 in the image frame sequence 302, the autonomous vehicle 301 may generate trajectory position prediction information 30221 for the vehicle 1. Alternatively, the trajectory position prediction information 30221 of the vehicle 1 described above may be obtained based on interpolation of trajectory points. Thereafter, the autonomous vehicle 301 may perform target detection for the vehicle 1 on the target image frame 3022, generating detected position prediction information 30222 of the vehicle 1. Finally, the autonomous vehicle 301 may generate optimized position information based on the trajectory position prediction information 30221 and the detected position prediction information 30222.

At present, one of the prior art usually adopts a markov decision process method, a deep learning method, a method based on a deep feature and a kalman filtering method, and because the tracking effect is easily affected by the detection effect, the target detection is incomplete under an abnormal working condition or a sheltered scene, and the tracking effect is poor. In the method provided by the above embodiment of the present application, the position of the target to be tracked in the image frame is optimized by combining the position determined by the target tracking technology and the position determined by the target detection technology. And then the accuracy of target detection and trajectory tracking results can be improved, and the deviation of tracking results caused by the inaccuracy of detection results is avoided.

With further reference to fig. 4, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of an apparatus for optimizing location information, which corresponds to the method embodiment shown in fig. 1 or fig. 2, and which may be applied in various electronic devices.

As shown in fig. 4, the apparatus 400 for optimizing location information provided by the present embodiment includes an obtaining unit 401, a first selecting unit 402, a first generating unit 403, a second generating unit 404, and a third generating unit 405. Wherein the acquiring unit 401 is configured to acquire an image frame sequence including an image of a target to be tracked; a first selecting unit 402 configured to select a target image frame from a sequence of image frames; a first generating unit 403 configured to generate trajectory position prediction information of the target to be tracked according to the image frame sequence, wherein the trajectory position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame determined by adopting the target tracking technology; a second generating unit 404 configured to perform target detection on the target image frame and generate detection position prediction information of the target to be tracked, wherein the detection position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame determined by adopting a target detection technology; a third generating unit 405 configured to generate optimized position information according to the trajectory position prediction information and the detected position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

In the present embodiment, in the apparatus 400 for optimizing location information: the specific processing of the obtaining unit 401, the first selecting unit 402, the first generating unit 403, the second generating unit 404, and the third generating unit 405 and the technical effects thereof can refer to the related descriptions of steps S101, S102, S103, S104, and S105 in the corresponding embodiment of fig. 1, respectively, and are not repeated herein.

In some optional implementations of this embodiment, the first generating unit 403 may include: a determination module (not shown in the figures) configured to determine a position of an image of the target to be tracked in each image frame associated with the target image frame in the sequence of image frames based on a detection tracking technique. And a first generating module (not shown in the figure) configured to perform interpolation according to the determined position to generate track position prediction information of the target to be tracked.

In some optional implementations of the present embodiment, the third generating unit 405 may include: and a second generation module (not shown in the figure) configured to input the trajectory position prediction information and the detection position prediction information into a pre-trained deep neural network, and generate a classification result for indicating whether the image of the target to be tracked is included in the target frame and a regression result for indicating the position adjustment increment information. And a third generating module (not shown in the figure) configured to generate optimized position information based on the classification result and the regression result.

In some optional implementations of this embodiment, the third generating module may be further configured to: and in response to the fact that the classification result is used for indicating the image of the target frame including the target to be tracked, updating the detection position prediction information according to the regression result.

In some optional implementation manners of this embodiment, the apparatus for optimizing location information may further include: a second selecting unit (not shown in the figure) configured to re-select the image frame corresponding to the non-optimized position information from the image frame sequence as a new target image frame. A fourth generating unit (not shown in the figure) configured to generate optimized position information indicating a position of the target to be tracked in the new target image frame based on the optimized position information.

The apparatus provided in the above embodiment of the present application optimizes the position of the target to be tracked in the image frame by combining the position determined by the first generating unit 403 using the target tracking technology with the position determined by the second generating unit 404 using the target detection technology through the third generating unit 405. And then the accuracy of target detection and trajectory tracking results can be improved, and the deviation of tracking results caused by the inaccuracy of detection results is avoided.

Referring now to fig. 5, the present application further provides an electronic device and a readable storage medium according to embodiments of the present application.

As shown in fig. 5, is a block diagram of an electronic device for optimizing location information according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as an automatic control system for an autonomous vehicle, personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.

Memory 502 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the method for optimizing location information provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method for optimizing location information provided herein.

The memory 502, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the method for optimizing location information in the embodiments of the present application (e.g., the segmentation unit 401, the acquisition unit 402, the projection unit 403, and the generation unit 404 shown in fig. 4). The processor 501 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 502, that is, implements the method for optimizing location information in the above-described method embodiments.

The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device for optimizing the location information, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 optionally includes memory located remotely from processor 501, which may be connected to an electronic device for optimizing location information over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the method for optimizing location information may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.

The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus for optimizing position information, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input devices. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

The roadside apparatus may include the electronic apparatus, a communication unit, and the like, and the electronic apparatus may be integrated with the communication unit or may be provided separately. The electronic device can acquire data of a perception device (such as a roadside camera), such as pictures, videos and the like, so as to perform image video processing and data calculation.

The cloud control platform executes processing at a cloud end, and electronic equipment included in the cloud control platform can acquire data of sensing equipment (such as a roadside camera), such as pictures, videos and the like, so as to perform image video processing and data calculation; the cloud control platform can also be called a vehicle-road cooperative management platform, an edge computing platform, a cloud computing platform, a central system, a cloud server and the like

According to the technical scheme of the embodiment of the application, the position determined by adopting the target tracking technology and the position determined by adopting the target detection technology can be combined, so that the position of the target to be tracked in the image frame is optimized. And then the accuracy of target detection and trajectory tracking results can be improved, and the problem that the tracking result is deviated due to the inaccuracy of the detection result is avoided.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for optimizing location information, comprising:

acquiring an image frame sequence including an image of a target to be tracked;

selecting a target image frame from the image frame sequence;

generating track position prediction information of the target to be tracked according to the image frame sequence, wherein the track position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame determined by adopting a target tracking technology;

performing target detection on the target image frame, and generating detection position prediction information of the target to be tracked, wherein the detection position prediction information is used for indicating the position of an image of the target to be tracked in the target image frame, which is determined by adopting a target detection technology;

and generating optimized position information according to the track position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

2. The method of claim 1, wherein the generating trajectory position prediction information of the target to be tracked from the sequence of image frames comprises:

determining the position of the image of the target to be tracked in each image frame associated with the target image frame in the image frame sequence based on a detection tracking technology;

and carrying out interpolation according to the determined position to generate track position prediction information of the target to be tracked.

3. The method of claim 1, wherein generating optimized location information based on the trajectory location prediction information and the detected location prediction information comprises:

inputting the track position prediction information and the detection position prediction information into a pre-trained deep neural network, and generating a classification result for indicating whether the target frame comprises the image of the target to be tracked and a regression result for indicating position adjustment increment information;

and generating optimized position information based on the classification result and the regression result.

4. The method of claim 3, wherein the generating optimized location information based on the classification results and the regression results comprises:

and in response to determining that the classification result is used for indicating that the target frame comprises the image of the target to be tracked, updating the detection position prediction information according to the regression result.

5. The method according to one of claims 1-4, wherein the method further comprises:

reselecting an image frame corresponding to the unoptimized position information from the image frame sequence as a new target image frame;

and generating optimized position information used for indicating the position of the target to be tracked in the new target image frame based on the optimized position information.

6. An apparatus for optimizing location information, comprising:

an acquisition unit configured to acquire an image frame sequence including an image of a target to be tracked;

a first selecting unit configured to select a target image frame from the image frame sequence;

a first generating unit configured to generate trajectory position prediction information of the target to be tracked according to the image frame sequence, wherein the trajectory position prediction information is used for indicating the position of the image of the target to be tracked in the target image frame determined by adopting a target tracking technology;

a second generating unit configured to perform target detection on the target image frame and generate detection position prediction information of the target to be tracked, wherein the detection position prediction information is used for indicating the position of an image of the target to be tracked, which is determined by adopting a target detection technology, in the target image frame;

a third generating unit configured to generate optimized position information according to the trajectory position prediction information and the detection position prediction information, wherein the position information is used for indicating the position of the image of the target to be tracked in the target image frame.

7. The apparatus of claim 6, the first generating unit comprising:

a determination module configured to determine a position of an image of the target to be tracked in each image frame associated with the target image frame in the sequence of image frames based on a detection tracking technique;

and the first generation module is configured to perform interpolation according to the determined position to generate track position prediction information of the target to be tracked.

8. The apparatus of claim 6, the third generating unit comprising:

a second generation module configured to input the trajectory position prediction information and the detection position prediction information to a pre-trained deep neural network, and generate a classification result indicating whether the image of the target to be tracked is included in the target frame and a regression result indicating position adjustment increment information;

a third generating module configured to generate optimized location information based on the classification result and the regression result.

9. The apparatus of claim 8, the third generation module further configured to:

10. The apparatus according to one of claims 6-9, the apparatus further comprising:

the second selecting unit is configured to reselect the image frame corresponding to the position information which is not optimized from the image frame sequence as a new target image frame;

a fourth generating unit configured to generate optimized position information indicating a position of the target to be tracked in the new target image frame based on the optimized position information.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.

13. A roadside apparatus comprising the electronic apparatus of claim 11.

14. A cloud controlled platform comprising the electronic device of claim 11.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.