CN115471520A - Measuring method and device - Google Patents

Measuring method and device Download PDF

Info

Publication number
CN115471520A
CN115471520A CN202110655268.9A CN202110655268A CN115471520A CN 115471520 A CN115471520 A CN 115471520A CN 202110655268 A CN202110655268 A CN 202110655268A CN 115471520 A CN115471520 A CN 115471520A
Authority
CN
China
Prior art keywords
observation
state
target object
image
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110655268.9A
Other languages
Chinese (zh)
Inventor
杨思静
张强
苏惠荞
魏志方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110655268.9A priority Critical patent/CN115471520A/en
Publication of CN115471520A publication Critical patent/CN115471520A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a measuring method and a measuring device, which can be used for auxiliary driving and automatic driving. The method comprises the following steps: processing the image by adopting a neural network to obtain a first observation result, and replacing hardware observation in the nonlinear filtering process to obtain an observation result; the motion state of the target object is predicted through a prediction process of nonlinear filtering by combining a motion model to obtain a prediction result, the motion track of the target object is estimated according to the prediction result to obtain a first estimation result, and the prediction result is adjusted according to the first observation result and the residual error of the first estimation result to obtain a more accurate measurement result. The method can be used in the processing process of image data acquired by a camera, improves the Advanced Driving Assistance System (ADAS) capability in automatic driving or assisted driving, and can be applied to the Internet of vehicles, such as vehicle external connection V2X, long term evolution technology LTE-V of workshop communication, vehicle-vehicle V2V and the like.

Description

Measuring method and device
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a measurement method and apparatus.
Background
With the development of society, intelligent terminals such as intelligent transportation equipment, intelligent home equipment, and robots are gradually entering the daily life of people. The sensor plays an important role in the intelligent terminal. Various sensors installed on the intelligent terminal, such as millimeter wave radar, laser radar, cameras, ultrasonic radar and the like sense the surrounding environment in the motion process of the intelligent terminal, collect data, identify and track moving objects, identify static scenes such as lane lines and nameplates, and plan paths by combining with a navigator and map data. The sensor can detect the possible danger in advance and assist or even autonomously take necessary evasive means, and the safety and the comfort of the intelligent terminal are effectively improved.
The information obtained by the sensor can realize functions of classifying, identifying, tracking and the like of the surrounding environment and the object. In the case of autonomous driving, an autonomous vehicle acquires external information from a vehicle-mounted sensor, and performs control operations such as steering and deceleration based on the external information, thereby avoiding collision with another vehicle. For example, by measuring the heading angle of the target vehicle, the future travel trajectory of the target vehicle can be predicted. When the target vehicle overtakes or merges, the course angle of the target vehicle is obviously changed, the automatic driving vehicle can judge the driving state of the target vehicle through the change of the course angle, the driving state of the automatic driving vehicle is changed, for example, the automatic driving vehicle decelerates, and the automatic driving vehicle is prevented from rubbing or colliding with the target vehicle.
Currently, there are several measurement methods that can measure the heading angle of a target vehicle. For example, the vehicle body information in the world coordinate system can be directly obtained through a laser radar, a millimeter wave radar and a multi-view camera, so that the course angle of the target vehicle is obtained. However, the sensors (laser radar, millimeter wave radar and multi-view camera) used in the measurement method have high cost, large volume and inconvenience in installation, and when the internal and external parameters of the camera change, the observation angle needs to be calibrated again through the laser radar. The monocular camera is low in cost, but in the related art, forward calculation of a Convolutional Neural Network (CNN) needs to be performed on an image acquired by the monocular camera at least twice, the calculation amount is large, and in the related art, the course angle of the target vehicle is calculated through the matching result of the target vehicle and the wheel grounding point, under the condition that the number of vehicles is large, the matching may be inaccurate, the calculation result has errors, and the jitter is large.
Disclosure of Invention
In view of this, a measurement method and apparatus are provided, in which a monocular camera is used to collect images, so as to obtain a more accurate measurement result.
In a first aspect, an embodiment of the present application provides a measurement method applied to a monocular camera, where the method includes: acquiring a first image by the monocular camera; identifying a first observation of a target object in the first image using a neural network model; the first observation comprises pixel plane coordinates of a plurality of observation points included in the target object; performing a nonlinear filtering prediction process according to the initial motion state and the motion model of the target object to obtain a first prediction state and update parameters of nonlinear filtering; wherein the initial motion state is a measurement result obtained from a history image before the first image, the initial motion state represents a final motion state of the target object before the camera acquires the first image, and the initial motion state includes an initial course angle of the target object; obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state; obtaining a residual error according to the first observation result and the first estimation result; and carrying out a nonlinear filtering updating process according to the first prediction state, the residual error and the updating parameter to obtain a new measurement result of the motion state of the target object, wherein the new measurement result comprises a course angle of the target object.
According to the measuring method provided by the embodiment of the application, the monocular camera is used for collecting the image, the image is processed by combining the neural network model, the nonlinear filtering model and the motion model, and the course angle of the target object in the image is predicted. Adopt neural network to handle the image and obtain first observation result, replace the non-linear filtering in-process and adopt hardware (radar etc.) to observe the process that obtains observation result, that is to say, handle the image that monocular camera gathered through neural network model and can realize the process of observing, need not install the volume ratio and expensive radar etc. and observe the target object, compare in radar camera price lower, can save the cost, and the camera volume compares in the radar small very much installation of being convenient for of volume.
The initial motion state of the target object and the motion model are combined, the motion state (including a course angle) of the target object is predicted through a prediction process of nonlinear filtering to obtain a prediction result (first prediction state), the motion track of the target object is estimated according to the prediction result to obtain a first estimation result, the first prediction state is adjusted according to a residual error between the first observation result and the first estimation result, and a more accurate measurement result can be obtained. Therefore, compared with the mode of calibrating a single frame image through a radar or measuring the course angle by matching the target vehicle and the wheel grounding point in the single frame image through the CNN in the related technology, the measuring method provided by the application adopts the prediction process of carrying out nonlinear filtering on the initial motion state of the target object obtained according to the historical image, can measure the course angle by combining multiple frames of images, is more accurate in measuring result, realizes the observation process of the target object through the neural network model, and can save the cost.
According to the first aspect, in a first possible implementation manner, the obtaining of the first prediction state and the update parameters of the nonlinear filtering according to the prediction process of the nonlinear filtering performed by the initial motion state of the target object and the motion model includes: extracting a plurality of representative points including points corresponding to the initial state values and weights corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix; obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points; calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimated state value and the weight of each representative point; converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain a pixel plane coordinate of each representative point; calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate of each representative point and the weight; obtaining a cross correlation matrix in the updating parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinate of each representative point, the mean value of a plurality of representative points in a pixel plane coordinate system and the weight; and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
In a second possible implementation manner, the obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state includes: and calculating the world coordinate of each observation point according to the first prediction state mean value, and converting the world coordinate of each observation point into a pixel plane coordinate to obtain a first estimation result of each observation point.
According to the first possible implementation manner of the first aspect, in a third possible implementation manner, performing an update process of a nonlinear filter according to the first prediction state, the residual error, and the update parameter to obtain a new measurement result of the motion state of the target object includes: calculating the product of the residual error and the Kalman gain to obtain a state increment; summing the first predicted state mean and the state increment to obtain a second predicted state mean of the new measurement result; and obtaining a second prediction covariance matrix of the new measurement result according to the first prediction covariance matrix, the Kalman gain and the cross correlation matrix.
According to the first aspect or any one of the first to third possible implementation manners of the first aspect, in a fourth possible implementation manner, the method further includes: before the loss satisfies the convergence condition, the following process is performed:
moving an optical flow tracking area in an observation area of the target object, and determining a plurality of pairs of sample matching points of the first image and a history image before the first image from the moved optical flow tracking area; the observation region of the target object is a corresponding region of a plane region on the target object on an image, and each pair of sample matching points corresponds to the same point of the target object; performing homography solution according to at least part of the matching points in the plurality of pairs of sample matching points to obtain a homography matrix; obtaining a first course angle change rate according to the homography matrix and a normal vector of a plane where the observation area on the first image is located, wherein the first course angle change rate is the course angle change rate of the target object in the historical image and the first image; and determining the loss according to the difference value of the first course angle change rate and the second course angle change rate and the number of points meeting the homography matrix in the plurality of pairs of sample matching points, wherein the second course angle change rate is the course angle change rate included in the first prediction state mean value.
According to a fourth possible implementation manner of the first aspect, in a fifth possible implementation manner, the residual further includes a difference value between the first heading angle change rate and the second heading angle change rate.
According to the measuring method, the homography matrix is calculated by utilizing the side surface and/or the vehicle tail plane, the observed value of the course angle change rate is increased, and the measuring result is adjusted according to the residual error of the observed value and the estimated value, so that the measuring precision can be improved. The optimal optical flow tracking area is determined through the loss function, and the influence of homography matrix solving errors caused by non-planes is reduced.
According to a fourth possible implementation manner of the first aspect, in a sixth possible implementation manner, the method further includes: obtaining first observation noise of each observation point according to the size of the part of the target object associated with each observation point in the target object; or obtaining a second observation noise according to the size of the area tracked by the optical flow;
the method for performing a nonlinear filtering prediction process according to an initial motion state and a motion model of a target object to obtain a first prediction state and update parameters of the nonlinear filtering includes: extracting a plurality of representative points including points corresponding to the initial state values and weights corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix; obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points; calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point; converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point; calculating a mean and covariance matrix of the plurality of representative points at pixel plane coordinates based on the pixel plane coordinates and weight of each of the representative points and one or both of the first observation noise and the second observation noise; obtaining a cross correlation matrix in the updated parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinates of each representative point, the mean value of the pixel plane coordinates of a plurality of representative points and the weight; and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
Errors of the observation results can be corrected through the first observation noise and/or the second observation noise, so that more accurate measurement results can be obtained. The first observation noise and the second observation noise are dynamically adjusted according to the target object in the image (related to the size of the observed corresponding surface) in the calculation process, the observation noise is dynamically adjusted, the influence caused by poor regression precision of a neural network model and low solving precision of a homography matrix under the condition of crossing or straight-going is reduced, and the measurement precision can be improved.
In a seventh possible implementation form of the method according to the first aspect as such or any one of the first to sixth possible implementation forms of the first aspect, the target object and the observation object are vehicles, and the plurality of observation points include: one or more of two corner points of the rear of the vehicle and one corner point of the front of the vehicle.
In an eighth possible implementation form of the method according to the seventh possible implementation form of the first aspect, the observation region includes one or more of a corresponding region of a rear of the vehicle on the image and a corresponding region of a side of the vehicle on the image.
In a ninth possible implementation manner, according to the eighth possible implementation manner of the first aspect, identifying a first observation of a target object in the first image by using a neural network model includes: recognizing the vehicle in the first image by adopting the neural network model to obtain a grounding line, a boundary line and a Bounding Box; wherein the ground line represents a straight line on which a wheel ground point of the vehicle is located, and the dividing line represents a dividing line of two planes of the vehicle; and determining the plurality of observation points and the pixel plane coordinates of the plurality of observation points according to the grounding line, the boundary line and the Bounding Box.
The neural network model is used for observing the 2.5DBBox of the target vehicle, so that a first observation result of the target vehicle is obtained, the process that hardware (radars and the like) is adopted in the nonlinear filtering process for observation to obtain the observation result is replaced, the image acquired by the monocular camera is processed through the neural network model, the observation process can be realized, and the cost can be saved.
In a tenth possible implementation manner, when one surface of the vehicle is visible, the residual error includes a difference between an azimuth angle and a heading angle in the first predicted state mean value, where the azimuth angle is calculated according to a vertical coordinate and a horizontal coordinate in the first predicted state mean value.
In embodiments where only one side of the application is visible, the azimuth angle may be used to improve the accuracy of the heading angle calculation.
In a second aspect, an embodiment of the present application provides a measuring apparatus applied to a monocular camera, where the apparatus includes: the image acquisition module is used for acquiring a first image through the monocular camera; a first observation module for identifying a first observation of a target object in the first image using a neural network model; the first observation comprises pixel plane coordinates of a plurality of observation points included in the target object; the prediction module is used for carrying out a nonlinear filtering prediction process according to the initial motion state and the motion model of the target object to obtain a first prediction state and an update parameter of the nonlinear filtering; wherein the initial motion state is a measurement result obtained from a history image before the first image, the initial motion state represents a final motion state of the target object before the camera acquires the first image, and the initial motion state includes an initial course angle of the target object; the estimation module is used for obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state; the first calculation module is used for obtaining a residual error according to the first observation result and the first estimation result; and the updating module is used for carrying out a nonlinear filtering updating process according to the first prediction state, the residual error and the updating parameter to obtain a new measurement result of the motion state of the target object, wherein the new measurement result comprises a course angle of the target object.
According to the measuring device provided by the embodiment of the application, the monocular camera is used for collecting images, the images are processed by combining the neural network model, the nonlinear filtering and the motion model, and the course angle of the target object in the images is predicted. The neural network is adopted to process the image to obtain a first observation result, the process of observing by adopting hardware (radar and the like) in the nonlinear filtering process is replaced, namely, the process of observing can be realized by processing the image collected by the monocular camera through the neural network model, the target object does not need to be observed by expensive radar and the like with large installation volume, the price is lower compared with that of a radar camera, the cost can be saved, and the camera volume is small compared with that of the radar and is convenient to install.
The initial motion state and the motion model of the target object are combined to predict the motion state (including the course angle) of the target object through the prediction process of the nonlinear filtering to obtain a prediction result (first prediction state), the motion track of the target object is estimated according to the prediction result to obtain a first estimation result, the first prediction state is adjusted according to the first observation result and the residual error of the first estimation result, and a more accurate measurement result can be obtained. Therefore, compared with the mode that the course angle is measured by calibrating a single frame image through a radar or matching a target vehicle and a wheel grounding point in the single frame image through a CNN (common noise network), the measuring device provided by the application adopts a prediction process of carrying out nonlinear filtering on the initial motion state of the target object obtained according to the historical image, can measure the course angle by combining multiple frames of images, is more accurate in measuring result, realizes the observation process of the target object through a neural network model, and can save the cost.
In a first possible implementation form according to the second aspect, the initial motion state includes an initial state value and an initial covariance matrix, and the prediction module is further configured to: extracting a plurality of representative points including points corresponding to the initial state values and weights corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix; obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points; calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point; converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain a pixel plane coordinate of each representative point; calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate of each representative point and the weight; obtaining a cross correlation matrix in the updated parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinates of each representative point, the mean values of a plurality of representative points in a pixel plane coordinate system and the weights; and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
In a second possible implementation manner, according to the first possible implementation manner of the second aspect, the estimation module is further configured to: and calculating the world coordinate of each observation point according to the first prediction state mean value, and converting the world coordinate of each observation point into a pixel plane coordinate to obtain a first estimation result of each observation point.
In a third possible implementation manner, according to the first possible implementation manner of the second aspect, the updating module is further configured to: calculating the product of the residual error and the Kalman gain to obtain a state increment; summing the first prediction state mean value and the state increment to obtain a second prediction state mean value of the new measurement result; and obtaining a second prediction covariance matrix of the new measurement result according to the first prediction covariance matrix, the Kalman gain and the cross correlation matrix.
In a fourth possible implementation manner, according to the second aspect or any one of the first to third possible implementation manners of the second aspect, the apparatus further includes: a second observation module, configured to perform the following process before the loss satisfies the convergence condition: moving an optical flow tracking area in an observation area of the target object, and determining a plurality of pairs of sample matching points of the first image and a history image before the first image from the moved optical flow tracking area; the observation region of the target object is a corresponding region of a plane region on the target object on an image, and each pair of sample matching points corresponds to the same point of the target object; performing homography solution according to at least part of the matching points in the plurality of pairs of sample matching points to obtain a homography matrix; obtaining a first course angle change rate according to the homography matrix and a normal vector of a plane where the observation area on the first image is located, wherein the first course angle change rate is the course angle change rate of the target object in the historical image and the first image; and determining the loss according to the difference value of the first course angle change rate and the second course angle change rate and the number of points which meet the homography matrix in the plurality of pairs of sample matching points, wherein the second course angle change rate is the course angle change rate included in the first prediction state mean value.
In a fifth possible implementation manner, according to the fourth possible implementation manner of the second aspect, the residual error further includes a difference value between the first heading angle change rate and the second heading angle change rate. The homography matrix is calculated by utilizing the side surface and/or the vehicle tail plane, the observed value of the course angle change rate is increased, and the precision of the measuring result can be improved.
According to a fourth possible implementation manner of the second aspect, in a sixth possible implementation manner, the apparatus further includes: the third observation module is used for obtaining first observation noise of each observation point according to the size of the part of the target object associated with each observation point in the target object; the fourth observation module is used for obtaining second observation noise according to the size of the optical flow tracking area; the prediction module is further to: extracting a plurality of representative points including points corresponding to the initial state values and weights corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix; obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points; calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point; converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point; calculating a mean and covariance matrix of the plurality of representative points at pixel plane coordinates based on the pixel plane coordinates and weight of each of the representative points and one or both of the first observation noise and the second observation noise; obtaining a cross correlation matrix in the updating parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinate of each representative point, the mean value of the pixel plane coordinates of a plurality of representative points and the weight; and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
The error of the observation result can be corrected through the first observation noise and/or the second observation noise, so that a more accurate measurement result is obtained. The first observation noise and the second observation noise are dynamically adjusted according to the target object in the image (related to the size of the observed corresponding surface) in the calculation process, the observation noise is dynamically adjusted, the influence caused by poor regression precision of a neural network model and low solving precision of a homography matrix under the condition of crossing or straight-going is reduced, and the measurement precision can be improved.
In a seventh possible implementation form of the method according to the second aspect as such or any one of the first to sixth possible implementation forms of the second aspect, the target object and the observation object are vehicles, and the plurality of observation points include: one or more of two corner points of the rear of the vehicle and one corner point of the front of the vehicle.
In a seventh possible implementation form of the second aspect, in an eighth possible implementation form, the observation region includes one or more of a corresponding region of a rear of the vehicle on the image and a corresponding region of a side of the vehicle on the image.
In an eighth possible implementation manner of the second aspect, in a ninth possible implementation manner, the first observation module is further configured to: identifying the vehicle in the first image by adopting the neural network model to obtain a grounding line, a boundary line and a Bounding Box; wherein the ground line represents a straight line on which a wheel ground point of the vehicle is located, and the dividing line represents a dividing line of two planes of the vehicle; and determining the plurality of observation points and the pixel plane coordinates of the plurality of observation points according to the grounding line, the boundary line and the Bounding Box.
The 2.5DBBox of the target vehicle is observed through the neural network model, so that a first observation result of the target vehicle is obtained, the process that hardware (radar and the like) is adopted in the nonlinear filtering process to observe to obtain the observation result is replaced, the image collected by the monocular camera is processed through the neural network model to realize the observation process, and the cost can be saved.
In a tenth possible implementation manner, when one surface of the vehicle is visible, the residual error includes a difference between an azimuth angle and a heading angle in the first predicted state mean value, and the azimuth angle is calculated according to a vertical coordinate and a horizontal coordinate in the first predicted state mean value.
In embodiments where only one side of the application is visible, the azimuth angle may be used to improve the accuracy of the heading angle calculation.
In a third aspect, embodiments of the present application provide a vehicle comprising a measurement device of one or more of the second aspect or the various possible implementations of the second aspect.
In a fourth aspect, an embodiment of the present application provides a measurement apparatus, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to carry out the instructions when executing the measurement method of the first aspect or one or more of many possible implementations of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product, which includes computer readable code or a non-transitory computer readable storage medium carrying computer readable code, and when the computer readable code runs in an electronic device, a processor in the electronic device executes a measurement method of the first aspect or one or more of the multiple possible implementations of the first aspect.
In a sixth aspect, an embodiment of the present application provides a non-transitory computer-readable storage medium, on which computer program instructions are stored, where the computer program instructions, when executed by a processor, implement the measurement method of the first aspect or one or more of the multiple possible implementation manners of the first aspect.
These and other aspects of the present application will be more readily apparent from the following description of the embodiment(s).
Drawings
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the application and, together with the description, serve to explain the principles of the application.
Fig. 1 shows a schematic diagram of an application scenario according to an embodiment of the present application.
Fig. 2 shows a flow chart of a measurement method according to an embodiment of the present application.
FIG. 3 illustrates a schematic diagram of identifying a first observation according to an embodiment of the present application.
Fig. 4 shows a schematic diagram for calculating world coordinates of an observation point according to an embodiment of the present application.
Fig. 5a shows a schematic diagram of a first image according to an embodiment of the application.
Fig. 5b shows a schematic diagram of a first image according to an embodiment of the application.
FIG. 6 illustrates an example of an observation region and an optical-flow tracking region according to an embodiment of the present application.
Fig. 7 is a schematic diagram showing a positional relationship between a target vehicle and an observation vehicle according to an embodiment of the present application.
FIG. 8 shows a block diagram of a measurement device according to an embodiment of the present application.
Fig. 9 shows a block diagram of data flow during a measurement method according to an embodiment of the present application.
FIG. 10 shows a block diagram of a measurement device according to an embodiment of the present application.
Detailed Description
Various exemplary embodiments, features and aspects of the present application will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present application.
For the sake of clarity of the description of the method of the present application, the terms referred to in the present application are explained first.
And the course angle of the target object is the included angle between the orientation of the target object and the longitudinal axis of the coordinate system under the coordinate system of the observation object. The observation object is an object for observing the state of the target object. Taking a vehicle as an example, the course angle of the target vehicle is an included angle between the heading direction of the target vehicle and a longitudinal axis of a coordinate system (a host vehicle coordinate system) in the coordinate system of an observation vehicle, where the observation vehicle is a vehicle observing the running state of the target vehicle, and the coordinate system of the observation vehicle may be a world coordinate system. The regulation and control system predicts the future running track of the target vehicle through the course angle so as to avoid collision with the target vehicle.
In order to solve the technical problem, the application provides a measuring method and a measuring device. The method provided by the embodiment of the application can be applied to terminal equipment with a sensing function or parts arranged on the terminal equipment. The terminal equipment can be transportation equipment such as vehicles, intelligent household equipment, robots and the like. The components arranged on the terminal equipment can be controllers, chips, other sensors such as radar or cameras, other components and the like on the terminal equipment.
The measurement method provided by the embodiment of the application can be applied to a vehicle-mounted system, such as an Advanced Driving Assistance System (ADAS), and can also be an automatic driving scene. The method can be used for judging the driving states of overtaking, merging, equidirectional, opposite direction, crossing and the like of the target vehicle and providing important external information for a regulation and control system. The method can also be applied to a vehicle scheduling system of a city or a factory to predict the driving track of the vehicle and provide an important basis for vehicle scheduling.
Fig. 1 shows a schematic diagram of an application scenario according to an embodiment of the present application. Fig. 1 shows a vehicle and a pedestrian traveling on a road, the vehicle in fig. 1 includes an observation vehicle and a target vehicle, and the measurement method provided by the embodiment of the present application can be applied to the observation vehicle. The observation vehicle can be provided with a camera which can be used for collecting surrounding images, and the observation vehicle can be also provided with a processor which can be used for processing the images collected by the camera, identifying the target vehicle in the images, calculating the course angle and other information of the target vehicle. The camera can be a monocular camera or other types of cameras, the monocular camera is not limited by the application, the cost of the monocular camera is low, and the measuring cost can be saved.
The embodiment of the application can be applied to the condition that two sides of the vehicle are visible, and can also be applied to the scene that only one side of the vehicle is visible. The two-sided visible scene may be a side + front combined scene or a side + rear combined scene, and this embodiment is described by taking a side + rear combined scene as an example. A scene that is visible on one side may be front, side, or rear visible.
The processor may include an Application Processor (AP), a modem processor, a Graphic Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors.
Besides the installation of the camera, other sensors for sensing surrounding environments such as a laser radar and a millimeter wave radar can be omitted from the observation vehicle shown in fig. 1, course angles can be accurately measured only through images acquired by the camera, cost can be saved, and the camera is small in size compared with a radar, so that the camera is convenient to install.
The relationship between the observation vehicle and the target vehicle shown in fig. 1 is merely an example of the present application, and the present application is not limited thereto, and any one of the vehicles in fig. 1 may be the observation vehicle, and the vehicle in the image captured by the camera of the observation vehicle may be the target vehicle for observation. In addition, the scenario of vehicle automatic driving shown in fig. 1 is an example of the present application, and the present application is not limited thereto, and for example, the present application may also be applied to other intelligent transportation devices such as an unmanned aerial vehicle and a ship, or an intelligent home device and a robot. In the embodiments of the present application, a method for measuring a heading angle will be described by taking a vehicle as an example.
Fig. 2 shows a flow chart of a measurement method according to an embodiment of the present application. As shown in fig. 2, the measurement method of the embodiment of the present application may include the following steps:
step S200, acquiring a first image through a monocular camera;
step S201, a first observation result of a target object in the first image is identified by adopting a neural network model; the first observation comprises pixel plane coordinates of a plurality of observation points included in the target object;
step S202, a nonlinear filtering prediction process is carried out according to the initial motion state and the motion model of the target object, and a first prediction state and updating parameters of the nonlinear filtering are obtained;
wherein the initial motion state is a measurement result obtained from a history image before the first image, the initial motion state represents a final motion state of the target object before the camera acquires the first image, and the initial motion state includes an initial heading angle of the target object.
Step S203, obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state;
step S204, obtaining a residual error according to the first observation result and the first estimation result;
step S205, performing a non-linear filtering updating process according to the first prediction state, the residual error and the updating parameter to obtain a new measurement result of the motion state of the target object, where the new measurement result includes a course angle of the target object.
According to the measuring method provided by the embodiment of the application, the monocular camera is used for collecting the image, the image is processed by combining the neural network model, the nonlinear filtering model and the motion model, and the course angle of the target object in the image is predicted. Adopt neural network to handle the image and obtain first observation result, replace the non-linear filtering in-process and adopt hardware (radar etc.) to observe the process that obtains observation result, that is to say, handle the image that monocular camera gathered through neural network model and can realize the process of observing, need not install the volume ratio and expensive radar etc. and observe the target object, compare in radar camera price lower, can save the cost, and the camera volume compares in the radar small very much installation of being convenient for of volume.
The initial motion state and the motion model of the target object are combined to predict the motion state (including the course angle) of the target object through the prediction process of the nonlinear filtering to obtain a prediction result (first prediction state), the motion track of the target object is estimated according to the prediction result to obtain a first estimation result, and the first prediction state is adjusted according to the first observation result and the residual error of the first estimation result, so that a more accurate measurement result can be obtained.
Compared with the mode that the course angle is measured by calibrating a single frame image through a radar or matching a target vehicle and a wheel grounding point in the single frame image through a Convolutional Neural Network (CNN) in the related technology, the measuring method provided by the application adopts a prediction process of carrying out nonlinear filtering on the initial motion state of the target object obtained according to a historical image, can be used for measuring the course angle by combining multiple frame images, is more accurate in measuring result, realizes the observation process of the target object through a neural network model, and can save the cost.
In step S201, the neural network model may be CNN, or may be another neural network model, which is not limited in this application. The target object may be a vehicle, which may be a car, truck, bus, etc. type of vehicle, which is not limited in this application. The target object may also be an intelligent home device, such as a sweeping robot. The target object may also be a drone, or the like. The plurality of observation points may be points on the target object in the first image, and taking the target object as a vehicle as an example, the plurality of observation points may include: one or more of two corner points of the rear of the vehicle and one corner point of the head of the vehicle.
In one possible implementation manner, step S201 may include: recognizing the vehicle in the first image by adopting the neural network model to obtain a grounding line, a boundary line and a Bounding Box; and determining the plurality of observation points and the pixel plane coordinates of the plurality of observation points according to the grounding line, the boundary line and the Bounding Box.
FIG. 3 illustrates a schematic diagram of identifying a first observation according to an embodiment of the present application. As shown in fig. 3, the ground line represents a straight line where the wheel grounding point of the vehicle is located, the boundary line represents a boundary line of two planes of the vehicle, for example, a boundary line of a side surface of the vehicle and a front plane of the vehicle as shown in fig. 3, or a boundary line of a side surface of the vehicle and a rear plane of the vehicle, and the Bounding Box may be a border of a boundary of a plane on the vehicle identified by the neural network model. The neural network model can extract the characteristics of the vehicle in the first image to obtain the vehicle, the grounding line of the vehicle, the boundary line and the Bounding Box. The neural network can determine an observation point according to the intersection point of the grounding line and the boundary line, determine the observation point according to the intersection point of the Bounding Box and the grounding line, and determine the pixel plane coordinates of the observation point according to the position of the observation point in the first image, so that a first observation result is obtained. The process is a 2.5DBBox (2.5 Dimension Bounding Box), the depth information of the vehicle can be obtained only by a monocular camera through the labeling process of the 2.5DBBox, the labeling difficulty is small, and the method can be suitable for camera modules with different internal and external parameters. The measurement method of the application is applicable to all objects with shapes which can be described by 2.5DBBox, such as ships and the like.
The observation object can also identify the pixel plane coordinate of the observation point through a neural network to obtain the pixel plane coordinate (U) of the observation point 1 (the tail corner point of the vehicle near the observation point) 21 ,V 21 ) And the pixel plane coordinates (U) of the observation point 2 (the corner point of the vehicle at the far end) 22 ,V 22 ) And the pixel plane coordinate (U) of the observation point 3 (far headstock corner point) 23 ,V 23 )。
According to the embodiment of the application, the first observation result can be obtained by performing network forward calculation once, the time consumption is low, and the real-time performance is good.
For step S202, the motion model may represent a vehicle motion model, and may be, for example, a Constant Turn Rate and Velocity (CTRV) model, a Constant Turn Rate and Acceleration (CTRA) model, or the like. The motion state of the vehicle at the next moment can be estimated through the vehicle motion model, and the estimation process is described in the embodiment of the application by taking the CTRV model as an example.
In the embodiment of the present application, taking a CTRV model as an example, the state quantities of the model may include:
x: the longitudinal coordinate of the midpoint of the bottom edge of the tail of the target vehicle in the own vehicle coordinate system of the observation vehicle;
y: the abscissa of the midpoint of the bottom edge of the tail of the target vehicle in the own vehicle coordinate system of the observation vehicle;
v: the speed of the target vehicle relative to the observation vehicle in the own vehicle coordinate system of the observation vehicle;
phi: in the own vehicle coordinate system of the observation vehicle, the course angle of the target vehicle relative to the observation vehicle;
dphi: in the own vehicle coordinate system of the observation vehicle, the course angle change rate of the target vehicle relative to the observation vehicle;
l: a length of the target vehicle;
w: the width of the target vehicle.
Thus, the state vector of the model may be represented as X = (X, y, v, phi, dphi, L, W). Wherein x, y, v, phi, dphi may represent coordinates of the target vehicle relative to the observation vehicle in a world coordinate system with the observation vehicle as a reference point, and the specific meaning is as described above. The value of the motion state of the target object may be represented by a state vector, and in the embodiment of the present application, it is assumed that the motion state of the target object conforms to gaussian distribution, the distribution of the motion state may be represented by means of a mean and a variance, and the mean is represented by means of the state vector of the motion state. In the embodiment of the application, it is assumed that the motion state of the target object is nonlinear, and when the motion state is predicted, nonlinear distribution is approximated by extracting a representative point, performing nonlinear transformation, and the like near a mean point, so that an approximation result closer to real distribution is obtained.
In the embodiment of the application, nonlinear filtering is performed by combining a motion model, and a multi-frame image acquired by a monocular camera is combined to predict the course angle of a target object in an image. Specifically, the motion state (course angle) of the target object in the current frame image is obtained by predicting according to the motion state measured by the previous image frame, the motion state of the target object in the current frame image is used as the initial motion state of the motion state prediction of the target object in the next frame image, and the motion state of the target object is measured in an iterative manner to obtain the course angle.
Therefore, in step S202, the observation object may perform a prediction process of the nonlinear filtering according to the motion model and an initial motion state of the target object, so as to obtain a first prediction state and an update parameter of the nonlinear filtering, where the initial motion state of the target object is a measurement result obtained according to a history image before the first image, and the initial motion state represents a final motion state of the target object before the first image is acquired by the camera.
In one possible implementation, the initial motion state includes an initial state value and an initial covariance matrix, and the initial state value includes: in a world coordinate system taking an observation object as a reference point, the ordinate, the abscissa, the speed, the course angle and the course angle change rate of the target object relative to the observation object, and the length and the width of the target object. The initial state values of the initial motion state degrees may be represented as X [0] = [ X0, y0, v0, phi (0), dphi (0), L, W ]. The initial covariance matrix may be expressed as sigma.
In one possible implementation, for step S202, the nonlinear filtering may be Extended Kalman Filtering (EKF) or Unscented Kalman Filtering (UKF). In the embodiment of the present application, the prediction process and the update process are described by taking the UKF as an example.
In this embodiment, in step S202, performing a prediction process of a nonlinear filter according to an initial motion state and a motion model of a target object, to obtain a first prediction state and update parameters of the nonlinear filter, where the method may include:
step S2021, extracting a plurality of representative points including a point corresponding to the initial state value and a weight corresponding to each representative point from a gaussian distribution whose mean is the initial state value and whose covariance matrix is the initial covariance matrix;
step S2022, obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points;
step S2023, calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimated state value and the weight of each of the representative points;
step S2024, converting the first estimated state value of each representative point into a pixel plane coordinate system to obtain a pixel plane coordinate of each representative point;
step S2025, calculating a mean value and a covariance matrix of the plurality of representative points in the pixel plane coordinate according to the pixel plane coordinate of each representative point and the weight;
step S2026, obtaining a cross correlation matrix in the updated parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinates of each representative point, the mean values of a plurality of representative points in a pixel plane coordinate system, and the weights;
step S2027, obtaining Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the multiple representative points in the pixel plane coordinate.
The course of motion due to the target object may be non-linear and may be irregularly distributed. Therefore, the original gaussian distribution can be subjected to nonlinear transformation to obtain an approximate gaussian distribution for predicting a future motion state. In an embodiment of the present application, a plurality of representative points (referred to as sigma points) including a point corresponding to the initial state value and a weight corresponding to each representative point are extracted from a gaussian distribution in which a mean value is the initial state value and a covariance matrix is the initial covariance matrix.
In one possible implementation, the number of sigma points may be determined according to the dimension of the state vector (state value), for example, the number N of sigma points may be 2n +1, where N represents the dimension of the state vector, and in the embodiment of the present application, the dimension of the state vector X [0] is 7, so that 15 sigma points may be extracted. It should be noted that the number of extracted sigma points is 15, which is only an example of the present application, and the more the number of extracted sigma points, the more accurate the measurement is, but the corresponding calculation amount is larger.
Specifically, the representative point can be calculated by the following formula:
χ[0]=X[0]
Figure BDA0003113473830000111
Figure BDA0003113473830000112
wherein χ represents a sigma point matrix, each column in the χ matrix represents a state vector of a sigma point, the number of columns in the χ matrix is the number of the extracted sigma points, and the number of columns is the same as the dimension of the state vector. For example, in the embodiment of the present application, the dimension of the state vector is 7, and if 15 sigma points are extracted, the size of the χ matrix is 7 × 15.λ represents a parameter of the distance of the extracted sigma point from the mean. And chi 0 is the initial state value of extraction.
The weight representing the point correspondence can be calculated by the following formula:
Figure BDA0003113473830000113
Figure BDA0003113473830000114
for step S2022, the process of nonlinear transformation may process the state values of the representative points by using a motion model to obtain a first estimated state value of each of the representative points.
Specifically, the state vector χ [ i ] for each sigma point may be substituted into the following formula to obtain the first estimated state value corresponding to each sigma point:
Figure BDA0003113473830000121
wherein, a v Representing acceleration noise in the longitudinal direction, a phi Represents angular acceleration noise, assuming the calculated χ' [ i ]]=[xi,yi,vi,phi(i),dphi(i),L,W]Wherein, x' i]Representing a first estimated state value.
Then, in step S2023, a first predicted state mean and a first predicted covariance matrix of the first predicted state may be calculated according to the first estimated state value and the corresponding weight of each representative point, and specifically, the first predicted state mean and the first predicted covariance matrix may be calculated by using the following formulas:
Figure BDA0003113473830000122
Figure BDA0003113473830000123
where X 'represents the mean of the approximate gaussian distribution, i.e. the mean of the first prediction state, and can be represented as [ X', y ', v', phi ', dphi', L ', W' ]. Σ' represents an approximately gaussian distributed covariance matrix, i.e., the first prediction covariance matrix.
The observation object can also convert the first estimation state values of the plurality of representative points into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point; and calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate and the weight of each representative point.
Specifically, the mean and covariance matrices of a plurality of representative points in the pixel plane coordinate system can be calculated according to the following formulas: z = F χ
Figure BDA0003113473830000131
Figure BDA0003113473830000132
Wherein F is a transformation matrix for transforming the state vector (first estimated state value) of the present application into a pixel plane coordinate, and the process of determining the transformation matrix is the same as that in the related art, and is not described again. Z represents a matrix composed of pixel plane coordinates of the representative point, Z' is a mean value of the pixel plane coordinates of the representative point, and S is a covariance matrix of the pixel plane coordinates of the representative point.
For step S2026, the observed object may calculate a cross-correlation matrix according to the following formula:
Figure BDA0003113473830000133
where T represents a cross-correlation matrix.
According to the cross correlation matrix T and the covariance matrix S of the pixel plane coordinates, kalman gain K = T · S can be obtained through calculation -1
It should be noted that the sequence numbers of steps S2021 to S2027 only represent different calculation procedures, and do not represent a sequential relationship, for example, step S2024 may be executed while step S2023 is executed.
In a possible implementation manner, in step S203, obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state may include:
and calculating the world coordinate of each observation point according to the first prediction state mean value, and converting the world coordinate of each observation point into a pixel plane coordinate to obtain a first estimation result of each observation point.
In one possible implementation manner, the ordinate and abscissa of the target object relative to the observation object are the ordinate and abscissa of the vehicle tail midpoint relative to the vehicle head midpoint of the observation vehicle, and the calculating the world coordinate of each observation point according to the first predicted state mean value may include: and calculating the world coordinate of each observation point according to the ordinate, the abscissa, the course angle, the length and the width in the first prediction state mean value.
FIG. 4 shows a schematic diagram of computing world coordinates of an observation point according to an embodiment of the present application. As shown in fig. 4, the world coordinates of observation point 1 (the corner point of the near vehicle tail) can be calculated by geometric relationship as:
(x’-W’/2*sin(phi’),y’-W’/2*cos(phi’),0) T the world coordinates of observation point 2 (the corner point of the vehicle's tail at a distance) are (x ' + W '/2 sin (phi '), y ' + W '/2 cos (phi '), 0 ″)) T The world coordinates of the observation point 3 (the remote nose corner point) are:
(x’-W’/2*sin(phi’)+L’*cos(phi’,y’-W’/2*cos(phi’)-L’*sin(phi’),0) T
the observation object can convert the world coordinate of each observation point into a pixel plane coordinate, and the pixel plane coordinate (U) of the observation point 1 can be obtained respectively 11 ,V 11 ) And pixel plane coordinates (U) of observation point 2 12 ,V 12 ) And the pixel plane coordinate (U) of observation point 3 13 ,V 13 )。
In a possible implementation manner, a specific implementation manner of converting the world coordinate into the pixel plane coordinate may be to multiply the world coordinate by a conversion matrix to obtain a corresponding pixel plane coordinate, where the conversion matrix may be obtained according to processes of rigid body transformation, perspective projection, and discretization in the related art, for example, a process of rigid body transformation such as rotation and translation on the world coordinate may obtain a camera coordinate, a process of perspective projection (multiplication by a matrix obtained according to camera parameters) on the camera coordinate may obtain an image coordinate, and a process of discretizing the image coordinate may obtain a pixel plane coordinate.
The first estimation result (U) can be obtained by the above process 11 ,V 11 )、(U 12 ,V 12 )、(U 13 ,V 13 )。
For step S203, the first estimate may be subtracted from the first observation to yield a residual Res = (((U) 21 ,V 21 )-(U 11 ,V 11 )),((U 22 ,V 22 )-(U 12 ,V 12 )),(U 23 ,V 23 )-(U 13 ,V 13 ))。
After the update parameter, the residual and the first prediction state are calculated, the first prediction state may be updated to obtain a measurement result of the first image, that is, a new measurement result of the motion state of the target object in the first image.
In one possible implementation manner, step S205 may include: calculating the product of the residual error and the Kalman gain to obtain a state increment; summing the first prediction state mean value and the state increment to obtain a second prediction state mean value of the new measurement result; and obtaining a second prediction covariance matrix of the new measurement result according to the first prediction covariance matrix, the Kalman gain and the cross correlation matrix.
In particular, the new measurement result may be calculated according to the following formula:
X[1]=X’+K×Res
∑(1)=(I-KT)∑’
where Res represents the residual error, and X [1] and Σ (1) represent the motion state of the target object in the first image, and are also the initial state value and the initial covariance matrix of the target object in the predicted second frame image, where phi in X [1] can be used as the heading angle of the target vehicle in the predicted second frame image. For the second image subsequent to the first image, the process of steps S200-S205 can be repeatedly performed with X [1] and Sigma (1) as initial state values to obtain the measurement result of the second image.
According to the measuring method provided by the embodiment of the application, the monocular camera is used for collecting the image, and the image is processed by combining the neural network model, the nonlinear filtering model and the motion model, so that the accurate course angle of the target object can be obtained. The course angle is measured by combining the multi-frame images, the measurement result is more accurate, the observation process of the target object is realized through the neural network model, the requirement on hardware is low, only a monocular camera is needed, and the cost can be saved.
In a possible implementation manner, the measurement method in the embodiment of the present application further includes: and obtaining a first observation noise of each observation point according to the size of the part of the target object associated with each observation point in the target object.
In the embodiment of the present application, the observation point may be a plane on the target vehicle, such as a vehicle side plane, a vehicle rear plane, or the like, in the target object at the associated position of the target object. The size of the portion of the target object may refer to the area of the plane. For example, taking the schematic diagram shown in fig. 4 as an example, the portion associated with observation point 2 is the vehicle tail plane, and the portion associated with observation point 3 is the vehicle left side surface. The first observation noise is determined mainly by considering that when the target vehicle is at certain angles, the first estimation result and the first observation result of the observation points are possibly influenced by the noise, and the obtained result is inaccurate, so that the observation noise can be used for correction.
For example, for observation point 2, when the vehicle passes through from the front, the side of the vehicle is more exposed to the camera, and when the visible area of the vehicle tail is smaller, the accuracy of the first observation and the first estimation result of observation point 2 calculated by the embodiment of the present application is affected. Fig. 5a shows a schematic diagram of a first image according to an embodiment of the application. As shown in fig. 5a, a white frame indicates a target vehicle 2.5DBBox obtained by CNN, an observation point 2 of the target vehicle is not completely exposed to a sight line, a result error of the observation point 2 obtained by calculation may be relatively large and low in reliability, w indicated in fig. 5a is a visible width of a vehicle tail plane, and a first observation noise corresponding to the observation point 2 may have a monotonically decreasing relationship with the visible width of the vehicle tail plane, for example, may have an inversely proportional relationship with w.
For observation point 3, when the target vehicle is in front of the observation vehicle and is moving straight, the side visible area of the target vehicle is small, for example, the side visible area of the target vehicle is small when the lane beside the observation vehicle is about to complete the merging process. In the extreme case, the target vehicle is directly in front of the observation vehicle, and the observation point 3 is hardly visible and the area visible from the side is small. In the above case, the result of the observation point 3 obtained by calculation may have a large error ratio and a low reliability. Thus, the error can be corrected with the observation noise. FIG. 5b shows a schematic diagram of a first image according to an embodiment of the present application. As shown in fig. 5b, the white frame indicates 2.5DBBox of the target vehicle obtained through CNN, the observation point 3 of the target vehicle is not completely exposed to the sight line, i indicated in fig. 5b is the visible length of the side of the vehicle, and the first observation noise corresponding to the observation point 3 may be monotonically decreased with i, for example, may be inversely proportional to i.
In the present embodiment, step S202 is different from the above process. Step S2025 in the above embodiment may be replaced with: and calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate and the weight of each representative point and the first observation noise. Specifically, the mean and covariance matrices of a plurality of representative points in the pixel plane coordinate system can be calculated according to the following formulas:
Z=Fχ
Figure BDA0003113473830000151
Figure BDA0003113473830000152
wherein Q is 1 The first observed noise may be represented.
The error of the observation result can be corrected through the first observation noise, so that a more accurate measurement result is obtained. The first observation noise is dynamically adjusted according to the target object in the image (related to the size of the observed corresponding surface) in the calculation process, the observation noise is dynamically adjusted, the influence caused by poor regression precision of a neural network model and low solving precision of a homography matrix under the condition of crossing is reduced, and the measurement precision can be improved.
In a possible implementation manner, the measurement method in the embodiment of the present application may also observe the course angle change rate through a CNN and homography solution process. The homography matrix can be calculated by utilizing the side surface and/or the vehicle tail plane of the vehicle, the course angle change between two frames of images is calculated through the homography matrix, and then the course angle change rate dyaw can be obtained by dividing the course angle change rate dyaw by the updating period. By increasing the observed value of the course angle change rate and adjusting the measurement result according to the residual error of the observed value and the estimated value, the measurement precision can be improved.
Specifically, the measurement method according to the embodiment of the present application may further include the following steps:
before the loss satisfies the convergence condition, the following process is performed:
moving an optical flow tracking area in an observation area of the target object, and determining a plurality of pairs of sample matching points of the first image and a history image before the first image from the moved optical flow tracking area; the observation region of the target object is a corresponding region of a plane region on the target object on an image, and each pair of sample matching points corresponds to the same point of the target object;
performing homography solution according to at least part of the matching points in the plurality of pairs of sample matching points to obtain a homography matrix;
obtaining a first course angle change rate according to the homography matrix and a normal vector of a plane where the observation area on the first image is located, wherein the first course angle change rate is the course angle change rate of the target object in the historical image and the first image;
and determining the loss according to the difference value of the first course angle change rate and the second course angle change rate and the number of points which meet the homography matrix in the plurality of pairs of sample matching points, wherein the second course angle change rate is the course angle change rate included in the first prediction state mean value.
The convergence condition may be that the calculated loss is less than a loss threshold or the maximum number of loop calculations is set, and a calculation result corresponding to one time with the smallest loss is selected from the losses calculated from the multiple times. According to the measuring method, the optimal optical flow tracking area is determined through the loss function, and the influence of the homography matrix solving error caused by the non-plane is reduced.
In an embodiment of the present application, taking a target object as an example of a target vehicle, the observation region includes one or more of a corresponding region of a rear of the vehicle on an image and a corresponding region of a side of the vehicle on the image. The optical-flow tracking area is a planar area in the observation area, and fig. 6 shows an example of the observation area and the optical-flow tracking area according to an embodiment of the present application. As shown in fig. 6, the 2.5DBBox of the target vehicle obtained by CNN is indicated by white solid line frames, and is an observation area on the side surface of the vehicle and an observation area on the rear end of the vehicle, respectively, and in the observation area corresponding to the side surface of the target vehicle (the left solid line frame in fig. 6), the optical flow tracking area is a planar area on the lower middle portion of the side surface, that is, an area indicated by a left dotted line frame in fig. 6, and in the observation area corresponding to the rear end of the target vehicle (the right solid line frame in fig. 6), the optical flow tracking area is a planar area on the middle portion of the rear end, that is, an area formed by a right dotted line in fig. 6.
In an embodiment of the present application, the observation region and the optical flow tracking region in the image may be identified by CNN. The plane area of the side surface or the tail of the vehicle can be different for different vehicles, so that the optical flow tracking area can be moved in the observation area, and the optimal optical flow tracking area can be determined through a loss function.
Thus, for each resulting optical flow tracking area of the movement: the method comprises the steps of determining a first image and a plurality of pairs of sample matching points of a historical image before the first image, carrying out homography solution on at least part of the plurality of pairs of sample matching points to obtain a homography matrix, and obtaining a first course angle change rate according to the homography matrix and a normal vector of a plane where an observation area on the first image is located.
In one possible implementation, the observation object may determine pairs of sample matching points of the first image and the historical image by means of feature point matching or optical flow tracking. The history image may be one of a plurality of frames of images before the first image, for example, may be a frame of image before the first image, and specifically, the history image is a frame of image corresponding to the initial motion state. For example, the first image is an nth frame image, the motion state of the mth frame image is obtained by the observation object according to the mth frame image and the motion state obtained by the history image before the mth frame image, the motion state of the mth frame image is used as the initial motion state by the observation object, and the measurement method provided by the embodiment of the present application is executed according to the mth frame image and the nth frame image to obtain a new measurement result (measurement result of the nth frame image). Then, feature point matching or optical flow tracking is performed to obtain a plurality of pairs of sample matching points, which are obtained according to the Mth frame image and the Nth frame image. Wherein N and M are both positive integers, and N-M is a positive integer, for example, N-M may have a value of 1 or 2, which is not limited in the present application.
The homography solution process according to the matching points can be referred to in the related art, for example, at least 4 pairs of matching points are selected from a plurality of pairs of sample matching points, coordinates of the 4 pairs of matching points are brought into an equation set of homography solution, and a rotation matrix R and a plane normal vector n can be obtained through solution.
The system of equations for the homography solution is:
p 2 =Hp 1
Figure BDA0003113473830000161
wherein p is 1 And p 2 For the matching point, H denotes a homography matrix, K denotes a camera reference matrix, R denotes a rotation matrix from the history image to the first image, t denotes a translation matrix from the history image to the first image, and n denotes a normal vector of the object plane. And calculating H according to the obtained matching points, and decomposing the H to obtain two groups of solutions, wherein each group of solutions comprises a rotation matrix R and a plane normal vector n, and the only R can be determined according to the normal vector of the plane where the observation region on the first image is located (namely the normal vector of the plane of the observation region on the real target vehicle) and the two groups of solutions. Wherein, R includes roll, pitch, yaw, i.e. relative roll angle change, relative pitch angle change and relative heading angle change of the observation object and the target object.
And obtaining the course angle change rate according to the relative course angle change and the time between the historical image and the first image. It should be noted that, for a scene where two surfaces of the vehicle are visible, two homography-solved heading angle change rates can be obtained: dyaw side and dyaw tail.
One example of solving for the rate of change of heading angle is as follows:
1) And obtaining matching points through feature point matching or an optical flow method.
2) In order to ensure enough parallax, homography solution is carried out on images of multiple frames (more than two frames) at intervals to obtain a homography matrix. The homography matrix is applied to a plane with the mid-region of the vehicle's tail and the mid-lower region of the side as optical flow tracking regions (half the height of BBox), as shown in fig. 6.
3) Homography solution will obtain two sets of solutions, and the only solution can be determined by the normal vector of the target plane where the observation region is located on the first image. And solving the homography matrix to obtain two groups of solutions, wherein each group of solutions comprises (roll, pitch, yaw, n), wherein the roll, pitch and yaw respectively represent the relative roll angle change, the relative pitch angle change and the relative heading angle change of the observation vehicle and the target vehicle, n corresponds to each axis (x, y and z) component of a normal vector of a plane where an observation area of the target vehicle is located in a first frame camera coordinate system, and the unique roll, pitch and yaw can be determined by combining the heading angle and the normal vector of the target plane. Taking the plane of the vehicle tail as an example, assuming that the heading angle of the target vehicle is 10 °, the z-axis component of the normal vector of the target plane in the coordinate system of the vehicle is cos (10 °) =0.984, and two sets of solutions for the homography matrix solution are:
solution 1 (0.001313 °, -0.001348 °,0.002651 °, (0.376, 0.913, -0.153) T );
Solution 2 (0.0366 degree, 0.0628 degree, 0.5494 degree, (0.047, 0.239 degree, 0.969) T )。
The z-axis component 0.969 of the target plane normal vector corresponding to the solution 2 in the coordinate system of the self-vehicle is closer to 0.984, so that the solution 2 is a unique solution, and the obtained course angle change is 0.5494 degrees.
That is to say, the homography matrix is decomposed to obtain two groups of solutions, one component in a normal vector of a plane where an observation area of the target vehicle is located can be obtained according to the course angle of the target vehicle, a unique solution is determined according to the proximity degree of the component and a component corresponding to the normal vector in the two groups of solutions, and a rotation matrix in the unique solution comprises course angle changes. The unique value solved by the homography matrix is determined through the normal vector of the target plane, and the problem of ambiguity solved by the homography matrix is solved.
4) And moving the optical flow tracking area up and down near the preset area, and determining the optimal optical flow tracking area through a loss function. The plane area of the side surface or the tail of the vehicle can be different for different vehicles, so that the optical flow tracking area is moved up and down near the preset area, a series of optical flow tracking areas to be selected can be obtained, and the optimal optical flow tracking area can be determined through a loss function. For each candidate optical flow tracking area, two loss functions lossdyaw and lossylier can be obtained. And losssdyaw = | dyawpre-dyawhomo |, wherein dyawpre is the predicted course angle change rate (namely, the second course angle change rate) of the UKF, dyawhomo is the course angle change rate of the homography matrix solution, and losdyaw can ensure that the calculation result is consistent with the historical track. lossinlier =1/inlier, where inlier is the number of inliers in a random sample consensus algorithm (random sample consensus) process. The lossinlier can ensure that enough matching points are on the same plane, and the resolving precision of the homography matrix is improved. And (3) integrating two loss functions, wherein the total loss function is loss = lambda lossdyaw + lossylier, and lambda is a weight coefficient to balance the weights of the two functions. Finally, in a series of optical flow tracking areas to be selected, an area with the minimum loss is selected as an optimal optical flow tracking area.
The optimal optical flow tracking area is determined through the loss function, and the course angle change rate of observation is determined according to the optimal optical flow tracking area, so that the observation precision can be improved, and the precision of a measurement result is improved.
5) And respectively solving the course angle changes of the side surface and the tail of the vehicle on the final light stream tracking area, and dividing the course angle changes by the updating period to obtain a course angle change rate dyaw side and a dyaw tail.
In a possible implementation manner, the residual error in this embodiment further includes a difference between the first heading angle change rate and the second heading angle change rate. And subtracting the predicted course angle change rate dphi (namely the second course angle change rate) from the dyaw side and/or the dyaw tail respectively to obtain corresponding residual errors, wherein the residual errors can be used for state updating.
The homography matrix is calculated by utilizing the side surface and/or the vehicle tail plane, the observed value of the course angle change rate is increased, and the precision of the measuring result can be improved.
In one possible implementation, the method further includes: and obtaining a second observation noise according to the size of the area tracked by the optical flow.
Wherein the second observed noise represents an observed noise of the rate of change of the heading angle. When the visible area of the tail (side surface) is small, the area of the optical flow tracking area of the tail (side surface) is small, the tracking points are few, the parallax is small, the solving precision of the homography matrix is low, and the error of the dyaw tail (dyaw side) is large. For example, when the vehicle traverses, the vehicle tail area is small, and the reliability of the obtained heading angle change rate is low. Therefore, the observation noise at the dyaw tail (dyaw side) and the area of the tail (side) optical flow tracking area are in a monotonically decreasing relationship, and may be in an inversely proportional relationship.
In the present embodiment, step S202 is different from the above process. Step S2025 in the above embodiment may be replaced with:
and calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate and the weight of each representative point and the second observation noise.
Specifically, the formula for solving the covariance matrix is:
Figure BDA0003113473830000181
wherein Q is 2 A second observed noise may be represented.
In another possible implementation manner, step S2025 may also be: and calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate and the weight of each representative point and the first observation noise and the second observation noise.
Specifically, the formula for solving the covariance matrix is:
Figure BDA0003113473830000182
the error of the observation result can be corrected through the first observation noise and/or the second observation noise, so that a more accurate measurement result is obtained. In addition, in the embodiment of the present application, the first observation noise and the second observation noise are dynamically adjusted (related to the observed size of the corresponding surface) according to the target object in the image during the calculation process, and the observation noise is dynamically adjusted, so that the influence caused by poor CNN regression accuracy and low homography matrix solution accuracy in the case of crossing or straight lines is reduced, and the measurement accuracy can be improved.
In a possible implementation manner, when a surface of the vehicle is visible, the residual error includes a difference between an azimuth angle and a heading angle in the first predicted state mean, and the azimuth angle is calculated according to a vertical coordinate and a horizontal coordinate in the first predicted state mean.
In an embodiment that one side of the vehicle is visible, the calculation process of the residual errors corresponding to the observation points 1 and 2 is the same as that described above, the residual error corresponding to the observation point 3 is the difference between the azimuth angle calculated according to the ordinate and the abscissa of the first estimated state value and the heading angle of the first estimated state value.
Fig. 7 is a schematic diagram showing a positional relationship between a target vehicle and an observation vehicle according to an embodiment of the present application. As shown in fig. 7, the observation point 3 is invisible when the side of the target vehicle is not visible, and the azimuth angle of the target vehicle should be equal to the heading angle, i.e. arctan (y '/x') -phi '=0, where x' and y 'are the ordinate and abscissa of the first predicted state mean value, and phi' is the heading angle of the first predicted state mean value. Thus, subtracting arctan (y '/x ') -phi ' from 0 results in a corresponding residual, which can be used for state updating.
The process of observing the heading angle change rate is different from the above embodiment, specifically, since only one side is visible, the homography solution process can be only used for the visible side, and the observed value dyaw side or dyaw tail of one heading angle change rate can be obtained.
In the embodiment of the application in which only one side is visible, the azimuth angle can be used for improving the calculation accuracy of the course angle, and the homography matrix can also be calculated by using the side surface or the vehicle tail plane, so that the solution accuracy of the course angle is improved.
Application example
FIG. 8 shows a block diagram of a measurement device according to an embodiment of the present application. Fig. 9 shows a block diagram of data flow during a measurement method according to an embodiment of the present application. The measurement method according to the embodiment of the present application is described with reference to fig. 8 and 9.
The embodiment of the application measures the course angle by combining Unscented Kalman Filtering (UKF) with a vehicle motion model, and the vehicle motion model can be a constant turn rate and velocity amplitude (CTRV) model.
According to the measuring method provided by the embodiment of the application, the course angle of the target vehicle is measured through the UKF combined multi-frame image, and the specific process is as follows:
as shown in fig. 8, the camera continuously captures a plurality of frames of images and transmits the captured plurality of frames of images to the processor. And processing the image by the processor root to obtain a measurement result.
In an embodiment of the present application, the initial motion state of the target object may be calculated by:
when measurement is started, for the first frame of collected images, the processor identifies the target vehicle, the tail middle point of the target vehicle and the wheel grounding point in the first frame of images, determines the pixel plane coordinates (u, v) of the tail middle point of the target vehicle and the wheel grounding point, converts the pixel plane coordinates of the tail middle point and the wheel grounding point into a world coordinate system, and obtains the initial coordinates (x 0, y0, z 0) of the tail middle point in the world coordinate system and the initial coordinates of the wheel grounding point in the world coordinate system. Determining the initial course angle phi of the target vehicle according to the initial coordinates of the wheels (the same side wheel or the two rear wheels) on the same side of the target vehicle 0
For a second frame image in the multi-frame images, the processor identifies the target vehicle, the tail middle point of the target vehicle and the wheel grounding point in the second frame image, determines pixel plane coordinates (u, v) of the tail middle point of the target vehicle and the wheel grounding point, converts the pixel plane coordinates of the tail middle point and the wheel grounding point into a world coordinate system, and obtains coordinates (x 1, y1, z 1) of the tail middle point in the world coordinate system and coordinates (x 1, y1, z 1) of the wheel grounding point in the world coordinate system. According to the same side wheel of the target vehicle in the second frame image (The same side wheel or the two rear wheels) determines the heading angle phi of the target vehicle 1
According to the initial course angle phi corresponding to the first frame image 0 And the course angle phi corresponding to the second frame image 1 And the time interval of the first frame image and the second frame image collected by the camera is used for determining the initial heading angle change rate dphi 0
Determining an initial value v of the speed of the target vehicle according to the initial coordinates (x 0, y0, z 0) corresponding to the middle point of the tail of the vehicle in the first frame image, the coordinates (x 1, y1, z 1) of the middle point of the tail of the vehicle in the second frame image in the world coordinate system and the time interval between the camera and the first frame image and the second frame image 0
Determining x0, y0 and speed v of the midpoint of the vehicle tail in the first frame image in the initial coordinates of the world coordinate system 0 Initial heading angle phi 0 And an initial heading angle rate of change dphi 0 As the initial values of the state vector, the initial values of L and W in the state vector may be set in advance.
Through the above process, the initial value X [0] of the state vector of the vehicle motion model, that is, the mean of the gaussian distribution satisfied by the vehicle motion model, can be determined: x [0] = [ X0, y0, v0, phi (0), dphi (0), L, W ].
The initial value of the covariance matrix Σ [0] of the gaussian distribution that the vehicle motion model satisfies may be set in advance.
Assuming that the second frame image is the first image and X [0] and Σ [0] are used as the initial motion state, as shown in fig. 9, X [1] and Σ (1) are calculated according to the above-described procedure as the measurement result of the second frame image, X [1] and Σ (1) may be the initial motion state of the next frame image, and the measurement result of the next frame image may be obtained by continuing to perform the measurement method of the embodiment of the present application according to X [1] and Σ (1) and the next frame image.
That is to say, the measurement result obtained by each calculation contains the information of the previous multi-frame image, and the measurement result of the embodiment of the application is combined with the information of the previous multi-frame image, so that the accuracy and the stability of the course angle calculation are improved.
As shown in fig. 9, in the embodiment of the present application, the measurement method in the embodiment of the present application may be executed to combine the measurement result of the previous frame image and the current frame image to obtain the measurement result of the current frame, or the measurement method in the embodiment of the present application may be executed to combine the measurement result of the mth frame image before the current frame image and the current frame image to obtain the measurement result of the current frame, where m may be a positive integer greater than 1. In the example shown in fig. 9, the values of m, n, and k may be the same or different, and the present application does not limit the values. In one example, m = n = k =1. According to the measuring method, the accuracy and the stability of course angle calculation are improved by combining multi-frame information through UKF.
The embodiment of the application also provides a measuring device which can be applied to a monocular camera. FIG. 10 shows a block diagram of a measurement device according to an embodiment of the present application. As shown in fig. 10, the measurement apparatus of the embodiment of the present application may include:
an image acquisition module 100, configured to acquire a first image through the monocular camera;
a first observation module 101, configured to identify a first observation result of a target object in the first image by using a neural network model; the first observation comprises pixel plane coordinates of a plurality of observation points included in the target object;
the prediction module 102 is configured to perform a prediction process of the nonlinear filtering according to an initial motion state and a motion model of the target object, so as to obtain a first prediction state and an update parameter of the nonlinear filtering; wherein the initial motion state is a measurement result obtained from a history image before the first image, the initial motion state represents a final motion state of the target object before the camera acquires the first image, and the initial motion state includes an initial course angle of the target object;
the estimation module 103 is configured to obtain a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state;
a first calculating module 104, configured to obtain a residual according to the first observation result and the first estimation result;
an updating module 105, configured to perform an updating process of the nonlinear filtering according to the first prediction state, the residual error, and the update parameter, to obtain a new measurement result of the motion state of the target object, where the new measurement result includes a heading angle of the target object.
According to the measuring device provided by the embodiment of the application, the monocular camera is used for collecting images, the images are processed by combining the neural network model, the nonlinear filtering model and the motion model, and the course angle of the target object in the images is predicted. Adopt neural network to handle the image and obtain first observation result, replace the non-linear filtering in-process and adopt hardware (radar etc.) to observe the process that obtains observation result, that is to say, handle the image that monocular camera gathered through neural network model and can realize the process of observing, need not install the volume ratio and expensive radar etc. and observe the target object, compare in radar camera price lower, can save the cost, and the camera volume compares in the radar small very much installation of being convenient for of volume.
The initial motion state and the motion model of the target object are combined to predict the motion state (including the course angle) of the target object through the prediction process of the nonlinear filtering to obtain a prediction result (first prediction state), the motion track of the target object is estimated according to the prediction result to obtain a first estimation result, the first prediction state is adjusted according to the first observation result and the residual error of the first estimation result, and a more accurate measurement result can be obtained. Therefore, compared with the mode that the course angle is measured by calibrating a single-frame image through a radar or matching a target vehicle and a wheel grounding point in the single-frame image through a CNN (common noise network), the measuring device provided by the application adopts a prediction process of nonlinear filtering according to the initial motion state of the target object obtained according to the historical image, can measure the course angle by combining multiple frames of images, is more accurate in measurement result, realizes the observation process of the target object through a neural network model, and can save the cost.
In one possible implementation, the initial motion state includes an initial state value and an initial covariance matrix, and the prediction module is further configured to: extracting a plurality of representative points including a point corresponding to the initial state value and a weight corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix; obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points; calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point; converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain a pixel plane coordinate of each representative point; calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate of each representative point and the weight; obtaining a cross correlation matrix in the updating parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinate of each representative point, the mean value of a plurality of representative points in a pixel plane coordinate system and the weight; and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
In one possible implementation, the estimation module is further configured to: and calculating the world coordinate of each observation point according to the first prediction state mean value, and converting the world coordinate of each observation point into a pixel plane coordinate to obtain a first estimation result of each observation point.
In one possible implementation, the update module is further configured to: calculating the product of the residual error and the Kalman gain to obtain a state increment; summing the first predicted state mean and the state increment to obtain a second predicted state mean of the new measurement result; and obtaining a second prediction covariance matrix of the new measurement result according to the first prediction covariance matrix, the Kalman gain and the cross correlation matrix.
In one possible implementation, the apparatus further includes: a second observation module, configured to perform the following process before the loss satisfies the convergence condition: moving an optical flow tracking area in an observation area of the target object, and determining a plurality of pairs of sample matching points of the first image and a history image before the first image from the moved optical flow tracking area; the observation region of the target object is a corresponding region of a plane region on the target object on an image, and each pair of sample matching points corresponds to the same point of the target object; performing homography solution according to at least part of the matching points in the plurality of pairs of sample matching points to obtain a homography matrix; obtaining a first course angle change rate according to the homography matrix and a normal vector of a plane where the observation area on the first image is located, wherein the first course angle change rate is the course angle change rate of the target object in the historical image and the first image; and determining the loss according to the difference value of the first course angle change rate and the second course angle change rate and the number of points which meet the homography matrix in the plurality of pairs of sample matching points, wherein the second course angle change rate is the course angle change rate included in the first prediction state mean value.
In a possible implementation, the residual further includes a difference between the first heading angle change rate and the second heading angle change rate. The homography matrix is calculated by utilizing the side surface and/or the vehicle tail plane, the observed value of the course angle change rate is increased, and the precision of the measuring result can be improved.
In one possible implementation, the apparatus further includes: the third observation module is used for obtaining first observation noise of each observation point according to the size of the part of the target object associated with each observation point in the target object; the fourth observation module is used for obtaining second observation noise according to the size of the optical flow tracking area; the prediction module is further to: extracting a plurality of representative points including a point corresponding to the initial state value and a weight corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix; obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points; calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point; converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point; calculating a mean and covariance matrix of the plurality of representative points at pixel plane coordinates based on the pixel plane coordinates and weight of each of the representative points and one or both of the first observation noise and the second observation noise; obtaining a cross correlation matrix in the updated parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinates of each representative point, the mean value of the pixel plane coordinates of a plurality of representative points and the weight; and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
Errors of the observation results can be corrected through the first observation noise and/or the second observation noise, so that more accurate measurement results can be obtained. The first observation noise and the second observation noise are dynamically adjusted according to the target object in the image (related to the observed size of the corresponding surface) in the calculation process, the observation noise is dynamically adjusted, the influence caused by poor regression precision of a neural network model and low solving precision of a homography matrix under the condition of crossing or straight-going is reduced, and the measurement precision can be improved.
In one possible implementation, the target object and the observation object are vehicles, and the plurality of observation points include: one or more of two corner points of the rear of the vehicle and one corner point of the head of the vehicle.
In one possible implementation, the observation region includes one or more of a corresponding region of a rear of the vehicle on the image and a corresponding region of a side of the vehicle on the image.
In a possible implementation manner, the first observation module is further configured to: identifying the vehicle in the first image by adopting the neural network model to obtain a grounding line, a boundary line and a Bounding Box; wherein the ground line represents a straight line on which a wheel ground point of the vehicle is located, and the dividing line represents a dividing line of two planes of the vehicle; and determining the plurality of observation points and the pixel plane coordinates of the plurality of observation points according to the grounding line, the boundary line and the Bounding Box.
The neural network model is used for observing the 2.5DBBox of the target vehicle, so that a first observation result of the target vehicle is obtained, the process that hardware (radars and the like) is adopted in the nonlinear filtering process for observation to obtain the observation result is replaced, the image acquired by the monocular camera is processed through the neural network model, the observation process can be realized, and the cost can be saved.
In a possible implementation manner, when one surface of the vehicle is visible, the residual error includes a difference between an azimuth angle and a heading angle in the first predicted state mean value, where the azimuth angle is calculated according to a vertical coordinate and a horizontal coordinate in the first predicted state mean value. In embodiments where only one side of the application is visible, the azimuth angle may be used to improve the accuracy of the heading angle calculation.
The measuring device may be a vehicle with a sensing function, or other component with a sensing function. Such measurement devices include, but are not limited to: the vehicle can pass through the vehicle-mounted terminal, the vehicle-mounted controller, the vehicle-mounted module, the vehicle-mounted component, the vehicle-mounted chip, the vehicle-mounted unit, the vehicle-mounted radar or the camera to implement the method provided by the application.
The measuring device can also be, or be arranged in a component of other intelligent terminals with sensing functions besides vehicles. The intelligent terminal can be other terminal equipment such as intelligent transportation equipment, intelligent home equipment and robots. The device includes but is not limited to a smart terminal or a controller, a chip, other sensors such as radar or a camera, and other components in the smart terminal.
The measuring device may be a general purpose device or a special purpose device. In a specific implementation, the apparatus may also be a desktop computer, a laptop computer, a network server, a Personal Digital Assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions. The embodiment of the present application does not limit the type of the measuring device.
The measurement device may also be a chip or processor with processing functionality, and the measurement device may comprise a plurality of processors. The processor may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. The chip or processor having the processing function may be provided in the sensor, or may be provided not in the sensor but on a receiving end of the sensor output signal.
An embodiment of the present application provides a measurement apparatus, including: a processor and a memory for storing processor-executable instructions; wherein the processor is configured to implement the above method when executing the instructions.
Embodiments of the present application provide a non-transitory computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.
Embodiments of the present application provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, the processor in the electronic device performs the above method.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only-memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only-memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical encoding device, a punch card or in-groove protrusion structure, for example, having instructions stored thereon, and any suitable combination of the preceding.
The computer program instructions for carrying out operations of the present application may be assembler instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry can execute computer-readable program instructions to implement aspects of the present application by utilizing state information of the computer-readable program instructions to personalize custom electronic circuitry, such as programmable logic circuits, field-programmable gate arrays (FPGAs), or Programmable Logic Arrays (PLAs).
Various aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
Having described embodiments of the present application, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope of the described embodiments. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or improvements to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (24)

1. A measuring method is applied to a monocular camera, and is characterized by comprising the following steps:
acquiring a first image by the monocular camera;
identifying a first observation of a target object in the first image using a neural network model; the first observation comprises pixel plane coordinates of a plurality of observation points included in the target object;
performing a nonlinear filtering prediction process according to the initial motion state and the motion model of the target object to obtain a first prediction state and update parameters of nonlinear filtering; wherein the initial motion state is a measurement result obtained according to a history image before the first image, the initial motion state represents a final motion state of the target object before the camera acquires the first image, and the initial motion state comprises an initial course angle of the target object;
obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state;
obtaining a residual error according to the first observation result and the first estimation result;
and carrying out a nonlinear filtering updating process according to the first prediction state, the residual error and the updating parameter to obtain a new measurement result of the motion state of the target object, wherein the new measurement result comprises a course angle of the target object.
2. The method of claim 1, wherein the initial motion state comprises an initial state value and an initial covariance matrix,
performing a prediction process of nonlinear filtering according to the initial motion state and the motion model of the target object to obtain a first prediction state and update parameters of the nonlinear filtering, including:
extracting a plurality of representative points including points corresponding to the initial state values and weights corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix;
obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points;
calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point;
converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point;
calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate of each representative point and the weight;
obtaining a cross correlation matrix in the updating parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinate of each representative point, the mean value of a plurality of representative points in a pixel plane coordinate system and the weight;
and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
3. The method of claim 2,
obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state, including:
and calculating the world coordinate of each observation point according to the first prediction state mean value, and converting the world coordinate of each observation point into a pixel plane coordinate to obtain a first estimation result of each observation point.
4. The method according to claim 2, wherein performing an update procedure of the non-linear filtering according to the first prediction state, the residual error and the update parameter to obtain a new measurement result of the motion state of the target object comprises:
calculating the product of the residual error and the Kalman gain to obtain a state increment;
summing the first predicted state mean and the state increment to obtain a second predicted state mean of the new measurement result;
and obtaining a second prediction covariance matrix of the new measurement result according to the first prediction covariance matrix, the Kalman gain and the cross correlation matrix.
5. The method according to any one of claims 1-4, further comprising:
before the loss satisfies the convergence condition, the following process is performed:
moving an optical flow tracking area in an observation area of the target object, and determining a plurality of pairs of sample matching points of the first image and a history image before the first image from the moved optical flow tracking area; the observation region of the target object is a corresponding region of a plane region on the target object on an image, and each pair of sample matching points corresponds to the same point of the target object;
performing homography solution according to at least part of the matching points in the multiple pairs of sample matching points to obtain a homography matrix;
obtaining a first course angle change rate according to the homography matrix and a normal vector of a plane where the observation area on the first image is located, wherein the first course angle change rate is the course angle change rate of the target object in the historical image and the first image;
and determining the loss according to the difference value of the first course angle change rate and the second course angle change rate and the number of points which meet the homography matrix in the plurality of pairs of sample matching points, wherein the second course angle change rate is the course angle change rate included in the first prediction state mean value.
6. The method of claim 5, wherein the residual further comprises a difference between the first and second heading angular rates of change.
7. The method of claim 5, further comprising:
obtaining first observation noise of each observation point according to the size of the part of the target object associated with each observation point in the target object;
or obtaining a second observation noise according to the size of the optical flow tracking area;
the method for performing a nonlinear filtering prediction process according to an initial motion state and a motion model of a target object to obtain a first prediction state and update parameters of the nonlinear filtering includes:
extracting a plurality of representative points including a point corresponding to the initial state value and a weight corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix;
obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points;
calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the weight of the first estimation state value of each representative point;
converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point;
calculating a mean and covariance matrix of the plurality of representative points at pixel plane coordinates based on the pixel plane coordinates and weight of each of the representative points and one or both of the first observation noise and the second observation noise;
obtaining a cross correlation matrix in the updating parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinate of each representative point, the mean value of the pixel plane coordinates of a plurality of representative points and the weight;
and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
8. The method of any one of claims 1-7, wherein the target object and the observation object are vehicles, and wherein the plurality of observation points comprise: one or more of two corner points of the rear of the vehicle and one corner point of the front of the vehicle.
9. The method of claim 8, wherein the observation region comprises one or more of a corresponding region of a rear of the vehicle on the image and a corresponding region of a side of the vehicle on the image.
10. The method of claim 9, wherein identifying a first observation of a target object in the first image using a neural network model comprises:
identifying the vehicle in the first image by adopting the neural network model to obtain a grounding line, a boundary line and a Bounding Box; wherein the ground line represents a straight line on which a wheel ground point of the vehicle is located, and the dividing line represents a dividing line of two planes of the vehicle;
and determining the plurality of observation points and the pixel plane coordinates of the plurality of observation points according to the grounding line, the boundary line and the Bounding Box.
11. The method of claim 10, wherein the residual error comprises a difference between an azimuth angle calculated from an ordinate and an abscissa of the first predicted state mean and a heading angle of the first predicted state mean when a face of the vehicle is visible.
12. A measuring device applied to a monocular camera is characterized by comprising:
the image acquisition module is used for acquiring a first image through the monocular camera;
a first observation module for identifying a first observation of a target object in the first image using a neural network model; the first observation comprises pixel plane coordinates of a plurality of observation points included in the target object;
the prediction module is used for carrying out a nonlinear filtering prediction process according to the initial motion state and the motion model of the target object to obtain a first prediction state and an update parameter of the nonlinear filtering; wherein the initial motion state is a measurement result obtained from a history image before the first image, the initial motion state represents a final motion state of the target object before the camera acquires the first image, and the initial motion state includes an initial course angle of the target object;
the estimation module is used for obtaining a first estimation result of the pixel plane coordinates of the plurality of observation points according to the first prediction state;
the first calculation module is used for obtaining a residual error according to the first observation result and the first estimation result;
and the updating module is used for carrying out a nonlinear filtering updating process according to the first prediction state, the residual error and the updating parameter to obtain a new measurement result of the motion state of the target object, wherein the new measurement result comprises a course angle of the target object.
13. The apparatus of claim 12, wherein the initial motion state comprises an initial state value and an initial covariance matrix,
the prediction module is further to:
extracting a plurality of representative points including points corresponding to the initial state values and weights corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix;
obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points;
calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point;
converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point;
calculating a mean value and a covariance matrix of the plurality of representative points in a pixel plane coordinate system according to the pixel plane coordinate of each representative point and the weight;
obtaining a cross correlation matrix in the updating parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinate of each representative point, the mean value of a plurality of representative points in a pixel plane coordinate system and the weight;
and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
14. The apparatus of claim 13,
the estimation module is further to:
and calculating the world coordinate of each observation point according to the first prediction state mean value, and converting the world coordinate of each observation point into a pixel plane coordinate to obtain a first estimation result of each observation point.
15. The apparatus of claim 13, wherein the update module is further configured to:
calculating the product of the residual error and the Kalman gain to obtain a state increment;
summing the first predicted state mean and the state increment to obtain a second predicted state mean of the new measurement result;
and obtaining a second prediction covariance matrix of the new measurement result according to the first prediction covariance matrix, the Kalman gain and the cross correlation matrix.
16. The apparatus of any one of claims 12-15, further comprising: a second observation module, configured to perform the following process before the loss satisfies the convergence condition:
moving an optical flow tracking area in an observation area of the target object, and determining a plurality of pairs of sample matching points of the first image and a history image before the first image from the moved optical flow tracking area; the observation region of the target object is a corresponding region of a plane region on the target object on an image, and each pair of sample matching points corresponds to the same point of the target object;
performing homography solution according to at least part of the matching points in the multiple pairs of sample matching points to obtain a homography matrix;
obtaining a first course angle change rate according to the homography matrix and a normal vector of a plane where the observation area on the first image is located, wherein the first course angle change rate is the course angle change rate of the target object in the historical image and the first image;
and determining the loss according to the difference value of the first course angle change rate and the second course angle change rate and the number of points meeting the homography matrix in the plurality of pairs of sample matching points, wherein the second course angle change rate is the course angle change rate included in the first prediction state mean value.
17. The apparatus of claim 16, wherein the residual further comprises a difference between the first heading angular rate of change and the second heading angular rate of change.
18. The apparatus of claim 16, further comprising:
the third observation module is used for obtaining first observation noise of each observation point according to the size of the part of the target object associated with each observation point in the target object;
the fourth observation module is used for obtaining second observation noise according to the size of the area tracked by the optical flow;
the prediction module is further to:
extracting a plurality of representative points including a point corresponding to the initial state value and a weight corresponding to each representative point from a Gaussian distribution with a mean value as the initial state value and a covariance matrix as the initial covariance matrix;
obtaining a first estimated state value of each representative point according to the motion model and the state values of the plurality of representative points;
calculating a first prediction state mean and a first prediction covariance matrix of the first prediction state according to the first estimation state value and the weight of each representative point;
converting the first estimation state value of each representative point into a pixel plane coordinate system to obtain the pixel plane coordinate of each representative point;
calculating a mean and covariance matrix of the plurality of representative points at pixel plane coordinates based on the pixel plane coordinates and weight of each of the representative points and one or both of the first observation noise and the second observation noise;
obtaining a cross correlation matrix in the updated parameters according to the first estimated state value and the first predicted state mean value of each representative point, the pixel plane coordinates of each representative point, the mean value of the pixel plane coordinates of a plurality of representative points and the weight;
and obtaining the Kalman gain in the updated parameters according to the cross correlation matrix and the covariance matrix of the pixel plane coordinates of the plurality of representative points.
19. The apparatus of any one of claims 12-18, wherein the target object and the observation object are vehicles, and wherein the plurality of observation points comprise: one or more of two corner points of the rear of the vehicle and one corner point of the front of the vehicle.
20. The apparatus of claim 19, wherein the observation region comprises one or more of a corresponding region of a rear of the vehicle on the image and a corresponding region of a side of the vehicle on the image.
21. The apparatus of claim 20, wherein the first observation module is further configured to:
identifying the vehicle in the first image by adopting the neural network model to obtain a grounding line, a boundary line and a Bounding Box; wherein the ground line represents a straight line on which a wheel ground point of the vehicle is located, and the dividing line represents a dividing line of two planes of the vehicle;
and determining the plurality of observation points and the pixel plane coordinates of the plurality of observation points according to the grounding line, the boundary line and the Bounding Box.
22. The apparatus of claim 21, wherein the residual error comprises a difference between an azimuth angle calculated from a vertical coordinate and a horizontal coordinate of the first predicted state mean and a heading angle of the first predicted state mean when a face of the vehicle is visible.
23. A measurement device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any one of claims 1-11 when executing the instructions.
24. A non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the method of any one of claims 1-11.
CN202110655268.9A 2021-06-11 2021-06-11 Measuring method and device Pending CN115471520A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110655268.9A CN115471520A (en) 2021-06-11 2021-06-11 Measuring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110655268.9A CN115471520A (en) 2021-06-11 2021-06-11 Measuring method and device

Publications (1)

Publication Number Publication Date
CN115471520A true CN115471520A (en) 2022-12-13

Family

ID=84365198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110655268.9A Pending CN115471520A (en) 2021-06-11 2021-06-11 Measuring method and device

Country Status (1)

Country Link
CN (1) CN115471520A (en)

Similar Documents

Publication Publication Date Title
Suhr et al. Sensor fusion-based low-cost vehicle localization system for complex urban environments
CN111771141B (en) LIDAR positioning for solution inference using 3D CNN network in autonomous vehicles
KR102335389B1 (en) Deep Learning-Based Feature Extraction for LIDAR Position Estimation of Autonomous Vehicles
Lu et al. Real-time performance-focused localization techniques for autonomous vehicle: A review
EP3714285B1 (en) Lidar localization using rnn and lstm for temporal smoothness in autonomous driving vehicles
Alonso et al. Accurate global localization using visual odometry and digital maps on urban environments
Matthaei et al. Map-relative localization in lane-level maps for ADAS and autonomous driving
US11144770B2 (en) Method and device for positioning vehicle, device, and computer readable storage medium
CN111402328B (en) Pose calculation method and device based on laser odometer
CN113920198B (en) Coarse-to-fine multi-sensor fusion positioning method based on semantic edge alignment
Kellner et al. Road curb detection based on different elevation mapping techniques
Lee et al. A geometric model based 2D LiDAR/radar sensor fusion for tracking surrounding vehicles
Zeng An object-tracking algorithm for 3-D range data using motion and surface estimation
Xu et al. Dynamic vehicle pose estimation and tracking based on motion feedback for LiDARs
CN110637209A (en) Method, apparatus, and computer-readable storage medium having instructions for estimating a pose of a motor vehicle
EP4307244A1 (en) Method, apparatus, electronic device and medium for target state estimation
EP4345417A2 (en) Method, apparatus, electronic device and medium for target state estimation
Liu et al. Precise Positioning and Prediction System for Autonomous Driving Based on Generative Artificial Intelligence
CN111612818A (en) Novel binocular vision multi-target tracking method and system
Lee et al. Map Matching-Based Driving Lane Recognition for Low-Cost Precise Vehicle Positioning on Highways
CN115471520A (en) Measuring method and device
Zhang et al. Vision-based uav positioning method assisted by relative attitude classification
US20220155455A1 (en) Method and system for ground surface projection for autonomous driving
Li-tian et al. A framework of traffic lights detection, tracking and recognition based on motion models
CN114384486A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination