CN115205324A - Target object orientation determining method and device - Google Patents

Target object orientation determining method and device Download PDF

Info

Publication number
CN115205324A
CN115205324A CN202110378944.2A CN202110378944A CN115205324A CN 115205324 A CN115205324 A CN 115205324A CN 202110378944 A CN202110378944 A CN 202110378944A CN 115205324 A CN115205324 A CN 115205324A
Authority
CN
China
Prior art keywords
target object
target
video frame
orientation
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110378944.2A
Other languages
Chinese (zh)
Inventor
朱静
王兵
卿泉
王刚
刘挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taobao China Software Co Ltd
Original Assignee
Taobao China Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taobao China Software Co Ltd filed Critical Taobao China Software Co Ltd
Priority to CN202110378944.2A priority Critical patent/CN115205324A/en
Publication of CN115205324A publication Critical patent/CN115205324A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the specification provides a target object orientation determining method and a target object orientation determining device, wherein the target object orientation determining method comprises the steps of receiving an i +1 th video frame containing a target object, and acquiring the shape, key points and an edge detection frame of the target object in the i +1 th video frame; determining a target attribute value of the target object based on the shape of the target object and the keypoints; determining a first orientation of a target object in the (i + 1) th video frame according to the target attribute value of the target object and an edge detection frame; determining a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i +1 th video frame, and determining the target orientation of the target object in the i +1 th video frame based on the first orientation and the second orientation.

Description

Target object orientation determining method and device
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to a target object orientation determining method. One or more embodiments of the present specification also relate to a target object orientation determining apparatus, a computing device, and a computer-readable storage medium.
Background
Under the large background that automatic driving of the vehicle is mature gradually, the vehicle is sought to land and is produced in mass, the vehicle has higher requirements on stability, comprehensiveness and precision on sensing and structural output of obstacles. In the field of automatic driving, vehicle orientation calculation is a basic task of a perception system, belongs to a part of pose estimation of a vehicle, and subsequent trajectory prediction and planning control of the vehicle depend on orientation calculation.
It is therefore highly desirable to provide a target object orientation determination method that can improve the accuracy and stability of vehicle orientation recognition.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide a method for determining an orientation of a target object. One or more embodiments of the present specification also relate to a target object orientation determining apparatus, a computing device, and a computer-readable storage medium to address technical deficiencies in the prior art.
According to a first aspect of embodiments herein, there is provided a target object orientation determination method, including:
receiving an i +1 th video frame containing a target object, and acquiring the shape, key points and an edge detection frame of the target object in the i +1 th video frame;
determining a target attribute value of the target object based on the shape of the target object and the key points;
determining a first orientation of a target object in the (i + 1) th video frame according to the target attribute value of the target object and an edge detection frame;
determining a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i +1 th video frame, and determining the target orientation of the target object in the i +1 th video frame based on the first orientation and the second orientation.
According to a second aspect of embodiments herein, there is provided a target object orientation determining apparatus comprising:
the device comprises a first video receiving module, a second video receiving module and a third video receiving module, wherein the first video receiving module is configured to receive an i +1 th video frame containing a target object and acquire the shape, key points and an edge detection frame of the target object in the i +1 th video frame;
a first determination module configured to determine a target attribute value of the target object based on a shape of the target object and a keypoint;
a second determining module configured to determine a first orientation of a target object in the (i + 1) th video frame according to a target attribute value of the target object and an edge detection box;
a first target orientation determination module configured to determine a second orientation of the target object in an i +1 th video frame based on the target orientation of the target object in the i-th video frame, the target orientation of the target object in the i +1 th video frame being determined based on the first orientation and the second orientation.
According to a third aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions, which when executed by the processor, implement the steps of the target object orientation determination method.
According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the target object orientation determining method.
One embodiment of the specification realizes a target object orientation determining method and a target object orientation determining device, wherein the target object orientation determining method comprises the steps of receiving an i +1 th video frame containing a target object, and acquiring a shape, a key point and an edge detection frame of the target object in the i +1 th video frame; determining a target attribute value of the target object based on the shape of the target object and the key points; determining a first orientation of a target object in the (i + 1) th video frame according to the target attribute value of the target object and an edge detection frame; determining a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i +1 th video frame, determining the target orientation of the target object in the i +1 th video frame based on the first orientation and the second orientation; specifically, the target object orientation determining method is realized by decomposing the calculation of the target orientation into a plurality of steps of shape estimation, target attribute value calculation, orientation calculation and the like of the target object, so that the realization steps can be well decoupled and fused, and the accurate and stable target orientation of the target object can be finally obtained.
Drawings
Fig. 1 is an exemplary diagram of a specific application scenario of a target object orientation determining method according to an embodiment of the present specification;
FIG. 2 is a flow chart of a target object orientation determination method provided in one embodiment of the present description;
fig. 3 is a schematic diagram of a video frame including a target object in a target object orientation determination method according to an embodiment of the present specification;
fig. 4 is a schematic diagram of a projection relationship of a target object in a world coordinate system in a method for determining an orientation of a target object according to an embodiment of the present specification;
FIG. 5 is a flow chart illustrating the application of a target object orientation determination method to vehicle autonomous driving provided by one embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a target object orientation determining apparatus according to an embodiment of the present specification;
fig. 7 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be implemented in many ways other than those specifically set forth herein, and those skilled in the art will appreciate that the present description is susceptible to similar generalizations without departing from the scope of the description, and thus is not limited to the specific implementations disclosed below.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if," as used herein, may be interpreted as "at \8230; \8230when" or "when 8230; \823030when" or "in response to a determination," depending on the context.
First, the noun terms to which one or more embodiments of the present specification relate are explained.
Projection geometry: in distinction to euclidean geometry, it is commonly applied in the conversion and calculation of real world and image planes.
The problem of long tail: limited by the richness of the samples and the performance of the models, the field of automated driving is often faced with many corner cases (CornerCase) that make existing systems a mistake.
Interpretability: the end-to-end model, usually a black box, faces the problem of poor interpretability.
And (3) edge calculation: real-time computing is required to be completed on end, and high real-time performance is generally emphasized, which is different from cloud computing.
External reference of the camera: camera extrinsic parameters (Camera Extrinsics), which are parameters in a world coordinate system, such as the position, rotational direction, etc. of a Camera.
Kalman filtering: kalman filtering is an algorithm for performing optimal estimation on the system state by using a linear system state equation and inputting and outputting observation data through the system.
The bicycle model is as follows: that is, a kinematic automotive model, by which the vehicle heading at the next time can be predicted when the vehicle heading at the current time is obtained.
In the field of automatic vehicle driving, the calculation of the orientation of a vehicle is a basic task of a perception system, and belongs to a part of the estimation of the pose of the vehicle, and the prediction and the planning control of the vehicle track in the subsequent process also depend on the orientation. The embodiment of the specification provides a vision-based target object orientation determining method, so that the orientation of a vehicle can be calculated more accurately, and for the vehicle application with a laser radar configuration, the precision and the stability of orientation calculation can be greatly improved by matching with the high-precision vision orientation calculating method of the specification; for vehicle application scenes without laser radars or in radar blind areas, orientation calculation and output can be independently undertaken by adopting the vision scheme provided by the specification, so that the target object orientation method provided by the specification can be suitable for various application scenes, and user experience is improved.
Based on this, in the present specification, a target object orientation determining method is provided. One or more embodiments of the present specification relate to a target object orientation determining apparatus, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.
Referring to fig. 1, fig. 1 is a diagram illustrating an example of a specific application scenario of a target object orientation determination method according to an embodiment of the present disclosure.
The application scenario of fig. 1 includes an image capturing terminal 102, an image receiving terminal 104 and a server 106, specifically, the image receiving terminal 104 receives an image a including a vehicle captured by the image capturing terminal 102 in real time; after receiving the image a, the image receiving terminal 104 sends the image a to the server 106, and after receiving the image a, the server 106 inputs the image a into a vehicle type detection model to obtain the vehicle type of the vehicle in the image a; inputting the image a into a key point detection model to obtain key points of the vehicle in the image a; inputting the image a into a whole vehicle detection model to obtain a whole vehicle detection frame of the vehicle in the image a; the vehicle model detection model, the key point detection model and the whole vehicle detection model can be understood as a deep learning model trained by a convolutional neural network.
During specific calculation, firstly, the vehicle type of the vehicle in the image a is utilized to determine the range of the length and the width of the vehicle, namely the length and the width 1 of the vehicle; for example, if the vehicle model is an SUV, and a general compact SUV has a length of about 4.4m to 4.65m and a width of about 1.8m, and a large SUV has a length of about 4.7m to 5m and a width of about 1.9m, the vehicle has a length range of 4.4m to 5m and a width range of 1.8m to 1.9m, which are determined based on the vehicle model.
Then, calculating the real length and width of the vehicle in the image a, namely the length and width of the vehicle 2 by using key points of the vehicle in the image a and camera external parameters for acquiring the image a; the key points are points E, F, D, G, etc. in the image a of fig. 1. If the image a is the first frame image including the vehicle, if the vehicle length and width 2 is within the range of the vehicle length and width 1, the vehicle length and width 2 is taken as the target length and width of the vehicle in the image a, i.e., the vehicle length and width 3 in fig. 1; if the image a is the second frame, the third frame or the fourth frame after the first frame image of the vehicle, the predicted length and width of the vehicle in the current image a are predicted by using the real length and width of the vehicle in the previous frame image, and then the real length and width of the vehicle in the image a obtained by the key point and the external parameter of the vehicle and the predicted length and width of the vehicle in the image a predicted by the length and width of the vehicle in the previous frame image are input into a Kalman filter for fusion correction so as to obtain the target length and width of the vehicle in the image a. In practical application, in the process of vehicle driving, the camera external parameters corresponding to each frame of image also change, so that when the target length and width of the vehicle in the image are calculated, iterative calculation needs to be performed on the length and width of the vehicle based on the camera external parameters to obtain the accurate length and width of the vehicle in each frame of image, and the accuracy of subsequent vehicle orientation calculation is ensured.
After obtaining the target length and width of the vehicle in the image a, that is, the vehicle length and width 3, projecting the whole vehicle detection frame of the vehicle in the image a in a preset coordinate system by projective geometry, so as to obtain a mapping key point of the whole vehicle detection frame of the vehicle in the image a in the preset coordinate system, where the preset coordinate system is a world coordinate system of a camera, and after determining the target length and width of the vehicle in the image a and the mapping key point, the vehicle orientation of the vehicle in the image a can be obtained by calculation based on the target length and width of the vehicle in the image a and a coordinate value of the mapping key point in the world coordinate system, and in practical applications, the image a is a first frame image containing the vehicle, and then the vehicle orientation is the target vehicle orientation of the vehicle in the image a; and if the image a contains the image of the vehicle in the previous frame, the predicted vehicle orientation of the vehicle in the image a can be obtained through a single vehicle model, and then the predicted vehicle orientation of the vehicle in the image a and the vehicle orientation calculated by the length and the width of the passing vehicle and the mapping key point are input into a Kalman filter to be processed, so that the target vehicle orientation of the vehicle in the image a is obtained.
The target object orientation determining method provided by the embodiment of the specification is applied to vehicle orientation calculation, the vehicle orientation calculation is divided into a plurality of parts such as vehicle type estimation, length and width calculation and Kalman filtering, accurate and stable orientations of the vehicle can be finally obtained through decoupling and fusion of different implementation modes, and the overall orientation calculation model of the vehicle only depends on key points, so that the calculation amount is small.
Referring to fig. 2, fig. 2 is a flowchart illustrating a target object orientation determining method according to an embodiment of the present disclosure, which specifically includes the following steps.
Step 202: receiving an i +1 th video frame containing a target object, and acquiring the shape, key points and an edge detection frame of the target object in the i +1 th video frame.
The target object comprises but is not limited to a two-wheel vehicle, a three-wheel vehicle, a four-wheel vehicle or other multi-wheel vehicles, or a logistics vehicle, a public service vehicle, a medical service vehicle, a terminal service vehicle and the like; in addition, i is a positive integer, for example, if i is 1, then i +1 is 2, and in practical applications, the target object direction determining method provided in the embodiment of the present specification may predict the direction of a stationary vehicle or a traveling vehicle.
Specifically, receiving the (i + 1) th video frame including the target object may be understood as receiving the (i + 1) th video frame including the vehicle in motion acquired by the camera, and acquiring the shape, the key point, and the edge detection frame of the target object in the (i + 1) th video frame.
Taking i as 1 as an example, the (i + 1) th video frame is the 2 nd video frame containing the running vehicle; after the 2 nd video frame is obtained, the shape, the key points and the edge detection frame of the vehicle in the 2 nd video frame are obtained; the shape of the vehicle can be understood as the type of the vehicle, and the edge detection frame of the vehicle can be understood as the whole vehicle detection frame of the vehicle.
In specific implementation, the acquiring the shape, the key point, and the edge detection frame of the target object in the (i + 1) th video frame includes:
and respectively inputting the (i + 1) th video frame into a first recognition model, a second recognition model and a third recognition model, and obtaining the shape, the key point and the edge detection frame of the target object in the (i + 1) th video frame.
The first recognition model, the second recognition model and the third recognition model can be deep learning models which are trained by adopting a convolutional neural network.
Along with the above example, the model of the vehicle in the 2 nd video frame can be obtained by inputting the 2 nd video frame into the first recognition model; inputting the 2 nd video frame into the second recognition model, and obtaining key points of the vehicle in the 2 nd video frame; inputting the 2 nd video frame into the third recognition model, so as to obtain a full vehicle detection frame of the vehicle in the 2 nd video frame; the whole vehicle detection frame of the vehicle can show the range of three-dimensional detection of the vehicle and the edge information of the vehicle; the key points of the vehicle can show wheel points and lamp points, have definite physical significance and texture, and meanwhile, the key point information of the vehicle also comprises the information of a vehicle side frame.
In the embodiment of the specification, the shape, the key points and the edge detection frame of the target object are respectively obtained through different recognition models, a targeted model is adopted to accurately obtain each feature of the target object, and then the targeted model is fused and calculated to obtain the accurate orientation of the target object.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating a video frame containing a target object in a target object orientation determination method according to an embodiment of the present specification.
As can be seen from fig. 3, the target object in the video frame is a vehicle, and the video frames including the vehicle in fig. 3 are respectively input into three recognition models, so as to obtain a whole vehicle detection frame of the vehicle in fig. 3, that is, a rectangular edge detection frame composed of ABCD and surrounding the vehicle, and obtain key points of the vehicle in fig. 3: E. f, G and D, and obtaining the vehicle model of the vehicle in the figure 3.
Specifically, before receiving the i +1 th video frame including the target object, the method further includes:
receiving an ith video frame containing a target object, and acquiring the shape, key points and an edge detection frame of the target object in the ith video frame;
determining a target attribute value of the target object based on the shape of the target object and the keypoints;
and determining the target orientation of the target object in the ith video frame according to the target attribute value of the target object and the edge detection frame.
And the target object in the ith video frame and the target object in the (i + 1) th video frame are the same target object.
In practical applications, the target object orientation determining method is applied to a scene in which a vehicle is traveling, and for a traveling vehicle, in order to obtain an accurate and stable orientation of the vehicle, the orientation of the vehicle of a current frame of the vehicle is predicted based on the vehicle orientation at the previous time of the current frame of the vehicle, and then the target orientation of the vehicle of the current frame is obtained based on the predicted vehicle orientation and the vehicle orientation calculated by the length, width, key points, and the like of the vehicle of the current frame.
Therefore, when determining the target orientation of the vehicle in the i +1 th video frame, it is necessary to acquire the target orientation of the vehicle in the i-th video frame.
According to the above example, if i is still 1, first receiving a 1 st video frame including a vehicle, and acquiring a shape, a key point and an edge detection frame of the vehicle in the 1 st video frame; then, determining a target attribute value of the vehicle based on the shape of the vehicle and the key points; and finally, determining the target orientation of the target object in the 1 st video frame according to the target attribute value of the vehicle and the edge detection frame.
For a manner of obtaining the shape, the key point, and the edge detection frame of the target object in the ith video frame, reference may be made to specific descriptions of the shape, the key point, and the edge detection frame of the target object in the (i + 1) th video frame in the foregoing embodiment, which are not described herein again.
In specific implementation, the determining a target attribute value of the target object based on the shape of the target object and the key point includes:
determining a first initial attribute value of a target object in the ith video frame based on a shape of the target object in the ith video frame;
determining a second initial attribute value of the target object in the ith video frame based on the key point of the target object in the ith video frame and the camera external parameter for acquiring the ith video frame;
and taking the second initial attribute value as a target attribute value of a target object in the ith video frame when the second initial attribute value is less than or equal to the first initial attribute value.
Here, the attribute value may be understood as a length and a width, and in a case where the target object is a vehicle, the attribute value may be understood as a length and a width of the vehicle.
Specifically, when the ith video frame is the first video frame containing the target object, no other video frame exists before the first video frame, the second initial attribute value of the target object in the ith video frame is calculated and obtained as the target attribute value based on the key point of the target object in the ith video frame and the camera external parameter of the ith video frame, and the target attribute value is also within the first initial attribute value inevitably.
After obtaining the model, the key points and the whole vehicle detection frame of the vehicle in the 1 st video frame, determining a first initial length and width of the vehicle, namely the length and width range of the vehicle, according to the model of the vehicle; then, according to key points of the vehicle in the 1 st video frame and external parameters of a camera for acquiring the video frame, obtaining a second initial length and width of the vehicle in the 1 st video frame; and finally, taking the second initial length and width as the target length and width of the vehicle in the 1 st video frame under the condition that the second initial length and width is less than or equal to the first initial length and width.
In practical application, the maximum length and width and the minimum length and width of a vehicle type can be obtained according to the vehicle type of the vehicle, then the maximum length and width and the minimum length and width of the vehicle type are taken as the range of the length and width of the vehicle, and after the actual length and width of the vehicle is obtained according to the key point of the vehicle and the camera external parameter calculation, if the actual length and width of the vehicle is within the range of the length and width, the fact that the actual length and width of the vehicle is obtained according to the key point of the vehicle and the camera external parameter calculation is accurate and can be taken as the target length and width of the vehicle in the 1 st video frame; if the real length and width of the vehicle are not in the length and width range, it indicates that the real length and width of the vehicle obtained through the key points of the vehicle and the camera external parameter calculation are wrong.
In this embodiment of the present specification, when the ith video frame is a first video frame including a target object, a first initial attribute value determined by a shape of the target object is used as a constraint condition, and a second initial attribute value of a vehicle in the ith video frame, which is calculated according to a key point of the target object and camera parameters for acquiring the ith video frame, is used as a target attribute value, so that an orientation of the target object in the first video frame can be quickly obtained subsequently.
Step 204: determining a target attribute value of the target object based on the shape of the target object and the keypoints.
Specifically, the determining a target attribute value of the target object based on the shape of the target object and the key point includes:
determining a first initial attribute value of a target object in the (i + 1) th video frame based on the shape of the target object in the (i + 1) th video frame;
determining a second initial attribute value of the target object in the (i + 1) th video frame based on the key point of the target object in the (i + 1) th video frame and the camera external parameter for acquiring the (i + 1) th video frame;
determining a third initial attribute value of a target object in the i +1 th video frame based on a target attribute value of the target object in the i th video frame if the second initial attribute value is less than or equal to the first initial attribute value;
determining a target attribute value for a target object in the (i + 1) th video frame based on the second initial attribute value and the third initial attribute value.
Specifically, the specific calculation manner of the target attribute value of the target object in the i +1 th video frame is different from the calculation manner of the target attribute value of the target object in the i th video frame in the above embodiment.
Firstly, determining a first initial attribute value of a target object in an (i + 1) th video frame based on the shape of the target object in the (i + 1) th video frame, and then obtaining a second initial attribute value of the target object in the (i + 1) th video frame according to a key point of the target object in the (i + 1) th video frame and camera extrinsic parameters of the target object in the (i + 1) th video frame; the calculation of the first initial attribute value and the second initial attribute value of the target object in the i +1 th video frame is the same as the calculation of the first initial attribute value and the second initial attribute value of the target object in the i-th video frame, which is not described herein again.
Because the first initial attribute value of the target object in the (i + 1) th video frame is a constraint condition of the second initial attribute value, after the first initial attribute value and the second initial attribute value of the target object in the (i + 1) th video frame are obtained, whether the second initial attribute value of the target object in the (i + 1) th video frame is within the first initial attribute value is judged, and if yes, a third initial attribute value of the target object in the (i + 1) th video frame can be determined based on the target attribute value of the target object in the (i) th video frame; in practical application, the target attribute value of the target object in the ith video frame is input into a kalman filter to predict and obtain a third initial attribute value of the target object in the (i + 1) th video frame.
And finally, performing fusion correction on the second initial attribute value and the third initial attribute value of the target object in the (i + 1) th video frame to obtain the target attribute value of the target object in the (i + 1) th video frame.
The target object orientation determining method in the embodiment of the specification is applied to a vehicle driving scene, a camera can acquire an image containing a vehicle in real time and send the image to a server, the server can predict the length and width of the vehicle in a current video frame based on the length and width of the vehicle in a video frame at the last moment of the current video frame after receiving the image containing the vehicle, and then the predicted length and width of the vehicle and the length and width of the vehicle obtained through key points of the vehicle in the current frame and camera external reference of the current video frame are input into a Kalman filter to be subjected to fusion correction, so that the accurate and stable vehicle length and width of the vehicle in the current video frame are obtained.
Step 206: and determining the first orientation of the target object in the (i + 1) th video frame according to the target attribute value of the target object and the edge detection frame.
Specifically, the determining the first orientation of the target object in the (i + 1) th video frame according to the target attribute value of the target object and the edge detection frame includes:
mapping the edge detection frame of the target object in the (i + 1) th video frame in a preset coordinate system to obtain a mapping key point of the edge detection frame of the target object in the (i + 1) th video frame;
and determining a first orientation of the target object in the (i + 1) th video frame based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object.
In practical application, when the first orientation of the target object in the (i + 1) th video frame is calculated, firstly mapping an edge detection frame of the target object in the (i + 1) th video frame in a world coordinate system to obtain a mapping key point of the edge detection frame of the target object in the (i + 1) th video frame; and then calculating to obtain a first orientation of the target object in the (i + 1) th video frame based on the coordinate value of the mapping key point in the world coordinate system and the target attribute value of the target object in the (i + 1) th video frame.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a projection relationship of a target object in a world coordinate system in a target object orientation determination method according to an embodiment of the present specification.
In conjunction with fig. 3, the rectangle formed by ABCD in fig. 3 is the edge detection frame of the vehicle, ABC in fig. 4 is three visible points of the vehicle, imaging is performed by the camera, the three corresponding points on the lowest horizontal line (i.e., the imaging plane) of ABC of the vehicle are a ', B ', C ', and the specific pixel values, i.e., coordinate values, of a ', B ', C ' on the imaging plane can be obtained according to the XOY coordinate system in fig. 4, and the direction of BC in the XOY coordinate system, i.e., the orientation of the vehicle in the video frame can be obtained according to the coordinate values of a ', B ', C ' on the imaging plane and AC (the width of the vehicle), BC (the length of the vehicle).
In the embodiment of the specification, the orientation of a target object is calculated by adopting a geometric modeling mode, and the method is insensitive to visual imaging in principle design and is suitable for pin holes and fisheye cameras, so that the method is insensitive to modules with different parameters and has good mass production characteristic; and by utilizing the constraint relation between the length, the width and the orientation of the target object, the orientation of the target object in the video frame can be calculated and obtained more accurately through the coordinate values of the mapping key points of the edge detection frame of the target object in the world coordinate system and the length and the width of the target object.
Specifically, the determining the first orientation of the target object in the i +1 th video frame based on the coordinate value of the mapping keypoint in the preset coordinate system and the target attribute value of the target object includes:
determining a first target edge and a second target edge of the target object based on the edge detection frame;
determining an edge value of the first target edge and an edge value of the second target edge according to the target attribute value of the target object;
and calculating to obtain a first orientation of the target object in the (i + 1) th video frame based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge.
In practical application, which side is the length of the target object and which side is the width of the target object can be determined based on the edge detection frame of the target object, after the long side and the wide side of the target object are determined, length information is given to the long side according to the target attribute value of the target object, width information is given to the wide side, and then the coordinate value of the mapping key point in the world coordinate system, the length of the long side of the target object and the width of the wide side are calculated to obtain the orientation of the target object.
In the embodiment of the present specification, by using a constraint relationship between the length and the width of the target object, the orientation of the target object in the video frame can be calculated and obtained more accurately through the coordinate value of the mapping key point of the edge detection frame of the target object in the world coordinate system and the length and the width of the target object.
Step 208: determining a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i +1 th video frame, and determining the target orientation of the target object in the i +1 th video frame based on the first orientation and the second orientation.
Specifically, the determining the target orientation of the target object in the ith video frame according to the target attribute value of the target object and the edge detection frame includes:
mapping the edge detection frame of the target object in the ith video frame in a preset coordinate system to obtain a mapping key point of the edge detection frame of the target object in the ith video frame;
determining a first orientation of a target object in the ith video frame based on a coordinate value of the mapping key point in the preset coordinate system and a target attribute value of the target object;
and in the case that i is 1, taking the first orientation of the target object in the ith video frame as the target orientation of the target object in the ith video frame.
Specifically, under the condition that the ith video frame is the first video frame containing the target object and the target orientation of the target object in the ith video frame is calculated, the target orientation of the ith video frame can be obtained directly by calculating the coordinate value of the mapping key point in the world coordinate system and the length and the width of the target object; in the case that the ith video frame is not the first video frame, the target orientation of the target object in the (i + 1) th video frame may be obtained in the manner of obtaining the target orientation.
In the embodiment of the present specification, when the ith video frame is the first video frame including the target object or the still video frame, there is no constraint relationship with speed, distance, and the like, and at this time, the target orientation of the ith video frame can be quickly obtained by directly mapping the coordinate values of the key points in the world coordinate system and the length and width of the target object.
Wherein the determining a first orientation of a target object in the ith video frame based on the coordinate values of the mapping keypoint in the preset coordinate system and the target attribute value of the target object comprises:
determining a first target edge and a second target edge of the target object based on the edge detection frame;
determining an edge value of the first target edge and an edge value of the second target edge according to the target attribute value of the target object;
and calculating to obtain a first orientation of the target object in the ith video frame based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge.
In this embodiment of the present specification, when the ith video frame is the first video frame or a still video frame, the first orientation of the target object in the ith video frame may be obtained through calculation according to the mapping key point of the edge detection frame of the target object in the ith video frame and the length and width of the target object, and a more accurate orientation of the target object may be obtained through calculation by using a constraint relationship between the length and width of the target object and the orientation.
In the case that the video frame is not the first video frame and the video frame is also not a still video frame, in order to obtain a stable and accurate target orientation of the target object, the orientation of the target object needs to be better calculated and corrected through a bicycle model and a filter, and the specific implementation manner is as follows:
said determining a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i th video frame comprises:
and inputting the target orientation of the target object in the ith video frame into a bicycle model to obtain a second orientation of the target object in the (i + 1) th video frame.
In practical application, a single vehicle model of the vehicle is established in advance, the calculated orientation is used as the observation of a Kalman filter, in the single vehicle model, the orientation and the speed direction of the vehicle are consistent, and the difference of distance measurement in time sequence provides speed information, so that the vehicle with high motion speed can believe the speed direction as the orientation more, and the vehicle with slow motion and static motion can believe the calculated orientation more in a geometric constraint way. And for a running vehicle, the motion speed of the vehicle can be obtained through ranging on the time sequence of the vehicle or laser radar, the target orientation of the target object in the ith video frame is input into the bicycle model, and the second orientation of the target object in the (i + 1) th video frame can be predicted through the constraint relation between the target orientation and the motion speed of the vehicle.
Inputting the first orientation and the second orientation of the target object in the (i + 1) th video frame into a Kalman filter, and performing weighted fusion on the first orientation and the second orientation by the Kalman filter to obtain the target orientation of the target object in the (i + 1) th video frame; in addition, the observation quantity of the kalman filter may further include information of a key point of the target object in the (i + 1) th video frame, wherein in the orientation calculation, the key point has a positive effect on the orientation calculation.
In the embodiment of the specification, the target object orientation determining method provides a new geometric modeling scheme, is suitable for vehicles at near and far distances and also suitable for static and moving vehicles, has no limitation of principle, replaces a deep learning model in the prior art by geometric modeling, has advantages in the aspects of computing resources and time consumption, and can be suitable for automatic driving application with high real-time performance; in the scheme, the end-to-end deep learning model is disassembled into a plurality of sub models and a new geometric model, so that the redundancy is increased, the stability and the long tail problem are reduced, and the safety of the automatic driving application is improved; the corresponding geometric modeling is insensitive to visual imaging in principle design and is suitable for pin holes and fisheye cameras, so that the method is insensitive to modules with different parameters (such as camera external parameters) and has good mass production characteristics; and a physical modeling scheme is adopted, so that the method has strong interpretability.
In the following, with reference to fig. 5, taking an application of the target object orientation determining method provided in this specification in automatic driving of a vehicle as an example, the target object orientation determining method is further described, specifically including the following steps.
Step 502: the method comprises the steps of receiving a video frame containing a vehicle, and obtaining a whole vehicle detection frame, key points and a vehicle type of the vehicle in the video frame.
Specifically, a whole vehicle detection frame, key points and a vehicle type of a vehicle in the video frame are obtained according to the three deep learning models.
Step 504: and obtaining the initial length and width of the vehicle in the video frame according to the vehicle type of the vehicle.
Step 506: and under the constraint of the initial length and width of the vehicle, determining the real length and width of the vehicle according to key points of the vehicle and camera external parameters of the acquired vehicle video frames.
Step 508: and calculating the initial orientation of the vehicle in the video frame according to the length and the width of the vehicle and the constraint relation of mapping key points of the whole vehicle detection frame of the vehicle in a world coordinate system.
Step 510: and establishing a motion model of the bicycle for the vehicle in advance, obtaining the predicted orientation of the vehicle in the video frame, and estimating the initial orientation and the predicted orientation by using a Kalman filter.
Step 512: and obtaining the stable orientation of the vehicle after the initial orientation and the predicted orientation are weighted and fused by using a Kalman filter.
The method for determining the orientation of the target object provided by the embodiment of the specification is applied to automatic driving of a vehicle, can obtain a stable and accurate orientation of the vehicle by utilizing a mode of modeling by utilizing a constraint relation between the length, the width and the orientation of the vehicle, is applicable to both a static vehicle and a running vehicle, is also applicable to vehicles at near places and far places, has no limitation on principle, and has better overall performance.
Corresponding to the above method embodiment, the present specification further provides an embodiment of a target object orientation determining apparatus, and fig. 6 illustrates a schematic structural diagram of a target object orientation determining apparatus provided in an embodiment of the present specification. As shown in fig. 6, the apparatus includes:
a first video receiving module 602, configured to receive an i +1 th video frame containing a target object, and obtain a shape, a key point, and an edge detection frame of the target object in the i +1 th video frame;
a first determination module 604 configured to determine a target attribute value of the target object based on the shape of the target object and the keypoints;
a second determining module 606 configured to determine a first orientation of a target object in the i +1 th video frame according to a target attribute value of the target object and an edge detection box;
a first target orientation determination module 608 configured to determine a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i +1 th video frame, the target orientation of the target object in the i +1 th video frame being determined based on the first orientation and the second orientation.
Optionally, the apparatus further includes:
the second video receiving module is configured to receive an ith video frame containing a target object and acquire the shape, key points and an edge detection frame of the target object in the ith video frame;
a third determination module configured to determine a target attribute value of the target object based on the shape of the target object and the keypoints;
a second target orientation determination module configured to determine a target orientation of a target object in the ith video frame according to the target attribute value of the target object and the edge detection box.
Optionally, the first video receiving module 602 is further configured to:
and respectively inputting the (i + 1) th video frame into a first recognition model, a second recognition model and a third recognition model, and acquiring the shape, key points and an edge detection frame of the target object in the (i + 1) th video frame.
Optionally, the third determining module is further configured to:
determining a first initial attribute value of a target object in the ith video frame based on a shape of the target object in the ith video frame;
determining a second initial attribute value of the target object in the ith video frame based on the key point of the target object in the ith video frame and the camera external parameter for acquiring the ith video frame;
and taking the second initial attribute value as a target attribute value of a target object in the ith video frame when the second initial attribute value is less than or equal to the first initial attribute value.
Optionally, the first determining module 604 is further configured to:
determining a first initial attribute value of a target object in the (i + 1) th video frame based on the shape of the target object in the (i + 1) th video frame;
determining a second initial attribute value of a target object in the (i + 1) th video frame based on key points of the target object in the (i + 1) th video frame and camera external parameters for acquiring the (i + 1) th video frame;
determining a third initial attribute value of a target object in the (i + 1) th video frame based on a target attribute value of the target object in the ith video frame if the second initial attribute value is less than or equal to the first initial attribute value;
determining a target attribute value for a target object in the (i + 1) th video frame based on the second initial attribute value and the third initial attribute value.
Optionally, the second determining module 606 is further configured to:
mapping the edge detection frame of the target object in the (i + 1) th video frame in a preset coordinate system to obtain a mapping key point of the edge detection frame of the target object in the (i + 1) th video frame;
and determining a first orientation of the target object in the (i + 1) th video frame based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object.
Optionally, the second determining module 606 is further configured to:
determining a first target edge and a second target edge of the target object based on the edge detection frame;
determining an edge value of the first target edge and an edge value of the second target edge according to the target attribute value of the target object;
and calculating to obtain a first orientation of the target object in the (i + 1) th video frame based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge.
Optionally, the second target orientation determining module is further configured to:
mapping the edge detection frame of the target object in the ith video frame in a preset coordinate system to obtain a mapping key point of the edge detection frame of the target object in the ith video frame;
determining a first orientation of a target object in the ith video frame based on a coordinate value of the mapping key point in the preset coordinate system and a target attribute value of the target object;
and in the case that i is 1, taking the first orientation of the target object in the ith video frame as the target orientation of the target object in the ith video frame.
Optionally, the second target orientation determining module is further configured to:
determining a first target edge and a second target edge of the target object based on the edge detection frame;
determining an edge value of the first target edge and an edge value of the second target edge according to the target attribute value of the target object;
and calculating to obtain a first orientation of the target object in the ith video frame based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge.
Optionally, the first target orientation determining module 608 is further configured to:
and inputting the target orientation of the target object in the ith video frame into a bicycle model to obtain a second orientation of the target object in the (i + 1) th video frame.
One embodiment of the present description implements a target object orientation determining apparatus, which decomposes calculation of a target orientation into multiple steps of shape estimation, target attribute value calculation, orientation calculation, and the like for a target object, so that each implementation step can be well decoupled and fused, and finally, an accurate and stable target orientation of the target object can be obtained.
The foregoing is a schematic configuration of the target object orientation determining apparatus of the present embodiment. It should be noted that the technical solution of the target object orientation determining apparatus and the technical solution of the target object orientation determining method belong to the same concept, and for details that are not described in detail in the technical solution of the target object orientation determining apparatus, reference may be made to the description of the technical solution of the target object orientation determining method.
The target object orientation determining apparatus provided in the embodiments of the present specification may be applied to an automatic driving scenario, and the orientation of the automatic driving vehicle is calculated by each functional module (for example, the first determining module, the second determining module, the first target orientation determining module, and the like) in the embodiments of the present specification.
Of course, these algorithm modules (such as the functional modules described above) may vary depending on the type of autonomous vehicle. For example, different algorithm modules may be involved for logistics vehicles, public service vehicles, medical service vehicles, terminal service vehicles. The algorithm modules are illustrated below for these four autonomous vehicles, respectively:
the logistics vehicle refers to a vehicle used in a logistics scene, and may be, for example, a logistics vehicle with an automatic sorting function, a logistics vehicle with a refrigeration and heat preservation function, and a logistics vehicle with a measurement function. These logistics vehicles may involve different algorithm modules.
For example, the logistics vehicles can be provided with an automatic sorting device which can automatically take out, convey, sort and store the goods after the logistics vehicles reach the destination. This relates to an algorithm module for goods sorting, which mainly implements logic control of goods taking out, carrying, sorting, storing and the like.
For another example, in a cold chain logistics scenario, the logistics vehicle may further include a refrigeration and insulation device, and the refrigeration and insulation device may implement refrigeration or insulation of transported fruits, vegetables, aquatic products, frozen foods, and other perishable foods, so that the transportation environment is in a proper temperature environment, and the long-distance transportation problem of perishable foods is solved. The algorithm module is mainly used for dynamically and adaptively calculating the proper temperature of cold meal or heat preservation according to the information such as the property, the perishability, the transportation time, the current season, the climate and the like of food (or articles), and automatically adjusting the cold-storage heat preservation device according to the proper temperature, so that a transport worker does not need to manually adjust the temperature when the vehicle transports different foods or articles, the transport worker is liberated from the complicated temperature regulation and control, and the efficiency of cold-storage heat preservation transportation is improved.
For another example, in most logistics scenarios, the fee is charged according to the volume and/or weight of the parcel, but the number of logistics parcels is very large, and the measurement of the volume and/or weight of the parcel by a courier is simple, so that the efficiency is very low, and the labor cost is high. Therefore, in some logistics vehicles, a measuring device is additionally arranged, so that the volume and/or the weight of the logistics packages can be automatically measured, and the cost of the logistics packages can be calculated. This relates to an algorithm module for logistics package measurement, which is mainly used to identify the type of logistics package, determine the measurement mode of logistics package, such as volume measurement or weight measurement or combined measurement of volume and weight, and can complete the measurement of volume and/or weight according to the determined measurement mode and complete the cost calculation according to the measurement result.
The public service vehicle refers to a vehicle providing a certain public service, and may be, for example, a fire truck, an ice removal truck, a water sprinkler, a snow clearer, a garbage disposal vehicle, a traffic guidance vehicle, and the like. These public service vehicles may involve different algorithm modules.
For example, in the case of an autonomous fire fighting vehicle, the main task is to perform a reasonable fire fighting task for the fire scene, which involves an algorithm for the fire fighting task, which at least requires logic for the identification of the fire situation, the planning of the fire fighting scheme and the automatic control of the fire fighting equipment.
For another example, for an ice removing vehicle, the main task is to remove ice and snow on the road surface, which involves an algorithm module for ice removal, the algorithm module at least needs to realize the recognition of the ice and snow condition on the road surface, formulate an ice removal scheme according to the ice and snow condition, such as which road sections need to be deiced, which road sections need not to be deiced, whether a salt spreading manner, the salt spreading gram number, and the like are adopted, and the logic of automatic control of a deicing device under the condition of determining the ice removal scheme.
The medical service vehicle is an automatic driving vehicle capable of providing one or more medical services, the vehicle can provide medical services such as disinfection, temperature measurement, dispensing and isolation, and the algorithm module relates to algorithm modules for providing various self-service medical services, the algorithm modules mainly realize identification of disinfection requirements and control of a disinfection device so that the disinfection device can disinfect patients, or identify the positions of the patients, control the temperature measurement device to automatically press close to the forehead and the like of the patients to measure the temperature of the patients, or is used for realizing judgment of symptoms, giving out prescriptions according to judgment results and realizing identification of medicine/medicine containers and control of a medicine taking manipulator so that the medicine taking manipulator can grab medicines for the patients according to the prescriptions, and the like.
The terminal service vehicle is a self-service automatic driving vehicle which can replace some terminal devices and provide certain convenient service for users, and for example, the vehicles can provide services such as printing, attendance checking, scanning, unlocking, payment and retail for the users.
For example, in some application scenarios, a user often needs to go to a specific location to print or scan a document, which is time consuming and labor intensive. Therefore, a terminal service vehicle capable of providing printing/scanning service for a user appears, the service vehicles can be interconnected with user terminal equipment, the user sends a printing instruction through the terminal equipment, the service vehicle responds to the printing instruction, documents required by the user are automatically printed, the printed documents can be automatically sent to the position of the user, the user does not need to queue at a printer, and the printing efficiency can be greatly improved. Or, the scanning instruction sent by the user through the terminal equipment can be responded, the scanning vehicle is moved to the position of the user, the user places the document to be scanned on the scanning tool of the service vehicle to complete scanning, queuing at the printer/scanner is not needed, and time and labor are saved. This involves an algorithm module providing a print/scan service that needs to recognize at least the interconnection with the user terminal device, the response of the print/scan command, the positioning of the user's position, and the travel control.
For another example, with the development of new retail business, more and more e-commerce delivers goods to various large office buildings and public areas by means of vending machines, but the vending machines are placed at fixed positions and are not movable, and users need to go by the vending machines to purchase the needed goods, which is still poor in convenience. Therefore, self-service driving vehicles capable of providing retail services appear, the service vehicles can carry commodities to move automatically and can provide corresponding self-service shopping APP or shopping entrances, a user can place an order for the self-service driving vehicles providing retail services through the APP or shopping entrances by means of a terminal such as a mobile phone, the order comprises names and numbers of commodities to be purchased, and after the vehicle receives an order placement request, whether the current remaining commodities have the commodities purchased by the user and whether the quantity is sufficient can be determined. This involves algorithm modules that provide retail services that implement logic primarily to respond to customer order requests, order processing, merchandise information maintenance, customer location, payment management, etc.
Referring to fig. 7, fig. 7 illustrates a block diagram of a computing device 700 provided in accordance with one embodiment of the present description. Components of the computing device 700 include, but are not limited to, a memory 710 and a processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.
Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 740 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 7 is for purposes of example only and is not limiting as to the scope of the present description. Other components may be added or replaced as desired by those skilled in the art.
Computing device 700 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet computer, personal digital assistant, laptop computer, notebook computer, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.
Wherein the processor 720 is configured to execute the computer-executable instructions for performing the steps of the target object orientation determination method when executed by the processor.
The foregoing is a schematic diagram of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the target object orientation determining method described above belong to the same concept, and for details that are not described in detail in the technical solution of the computing device, reference may be made to the description of the technical solution of the target object orientation determining method described above.
An embodiment of the present specification also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the target object orientation determination method.
The above is an illustrative scheme of a computer-readable storage medium of the embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the target object orientation determining method described above, and for details that are not described in detail in the technical solution of the storage medium, reference may be made to the description of the technical solution of the target object orientation determining method described above.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer-readable medium may contain suitable additions or subtractions depending on the requirements of legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer-readable media may not include electrical carrier signals or telecommunication signals in accordance with legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims (12)

1. A target object orientation determination method, comprising:
receiving an i +1 th video frame containing a target object, and acquiring the shape, key points and an edge detection frame of the target object in the i +1 th video frame;
determining a target attribute value of the target object based on the shape of the target object and the key points;
determining a first orientation of a target object in the (i + 1) th video frame according to the target attribute value of the target object and an edge detection frame;
determining a second orientation of the target object in the i +1 th video frame based on the target orientation of the target object in the i-th video frame, and determining the target orientation of the target object in the i +1 th video frame based on the first orientation and the second orientation.
2. The target object orientation determination method of claim 1, prior to receiving an i +1 th video frame containing a target object, further comprising:
receiving an ith video frame containing a target object, and acquiring the shape, key points and an edge detection frame of the target object in the ith video frame;
determining a target attribute value of the target object based on the shape of the target object and the keypoints;
and determining the target orientation of the target object in the ith video frame according to the target attribute value of the target object and the edge detection frame.
3. The method for determining the orientation of a target object according to claim 1 or 2, wherein the obtaining of the shape, the key points and the edge detection frame of the target object in the (i + 1) th video frame comprises:
and respectively inputting the (i + 1) th video frame into a first recognition model, a second recognition model and a third recognition model, and obtaining the shape, the key point and the edge detection frame of the target object in the (i + 1) th video frame.
4. The target object orientation determination method of claim 2, the determining a target property value of the target object based on a shape of the target object and a keypoint, comprising:
determining a first initial attribute value of a target object in the ith video frame based on a shape of the target object in the ith video frame;
determining a second initial attribute value of the target object in the ith video frame based on the key point of the target object in the ith video frame and the camera external parameter for acquiring the ith video frame;
and when the second initial attribute value is less than or equal to the first initial attribute value, taking the second initial attribute value as a target attribute value of a target object in the ith video frame.
5. The target object orientation determination method of claim 4, the determining a target property value of the target object based on a shape of the target object and keypoints, comprising:
determining a first initial attribute value of a target object in the (i + 1) th video frame based on the shape of the target object in the (i + 1) th video frame;
determining a second initial attribute value of a target object in the (i + 1) th video frame based on key points of the target object in the (i + 1) th video frame and camera external parameters for acquiring the (i + 1) th video frame;
determining a third initial attribute value of a target object in the (i + 1) th video frame based on a target attribute value of the target object in the ith video frame if the second initial attribute value is less than or equal to the first initial attribute value;
determining a target attribute value for a target object in the (i + 1) th video frame based on the second initial attribute value and the third initial attribute value.
6. The target object orientation determination method of claim 1, wherein determining the first orientation of the target object in the (i + 1) th video frame according to the target attribute value of the target object and the edge detection box comprises:
mapping the edge detection frame of the target object in the (i + 1) th video frame in a preset coordinate system to obtain a mapping key point of the edge detection frame of the target object in the (i + 1) th video frame;
and determining a first orientation of the target object in the (i + 1) th video frame based on the coordinate value of the mapping key point in the preset coordinate system and the target attribute value of the target object.
7. The target object orientation determination method of claim 6, wherein the determining the first orientation of the target object in the i +1 th video frame based on the coordinate values of the mapping keypoint in the preset coordinate system and the target attribute value of the target object comprises:
determining a first target edge and a second target edge of the target object based on the edge detection frame;
determining an edge value of the first target edge and an edge value of the second target edge according to the target attribute value of the target object;
and calculating to obtain a first orientation of the target object in the (i + 1) th video frame based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge.
8. The method for determining the target orientation of a target object according to claim 2, wherein determining the target orientation of the target object in the ith video frame according to the target attribute value of the target object and an edge detection box comprises:
mapping the edge detection frame of the target object in the ith video frame in a preset coordinate system to obtain a mapping key point of the edge detection frame of the target object in the ith video frame;
determining a first orientation of a target object in the ith video frame based on a coordinate value of the mapping key point in the preset coordinate system and a target attribute value of the target object;
and in the case that i is 1, taking the first orientation of the target object in the ith video frame as the target orientation of the target object in the ith video frame.
9. The target object orientation determining method according to claim 8, said determining a first orientation of a target object in the i-th video frame based on coordinate values of the mapping keypoint in the preset coordinate system and a target attribute value of the target object, comprising:
determining a first target edge and a second target edge of the target object based on the edge detection frame;
determining an edge value of the first target edge and an edge value of the second target edge according to the target attribute value of the target object;
and calculating to obtain a first orientation of the target object in the ith video frame based on the coordinate value of the mapping key point in the preset coordinate system, the edge value of the first target edge and the edge value of the second target edge.
10. A target object orientation determination apparatus comprising:
the first video receiving module is configured to receive an i +1 th video frame containing a target object and acquire the shape, key points and an edge detection frame of the target object in the i +1 th video frame;
a first determination module configured to determine a target attribute value of the target object based on a shape of the target object and a keypoint;
a second determining module configured to determine a first orientation of a target object in the (i + 1) th video frame according to a target attribute value of the target object and an edge detection box;
a first target orientation determination module configured to determine a second orientation of the target object in an i +1 th video frame based on the target orientation of the target object in the i-th video frame, the target orientation of the target object in the i +1 th video frame being determined based on the first orientation and the second orientation.
11. A computing device, comprising:
a memory and a processor;
the memory is for storing computer-executable instructions for execution by the processor, which when executed by the processor implement the steps of the target object orientation determination method of any one of claims 1 to 9.
12. A computer readable storage medium storing computer executable instructions which, when executed by a processor, carry out the steps of the target object orientation determination method of any one of claims 1 to 9.
CN202110378944.2A 2021-04-08 2021-04-08 Target object orientation determining method and device Pending CN115205324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110378944.2A CN115205324A (en) 2021-04-08 2021-04-08 Target object orientation determining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110378944.2A CN115205324A (en) 2021-04-08 2021-04-08 Target object orientation determining method and device

Publications (1)

Publication Number Publication Date
CN115205324A true CN115205324A (en) 2022-10-18

Family

ID=83570913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110378944.2A Pending CN115205324A (en) 2021-04-08 2021-04-08 Target object orientation determining method and device

Country Status (1)

Country Link
CN (1) CN115205324A (en)

Similar Documents

Publication Publication Date Title
US12073571B1 (en) Tracking objects in three-dimensional space
CN110059608B (en) Object detection method and device, electronic equipment and storage medium
CN111108507B (en) Generating a three-dimensional bounding box from two-dimensional image and point cloud data
CN109844807B (en) Method, system and apparatus for segmenting and sizing objects
US10983217B2 (en) Method and system for semantic label generation using sparse 3D data
US10817752B2 (en) Virtually boosted training
US9594979B1 (en) Probabilistic registration of interactions, actions or activities from multiple views
WO2019040123A1 (en) Object separation in three-dimensional camera images
US11869256B2 (en) Separation of objects in images from three-dimensional cameras
CN110796104A (en) Target detection method and device, storage medium and unmanned aerial vehicle
CN115205610A (en) Training method and training device for perception model and electronic equipment
US20230168689A1 (en) Systems and methods for preserving data and human confidentiality during feature identification by robotic devices
CN114663598A (en) Three-dimensional modeling method, device and storage medium
Meng et al. Multi‐vehicle multi‐sensor occupancy grid map fusion in vehicular networks
Adachi et al. Accuracy improvement of semantic segmentation trained with data generated from a 3d model by histogram matching using suitable references
CN115205324A (en) Target object orientation determining method and device
CN115223146A (en) Obstacle detection method, obstacle detection device, computer device, and storage medium
CN115171061A (en) Vehicle detection method, device and equipment
CN115727857A (en) Positioning method, positioning apparatus, vehicle, storage medium, and program product
CN115035490A (en) Target detection method, device, equipment and storage medium
CN115408804A (en) Vehicle pose determination method and device
CN117456087A (en) Point cloud rasterization method
Daraei et al. Region segmentation using lidar and camera
Astudillo et al. Reducing the breach between simulated and real data for top view images
US20240096103A1 (en) Systems and methods for constructing high resolution panoramic imagery for feature identification on robotic devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination