CN111723716B - Method, device, system, medium and electronic equipment for determining target object orientation - Google Patents

Method, device, system, medium and electronic equipment for determining target object orientation Download PDF

Info

Publication number
CN111723716B
CN111723716B CN202010527892.6A CN202010527892A CN111723716B CN 111723716 B CN111723716 B CN 111723716B CN 202010527892 A CN202010527892 A CN 202010527892A CN 111723716 B CN111723716 B CN 111723716B
Authority
CN
China
Prior art keywords
dimensional coordinate
determining
dimensional
depth
coordinate system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010527892.6A
Other languages
Chinese (zh)
Other versions
CN111723716A (en
Inventor
杨哲宁
刘强
李秦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Horizon Robotics Science and Technology Co Ltd
Original Assignee
Shenzhen Horizon Robotics Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Horizon Robotics Science and Technology Co Ltd filed Critical Shenzhen Horizon Robotics Science and Technology Co Ltd
Priority to CN202010527892.6A priority Critical patent/CN111723716B/en
Publication of CN111723716A publication Critical patent/CN111723716A/en
Application granted granted Critical
Publication of CN111723716B publication Critical patent/CN111723716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are a method, a device, a system, a medium and an electronic device for determining the orientation of a target object, wherein the method comprises the following steps: determining a first three-dimensional coordinate of a key point of a monitored target from a first depth map acquired by a first depth camera; the first three-dimensional coordinates are three-dimensional coordinates in a first three-dimensional coordinate system of the first depth image pickup device; acquiring a second three-dimensional coordinate of a preset target point in a second three-dimensional coordinate system; the preset target point is located outside the visual field range of the first depth camera; according to the position relation between the first three-dimensional coordinate system and the second three-dimensional coordinate system, carrying out coordinate conversion on one of the first three-dimensional coordinate system and the second three-dimensional coordinate system to obtain a third three-dimensional coordinate; and determining the orientation information of the monitored target according to the third three-dimensional coordinate and the other of the first three-dimensional coordinate and the second three-dimensional coordinate. The method and the device are beneficial to reducing the cost for determining the direction of the target object and are beneficial to realizing the direction determination of the target object in a narrow area.

Description

Method, device, system, medium and electronic equipment for determining target object orientation
Technical Field
The present disclosure relates to computer vision technology, and more particularly, to a method of determining a target object orientation, an apparatus for determining a target object orientation, a system for determining a target object orientation, a storage medium, and an electronic device.
Background
In applications such as DMS (Driver Monitor System, driver monitoring system), it is often necessary to determine the line of sight information of the driver in the acquired image. The line of sight information of the driver is the line of sight direction of the driver. The driver's sight line information and the like can be used to train the neural network in the DMS.
The method is used for collecting the images in the actual driving environment, determining the sight line information of the driver in the collected images and the like, and is beneficial to improving the training effect of the neural network, so that the performance of the DMS is beneficial to improvement. However, because the actual driving environment is usually in a relatively small space, the difficulty in determining the sight line information of the driver in the collected image is relatively high.
Disclosure of Invention
The present disclosure has been made in order to solve the above technical problems. The embodiment of the disclosure provides a method, a device, a storage medium and electronic equipment for determining the orientation of a target object.
According to a first aspect of embodiments of the present disclosure, there is provided a method of determining the orientation of a target object, the method comprising: determining a first three-dimensional coordinate of a key point of a monitored target from a first depth map acquired by a first depth camera; the first three-dimensional coordinates are three-dimensional coordinates in a first three-dimensional coordinate system of the first depth image pickup device; acquiring a second three-dimensional coordinate of a preset target point in a second three-dimensional coordinate system; wherein the preset target point is located outside the visual field range of the first depth camera; according to the position relation between the first three-dimensional coordinate system and the second three-dimensional coordinate system, carrying out coordinate conversion on one three-dimensional coordinate in the first three-dimensional coordinate system and the second three-dimensional coordinate system to obtain a third three-dimensional coordinate; and determining the orientation information of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for determining an orientation of a target object, including: the first acquisition module is used for determining a first three-dimensional coordinate of a key point of the monitored target from a first depth map acquired by the first depth camera; the first three-dimensional coordinates are three-dimensional coordinates in a first three-dimensional coordinate system of the first depth image pickup device; the second acquisition module is used for acquiring a second three-dimensional coordinate of the preset target point in a second three-dimensional coordinate system; wherein the preset target point is located outside the visual field range of the first depth camera; the coordinate conversion module is used for carrying out coordinate conversion on one three-dimensional coordinate in the first three-dimensional coordinate and the second three-dimensional coordinate according to the position relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system to obtain a third three-dimensional coordinate; and the orientation determining module is used for determining orientation information of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
According to a third aspect of embodiments of the present disclosure, there is provided a system for determining the orientation of a target object, the system comprising: the first depth camera is arranged in the driving space of the vehicle and faces the monitored target, and is used for collecting a first depth map containing the monitored target; a plurality of preset target points; the second depth image pick-up device is arranged in the driving space of the vehicle, is opposite to the first depth image pick-up device, faces the preset target point and is used for collecting a second depth image containing the preset target point; and the device is used for determining the orientation information of the monitored target according to the first depth map acquired by the first depth camera and the second depth map acquired by the second depth camera.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for implementing the above method.
According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method described above.
According to the method, the device and the system for determining the orientation of the target object provided by the embodiment of the disclosure, the first three-dimensional coordinates of the key points of the monitored target in the first depth image and the second three-dimensional coordinates of the preset target point outside the visual field range of the first depth image capturing device are acquired by utilizing the first depth image capturing device, so that the orientation information of the monitored target can be obtained. Even in a narrow space, the direction information of the monitored target can be conveniently obtained by installing a depth camera device at a position facing the monitored target and setting a preset target point in the narrow space, so that the direction information of the monitored target such as a driver can be conveniently and rapidly obtained in a narrow practical space such as a practical driving space.
The technical scheme of the present disclosure is described in further detail below through the accompanying drawings and examples.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing embodiments thereof in more detail with reference to the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, not to limit the disclosure. In the drawings, like reference numerals generally refer to like parts or steps.
FIG. 1 is a schematic illustration of a scenario in which the present disclosure is applicable;
FIG. 2 is a flow chart of one embodiment of a method of determining a target object orientation of the present disclosure;
FIG. 3 is a flow chart of one embodiment of the present disclosure for determining a first three-dimensional coordinate of a keypoint of a monitored target;
FIG. 4 is a flowchart of one embodiment of acquiring a second three-dimensional coordinate of a preset target point in a second three-dimensional coordinate system according to the present disclosure;
FIG. 5 is a flow chart of one embodiment of the present disclosure for obtaining a positional relationship between a first three-dimensional coordinate system and a second three-dimensional coordinate system;
FIG. 6 is a schematic view of one embodiment of a calibration plate used in the present disclosure;
FIG. 7 is a schematic structural view of one embodiment of an apparatus for determining the orientation of a target object of the present disclosure;
FIG. 8 is a schematic diagram of one embodiment of a system of determining target object orientation of the present disclosure;
fig. 9 is a block diagram of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present disclosure and not all of the embodiments of the present disclosure, and that the present disclosure is not limited by the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
It will be appreciated by those of skill in the art that the terms "first," "second," etc. in embodiments of the present disclosure are used merely to distinguish between different steps, devices or modules, etc., and do not represent any particular technical meaning nor necessarily logical order between them.
It should also be understood that in embodiments of the present disclosure, "plurality" may refer to two or more, and "at least one" may refer to one, two or more.
It should also be appreciated that any component, data, or structure referred to in the presently disclosed embodiments may be generally understood as one or more without explicit limitation or the contrary in the context.
In addition, the term "and/or" in this disclosure is merely an association relationship describing an association object, and indicates that three relationships may exist, such as a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the front and rear association objects are an or relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and that the same or similar features may be referred to each other, and for brevity, will not be described in detail.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
Embodiments of the present disclosure are applicable to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with the terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc., that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computing system storage media including memory storage devices.
Summary of the disclosure
In the process of realizing the present disclosure, the inventor finds that, because the actual driving space of some vehicles is usually relatively narrow, in practical application, a large number of images are often collected in a simulation environment, and orientation information, such as sight line information, of monitored targets, such as drivers, in each image is determined, so as to form a plurality of training samples; the neural network is then trained using a plurality of training samples. However, since there is a difference between the actual driving environment and the simulated environment, the neural network successfully trained, when applied to the actual driving environment, is likely to have a large deviation in the actually predicted direction of the driver's line of sight and the like, thereby affecting the performance of the DMS.
Exemplary overview
The techniques for determining the orientation of a target object provided by the present disclosure may be applicable in application scenarios such as forming training samples. A training sample, i.e., a sample acquisition, is formed. An example of an application scenario to which the technology for determining the orientation of a target object provided in the present disclosure is applicable is shown in fig. 1.
In fig. 1, a depth image pickup device 100 is provided in a driving space of a vehicle, and a lens of the depth image pickup device 100 faces a driver. The labels 101 are provided at a plurality of target points such as a front windshield, a rear mirror, a steering wheel, a display screen, two door insides on the front side, and a panel located in front of the passenger seat. The label 101 may be embodied as a sticker or the like. Fig. 1 only schematically shows that the tag 101 is provided at 13 target points, and in practical applications, the tag 101 may be provided at a larger number of target points.
A driver sitting on the driver seat is prompted to look at the tag 101 at each target point one by one, and when the driver looks at the tag 101 at each target point, the depth image pickup device 100 is controlled to take a depth map as well as a two-dimensional image (i.e., a 2D image). According to the method and the device, the sight line direction (also called sight line direction and the like) of the driver in each depth map can be determined according to the three-dimensional coordinates of each target point and the three-dimensional coordinates of the pupil key points of the driver in each depth map, so that the method and the device can set the labeling information of the sight line direction for a plurality of 2D images obtained through shooting respectively, and a plurality of training samples are formed. The present disclosure may train the neural network in the DMS using the training samples generated as described above.
Exemplary method
Fig. 2 is a flow chart of one embodiment of a method of determining a target object orientation of the present disclosure. The method as shown in fig. 2 includes: s200, S201, S202, and S203. The steps are described separately below.
S200, determining first three-dimensional coordinates of key points of the monitored target from a first depth map acquired by the first depth camera.
The first depth image pickup apparatus in the present disclosure may refer to an image pickup apparatus having a depth detection function. The first depth image pickup device can obtain depth information of each scene, object, person, etc. within the shooting visual field. The first depth image pickup device can obtain not only a depth map but also a 2D image by photographing. The first depth map in the present disclosure may refer to a set of depth information formed by depth values of a plurality of pixels (e.g., all pixels) in a 2D image.
The monitored target in the present disclosure may refer to a target for which orientation information needs to be determined. For example, a driver is required to determine the orientation information. For another example, a passenger who needs to determine the orientation information, etc.
The keypoints of the monitored object in the present disclosure may include one or more keypoints, and the keypoints of the monitored object are typically related to orientation information to be determined. That is, the orientation information formed by the key points of the monitored target can be used as the orientation information of the monitored target. For example, when the direction information to be determined is the line of sight direction, the key point of the monitored target in the present disclosure may refer to the pupil key point of the monitored target. For another example, when the direction information to be determined is finger pointing, the key point of the monitored target in the present disclosure may refer to a fingertip key point of the corresponding finger, or the like.
The first three-dimensional coordinates in the present disclosure may be three-dimensional coordinates in a first three-dimensional coordinate system of the first depth image capturing apparatus. The direction of the Z axis of the first three-dimensional coordinate system may be the direction of the optical axis of the first depth image pickup device. The X-axis of the first three-dimensional coordinate system may be directed horizontally rightward along the optical axis direction of the first depth image pickup device. The Y-axis of the first three-dimensional coordinate system may be oriented perpendicular to the X-axis direction and downward.
The method and the device can determine the corresponding pixel points of the key points of the monitored target in the first depth map, and can determine the first three-dimensional coordinates of the key points of the monitored target.
S201, acquiring a second three-dimensional coordinate of the preset target point in a second three-dimensional coordinate system.
The preset target point in the present disclosure is located outside the field of view of the first depth image capturing apparatus. The preset target point may be located in the space where the monitored target is located (e.g., in the vehicle where the driver is located), or may be located outside the space where the monitored target is located (e.g., outside the vehicle where the driver is located). The preset target point may be a target point specifically set for implementing the present disclosure, or may be an existing target point.
The second three-dimensional coordinate system in the present disclosure is a coordinate system different from the first three-dimensional coordinate system of the first depth image capturing apparatus. For example, the second three-dimensional coordinate system may be a world coordinate system, or may be a three-dimensional coordinate system of another imaging device (such as another depth imaging device), or the like.
The key points of the monitored target in the present disclosure are directed toward the preset target point. For example, the key points of the monitored target look at/point to the preset target point.
S202, according to the position relation between the first three-dimensional coordinate system and the second three-dimensional coordinate system, carrying out coordinate transformation on one three-dimensional coordinate in the first three-dimensional coordinate system and the second three-dimensional coordinate system, and obtaining a third three-dimensional coordinate.
The positional relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system in the present disclosure may refer to position conversion information capable of converting a point in one of the two three-dimensional coordinate systems into the other three-dimensional coordinate system. The present disclosure may utilize the position conversion information to perform position conversion on the first three-dimensional coordinate, so that the first three-dimensional coordinate is converted into a three-dimensional coordinate in the second three-dimensional coordinate system, that is, a third three-dimensional coordinate. The present disclosure may also use the position conversion information to perform position conversion on the second three-dimensional coordinate, so that the second three-dimensional coordinate is converted into a three-dimensional coordinate in the first three-dimensional coordinate system, that is, a third three-dimensional coordinate.
S203, determining the orientation information of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
The other three-dimensional coordinate (hereinafter simply referred to as another three-dimensional coordinate) among the first three-dimensional coordinate and the second three-dimensional coordinate in the present disclosure may refer to a three-dimensional coordinate in which position conversion is not performed among the first three-dimensional coordinate and the second three-dimensional coordinate. For example, in S202, if the first three-dimensional coordinate is subjected to position conversion, the second three-dimensional coordinate is another three-dimensional coordinate in S203. For another example, in S202, if the second three-dimensional coordinate is subjected to position conversion, the first three-dimensional coordinate is another three-dimensional coordinate in S203. Since the third three-dimensional coordinate and the other three-dimensional coordinate are in the same three-dimensional coordinate system, the present disclosure can obtain the orientation information of the monitored object by determining the direction of the line between the third three-dimensional coordinate and the other three-dimensional coordinate.
The orientation information of the monitored object in the present disclosure may be represented by using an angle between the above-mentioned connecting line and the X-axis, an angle between the above-mentioned connecting line and the Y-axis, and an angle between the above-mentioned connecting line and the Z-axis.
The orientation information of the monitored target in the present disclosure may include: the line of sight direction of the monitored object or the finger pointing direction of the monitored object, and the like.
The method and the device acquire and acquire the first three-dimensional coordinates of the key points of the monitored target in the first depth map and the second three-dimensional coordinates of the preset target point which is located outside the visual field range of the first depth camera by using the first depth camera, so that the orientation information of the monitored target can be acquired. Therefore, even in a narrow space, the direction information of the monitored target can be conveniently obtained by installing the depth camera device at the position facing the monitored target and setting the preset target point in the narrow space, so that the direction information of the monitored target such as a driver can be conveniently and rapidly obtained in a narrow practical space such as a practical driving space.
When the orientation information of the monitored target is used as the labeling information of the image, the method and the device can conveniently obtain a large number of training samples under the condition that manual labeling operation is not required to be executed. The training sample obtained based on the actual space is used for training the neural network, so that the phenomenon that the direction information of the predicted line of sight direction and the like of the monitored target has larger deviation when the successfully trained neural network is applied to the actual environment due to the difference between the simulation environment and the actual environment is avoided. In summary, the technical scheme provided by the disclosure is beneficial to reducing the acquisition cost of the training sample, improving the acquisition convenience of the training sample and improving the training effect of the neural network.
In one optional example, the monitored targets in the present disclosure may include: a driver located within the vehicle. Of course, the monitored targets in the present disclosure may also include: passengers located in a vehicle, etc. The vehicle in the present disclosure may include: vehicles such as vehicles, trains or planes.
Optionally, the direction information of the monitored target determined by the present disclosure may include: one or more of a line of sight direction of the monitored object, a finger direction of the monitored object, and the like. That is, the present disclosure may determine the line of sight direction of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate, and may determine the finger direction of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
Alternatively, the gaze direction and finger direction obtained by the present disclosure may be applied in applications that form training samples. In a specific example, the present disclosure may obtain a training sample with a gaze direction label by obtaining a gaze direction of a monitored target, for example, a gaze direction of a driver located in a vehicle, and after training a corresponding neural network in a DMS using such training sample, the successfully trained neural network may predict, in real time, the gaze direction of the driver according to an image including a face of the driver acquired in real time, so that the DMS may ensure safe driving of the vehicle by taking corresponding measures when the driver is tired or distracted. As another specific example, the present disclosure may obtain a training sample with finger pointing labels by obtaining the finger pointing of a monitored target, for example, the finger pointing of a driver located in a vehicle, and after training a corresponding neural network in a DMS by using such a neural network, the successfully trained neural network may predict the finger pointing of the driver in real time according to an image acquired in real time and including the hand of the driver, so as to implement functions such as automatic control in the vehicle according to the finger pointing of the driver.
In an alternative example, an embodiment of the present disclosure for determining the first three-dimensional coordinates of the keypoints of the monitored target from the first depth map acquired by the first depth image capturing device may be as shown in fig. 3.
In fig. 3, S300, two-dimensional coordinates of a key point of the monitored object in the two-dimensional image acquired by the first depth image capturing device are acquired.
Optionally, the first depth camera of the present disclosure may acquire 2D images in addition to the first depth map (i.e., depth image). The 2D image and the first depth map typically have the same viewing angle. The origin of the two-dimensional coordinate system of the 2D image may be at the upper left corner position of the 2D image or at another position such as the lower right corner of the 2D image. The two-dimensional coordinate system of the 2D image may be a coordinate system formed by an X-axis and a Y-axis in the first three-dimensional coordinate system.
Alternatively, the present disclosure may perform keypoint detection on the 2D image using a neural network (e.g., a convolutional neural network) for detecting keypoints, so that two-dimensional coordinates of the keypoints of the monitored object may be obtained from the output of the neural network.
Optionally, in a normal case, the present disclosure may enable the 2D image acquired by the first depth image capturing device to include only a face of a target, which is a monitored target, by adjusting a setting position of the first depth image capturing device. In the case where the face of the plurality of targets is included in the 2D image acquired by the first depth image pickup device, the present disclosure can obtain the face of the monitored target in the 2D image by the target tracking algorithm.
Alternatively, the present disclosure may obtain a face detection frame of a monitored target in a 2D image by using a first neural network, then, intercept a face image block from the 2D image by using the face detection frame, and provide the intercepted face image block as an input to a second neural network, where the second neural network may predict two-dimensional coordinates of a plurality of face key points in the input face image block, and the present disclosure may obtain two-dimensional coordinates of pupil key points from the two-dimensional coordinates of the plurality of face key points.
Optionally, the present disclosure may obtain a hand detection frame of a monitored target in a 2D image by using a first neural network, then intercept a hand image block from the 2D image by using the hand detection frame, and provide the intercepted hand image block as an input to a second neural network, where the second neural network may predict two-dimensional coordinates of a plurality of hand keypoints in the input hand image block, and the present disclosure may obtain two-dimensional coordinates of a finger tip keypoint of a corresponding finger from the two-dimensional coordinates of the plurality of hand keypoints.
Alternatively, when the keypoints of the monitored target are other keypoints than the pupil keypoint and the finger tip keypoints of the corresponding fingers, the present disclosure may obtain the two-dimensional coordinates of the corresponding keypoints of the monitored target in a similar manner, which is not illustrated here.
S301, determining a depth value corresponding to the two-dimensional coordinate according to a first depth map acquired by the first depth image pickup device.
Optionally, since the first depth map in the present disclosure includes depth values of a plurality of pixels (such as all pixel points) in the 2D image, after obtaining two-dimensional coordinates of a key point of the monitored target, the present disclosure may determine a depth value corresponding to the two-dimensional coordinates according to the two-dimensional coordinates.
S302, determining a first three-dimensional coordinate of a key point of the monitored object according to the two-dimensional coordinate and the depth value.
Optionally, the present disclosure may calculate two-dimensional coordinates of a key point of the monitored target using internal parameters (such as a focal length and a distortion factor) of the first depth image capturing device, so that an X-axis coordinate and a Y-axis coordinate of the key point in a three-dimensional coordinate system of the first depth image capturing device may be obtained. The X-axis coordinates and Y-axis coordinates form a first three-dimensional coordinate of the keypoint together with the depth value of the keypoint.
According to the method and the device, the two-dimensional coordinates of the key points of the monitored target are obtained through the two-dimensional images, and the first three-dimensional coordinates of the key points of the monitored target are obtained through the two-dimensional coordinates and the first depth map, so that the three-dimensional coordinates of the key points of the monitored target can be obtained conveniently.
In an alternative example, an embodiment of the present disclosure for acquiring the second three-dimensional coordinates of the preset target point in the second three-dimensional coordinate system may be as shown in fig. 4.
In fig. 4, S400, a second depth map acquired by a second depth camera is acquired.
Optionally, the second depth image capturing device of the present disclosure is disposed opposite to the first depth image capturing device, and the preset target point that cannot appear in the field of view of the first depth image capturing device may appear in the field of view of the second depth image capturing device.
Optionally, in the case where the present disclosure uses the second depth image capturing apparatus, the second three-dimensional coordinate system in the present disclosure is a three-dimensional coordinate system of the second depth image capturing apparatus. The direction of the Z axis of the second three-dimensional coordinate system may be the direction of the optical axis of the second depth image pickup device. The X-axis of the second three-dimensional coordinate system may be directed horizontally rightward along the optical axis direction of the second depth image pickup device. The Y-axis of the second three-dimensional coordinate system may be oriented perpendicular to the X-axis direction and downward.
S401, determining a second three-dimensional coordinate of the preset target point in a second three-dimensional coordinate system based on the second depth map.
Alternatively, the preset target point in the present disclosure may be a paper label or a thumbtack with vivid color, a magnetic label or a light source, etc. The second depth camera of the present disclosure may acquire 2D images in addition to the second depth map (i.e., depth image). The 2D image and the second depth map typically have the same viewing angle. The origin of the two-dimensional coordinate system of the 2D image may be at the upper left corner position of the 2D image or at another position such as the lower right corner of the 2D image. The two-dimensional coordinate system of the 2D image may be a coordinate system formed by an X-axis and a Y-axis in the second three-dimensional coordinate system.
Alternatively, the present disclosure may detect the 2D image using a neural network (e.g., a convolutional neural network) for detecting the preset target point, so that two-dimensional coordinates of the preset target point in the 2D image may be obtained from an output of the neural network. In addition, the present disclosure may also determine two-dimensional coordinates of a preset target point in the 2D image using manual labeling.
Optionally, since the second depth map in the present disclosure includes depth values of a plurality of pixels (such as all pixel points) in the 2D image, after two-dimensional coordinates of a preset target point are obtained, the present disclosure may determine a depth value corresponding to the two-dimensional coordinates according to the two-dimensional coordinates.
Optionally, the present disclosure may calculate the two-dimensional coordinates of the preset target point using the internal parameters (such as the focal length and the distortion factor) of the second depth image capturing device, so as to obtain the X-axis coordinates and the Y-axis coordinates of the preset target point in the three-dimensional coordinate system of the second depth image capturing device. The X-axis coordinates and the Y-axis coordinates form a second three-dimensional coordinate of the preset target point together with the depth value of the preset target point.
According to the method and the device, the two-dimensional coordinates of the preset target point are obtained through the two-dimensional image, and the second three-dimensional coordinates of the preset target point are obtained through the two-dimensional coordinates and the second depth map, so that the second three-dimensional coordinates of the preset target point can be obtained conveniently. Based on the description of fig. 4, the present disclosure may implement a method of determining the orientation of a target object using two depth cameras.
In an alternative example, the first depth image capturing device and the second depth image capturing device in the present disclosure are both disposed within the driving space of the vehicle, and the second depth image capturing device is disposed facing all preset target points. Because the second depth image pickup device and the first depth image pickup device are arranged oppositely, and the second depth image pickup device faces all preset target points, all the preset target points cannot appear in the visual field range of the first depth image pickup device.
Alternatively, the first depth image pickup device in the present disclosure may be fixed at a front windshield or a driving console or the like. The second depth image pickup device in the present disclosure may be provided on the seat back of the passenger seat or at a rear space of the seat back of the passenger seat or the like. The view angles of the first and second depth image pickup devices of the present disclosure may be not less than 180 degrees.
According to the method and the device, the two depth camera devices are relatively arranged in the driving space of the vehicle, so that the first three-dimensional coordinates of the key points of the monitored object and the second three-dimensional coordinates of the preset target points can be conveniently obtained, the driving space can be prevented from being too narrow, and difficulties are brought to data acquisition.
In an alternative example, the present disclosure may perform external parameter calibration processing on the first depth image capturing device and the second depth image capturing device using a calibration plate, thereby obtaining a positional relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system. One embodiment of the present disclosure for obtaining a positional relationship between a first three-dimensional coordinate system and a second three-dimensional coordinate system using a calibration plate is shown in fig. 5.
In fig. 5, S500, it is determined that the first depth image capturing device and the second depth image capturing device capture a third depth image and a fourth depth image respectively for the same calibration plate located at the same position.
Alternatively, the calibration plate in the present disclosure may be a checkerboard calibration plate, for example, a plate provided with a black and white checkerboard. An example of a black and white checkerboard provided on one face of the checkerboard calibration plate is shown in fig. 6. The calibration plate 602 in the present disclosure includes a plurality of corner points, which may be considered as the vertices of the black squares in the calibration plate 602 in fig. 6, and may be considered as the vertices of the white squares in the calibration plate 602 in fig. 6. In addition, both surfaces of the calibration plate 602 may be provided with black and white checkers, and each black checkers of one surface is completely overlapped with each black checkers of the other surface, and each white checkers of one surface is completely overlapped with each white checkers of the other surface.
Alternatively, after the first depth image capturing device 600 and the second depth image capturing device 601 are fixed in the driving space, a calibration plate may be disposed between the first depth image capturing device 600 and the second depth image capturing device 601 (for example, the first depth image capturing device 600 faces one surface of the calibration plate 602 and the second depth image capturing device 601 faces the other surface of the calibration plate 602), and in the case that the position of the calibration plate 602 is not changed, the first depth image capturing device 600 is controlled to capture the calibration plate 602 to obtain a third depth image, and the second depth image capturing device 601 is controlled to capture the calibration plate 602 to obtain a fourth depth image. The method and the device can control the two depth cameras to shoot the calibration plate 602 at the same time, and can also control the two depth cameras to shoot the calibration plate 602 successively.
S501, a fourth three-dimensional coordinate in a first three-dimensional coordinate system and a fifth three-dimensional coordinate in a second three-dimensional coordinate system based on a plurality of angular points at the same position in the third depth map and the fourth depth map are obtained.
Optionally, the present disclosure may determine pixel points corresponding to each of a plurality of corner points in the 2D image captured by the first depth image capturing device, so as to determine two-dimensional coordinates of the plurality of corner points in the image coordinate system. Then, since the third depth map in the present disclosure includes depth values of a plurality of pixels (such as all pixel points) in the 2D image, after two-dimensional coordinates of a plurality of corner points are obtained, the present disclosure may determine depth values corresponding to the two-dimensional coordinates of the plurality of corner points according to the two-dimensional coordinates of the plurality of corner points. And then, the two-dimensional coordinates of the plurality of corner points can be respectively calculated by using the internal parameters (such as focal length, distortion factor and the like) of the first depth image pickup device, so that the X-axis coordinates and the Y-axis coordinates of the plurality of corner points in the three-dimensional coordinate system of the first depth image pickup device can be obtained. The X-axis coordinates and the Y-axis coordinates of the plurality of corner points form fourth three-dimensional coordinates of the plurality of corner points together with the depth values of the plurality of corner points, respectively.
Likewise, the method and the device can determine the pixel points corresponding to the corner points in the 2D image shot by the second depth camera, so that the two-dimensional coordinates of the corner points in the image coordinate system can be determined. Then, since the fourth depth map in the present disclosure includes depth values of a plurality of pixels (such as all pixel points) in the 2D image, after two-dimensional coordinates of a plurality of corner points are obtained, the present disclosure may determine depth values corresponding to the two-dimensional coordinates of the plurality of corner points according to the two-dimensional coordinates of the plurality of corner points. And then, the two-dimensional coordinates of the plurality of corner points can be respectively calculated by using the internal parameters (such as focal length, distortion factor and the like) of the second depth image pickup device, so that the X-axis coordinates and the Y-axis coordinates of the plurality of corner points in the three-dimensional coordinate system of the second depth image pickup device can be obtained. The X-axis coordinates and the Y-axis coordinates of the plurality of corner points form fifth three-dimensional coordinates of the plurality of corner points together with the depth values of the plurality of corner points, respectively.
It should be noted that, regardless of the thickness of the calibration plate, the positions of the plurality of corner points in the 2D image captured by the first depth image capturing device and the plurality of corner points in the 2D image captured by the second depth image capturing device are the same. In addition, the number of the plurality of corner points may be related to the number of coordinates required to calculate the rotation information and the translation information. For example, the plurality of corner points may be at least five corner points or the like.
S502, determining rotation information and translation information between the first three-dimensional coordinate system and the second three-dimensional coordinate system according to the fourth three-dimensional coordinate and the fifth three-dimensional coordinate of the plurality of corner points.
Optionally, the present disclosure may calculate the fourth three-dimensional coordinates and the fifth three-dimensional coordinates of the plurality of corner points using a SolvePnP algorithm, and obtain rotation information and translation information between the first three-dimensional coordinate system and the second three-dimensional coordinate system according to the calculation result. The rotation information and the translation information may represent a positional relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system, that is, external parameters of the first depth image pickup device and the second depth image pickup device. The rotation information may be a rotation matrix, and the translation information may be a translation vector.
According to the method, the calibration plate is arranged between the first depth camera and the second depth camera, and the plurality of corner points on the calibration plate can be conveniently utilized to obtain the position relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system.
In an alternative example, the present disclosure may implement a method of determining the orientation of a target object using a depth camera. At this time, the second three-dimensional coordinate system in the present disclosure may be a world coordinate system. The second three-dimensional coordinates of the preset target point in the second three-dimensional coordinate system in the present disclosure may be the second three-dimensional coordinates of the preset target point in the world coordinate system. The present disclosure may position the preset target point using a high-precision positioning technique (e.g., carrier-phase differential technique, etc.), thereby obtaining a second three-dimensional coordinate of the preset target point. At this time, the positional relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system in the present disclosure is actually the positional relationship between the first three-dimensional coordinate system and the world coordinate system. The present disclosure may utilize existing coordinate transformation approaches to convert a first three-dimensional coordinate to a three-dimensional coordinate in a world coordinate system, and may also convert a second three-dimensional coordinate in the world coordinate system to a three-dimensional coordinate in the first three-dimensional coordinate system.
The present disclosure provides another implementation way to obtain the third three-dimensional coordinate by using the world coordinate system as the second three-dimensional coordinate system, and obtaining the third three-dimensional coordinate without setting external parameters between the depth image capturing devices.
In an alternative example, the determining the orientation information of the monitored target according to the present disclosure may specifically be: connecting the third three-dimensional coordinate with the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate, and determining the direction of the connecting line. In one example, the present disclosure may link a three-dimensional coordinate point of a pupil key point of a driver and a three-dimensional coordinate point of a preset target point located in the same three-dimensional coordinate system, and take a direction of the link as a line of sight direction of the driver. In another example, the present disclosure links a three-dimensional coordinate point of a key point of an index finger tip of a driver located in the same three-dimensional coordinate system with a three-dimensional coordinate point of a preset target point, and points a direction of the link as a finger of the driver. The direction of the connecting line of the key point of the monitored target and the preset target point is utilized, so that the orientation information of the monitored target can be conveniently and accurately determined.
In an alternative example, the present disclosure may use the two-dimensional image acquired by the first depth camera and the above-determined orientation information as training samples (may also be referred to as acquired data). I.e. the orientation information is used as annotation information for the two-dimensional image. Of course, the training samples of the present disclosure may also include other labeling information, such as key point labeling information, and the like.
The training sample is formed by utilizing the orientation information and the two-dimensional image, so that the acquisition cost of the training sample is reduced, and the acquisition convenience of the training sample is improved. The method and the device can be used for shooting the two-dimensional image by using the first depth camera in the practical application environment of the neural network and setting corresponding labeling information for the two-dimensional image, so that the training sample is used for training the neural network, and the training effect of the neural network is improved. When the neural network is applied to the DMS, the performance of the DMS is improved.
Exemplary apparatus
Fig. 7 is a schematic structural diagram of an embodiment of an apparatus for determining the orientation of a target object of the present disclosure. The apparatus of this embodiment may be used to implement the corresponding method embodiments of the present disclosure. The apparatus as shown in fig. 7 includes: a first acquisition module 700, a second acquisition module 701, a coordinate conversion module 702, and a determination orientation module 703. Optionally, the apparatus of the present disclosure may further include: at least one of the determining positional relationship module 704 and the generating acquisition data module 705.
The first acquisition module 700 is configured to determine a first three-dimensional coordinate of a key point of a monitored target from a first depth map acquired by a first depth camera. The first three-dimensional coordinates are three-dimensional coordinates in a first three-dimensional coordinate system of the first depth image pickup device.
Alternatively, the first obtaining module 700 may obtain two-dimensional coordinates of a key point of the monitored target in the two-dimensional image acquired by the first depth image capturing device, the first obtaining module 700 determines a depth value corresponding to the two-dimensional coordinates according to the first depth image acquired by the first depth image capturing device, and the first obtaining module 700 determines a first three-dimensional coordinate of the key point of the monitored target according to the two-dimensional coordinates and the depth value.
The second acquisition module 701 is configured to acquire a second three-dimensional coordinate of the preset target point in a second three-dimensional coordinate system. The preset target point is located outside the visual field range of the first depth camera.
Optionally, the second obtaining module 701 may first obtain a second depth map acquired by the second depth image capturing device, and determine, based on the second depth map, a second three-dimensional coordinate of the preset target point in a second three-dimensional coordinate system. The second depth image pickup device is arranged opposite to the first depth image pickup device, and the second three-dimensional coordinate system is the three-dimensional coordinate system of the second depth image pickup device.
Optionally, the first depth image capturing device and the second depth image capturing device in the present disclosure are both disposed in a driving space of a vehicle, and the second depth image capturing device is disposed facing a preset target point.
Alternatively, the second acquiring module 701 may acquire the second three-dimensional coordinate of the preset target point in the world coordinate system.
The coordinate conversion module 702 is configured to perform coordinate conversion on one of the first three-dimensional coordinate and the second three-dimensional coordinate according to a positional relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system, so as to obtain a third three-dimensional coordinate.
The orientation determining module 703 is configured to determine orientation information of the monitored object according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
Alternatively, the determining orientation module 703 may determine the line-of-sight direction of the monitored object according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
Alternatively, the determining orientation module 703 may determine the finger pointing direction of the monitored object according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
Alternatively, the direction determining module 703 may determine the direction of the connection line between the key point of the monitored target and the preset target point according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate, and determine the direction information of the monitored target based on the direction of the connection line. For example, the determination orientation module 703 takes the direction of the link as orientation information of the monitored object.
The determining position relation module 704 is configured to determine a third depth image and a fourth depth image that are respectively captured by the first depth image capturing device and the second depth image capturing device for the same calibration board located at the same position, and obtain a fourth three-dimensional coordinate of a plurality of corner points in the same position in a first three-dimensional coordinate system and a fifth three-dimensional coordinate of the plurality of corner points in the fourth depth image and the fourth depth image in the second three-dimensional coordinate system, and determine rotation information and translation information between the first three-dimensional coordinate system and the second three-dimensional coordinate system according to the fourth three-dimensional coordinate and the fifth three-dimensional coordinate of the plurality of corner points by the determining position relation module 704. The rotation information and the translation information are the position relation between the first three-dimensional coordinate system and the second three-dimensional coordinate system.
The data generation and collection module 705 is configured to generate collection data including a two-dimensional image collected by the first depth camera and a direction of the connection line.
Fig. 8 is a schematic structural diagram of one embodiment of a system for determining the orientation of a target object of the present disclosure. The system of this embodiment may be used to implement the corresponding method embodiments of the present disclosure. The system as shown in fig. 8 includes: a first depth camera 800, a plurality of preset target points 801, a second depth camera 802, and a means 803 for determining the orientation of a target object.
The first depth image capturing device 800 is disposed in a driving space of the vehicle, and the first depth image capturing device 800 faces the monitored target, and the first depth image capturing device 800 is configured to collect a first depth image including the monitored target.
The number of preset target points 801 may be set according to actual requirements.
The second depth image capturing device 802 is disposed in the driving space of the vehicle, the second depth image capturing device 802 is disposed opposite to the first depth image capturing device 801, and the second depth image capturing device 802 faces all preset target points 801, and the second depth image capturing device 802 is configured to collect a second depth image including the preset target points.
The device 803 for determining the orientation of the target object is configured to determine orientation information of the monitored target according to the first depth map acquired by the first depth camera 800 and the second depth map acquired by the second depth camera 802.
The structure of the means 803 for determining the orientation of the target object may be as described above with respect to fig. 7 and will not be described in detail here.
Exemplary electronic device
An electronic device according to an embodiment of the present disclosure is described below with reference to fig. 9. Fig. 9 shows a block diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 9, the electronic device 91 includes one or more processors 911 and memory 912.
The processor 911 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities and may control other components in the electronic device 91 to perform desired functions.
Memory 912 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example: random Access Memory (RAM) and/or cache, etc. The nonvolatile memory may include, for example: read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer readable storage medium that the processor 911 may execute to implement the methods of determining target object orientation and/or other desired functions of the various embodiments of the present disclosure described above. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.
In one example, the electronic device 91 may further include: input devices 913, output devices 914, and the like, interconnected by a bus system and/or other forms of connection mechanisms (not shown). In addition, the input device 913 may also include, for example, a keyboard, a mouse, and the like. The output device 914 can output various information to the outside. The output devices 914 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.
Of course, only some of the components of the electronic device 91 relevant to the present disclosure are shown in fig. 9 for simplicity, components such as buses, input/output interfaces, and the like being omitted. In addition, the electronic device 91 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer readable storage Medium
In addition to the methods and apparatus described above, embodiments of the present disclosure may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in a method of determining a target object orientation according to various embodiments of the present disclosure described in the above "exemplary methods" section of this specification.
The computer program product may write program code for performing the operations of embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform steps in a method of determining a target object orientation according to various embodiments of the present disclosure described in the above "exemplary method" section of the present disclosure.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The basic principles of the present disclosure have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present disclosure are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present disclosure. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, since the disclosure is not necessarily limited to practice with the specific details described.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, so that the same or similar parts between the embodiments are mutually referred to. For system embodiments, the description is relatively simple as it essentially corresponds to method embodiments, and reference should be made to the description of method embodiments for relevant points.
The block diagrams of the devices, apparatuses, devices, systems referred to in this disclosure are merely illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatus, devices, and systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present disclosure may also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the apparatus, devices and methods of the present disclosure, components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered equivalent to the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, changes, additions, and sub-combinations thereof.

Claims (10)

1. A method of determining the orientation of a target object, comprising:
determining a first three-dimensional coordinate of a key point of a monitored target from a first depth map acquired by a first depth camera; the first three-dimensional coordinates are three-dimensional coordinates in a first three-dimensional coordinate system of the first depth image pickup device;
acquiring a second three-dimensional coordinate of a preset target point in a second three-dimensional coordinate system; the preset target point is located outside the visual field range of the first depth camera, and the key point of the monitored target faces the preset target point;
according to the position relation between the first three-dimensional coordinate system and the second three-dimensional coordinate system, carrying out coordinate conversion on one three-dimensional coordinate in the first three-dimensional coordinate system and the second three-dimensional coordinate system to obtain a third three-dimensional coordinate;
and determining the orientation information of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
2. The method of claim 1, wherein the determining the orientation information of the monitored object from the third three-dimensional coordinate and the other of the first and second three-dimensional coordinates comprises:
determining the sight line direction of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate in the first three-dimensional coordinate and the second three-dimensional coordinate; or alternatively
And determining the finger pointing direction of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
3. The method of claim 1, wherein determining the first three-dimensional coordinates of the keypoints of the monitored object from the first depth map acquired by the first depth camera comprises:
acquiring two-dimensional coordinates of key points of a monitored target in a two-dimensional image acquired by the first depth camera;
determining a depth value corresponding to the two-dimensional coordinate according to a first depth map acquired by the first depth camera;
and determining a first three-dimensional coordinate of the key point of the monitored target according to the two-dimensional coordinate and the depth value.
4. A method according to any one of claims 1 to 3, wherein the acquiring second three-dimensional coordinates of the preset target point in a second three-dimensional coordinate system comprises:
acquiring a second depth map acquired by a second depth camera;
determining a second three-dimensional coordinate of the preset target point in a second three-dimensional coordinate system based on the second depth map;
the second depth image pickup device is arranged opposite to the first depth image pickup device, and the second three-dimensional coordinate system is the three-dimensional coordinate system of the second depth image pickup device.
5. The method of claim 4, wherein the method further comprises:
determining a third depth map and a fourth depth map which are shot by the first depth camera and the second depth camera respectively aiming at the same calibration plate positioned at the same position;
acquiring a fourth three-dimensional coordinate in a first three-dimensional coordinate system and a fifth three-dimensional coordinate in a second three-dimensional coordinate system based on a plurality of angular points at the same position in the third depth map and the fourth depth map;
determining rotation information and translation information between the first three-dimensional coordinate system and the second three-dimensional coordinate system according to the fourth three-dimensional coordinate and the fifth three-dimensional coordinate of the plurality of corner points;
The rotation information and the translation information are the position relation between the first three-dimensional coordinate system and the second three-dimensional coordinate system.
6. The method of claim 1, wherein the determining the orientation information of the monitored object from the third three-dimensional coordinate and the other of the first and second three-dimensional coordinates comprises:
determining the direction of a connecting line of the key point of the monitored target and the preset target point according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate;
and determining the orientation information of the monitored target based on the direction of the connecting line.
7. An apparatus for determining the orientation of a target object, comprising:
the first acquisition module is used for determining a first three-dimensional coordinate of a key point of the monitored target from a first depth map acquired by the first depth camera; the first three-dimensional coordinates are three-dimensional coordinates in a first three-dimensional coordinate system of the first depth image pickup device;
the second acquisition module is used for acquiring a second three-dimensional coordinate of the preset target point in a second three-dimensional coordinate system; the preset target point is located outside the visual field range of the first depth camera, and the key point of the monitored target faces the preset target point;
The coordinate conversion module is used for carrying out coordinate conversion on one three-dimensional coordinate in the first three-dimensional coordinate and the second three-dimensional coordinate according to the position relationship between the first three-dimensional coordinate system and the second three-dimensional coordinate system to obtain a third three-dimensional coordinate;
and the orientation determining module is used for determining orientation information of the monitored target according to the third three-dimensional coordinate and the other three-dimensional coordinate of the first three-dimensional coordinate and the second three-dimensional coordinate.
8. A system for determining the orientation of a target object, comprising:
the first depth camera is arranged in the driving space of the vehicle and faces the monitored target, and is used for collecting a first depth map containing the monitored target;
a plurality of preset target points;
the second depth camera is arranged in the driving space of the vehicle, is opposite to the first depth camera, faces the preset target point and is used for collecting a second depth map containing the preset target point, and the key point of the monitored target faces the preset target point;
and the device is used for determining the orientation information of the monitored target according to the first depth map acquired by the first depth camera and the second depth map acquired by the second depth camera.
9. A computer readable storage medium storing a computer program for performing the method of any one of the preceding claims 1-6.
10. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor being configured to read the executable instructions from the memory and execute the instructions to implement the method of any of the preceding claims 1-6.
CN202010527892.6A 2020-06-11 2020-06-11 Method, device, system, medium and electronic equipment for determining target object orientation Active CN111723716B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010527892.6A CN111723716B (en) 2020-06-11 2020-06-11 Method, device, system, medium and electronic equipment for determining target object orientation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010527892.6A CN111723716B (en) 2020-06-11 2020-06-11 Method, device, system, medium and electronic equipment for determining target object orientation

Publications (2)

Publication Number Publication Date
CN111723716A CN111723716A (en) 2020-09-29
CN111723716B true CN111723716B (en) 2024-03-08

Family

ID=72568015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010527892.6A Active CN111723716B (en) 2020-06-11 2020-06-11 Method, device, system, medium and electronic equipment for determining target object orientation

Country Status (1)

Country Link
CN (1) CN111723716B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112740268B (en) * 2020-11-23 2022-06-07 华为技术有限公司 Target detection method and device
CN112541553B (en) * 2020-12-18 2024-04-30 深圳地平线机器人科技有限公司 Method, device, medium and electronic equipment for detecting state of target object
CN113096151B (en) * 2021-04-07 2022-08-09 地平线征程(杭州)人工智能科技有限公司 Method and apparatus for detecting motion information of object, device and medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1650712A2 (en) * 2004-10-25 2006-04-26 Sony Corporation Generation of a 3D image with controlled viewing angle for navigation
CN102830793A (en) * 2011-06-16 2012-12-19 北京三星通信技术研究有限公司 Sight tracking method and sight tracking device
CN104935893A (en) * 2015-06-17 2015-09-23 浙江大华技术股份有限公司 Monitoring method and device
CN107861625A (en) * 2017-12-04 2018-03-30 北京易真学思教育科技有限公司 Gaze tracking system and method based on 3d space model
CN108986161A (en) * 2018-06-19 2018-12-11 亮风台(上海)信息科技有限公司 A kind of three dimensional space coordinate estimation method, device, terminal and storage medium
CN109492514A (en) * 2018-08-28 2019-03-19 初速度(苏州)科技有限公司 A kind of method and system in one camera acquisition human eye sight direction
CN109859270A (en) * 2018-11-28 2019-06-07 浙江合众新能源汽车有限公司 A kind of human eye three-dimensional coordinate localization method and separate type binocular camera shooting device
CN110458104A (en) * 2019-08-12 2019-11-15 广州小鹏汽车科技有限公司 The human eye sight direction of human eye sight detection system determines method and system
CN110780305A (en) * 2019-10-18 2020-02-11 华南理工大学 Track cone bucket detection and target point tracking method based on multi-line laser radar
CN110826357A (en) * 2018-08-07 2020-02-21 北京市商汤科技开发有限公司 Method, device, medium and equipment for three-dimensional detection and intelligent driving control of object
CN110928627A (en) * 2019-11-22 2020-03-27 北京市商汤科技开发有限公司 Interface display method and device, electronic equipment and storage medium
CN110969061A (en) * 2018-09-29 2020-04-07 北京市商汤科技开发有限公司 Neural network training method, neural network training device, visual line detection method, visual line detection device and electronic equipment

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1650712A2 (en) * 2004-10-25 2006-04-26 Sony Corporation Generation of a 3D image with controlled viewing angle for navigation
CN102830793A (en) * 2011-06-16 2012-12-19 北京三星通信技术研究有限公司 Sight tracking method and sight tracking device
CN104935893A (en) * 2015-06-17 2015-09-23 浙江大华技术股份有限公司 Monitoring method and device
CN107861625A (en) * 2017-12-04 2018-03-30 北京易真学思教育科技有限公司 Gaze tracking system and method based on 3d space model
CN108986161A (en) * 2018-06-19 2018-12-11 亮风台(上海)信息科技有限公司 A kind of three dimensional space coordinate estimation method, device, terminal and storage medium
CN110826357A (en) * 2018-08-07 2020-02-21 北京市商汤科技开发有限公司 Method, device, medium and equipment for three-dimensional detection and intelligent driving control of object
CN109492514A (en) * 2018-08-28 2019-03-19 初速度(苏州)科技有限公司 A kind of method and system in one camera acquisition human eye sight direction
WO2020042345A1 (en) * 2018-08-28 2020-03-05 初速度(苏州)科技有限公司 Method and system for acquiring line-of-sight direction of human eyes by means of single camera
CN110969061A (en) * 2018-09-29 2020-04-07 北京市商汤科技开发有限公司 Neural network training method, neural network training device, visual line detection method, visual line detection device and electronic equipment
CN109859270A (en) * 2018-11-28 2019-06-07 浙江合众新能源汽车有限公司 A kind of human eye three-dimensional coordinate localization method and separate type binocular camera shooting device
CN110458104A (en) * 2019-08-12 2019-11-15 广州小鹏汽车科技有限公司 The human eye sight direction of human eye sight detection system determines method and system
CN110780305A (en) * 2019-10-18 2020-02-11 华南理工大学 Track cone bucket detection and target point tracking method based on multi-line laser radar
CN110928627A (en) * 2019-11-22 2020-03-27 北京市商汤科技开发有限公司 Interface display method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
虚拟视角自适应的视线矫正方法;尹苓琳等;《计算机辅助设计与图形学学报》;20131215(第12期);全文 *
飞行时间深度相机和彩色相机的联合标定;周杰等;《信号处理》;20170125(第01期);全文 *

Also Published As

Publication number Publication date
CN111723716A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN111723716B (en) Method, device, system, medium and electronic equipment for determining target object orientation
CN109242903B (en) Three-dimensional data generation method, device, equipment and storage medium
US10726576B2 (en) System and method for identifying a camera pose of a forward facing camera in a vehicle
KR101320683B1 (en) Display correction method and module based on augmented reality, object information display method and system using the same
CN112509047B (en) Pose determining method and device based on image, storage medium and electronic equipment
US20200184726A1 (en) Implementing three-dimensional augmented reality in smart glasses based on two-dimensional data
US11816865B2 (en) Extrinsic camera parameter calibration method, extrinsic camera parameter calibration apparatus, and extrinsic camera parameter calibration system
CN112541553B (en) Method, device, medium and electronic equipment for detecting state of target object
CN112326258B (en) Method, device and system for detecting automatic driving state and electronic equipment
JP5624370B2 (en) Moving body detection apparatus and moving body detection method
CN110796738A (en) Three-dimensional visualization method and device for tracking state of inspection equipment
CN111309141B (en) Screen estimation
CN113689508B (en) Point cloud labeling method and device, storage medium and electronic equipment
CN115147683A (en) Pose estimation network model training method, pose estimation method and device
CN112446347B (en) Face direction determining method and device, storage medium and electronic equipment
JP5418427B2 (en) Collision time calculation device, collision time calculation method and program
CN111429519B (en) Three-dimensional scene display method and device, readable storage medium and electronic equipment
CN110827337B (en) Method and device for determining posture of vehicle-mounted camera and electronic equipment
CN112668596A (en) Three-dimensional object recognition method and device and recognition model training method and device
JP2023100258A (en) Pose estimation refinement for aerial refueling
CN116030139A (en) Camera detection method and device, electronic equipment and vehicle
WO2019055260A1 (en) Systems and methods for calibrating imaging and spatial orientation sensors
Kim et al. Method for user interface of large displays using arm pointing and finger counting gesture recognition
KR102601438B1 (en) System and method for overlaying target-point based on virtual environment
CN118038560B (en) Method and device for predicting face pose of driver

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant