WO2023273499A1 - Procédé et appareil de mesure de profondeur, dispositif électronique et support de stockage - Google Patents

Procédé et appareil de mesure de profondeur, dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2023273499A1
WO2023273499A1 PCT/CN2022/085920 CN2022085920W WO2023273499A1 WO 2023273499 A1 WO2023273499 A1 WO 2023273499A1 CN 2022085920 W CN2022085920 W CN 2022085920W WO 2023273499 A1 WO2023273499 A1 WO 2023273499A1
Authority
WO
WIPO (PCT)
Prior art keywords
detected
target object
frame
key point
point detection
Prior art date
Application number
PCT/CN2022/085920
Other languages
English (en)
Chinese (zh)
Inventor
赵佳
谢符宝
刘文韬
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023273499A1 publication Critical patent/WO2023273499A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Definitions

  • the present disclosure relates to the technical field of computers, and in particular to a depth detection method and device, electronic equipment and a storage medium.
  • the depth information can reflect the distance of the human body in the image relative to the image acquisition device, and based on the depth information, the human body object in the image can be spatially positioned.
  • the binocular camera is a relatively common and widely used image acquisition device. Based on at least two images collected by the binocular camera, the depth information of the human body in the image can be determined by matching between images. However, the matching calculation between images is complex and The accuracy is easily affected. How to conveniently and accurately determine the depth information of the human body in the image has become an urgent problem to be solved.
  • the present disclosure proposes a technical solution for depth detection.
  • a deep detection method including:
  • the multiple frames to be detected include image frames obtained by collecting images of the target object from at least two acquisition angles of view; performing detection of the target area in the target object according to the frames to be detected Key point detection, determining a plurality of key point detection results corresponding to the multiple frames to be detected, wherein the target area includes a head area and/or a shoulder area; according to the multiple key point detection results, determining Depth information of the target object.
  • the determining the depth information of the target object according to the multiple key point detection results includes: acquiring at least two preset device parameters respectively corresponding to at least two acquisition devices, the The at least two acquisition devices are used to acquire images of the target object from at least two acquisition angles of view; according to the at least two preset device parameters and the multiple key point detection results, determine the Depth information of the target object.
  • the depth information includes a depth distance, and the depth distance includes a distance between the target object and the optical center of the acquisition device; the at least two preset device parameters and The multiple key point detection results, determining the depth information of the target object in the frame to be detected includes: according to the preset external parameters in the at least two preset device parameters and the multiple key point detection As a result, the depth distance is obtained by coordinates in at least two forms; wherein, the preset external parameters include relative parameters formed between the at least two acquisition devices.
  • the depth information includes an offset angle
  • the offset angle includes a spatial angle of the target object relative to the optical axis of the acquisition device
  • determining the depth information of the target object in the frame to be detected includes: according to the preset internal parameters in the at least two preset device parameters and the multiple The coordinates of the key point detection results in at least two forms are used to obtain the offset angle; wherein, the preset internal parameters include device parameters corresponding to the at least two devices.
  • the performing the key point detection of the target area in the target object according to the frame to be detected includes: according to the position information of the target object in the reference frame, The key point detection is performed on the target area of the target object in the frame, and the key point detection result corresponding to the frame to be detected is obtained, wherein the reference frame is the target video to which the frame to be detected belongs, and is located in the The video frame preceding the frame to be detected.
  • the key point detection is performed on the target area of the target object in the frame to be detected to obtain the
  • the key point detection result corresponding to the frame includes: clipping the frame to be detected according to the first position of the target object in the reference frame to obtain a clipping result; and the target area of the target object in the clipping result Perform key point detection to obtain a key point detection result corresponding to the frame to be detected.
  • the key point detection is performed on the target area of the target object in the frame to be detected to obtain the
  • the key point detection result corresponding to the frame includes: obtaining a second position of the target area of the target object in the reference frame; cutting the frame to be detected according to the second position to obtain a cutting result;
  • the key point detection is performed on the target area of the target object in the clipping result, and the key point detection result corresponding to the frame to be detected is obtained.
  • the obtaining the second position of the target area of the target object in the reference frame includes: identifying the target area in the reference frame by using a first neural network to obtain The second position output by the first neural network; and/or, according to the key point detection result corresponding to the reference frame, the second position of the target area in the reference frame is obtained.
  • the method further includes: determining a position of the target object in a three-dimensional space according to depth information of the target object.
  • a depth detection device including:
  • An acquisition module configured to acquire multiple frames to be detected, wherein the multiple frames to be detected include image frames obtained by image acquisition of the target object from at least two acquisition angles of view;
  • a key point detection module configured to The frame to be detected performs the key point detection of the target area in the target object, and determines a plurality of key point detection results corresponding to the multiple frames to be detected, wherein the target area includes a head area and/or a shoulder area ;
  • a depth detection module configured to determine the depth information of the target object according to the multiple key point detection results.
  • the depth detection module is configured to: acquire at least two preset device parameters respectively corresponding to at least two acquisition devices, and the at least two acquisition devices are used to measure The target object performs image acquisition; according to the at least two preset device parameters and the multiple key point detection results, determine the depth information of the target object in the frame to be detected.
  • the depth information includes a depth distance
  • the depth distance includes a distance between the target object and the optical center of the acquisition device
  • the depth detection module is further configured to: according to the at least The preset external parameters among the two preset device parameters and the coordinates of the plurality of key point detection results in at least two forms obtain the depth distance; wherein, the preset external parameters include the at least two Collect relative parameters formed between devices.
  • the depth information includes an offset angle
  • the offset angle includes a spatial angle of the target object relative to the optical axis of the acquisition device
  • the depth detection module is further configured to: According to the preset internal parameters in the at least two preset device parameters and the coordinates of the plurality of key point detection results in at least two forms, the offset angle is obtained; wherein the preset internal parameters include Device parameters respectively corresponding to the at least two devices.
  • the key point detection module is configured to: perform key point detection on the target area of the target object in the frame to be detected according to the position information of the target object in the reference frame , to obtain a key point detection result corresponding to the frame to be detected, wherein the reference frame is a video frame before the frame to be detected in the target video to which the frame to be detected belongs.
  • the key point detection module is further configured to: clip the frame to be detected according to the first position of the target object in the reference frame to obtain a clipping result; The key point detection is performed on the target area of the target object in the clipping result, and the key point detection result corresponding to the frame to be detected is obtained.
  • the key point detection module is further configured to: acquire a second position of the target area of the target object in the reference frame; Clipping the frame to obtain a clipping result; performing key point detection on the target area of the target object in the clipping result to obtain a key point detection result corresponding to the frame to be detected.
  • the key point detection module is further configured to: use a first neural network to identify the target area in the reference frame to obtain a second position output by the first neural network; and /or, obtain the second position of the target area in the reference frame according to the key point detection result corresponding to the reference frame.
  • the apparatus is further configured to: determine the position of the target object in a three-dimensional space according to the depth information of the target object.
  • an electronic device including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to call the instructions stored in the memory to execute the above-mentioned method.
  • a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented.
  • a computer program product including computer readable codes, and when the computer readable codes are run in an electronic device, a processor in the electronic device executes the above method.
  • the parallax formed by the multiple frames to be detected collected under at least two acquisition angles can be used to utilize the multi-frames to be detected
  • the detection results of multiple key points corresponding to the target area in the frame realize the calculation based on parallax to obtain depth information, effectively reduce the amount of data processed in the process of calculation based on parallax, and improve the efficiency and accuracy of depth detection.
  • Fig. 1 shows a flowchart of a depth detection method according to an embodiment of the present disclosure.
  • FIG. 2 shows a schematic diagram of a target area according to an embodiment of the present disclosure.
  • Fig. 3 shows a flowchart of a depth detection method according to an embodiment of the present disclosure.
  • FIG. 4 shows a block diagram of a depth detection device according to an embodiment of the present disclosure.
  • Fig. 5 shows a schematic diagram of an application example according to the present disclosure.
  • FIG. 6 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • FIG. 7 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • Fig. 1 shows a flowchart of a depth detection method according to an embodiment of the present disclosure.
  • the method can be performed by a depth detection device, and the depth detection device can be an electronic device such as a terminal device or a server, and the terminal device can be user equipment (User Equipment, UE), mobile device, user terminal, terminal, cellular phone, cordless phone, personal Digital assistants (Personal Digital Assistant, PDA), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • the method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • the method can be performed by a server.
  • the method may include:
  • Step S11 acquiring multiple frames to be detected, wherein the multiple frames to be detected include image frames obtained by collecting images of the target object from at least two collection angles of view.
  • the frame to be detected may be any image frame that requires depth detection, for example, it may be an image frame extracted from a captured video, or an image frame obtained by capturing an image.
  • the number of multiple frames to be detected is not limited in this embodiment of the present disclosure, and may include two or more frames.
  • the acquisition angle of view can be the angle of image acquisition of the target object, and different frames to be detected can be acquired by image acquisition devices set at different acquisition angles of view, or can be acquired by the same image device under different acquisition angles of view.
  • the frame to be detected includes the target object to be subjected to depth detection.
  • the type of the target object is not limited in the embodiments of the present disclosure, and may include various human objects, animal objects, or some mechanical objects, such as robots. Subsequent disclosed embodiments are described by taking the target object as a person object as an example. Implementations in which the target object is other types can be flexibly expanded by referring to the subsequent disclosed embodiments, and will not be elaborated one by one.
  • the number of target objects contained in the frame to be detected is also not limited in the embodiments of the present disclosure, and may contain one or more target objects, which can be flexibly determined according to actual conditions.
  • frame extraction may be performed from one or more videos to obtain multiple frames to be detected, wherein, frame The extraction may include one or more methods such as frame-by-frame extraction, frame sampling at a certain interval, or random frame sampling.
  • Step S12 performing key point detection of the target area in the target object according to the frame to be detected, and determining multiple key point detection results corresponding to multiple frames to be detected.
  • the key point detection result may include the position of the detected key point in the frame to be detected.
  • the number and types of detected key points can be flexibly determined according to the actual situation.
  • the number of detected key points can include 2 to 150, etc.
  • the detected key points can be Contains 14 limb key points of the human body (such as head key points, shoulder key points, neck key points, elbow key points, wrist key points, crotch key points, leg key points and foot key points, etc.) , or include 59 outline key points on the outline of the human body (such as some key points on the periphery of the head or the periphery of the shoulders) and the like.
  • the detected key points may also only include three key points including the key point of the head, the key point of the left shoulder and the key point of the right shoulder.
  • Multiple key point detection results can correspond to multiple frames to be detected respectively. For example, if key point detection is performed on multiple frames to be detected, each frame to be detected can correspond to a key point detection result, so that it can be obtained Multiple keypoint detection results.
  • the target area may include a head area and/or a shoulder area
  • the head area of the target object may be the area where the head of the target object is located, such as the area formed between the key points of the head and the key points of the neck; the shoulder area Then it may be the area where the shoulder and neck of the target object are located, such as the area formed between the key points of the neck and the key points of the shoulder.
  • Fig. 2 shows a schematic diagram of a target area according to an embodiment of the present disclosure.
  • the head key can be point
  • the key point of the left shoulder and the key point of the right shoulder are connected by the head and shoulders box, which is used as the target area.
  • the head-shoulders frame can be a rectangle as shown in Figure 2. It can be seen from Figure 2 that the head-shoulders frame can be connected to the head key point at the head vertex of the target object and the left shoulder key point at the left shoulder joint. and the right shoulder key point at the right shoulder joint.
  • the head-shoulders frame may also be in other shapes, such as polygons, circles, or other irregular shapes.
  • the frame to be detected can be input into any neural network with key point detection function to realize key point detection; in some possible implementations, It is also possible to perform key point identification on the frame to be detected through a relevant key point identification algorithm to obtain a key point detection result; Position, perform key point detection on a part of the image area in the frame to be detected to obtain key point detection results, etc.
  • step S12 reference may be made to the following disclosed embodiments in detail, which will not be expanded here.
  • Step S13 according to the multiple key point detection results, determine the depth information of the target object in the frame to be detected.
  • the information content contained in the depth information can be flexibly determined according to the actual situation, and any information that can reflect the depth of the target object in the three-dimensional space can be used as a realization method of the depth information.
  • the depth information may include a depth distance and/or an offset angle.
  • the depth distance can be the distance between the target object and the collection device, and the collection device can be any device that collects images of the target object.
  • the collection device can be a static image collection device, such as a camera, etc. ;
  • the collection device may also be a device for collecting dynamic images, such as a video camera or a camera.
  • different frames to be detected can be collected by image acquisition devices set under different acquisition angles of view, or can be acquired by the same image device under different acquisition angles of view. Therefore, the number of acquisition devices Can be one or more.
  • the depth detection method proposed by the embodiment of the present disclosure can be implemented based on at least two acquisition devices. In this case, at least two acquisition devices can detect the target object from at least two acquisition angles. Image acquisition is performed to obtain multiple frames to be detected.
  • the types of different collection devices may be the same or different, which can be flexibly selected according to the actual situation, and there is no limitation in this embodiment of the present disclosure.
  • the depth distance can be the distance between the target object and the collection device, the distance can be the distance between the target object and the collection device as a whole, or the distance between the target object and a certain equipment part of the collection device, in some possible
  • the distance between the target object and the optical center of the acquisition device may be used as the depth distance.
  • the offset angle may be an offset angle of the target object relative to the collection device, and in a possible implementation manner, the offset angle may be a spatial angle of the target object relative to the optical axis of the collection device.
  • multiple key point detection results can correspond to multiple frames to be detected, and multiple frames to be detected can be obtained by collecting images of the target object from at least two acquisition angles of view, therefore, based on multiple key point detection results,
  • the parallax formed between multiple frames to be detected can be determined, and then the depth information calculation based on the parallax can be realized to obtain the depth information of the target object.
  • the parallax-based calculation method based on the key point detection results can be flexibly determined according to the actual situation. Any method for realizing depth ranging based on parallax can be used in the implementation process of step S13. For details, see the following disclosed embodiments. , do not expand here.
  • the parallax formed by the multiple frames to be detected collected under at least two acquisition angles can be used to utilize the multi-frames to be detected
  • the detection results of multiple key points corresponding to the target area in the frame realize the calculation based on parallax to obtain depth information, effectively reduce the amount of data processed in the process of calculation based on parallax, and improve the efficiency and accuracy of depth detection.
  • step S12 may include:
  • key point detection is performed on the target area of the target object in the frame to be detected, and a key point detection result corresponding to the frame to be detected is obtained.
  • the reference frame may be a video frame located before the frame to be detected in the target video, and the target video may be a video including the frame to be detected.
  • different frames to be detected may respectively belong to different target videos, and in this case, reference frames corresponding to different frames to be detected may also be different.
  • the reference frame can be the previous frame of the frame to be detected in the target video, and in some possible implementations, the reference frame can also be the frame in the target video, located before the frame to be detected and connected to the frame to be detected
  • the distance between the video frames does not exceed the preset distance, the number of preset distances can be flexibly determined according to the actual situation, and can be one or more frames apart, which is not limited in this embodiment of the present disclosure.
  • the position of the target object in the reference frame may be relatively close to the position of the target object in the frame to be detected.
  • the position information of the target object in the reference frame According to the position information of the target object in the reference frame, the position information of the target object in the frame to be detected can be roughly determined.
  • the target area of the target object in the frame to be detected can be more targeted. Detection, and the amount of data detected will be smaller, so that more accurate key point detection results can be obtained, and the efficiency of key point detection can also be improved.
  • the key point detection method of the target area of the target object in the frame to be detected can be flexibly determined according to the actual situation, for example, according to the position information of the target object in the reference frame
  • the position information in the to-be-detected frame is cropped and then the key point detection is performed, or according to the position information of the target object in the reference frame, the key point detection is directly performed on the image area corresponding to the position in the to-be-detected frame, etc.
  • the key point detection is performed on the target area of the target object in the frame to be detected, and the key point detection result corresponding to the frame to be detected is obtained, including:
  • the key point detection is performed on the target area of the target object in the clipping result, and the key point detection result corresponding to the frame to be detected is obtained.
  • the first position may be the overall position coordinates of the target object in the reference frame.
  • the first position may be the position coordinates of the body frame of the target object in the reference frame.
  • the manner of clipping the frame to be detected according to the first position is also not limited in the embodiments of the present disclosure, and is not limited to the following disclosed embodiments.
  • the first coordinates of the human body frame in the reference frame can be determined according to the first position, and combined with the corresponding relationship between the position coordinates between the reference frame and the frame to be detected, it can be determined that the human body frame of the target object is in the frame to be detected.
  • the second coordinates in the frame are detected, and the frame to be detected is cropped based on the second coordinates to obtain a cropping result.
  • the first coordinates of the body frame in the reference frame and the border length of the body frame can also be determined according to the first position, and combined with the position coordinate correspondence between the reference frame and the frame to be detected, determine The second coordinates of the human body frame of the target object in the frame to be detected, and the frame to be detected is cropped based on the second coordinates and the frame length to obtain a clipping result, wherein, the clipping based on the second coordinates and the frame length can be based on the first
  • the two coordinates determine the position of the clipping endpoint, and the frame length determines the length of the clipping result.
  • the length of the clipping result can be consistent with the frame length.
  • the length of the clipping result can also be proportional to the frame length, such as N times the frame length, etc., N can be any value not less than 1, etc.
  • the target object in the frame to be detected can be preliminarily positioned according to the first position of the target object in the reference frame, and the clipping result can be obtained, and the key point detection of the target area can be performed based on the clipping result.
  • it can reduce The amount of detected data improves the detection efficiency.
  • the accuracy of key point detection can be improved.
  • the key point detection is performed on the target area of the target object in the frame to be detected, and the key point detection result corresponding to the frame to be detected is obtained, including:
  • the second position may be the position coordinates of the target area of the target object in the reference frame.
  • the target area may include the head area and/or the shoulder area, so in a possible implementation
  • the second position may be the position coordinates of the head and shoulders frame of the target object in the reference frame.
  • the implementation form can be flexibly determined according to the actual situation, for example, it can be realized by performing head and shoulder frame and/or key point recognition on the reference frame, see the following publications for details Embodiment, do not expand here.
  • the key point detection method for the target object in the clipping result can be the same as the key point detection method based on the clipping result obtained at the first position, or it can be different. Do unfold.
  • the key point detection result can be obtained according to the second position of the target area of the target object in the reference frame.
  • the target area can be more targeted, thereby further reducing the amount of data processing. Therefore, the accuracy and efficiency of depth detection are further improved.
  • obtaining the second position of the target area of the target object in the reference frame may include:
  • the second position of the target area in the reference frame is obtained.
  • the first neural network may be any network used to determine the second position, and its implementation form is not limited in the embodiments of the present disclosure.
  • the first neural network may be an object area detection network for identifying the second location of the object area directly from the reference frame.
  • the object area detection network may be faster based on Regional Convolutional Neural Networks (Faster Regions with Convolutional Neural Networks, Faster RCNN); in some possible implementations, the first neural network can also be a key point detection network, which is used to detect one or more key points in the reference frame Points are identified, and then the second position of the target area in the reference frame is determined according to the positions of the identified key points.
  • the reference frame may also be used as the frame to be detected for depth detection.
  • the reference frame may have undergone key point detection and a corresponding key point detection result has been obtained. Therefore, in some possible implementation manners, the second position of the target area in the reference frame may be obtained according to the key point detection result corresponding to the reference frame.
  • the key point detection may also be directly performed on the reference frame to obtain the key point detection result.
  • the key point detection method reference may be made to other disclosed embodiments, which will not be repeated here.
  • the second position of the target area in the reference frame can be flexibly determined in multiple ways according to the actual situation of the reference frame, which improves the flexibility and versatility of depth detection; and in some possible implementations
  • the second position can be determined directly based on the intermediate result of the reference frame in the depth detection, thereby reducing the repeated calculation of data and improving the depth detection. efficiency and precision.
  • the key point detection is performed on the target object in the clipping result to obtain the key point detection result, which may include:
  • the second neural network is used to perform key point detection on the target object in the clipping result to obtain a key point detection result.
  • the second neural network may be any neural network used to realize key point detection, and its implementation mode is not limited in the embodiments of the present disclosure, wherein, when the first neural network may be a key point detection network, the second The second neural network may be implemented in the same or different manner as the first neural network.
  • key point detection may also be performed on the target object in the clipping result through a related key point recognition algorithm, and the key point recognition algorithm to be used is also not limited in the embodiments of the present disclosure.
  • FIG. 3 shows a flowchart of a depth detection method according to an embodiment of the present disclosure.
  • step S13 may include:
  • Step S131 acquiring at least two preset device parameters respectively corresponding to at least two capture devices, the at least two capture devices are used to capture images of the target object from at least two capture angles of view.
  • Step S132 Determine the depth information of the target object in the frame to be detected according to at least two preset device parameters and a plurality of key point detection results.
  • the at least two preset device parameters may include preset internal parameters respectively corresponding to at least two acquisition devices.
  • the preset internal parameters may be some calibration parameters of the collection device itself, and the types and types of parameters contained therein may be flexibly determined according to the actual situation of the collection device.
  • the preset internal parameters may include an internal reference matrix of the acquisition device, and the internal reference matrix may include one or more focal length parameters of the camera, principal point positions of one or more cameras, and the like.
  • the collection device may include at least two collection devices
  • at least two preset device parameters may also include preset external parameters, wherein the preset external parameters may be between different collection devices
  • the formed relative parameters are used to describe the relative positions of different acquisition devices in the world coordinate system.
  • the preset external parameters may include an external parameter matrix formed between different acquisition devices.
  • the external parameter matrix may include a rotation matrix and/or a translation vector matrix, and the like.
  • the way to obtain the preset device parameters is not limited in the embodiments of the present disclosure.
  • the preset device parameters can be directly obtained according to the actual situation of the acquisition device.
  • you can also The preset device parameters are obtained by calibrating the acquisition device.
  • the parallax formed between different frames to be detected in the three-dimensional world coordinate system can be determined.
  • the information content contained in the depth information can be flexibly determined according to the actual situation. Therefore, with the different content of the depth information, the process of determining the depth information according to the preset device parameters and the detection results of multiple key points can also be determined at any time.
  • At least two preset device parameters and multiple key point detection results can be used to determine the disparity formed between different frames to be detected, and to determine the depth information simply and conveniently.
  • This method has a small amount of calculation and is The result is more accurate, which can improve the accuracy and efficiency of depth detection.
  • step S132 may include:
  • the depth distance is obtained according to the preset external parameters among the at least two preset device parameters and the coordinates of the multiple key point detection results in at least two forms.
  • the coordinates of the key point detection results in at least two forms can be the corresponding coordinates of the key point detection results in different coordinate systems, for example, it can include the pixel coordinates formed by the key point detection results in the image coordinate system, and/or Or, homogeneous coordinates formed separately in different acquisition devices, etc. Which form of coordinates to choose can be flexibly selected according to the actual situation, and is not limited to the following disclosed embodiments.
  • the coordinates of the key points in the key point detection results are not limited in the embodiment of the present disclosure.
  • the head key point, left shoulder key point and right shoulder key point can be selected.
  • the center of the head and shoulders can also be chosen.
  • the center point of the head and shoulders may be the center point of the head and shoulders frame mentioned in the above disclosed embodiments.
  • the position coordinates of the key points of the head, the key points of the left shoulder and the key points of the right shoulder may be Determine the overall position coordinates of the head and shoulders frame, and determine the position coordinates of the center point of the head and shoulders based on the overall position coordinates of the head and shoulders frame; in some possible implementations, the center point of the head and shoulders can also be directly used as the key to be detected point, so that the position coordinates of the center point of the head and shoulders can be directly obtained in the key point detection results.
  • the calculation method for obtaining the depth distance can be flexibly changed, and is not limited to the following disclosed embodiments.
  • it may include two acquisition devices, a left camera and a right camera.
  • the process of obtaining the depth distance can be expressed by the following formulas (1) and (2):
  • d is the depth distance
  • the original coordinates of the key points in the homogeneous form in the frame to be detected collected by the left camera is the transformed coordinate obtained by linearly transforming the original coordinate
  • the coordinates of the key points in the homogeneous form in the frame to be detected collected by the right camera is the rotation matrix R of the right camera relative to the left camera in the preset external parameters
  • the translation vector matrix T of the right camera relative to the left camera in the preset external parameters is the depth distance
  • the homogeneous form coordinates of key points in different camera coordinate systems and the coordinates of key points in the form of linear transformation can be combined with the relative preset external parameters between different cameras, with a small amount of calculation Accurately determine the depth distance, thereby improving the accuracy and efficiency of depth detection.
  • step S132 may also include:
  • the offset angle is obtained according to the preset internal parameters in the at least two preset device parameters and the coordinates of the multiple key point detection results in at least two forms.
  • the way of obtaining the offset angle can also be flexibly selected, and is not limited to the following disclosed embodiments.
  • the type of selected key points can also be flexibly selected according to the actual situation. You can refer to the type of key points selected in the above-mentioned determination of the depth distance, which will not be repeated here.
  • the calculation method for obtaining the offset angle can also be flexibly changed, and is not limited to the following disclosed embodiments.
  • the process of obtaining the offset angle relative to the target camera can be expressed by the following formulas (3) to (5):
  • ⁇ x is the offset angle of the target object in the x-axis direction
  • ⁇ y is the offset angle of the target object in the y-axis direction
  • f x and f y are the internal reference matrix of the target camera
  • the focal length parameter in , u 0 and v 0 are the principal point positions in the intrinsic parameter matrix K of the target camera.
  • the offset angle can be determined simply and conveniently by using the preset internal parameters and the coordinates of the key point detection results obtained in the depth detection process in different forms. This determination method does not need to obtain additional data and is convenient. Computing can improve the efficiency and convenience of in-depth detection.
  • the method proposed in the embodiment of the present disclosure may further include:
  • the position of the target object in the three-dimensional space is determined.
  • the position of the target object in the three-dimensional space may be the three-dimensional coordinates of the target object in the three-dimensional space.
  • the way to determine the position in the three-dimensional space based on the depth information can be flexibly selected according to the actual situation.
  • the two-dimensional coordinates of the target object in the frame to be detected can be determined according to the key point detection results of the target object.
  • the two-dimensional coordinates are combined with the depth distance and/or offset angle in the depth information, so as to determine the three-dimensional coordinates of the target object in the three-dimensional space.
  • the depth information can be used to perform three-dimensional positioning of the target object, so as to realize various operations such as interaction with the target object.
  • the distance and angle between the target object and the smart air conditioner can be determined according to the position of the target object in three-dimensional space, so as to dynamically adjust the wind direction and/or wind speed of the smart air conditioner; in some possible
  • the target object can also be positioned in the game scene based on the position of the target object in the three-dimensional space in the AR game platform, so that the human-computer interaction in the AR scene can be realized more realistically and naturally.
  • the present disclosure also provides depth detection devices, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any depth detection method provided by the present disclosure.
  • depth detection devices electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any depth detection method provided by the present disclosure.
  • FIG. 4 shows a block diagram of a depth detection device according to an embodiment of the present disclosure.
  • device 20 includes:
  • the obtaining module 21 is configured to obtain multiple frames to be detected, wherein the multiple frames to be detected include image frames obtained by collecting images of a target object from at least two collection angles of view.
  • the key point detection module 22 is used to perform key point detection of the target area in the target object according to the frame to be detected, and determine a plurality of key point detection results corresponding to multiple frames to be detected, wherein the target area includes the head area and/or shoulder area.
  • the depth detection module 23 is configured to determine the depth information of the target object according to the multiple key point detection results.
  • the depth detection module is configured to: acquire at least two preset device parameters respectively corresponding to at least two acquisition devices, the at least two acquisition devices are used to image the target object from at least two acquisition angles of view Acquisition: Determining the depth information of the target object in the frame to be detected according to at least two preset device parameters and a plurality of key point detection results.
  • the depth information includes a depth distance
  • the depth distance includes a distance between the target object and the optical center of the acquisition device
  • the depth detection module is further used for: according to the preset in at least two preset device parameters
  • the external parameters and the coordinates of the multiple key point detection results in at least two forms obtain the depth distance; wherein, the preset external parameters include relative parameters formed between at least two acquisition devices.
  • the depth information includes an offset angle
  • the offset angle includes a spatial angle of the target object relative to the optical axis of the acquisition device
  • the depth detection module is further configured to: according to at least two preset device parameters
  • the preset internal parameters and the coordinates of the multiple key point detection results in at least two forms are used to obtain the offset angle; wherein the preset internal parameters include device parameters corresponding to at least two devices respectively.
  • the key point detection module is used to: perform key point detection on the target area of the target object in the frame to be detected according to the position information of the target object in the reference frame, and obtain the key point corresponding to the frame to be detected The point detection result, wherein the reference frame is a video frame before the frame to be detected in the target video to which the frame to be detected belongs.
  • the key point detection module is further used to: crop the frame to be detected according to the first position of the target object in the reference frame to obtain the cropping result; key the target area of the target object in the cropping result Point detection to obtain key point detection results corresponding to the frame to be detected.
  • the key point detection module is further used to: obtain the second position of the target area of the target object in the reference frame; according to the second position, the frame to be detected is cropped to obtain the cropping result; the cropping result The key point detection is performed on the target area of the target object in , and the key point detection result corresponding to the frame to be detected is obtained.
  • the key point detection module is further configured to: use the first neural network to identify the target area in the reference frame to obtain the second position output by the first neural network; and/or, according to the reference frame The corresponding key point detection result obtains the second position of the target area in the reference frame.
  • the device is further configured to: determine the position of the target object in the three-dimensional space according to the depth information of the target object.
  • the functions or modules included in the device provided by the embodiments of the present disclosure can be used to execute the methods described in the method embodiments above, and its specific implementation can refer to the description of the method embodiments above. For brevity, here No longer.
  • Fig. 5 shows a schematic diagram of an application example according to the present disclosure.
  • the application example of the present disclosure proposes a depth detection method, which may include the following process:
  • Step S31 use the Faster RCNN neural network to detect the head and shoulders frame of the human body from the two frames to be detected taken by the binocular camera (including the left camera and the right camera), and obtain the head and shoulders frame in the first frame of the left camera. position, and the position of the head-and-shoulders box in the first frame of the right camera.
  • Step S32 obtain the target video corresponding to the left camera and the right camera respectively, start from the second frame of the target video, use the video frame as the frame to be detected, and use the previous frame of the frame to be detected as the reference frame, according to the reference frame
  • the key point detection of the frame to be detected is carried out through the key point detection network, and the position coordinates of the three key points of the head key point, left shoulder key point and right shoulder key point are obtained, and the three key points The circumscribed rectangle of the point is used as the head and shoulders frame in the frame to be detected.
  • Step S33 according to the coordinates of the key points in the frame to be detected in at least two forms, and the internal reference matrix of the camera, calculate the offset angle of the target object relative to the camera:
  • the head According to the pixel coordinates (u, v, 1) of the key points of the head in the frame to be detected and the internal reference matrix K of the camera, the head The coordinates (x/z, y/z, 1) of the homogeneous form corresponding to the internal key points, and the offset angles ⁇ x and ⁇ y relative to the camera optical axis.
  • Step S34 according to the homogeneous coordinates of the key points in the frame to be detected in the left camera and the right camera, and the extrinsic matrix of the right camera relative to the left camera, calculate the depth distance of the target object:
  • the next frame of the frame to be detected in the target video corresponding to the left camera and the right camera can also be used as the frame to be detected , and return to step S32 to perform depth detection again.
  • the head and shoulders frame of the human body and the key points in the head and shoulders frame can be used to calculate the disparity formed by the frames to be detected collected under different viewing angles.
  • the calculation amount is more Small size, wider application scenarios.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which computer program instructions are stored, and the above-mentioned method is implemented when the computer program instructions are executed by a processor.
  • the computer readable storage medium may be a non-volatile computer readable storage medium or a volatile computer readable storage medium.
  • An embodiment of the present disclosure also proposes an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
  • An embodiment of the present disclosure also provides a computer program product, including computer readable codes.
  • the processor in the device executes the method for implementing the depth detection method provided in any of the above embodiments. instruction.
  • the embodiments of the present disclosure also provide another computer program product, which is used for storing computer-readable instructions, and when the instructions are executed, the computer executes the operation of the depth detection method provided by any of the above-mentioned embodiments.
  • Electronic devices may be provided as terminals, servers, or other forms of devices.
  • FIG. 6 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 800 may be a terminal such as a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, or a personal digital assistant.
  • electronic device 800 may include one or more of the following components: processing component 802, memory 804, power supply component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814 , and the communication component 816.
  • the processing component 802 generally controls the overall operations of the electronic device 800, such as those associated with display, telephone calls, data communications, camera operations, and recording operations.
  • the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the above method. Additionally, processing component 802 may include one or more modules that facilitate interaction between processing component 802 and other components. For example, processing component 802 may include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802 .
  • the memory 804 is configured to store various types of data to support operations at the electronic device 800 . Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and the like.
  • the memory 804 can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk Magnetic Disk
  • the power supply component 806 provides power to various components of the electronic device 800 .
  • Power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for electronic device 800 .
  • the multimedia component 808 includes a screen providing an output interface between the electronic device 800 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense a boundary of a touch or swipe action, but also detect duration and pressure associated with the touch or swipe action.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capability.
  • the audio component 810 is configured to output and/or input audio signals.
  • the audio component 810 includes a microphone (MIC), which is configured to receive external audio signals when the electronic device 800 is in operation modes, such as call mode, recording mode and voice recognition mode. Received audio signals may be further stored in memory 804 or sent via communication component 816 .
  • the audio component 810 also includes a speaker for outputting audio signals.
  • the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module, which may be a keyboard, a click wheel, a button, and the like. These buttons may include, but are not limited to: a home button, volume buttons, start button, and lock button.
  • Sensor assembly 814 includes one or more sensors for providing status assessments of various aspects of electronic device 800 .
  • the sensor component 814 can detect the open/closed state of the electronic device 800, the relative positioning of components, such as the display and the keypad of the electronic device 800, the sensor component 814 can also detect the electronic device 800 or a Changes in position of components, presence or absence of user contact with electronic device 800 , electronic device 800 orientation or acceleration/deceleration and temperature changes in electronic device 800 .
  • Sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 814 may also include an optical sensor, such as a complementary metal-oxide-semiconductor (CMOS) or charge-coupled device (CCD) image sensor, for use in imaging applications.
  • CMOS complementary metal-oxide-semiconductor
  • CCD charge-coupled device
  • the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • the communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices.
  • the electronic device 800 can access a wireless network based on a communication standard, such as a wireless network (WiFi), a second generation mobile communication technology (2G) or a third generation mobile communication technology (3G), or a combination thereof.
  • the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, Infrared Data Association (IrDA) technology, Ultra Wide Band (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID Radio Frequency Identification
  • IrDA Infrared Data Association
  • UWB Ultra Wide Band
  • Bluetooth Bluetooth
  • electronic device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the methods described above.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A programmable gate array
  • controller microcontroller, microprocessor or other electronic component implementation for performing the methods described above.
  • a non-volatile computer-readable storage medium such as the memory 804 including computer program instructions, which can be executed by the processor 820 of the electronic device 800 to implement the above method.
  • FIG. 7 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 1900 may be provided as a server.
  • electronic device 1900 includes processing component 1922 , which further includes one or more processors, and memory resources represented by memory 1932 for storing instructions executable by processing component 1922 , such as application programs.
  • the application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions.
  • the processing component 1922 is configured to execute instructions to perform the above method.
  • Electronic device 1900 may also include a power supply component 1926 configured to perform power management of electronic device 1900, a wired or wireless network interface 1950 configured to connect electronic device 1900 to a network, and an input-output (I/O) interface 1958 .
  • the electronic device 1900 can operate based on the operating system stored in the memory 1932, such as the Microsoft server operating system (Windows Server TM ), the graphical user interface-based operating system (Mac OS X TM ) introduced by Apple Inc., and the multi-user and multi-process computer operating system (Unix TM ), a free and open source Unix-like operating system (Linux TM ), an open source Unix-like operating system (FreeBSD TM ), or the like.
  • Microsoft server operating system Windows Server TM
  • Mac OS X TM graphical user interface-based operating system
  • Unix TM multi-user and multi-process computer operating system
  • Linux TM free and open source Unix-like operating system
  • FreeBSD TM open source Unix-like operating system
  • a non-transitory computer-readable storage medium such as a memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to implement the above method.
  • the present disclosure can be a system, method and/or computer program product.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present disclosure.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the computer program product can be specifically realized by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.
  • a software development kit Software Development Kit, SDK

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

La présente divulgation concerne un procédé et un appareil de mesure de profondeur, un dispositif électronique et un support de stockage. Le procédé consiste : à obtenir une pluralité de trames devant être détectées, la pluralité de trames devant être détectées comprenant une trame d'image obtenue par la réalisation d'une acquisition d'image sur un objet cible à partir d'au moins deux angles de visualisation d'acquisition ; à réaliser une détection de point clé sur une zone cible dans l'objet cible en fonction des trames devant être détectées et à déterminer une pluralité de résultats de détection de point clé correspondant à la pluralité de trames devant être détectées, la zone cible comprenant une zone de tête et/ou une zone d'épaulement ; et à déterminer des informations de profondeur de l'objet cible en fonction de la pluralité de résultats de détection de point clé.
PCT/CN2022/085920 2021-06-28 2022-04-08 Procédé et appareil de mesure de profondeur, dispositif électronique et support de stockage WO2023273499A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110721270.1 2021-06-28
CN202110721270.1A CN113345000A (zh) 2021-06-28 2021-06-28 深度检测方法及装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023273499A1 true WO2023273499A1 (fr) 2023-01-05

Family

ID=77479236

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085920 WO2023273499A1 (fr) 2021-06-28 2022-04-08 Procédé et appareil de mesure de profondeur, dispositif électronique et support de stockage

Country Status (3)

Country Link
CN (1) CN113345000A (fr)
TW (1) TW202301276A (fr)
WO (1) WO2023273499A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344999A (zh) * 2021-06-28 2021-09-03 北京市商汤科技开发有限公司 深度检测方法及装置、电子设备和存储介质
CN113345000A (zh) * 2021-06-28 2021-09-03 北京市商汤科技开发有限公司 深度检测方法及装置、电子设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764091A (zh) * 2018-05-18 2018-11-06 北京市商汤科技开发有限公司 活体检测方法及装置、电子设备和存储介质
CN108876835A (zh) * 2018-03-28 2018-11-23 北京旷视科技有限公司 深度信息检测方法、装置和系统及存储介质
US10319154B1 (en) * 2018-07-20 2019-06-11 The University Of North Carolina At Chapel Hill Methods, systems, and computer readable media for dynamic vision correction for in-focus viewing of real and virtual objects
CN110942032A (zh) * 2019-11-27 2020-03-31 深圳市商汤科技有限公司 活体检测方法及装置、存储介质
CN113345000A (zh) * 2021-06-28 2021-09-03 北京市商汤科技开发有限公司 深度检测方法及装置、电子设备和存储介质
CN113344999A (zh) * 2021-06-28 2021-09-03 北京市商汤科技开发有限公司 深度检测方法及装置、电子设备和存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897675B (zh) * 2017-01-24 2021-08-17 上海交通大学 双目视觉深度特征与表观特征相结合的人脸活体检测方法
CN111222509B (zh) * 2020-01-17 2023-08-18 北京字节跳动网络技术有限公司 目标检测方法、装置及电子设备
CN111780673B (zh) * 2020-06-17 2022-05-31 杭州海康威视数字技术股份有限公司 一种测距方法、装置及设备
CN112419388A (zh) * 2020-11-24 2021-02-26 深圳市商汤科技有限公司 深度检测方法、装置、电子设备和计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876835A (zh) * 2018-03-28 2018-11-23 北京旷视科技有限公司 深度信息检测方法、装置和系统及存储介质
CN108764091A (zh) * 2018-05-18 2018-11-06 北京市商汤科技开发有限公司 活体检测方法及装置、电子设备和存储介质
US10319154B1 (en) * 2018-07-20 2019-06-11 The University Of North Carolina At Chapel Hill Methods, systems, and computer readable media for dynamic vision correction for in-focus viewing of real and virtual objects
CN110942032A (zh) * 2019-11-27 2020-03-31 深圳市商汤科技有限公司 活体检测方法及装置、存储介质
CN113345000A (zh) * 2021-06-28 2021-09-03 北京市商汤科技开发有限公司 深度检测方法及装置、电子设备和存储介质
CN113344999A (zh) * 2021-06-28 2021-09-03 北京市商汤科技开发有限公司 深度检测方法及装置、电子设备和存储介质

Also Published As

Publication number Publication date
TW202301276A (zh) 2023-01-01
CN113345000A (zh) 2021-09-03

Similar Documents

Publication Publication Date Title
WO2023273499A1 (fr) Procédé et appareil de mesure de profondeur, dispositif électronique et support de stockage
WO2017215224A1 (fr) Procédé et appareil d'invite à l'entrée d'une empreinte digitale
US20210158560A1 (en) Method and device for obtaining localization information and storage medium
WO2023273498A1 (fr) Procédé et appareil de détection de profondeur, dispositif électronique et support de stockage
CN112991553B (zh) 信息展示方法及装置、电子设备和存储介质
WO2023155532A1 (fr) Procédé de détection de pose, appareil, dispositif électronique et support de stockage
WO2022193466A1 (fr) Procédé et appareil de traitement d'images, dispositif électronique et support de stockage
WO2022134475A1 (fr) Procédé et appareil de construction de carte de nuage de points, dispositif électronique, support de stockage et programme
WO2022121577A1 (fr) Procédé et appareil de traitement d'images
WO2023051356A1 (fr) Procédé et appareil d'affichage d'objet virtuel, dispositif électronique et support de stockage
WO2022151686A1 (fr) Procédé et appareil d'affichage d'images de scènes, dispositif, support de stockage, programme et produit
WO2022017140A1 (fr) Procédé et appareil de détection de cible, dispositif électronique et support de stockage
US20200402321A1 (en) Method, electronic device and storage medium for image generation
CN112184787A (zh) 图像配准方法及装置、电子设备和存储介质
TW202211671A (zh) 一種資訊處理方法、電子設備、儲存媒體和程式
KR20220123218A (ko) 타깃 포지셔닝 방법, 장치, 전자 기기, 저장 매체 및 프로그램
CN111860388A (zh) 图像处理方法及装置、电子设备和存储介质
CN110930351A (zh) 一种光斑检测方法、装置及电子设备
CN114581525A (zh) 姿态确定方法及装置、电子设备和存储介质
US20170302908A1 (en) Method and apparatus for user interaction for virtual measurement using a depth camera system
CN112613447A (zh) 关键点检测方法及装置、电子设备和存储介质
WO2023155350A1 (fr) Procédé et appareil de positionnement de foule, dispositif électronique et support de stockage
WO2022151687A1 (fr) Procédé et appareil de génération d'image photographique de groupe, dispositif, support de stockage, programme informatique et produit
WO2022110801A1 (fr) Procédé et appareil de traitement de données, dispositif électronique et support de stockage
CN114266305A (zh) 对象识别方法及装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831330

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE