WO2022213729A1 - 对目标的运动信息进行检测的方法和装置、设备和介质 - Google Patents
对目标的运动信息进行检测的方法和装置、设备和介质 Download PDFInfo
- Publication number
- WO2022213729A1 WO2022213729A1 PCT/CN2022/076765 CN2022076765W WO2022213729A1 WO 2022213729 A1 WO2022213729 A1 WO 2022213729A1 CN 2022076765 W CN2022076765 W CN 2022076765W WO 2022213729 A1 WO2022213729 A1 WO 2022213729A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- target
- detection frame
- information
- coordinate system
- Prior art date
Links
- 230000033001 locomotion Effects 0.000 title claims abstract description 192
- 238000000034 method Methods 0.000 title claims abstract description 110
- 238000001514 detection method Methods 0.000 claims abstract description 235
- 230000008859 change Effects 0.000 claims abstract description 33
- 230000008569 process Effects 0.000 claims description 21
- 239000013598 vector Substances 0.000 claims description 21
- 230000003287 optical effect Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 11
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 abstract description 18
- 238000010586 diagram Methods 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 12
- 238000013528 artificial neural network Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000006073 displacement reaction Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 241001464837 Viridiplantae Species 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/254—Analysis of motion involving subtraction of images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
- G06T2207/30261—Obstacle
Definitions
- the present disclosure relates to computer vision technology, in particular to a method and apparatus for detecting motion information of a target, a method and apparatus for controlling a traveling object based on the motion information of the target, an electronic device and a storage medium.
- the estimation of the motion speed and direction of objects is the focus of research in the fields of unmanned driving, security monitoring, and scene understanding.
- the decision-making layer can control the vehicle to slow down or even stop to ensure the safe driving of the vehicle.
- lidar is mostly used for data collection.
- the laser beam is emitted at a high frequency, and then the distance to the target point is calculated according to the emission time and reception time of the laser beam.
- the target detection and target tracking are performed on the point cloud data collected at two times corresponding to a certain time range, and then the movement speed and direction of the target within the time range are calculated.
- Embodiments of the present disclosure provide a method and apparatus for detecting motion information of a target, a method and apparatus for controlling a traveling object based on the motion information of the target, an electronic device, and a storage medium.
- a method for detecting motion information of a target including:
- the first image is an image of a scene outside the driving object collected by a camera on the driving object during the driving of the driving object;
- the depth information of the detection frame of the first target is determined, and based on the position of the detection frame of the first target in the image coordinate system and the detection frame of the first target depth information, to determine the first coordinates of the first target in the first camera coordinate system;
- the pose change information of the camera device from collecting the second image to collecting the first image; wherein, the second image is a sequence of images where the first image is located before the first image and is different from the first image. Describe the image of the first image interval preset number of frames;
- the second target is the target in the second image corresponding to the first target
- the motion information of the first target within a corresponding time range from the acquisition moment of the second image to the acquisition moment of the first image is determined.
- an intelligent driving control method including:
- the image sequence of the scene outside the driving object is collected by the camera device on the driving object;
- a control command for controlling the traveling state of the traveling object is generated according to the motion information of the target.
- an apparatus for detecting motion information of a target including:
- a detection module configured to perform target detection on a first image to obtain a detection frame of the first target, where the first image is a scene outside the driving object collected by a camera on the driving object during the driving process of the driving object image;
- a first acquisition module configured to acquire depth information of the first image in the corresponding first camera coordinate system
- a first determination module configured to determine the depth information of the detection frame of the first target according to the depth information of the first image acquired by the first acquisition module
- a second determination module configured to be based on the position of the detection frame of the first target obtained by the detection module in the image coordinate system and the depth information of the detection frame of the first target determined by the first determination module, determining a first coordinate of the first target in the first camera coordinate system;
- the second acquisition module is configured to acquire the pose change information of the camera device from the acquisition of the second image to the acquisition of the first image; wherein, the second image is located in the first image sequence in the sequence of the first image.
- a conversion module configured to convert the second coordinate of the second target in the second camera coordinate system corresponding to the second image to the first camera according to the pose change information obtained by the second obtaining module the third coordinate in the coordinate system; wherein, the second target is the target in the second image corresponding to the first target;
- a third determination module configured to determine the acquisition moment of the first target from the second image based on the first coordinates determined by the second determination module and the third coordinates converted by the conversion module
- the motion information in the time range is corresponding to the acquisition time of the first image.
- an intelligent driving control device including:
- a camera device which is arranged on the driving object, and is used for collecting the image sequence of the scene outside the driving object during the driving process of the driving object;
- a motion information detection device configured to use at least one frame image in the image sequence as a first image, with a preset number of frames before the first image in the image sequence and spaced from the first image by a preset number of frames At least one frame of the image is used as the second image to determine the motion information of the target in the scene;
- the motion information detection device includes the device for detecting the motion information of the target according to any embodiment of the present disclosure
- a control device is configured to generate a control instruction for controlling the traveling state of the traveling object according to the motion information of the target detected by the motion information detection device.
- a computer-readable storage medium where the storage medium stores a computer program, and the computer program is used to execute the movement of the target according to any of the above-mentioned embodiments of the present disclosure
- a method of detecting information or a method of controlling a traveling object based on the motion information of the target is provided.
- an electronic device comprising:
- a memory for storing the processor-executable instructions
- the processor is configured to read the executable instructions from the memory, and execute the instructions to implement the method for detecting motion information of a target described in any of the foregoing embodiments of the present disclosure or a target-based
- the motion information controls the method of the moving object.
- the image of the scene outside the driving object is collected by the camera device on the driving object during the driving process of the driving object, and the collected images are collected.
- the obtained first image is subjected to target detection, the detection frame of the first target is obtained, the depth information of the first image in the corresponding first camera coordinate system is obtained, and the detection frame of the first target is determined according to the depth information of the first image.
- the embodiment of the present disclosure utilizes computer vision technology to determine the motion information of the target in the driving scene based on the sequence of images
- the image sequence of the scene outside the traveling object is collected by the camera on the traveling object, Using at least one frame image in the image sequence as the first image, and using at least one frame image in the image sequence before the first image and separated from the first image by a preset number of frames as the second image, using any one of the present disclosure
- the method for detecting the motion information of the target described in the embodiment determines the motion information of the target in the driving scene, and then generates a control instruction for controlling the driving state of the driving object according to the motion information of the target, thereby realizing the detection of driving by computer vision technology.
- the motion information of the target in the scene and the intelligent driving control of the driving object are beneficial to meet the real-time intelligent driving control of the driving object in the unmanned scene, so as to ensure the safe driving of the driving object.
- FIG. 1 is a scene diagram to which the present disclosure is applicable.
- FIG. 2 is a schematic flowchart of a method for detecting motion information of a target provided by an exemplary embodiment of the present disclosure.
- FIG. 3 is a schematic flowchart of a method for detecting motion information of a target provided by another exemplary embodiment of the present disclosure.
- FIG. 4 is a schematic flowchart of a method for detecting motion information of a target provided by another exemplary embodiment of the present disclosure.
- FIG. 5 is a schematic flowchart of a method for detecting motion information of a target provided by yet another exemplary embodiment of the present disclosure.
- FIG. 6 is a schematic flowchart of a method for detecting motion information of a target provided by yet another exemplary embodiment of the present disclosure.
- FIG. 7 is a schematic flowchart of a method for detecting motion information of a target provided by another exemplary embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of an application flow of a method for detecting motion information of a target provided by an exemplary embodiment of the present disclosure.
- FIG. 9 is a schematic flowchart of a method for controlling a traveling object based on motion information of a target provided by an exemplary embodiment of the present disclosure.
- FIG. 10 is a schematic structural diagram of an apparatus for detecting motion information of a target provided by an exemplary embodiment of the present disclosure.
- FIG. 11 is a schematic structural diagram of an apparatus for detecting motion information of a target provided by another exemplary embodiment of the present disclosure.
- FIG. 12 is a schematic structural diagram of an apparatus for controlling a traveling object based on motion information of a target provided by an exemplary embodiment of the present disclosure.
- FIG. 13 is a structural diagram of an electronic device provided by an exemplary embodiment of the present disclosure.
- a plurality may refer to two or more, and “at least one” may refer to one, two or more.
- the term "and/or" in the present disclosure is only an association relationship to describe associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, and A and B exist at the same time , there are three cases of B alone.
- the character "/" in the present disclosure generally indicates that the related objects are an "or" relationship.
- Embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, servers, etc., which can operate with numerous other general-purpose or special-purpose computing system environments or configurations.
- Examples of well-known terminal equipment, computing systems, environments and/or configurations suitable for use with terminal equipment, computer systems, servers, etc. electronic equipment include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients computer, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, minicomputer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the foregoing, among others.
- Electronic devices such as terminal devices, computer systems, servers, etc., may be described in the general context of computer system-executable instructions, such as program modules, being executed by the computer system.
- program modules may include routines, programs, object programs, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types.
- Computer systems/servers may be implemented in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network.
- program modules may be located on local or remote computing system storage media including storage devices.
- lidar can obtain the depth values of several points in an instantaneous scene, but cannot directly obtain information such as the moving speed and direction of an object.
- the embodiments of the present disclosure provide a technical solution for obtaining motion information of a target in a driving scene based on a sequence of images of a driving scene by using a computer vision technology, and a camera on the driving object collects the motion information of the scene outside the driving object during the driving process of the driving object.
- Image, target detection and target tracking are performed on the first image and the second image separated by a preset number of frames in the collected image sequence, and the first coordinate of the same target in the first camera coordinate system corresponding to the first image is the same as that in the first image.
- the second coordinate in the second camera coordinate system corresponding to the two images is converted to the third coordinate obtained by the first camera coordinate system, and then the acquisition time of the target in the first image and the second image is determined based on the first coordinate and the third coordinate. Corresponds to the motion information in the time range.
- the embodiments of the present disclosure do not need to rely on lidar, which can avoid a large amount of calculation processing, save processing time, improve processing efficiency, and help meet the needs of scenarios with high real-time requirements such as unmanned driving.
- a control instruction for controlling the driving state of the driving object can be generated according to the motion information of the target, thereby realizing the detection of the driving scene by using the computer vision technology
- the motion information of the target and the intelligent driving control of the driving object are beneficial to meet the real-time intelligent driving control of the driving object in the unmanned scene, so as to ensure the safe driving of the driving object.
- the embodiments of the present disclosure can be applied to intelligent driving control scenarios of driving objects, robots, toy cars and other driving objects.
- a control command for controlling the driving state of the driving objects is generated, and the driving conditions of the driving objects are controlled.
- the driving state of the object is controlled.
- FIG. 1 is a scene diagram to which the present disclosure is applicable.
- an image sequence collected by an image acquisition module 101 eg, a camera device such as a camera
- the device 102; the motion information detection device 102 takes each frame of image in the image sequence or a frame of image selected at intervals of several frames as the second image, and is located after the second image in the sequence of images and is spaced from the second image.
- a frame image of a certain number of frames is used as the first image, and target detection is performed on the first image to obtain the detection frame of the first target; the depth information of the first image in the corresponding first camera coordinate system is obtained, and according to the first image
- the depth information of the image determines the depth information of the detection frame of the first target; based on the position of the detection frame of the first target in the image coordinate system and the depth information of the detection frame of the first target, it is determined that the first target is in the first camera coordinate system
- the first coordinates of the second target are converted to the first camera coordinates from the second coordinates of the second target in the second camera coordinate system corresponding to the second image according to the pose change information of the camera from collecting the second image to collecting the first image.
- the third coordinate in the system further, based on the first coordinate and the third coordinate, determine the motion information of the first target within the corresponding time range from the acquisition moment of the second image to the acquisition moment of the first image, and output it; the control device 103, Based on the motion information of the first target outputted by the motion information detection device 102 within the corresponding time range, the traveling state of the traveling objects such as the vehicle, the robot, and the toy car is controlled.
- the control device 103 In the application scenario of controlling the driving state of a driving object, if the movement information of the first target (the movement information may include the movement speed and the movement direction) and the driving state of the driving object (the driving state may include the driving speed and the movement direction) ), it is determined that the driving object and the first target may collide within the next 5 seconds, the control device 103 generates a control command for controlling the driving object to decelerate and output it to the driving object, so as to control the current driving object to decelerate and drive to avoid The traveling object collides with the first target.
- the embodiments of the present disclosure do not limit specific application scenarios.
- FIG. 2 is a schematic flowchart of a method for detecting motion information of a target provided by an exemplary embodiment of the present disclosure.
- This embodiment can be applied to electronic devices, and can also be applied to traveling objects such as vehicles, robots, and toy cars.
- the method for detecting motion information of a target in this embodiment includes the following steps:
- Step 201 performing target detection on the first image to obtain a detection frame of the first target.
- the first image is an image of a scene outside the driving object collected by a camera on the driving object during the driving process of the driving object.
- the first image may be an RGB (red, green and blue) image or a grayscale image, and the embodiment of the present disclosure does not limit the first image.
- the target in the embodiment of the present disclosure may be any target of interest in the scene outside the driving object, such as a moving or stationary person, small animal, object, etc., where the object may be, for example, a vehicle, a road on both sides of the road, etc. Buildings, green plants, road markings, traffic lights, etc., the embodiments of the present disclosure do not limit the targets to be detected, and can be determined according to actual needs.
- a preset target detection framework can be used, for example, a recurrent convolutional neural network (Recurrent Neural Network, RCNN), an accelerated recurrent convolutional neural network (Fast RCNN), a mask (Mask RCNN) ) and other region-based algorithms, just glance at the regression-based algorithms such as You Only Look Once (YOLO), the Single Shot MultiBox Detector (SSD) algorithm obtained by the combination of Faster RCNN and YOLO, and so on, Object detection is performed on the first image.
- YOLO You Only Look Once
- SSD Single Shot MultiBox Detector
- the first target is the target in the first image, which may be one target or multiple targets, and the multiple targets may be targets of the same type (for example, all persons), or may be different types of targets. Targets (eg including people, vehicles, etc.). Correspondingly, performing target detection on the first image to obtain a detection frame of the first target may be one or multiple.
- the embodiments of the present disclosure do not limit the quantity and type of the first targets.
- the detection box in the embodiment of the present disclosure is the bounding box of the target (Bounding Box).
- a thinking vector (x, y, w, h) can be used to represent each detection frame, where (x, y) represents the coordinates of the detection frame in the image coordinate system, which can be the center point of the detection frame or Preset the coordinates of any vertex in the image coordinate system; w and h represent the width and height of the detection frame, respectively.
- Step 202 Acquire depth information of the first image in the corresponding first camera coordinate system.
- the depth information is used for the distance information between each point in the scene (corresponding to each pixel point in the image) and the camera device.
- the depth information can be specifically expressed as depth map.
- a depth map is an image or image channel that contains distance information between points in the scene and the camera. The depth map is similar to a grayscale image, and each pixel value of it is the actual distance (L) of the camera from a point in the scene, and each pixel value occupies a short (short) length to store the camera to a corresponding point. distance.
- a neural network can be used to acquire depth information of the first image in the corresponding first camera coordinate system.
- the neural network is a pre-trained neural network, which can perform depth prediction based on the input image and output the depth information of the scene in the image.
- an end-to-end U-shaped deep neural network and a monocular depth prediction method based on deep learning can be used to perform depth prediction on the input first image to obtain the depth of the first image in the corresponding first camera coordinate system information.
- the camera coordinate system is a three-dimensional (3D) coordinate system established with the focus center of the camera device as the origin and the optical axis (ie, the depth direction) as the Z axis.
- 3D coordinate system established with the focus center of the camera device as the origin and the optical axis (ie, the depth direction) as the Z axis.
- the camera device on the driving object is in a moving state
- the pose of the camera device is also in a changing state
- the corresponding 3D coordinate system is also different.
- the first camera coordinate system corresponding to the first image is The 3D coordinate system when the camera device captures the first image.
- step 202 and step 201 may be performed simultaneously, or may be performed in any time sequence, which is not limited in this embodiment of the present disclosure.
- Step 203 Determine the depth information of the detection frame of the first target according to the depth information of the first image, and determine the first target detection frame based on the position of the detection frame of the first target in the image coordinate system and the depth information of the detection frame of the first target.
- the depth information of the first image refers to the depth information of the first image in the corresponding first camera coordinate system determined in step 202
- the depth information of the detection frame of the first target refers to Depth information of the detection frame of the first target in the first camera coordinate system.
- Step 204 acquiring the pose change information of the camera device from the collection of the second image to the collection of the first image.
- the second image is an image whose time sequence precedes the first image in the image sequence where the first image is located and is spaced from the first image by a preset number of frames.
- the specific value of the preset number of frames can be set according to actual needs (for example, a specific scene, a motion state of a driving object, an image collection frequency of a camera device, etc.), and can be 0, 1, 2, 3, etc.,
- the preset number of frames is 0, the second image and the first image are two adjacent frames of images.
- the value of the preset number of frames is small, so as to prevent the target in the second image from being captured by the camera device in the first
- the image has moved out of the shooting range of the camera device and cannot appear in the first image, so as to effectively detect the motion information of the target in the scene outside the driving object; while in the driving scene of the crowded urban road, the movement of the driving object
- the value of the preset number of frames is large, so that the same target can be detected from the time when the second image is collected to the time when the first image is collected.
- the motion information within the range can also avoid the computing resources and storage resources occupied by the frequent execution of the motion information detection method, and improve the resource utilization rate.
- the pose change information in the embodiment of the present disclosure refers to the difference between the pose of the camera device when the first image is collected and the pose when the second image is collected.
- the pose change information is the pose change information based on the 3D space, which can be specifically expressed as a matrix, so it can be called a pose change matrix.
- the pose change information may include translation information and rotation information of the camera.
- the translation information of the camera device may include: displacement amounts of the camera device on the three coordinate axes XYZ in the 3D coordinate system respectively.
- the rotation information of the camera device may be: rotation vectors based on pitch (Roll), yaw (Yaw) and roll (Pitch), which include rotation component vectors based on the three rotation directions of Roll, Yaw and Pitch, wherein, Roll, Yaw and Pitch respectively represent the rotation of the camera device around the three coordinate axes XYZ in the 3D coordinate system.
- visual technology can be used to obtain the pose change information of the camera device from the acquisition of the second image to the first image, for example, by using Simultaneous Localization And Mapping (Simultaneous Localization And Mapping, SLAM) method to obtain the pose change information.
- Simultaneous Localization And Mapping Simultaneous Localization And Mapping, SLAM
- the depth information of the first image (RGB image) and the first image and the depth information of the second image can be input into the open source Oriented FAST and Rotated BRIEF (ORB)-SLAM framework for red, green and blue depths (Red Green Blue Depth, RGBD) model, which outputs pose change information from the RGBD model.
- the embodiment of the present disclosure may also adopt other manners, for example, using a global positioning system (Global Positioning System, GPS) and an angular velocity sensor to acquire the pose change information of the camera device from the second image to the first image.
- GPS Global Positioning System
- the embodiment of the present disclosure does not limit the specific manner of acquiring the pose change information of the camera device when the second image is collected to the first image.
- Step 205 Convert the second coordinates of the second target in the second camera coordinate system corresponding to the second image into the first camera coordinate system according to the pose change information of the camera from collecting the second image to collecting the first image the third coordinate of .
- the second target is the target in the second image corresponding to the first target, and corresponding to the first target, the second target may be one target or multiple targets, and the multiple targets may be targets of the same type ( For example, both are people), and can also be different types of targets (for example, including people, vehicles, etc.).
- the second camera coordinate system corresponding to the second image is the 3D coordinate system when the camera device collects the second image.
- steps 204 to 205 and steps 201 to 203 may be performed simultaneously, or may be performed in any time sequence, which is not limited in this embodiment of the present disclosure.
- Step 206 based on the first coordinates and the third coordinates, determine the motion information of the first target within a time range corresponding to the time when the second image is collected to the time when the first image is collected.
- the motion information of the first target may include a motion speed and a motion direction of the first target within a corresponding time range.
- the image of the scene outside the driving object is collected by the camera device on the driving object during the driving process of the driving object, and the target detection is performed on the collected first image to obtain the detection frame of the first target, and the first image is obtained.
- depth information in the corresponding first camera coordinate system and determine the depth information of the detection frame of the first target according to the depth information of the first image, and then, based on the position of the detection frame of the first target in the image coordinate system and The depth information of the detection frame of the first target determines the first coordinate of the first target in the first camera coordinate system; obtains the pose change information of the camera from collecting the second image to collecting the first image, wherein the second image In the image sequence where the first image is located, the time sequence is located before the first image and is separated from the first image by a preset number of frames, and then according to the pose change information, the target in the second image corresponding to the first target is used as the first target.
- Second target convert the second coordinate of the second target in the second camera coordinate system corresponding to the second image to the third coordinate in the first camera coordinate system, and then determine the first coordinate based on the first coordinate and the third coordinate
- the motion information of the target within the corresponding time range from the acquisition moment of the second image to the acquisition moment of the first image.
- the embodiments of the present disclosure use computer vision technology to determine the motion information of the target in the driving scene based on the image sequence of the driving scene, without the aid of lidar.
- Emitting a laser beam to construct point cloud data, performing target detection and target tracking on the two point cloud data, and calculating the movement speed and direction of the target can avoid a lot of computational processing, save processing time, and improve processing efficiency, which is beneficial to meet the needs of unmanned Driving and other scenarios that require high real-time performance.
- FIG. 3 is a schematic flowchart of a method for detecting motion information of a target provided by another exemplary embodiment of the present disclosure. As shown in FIG. 3 , on the basis of the above-mentioned embodiment shown in FIG. 2 , step 203 may include the following steps:
- Step 2031 Obtain the depth value of each pixel in the detection frame of the first target from the depth information of the first image.
- the depth information of the first image includes the depth value of each pixel in the first image, and the depth value of each pixel in the detection frame of the first target can be queried from the depth information of the first image.
- Step 2032 Determine the depth information of the detection frame of the first target based on the depth value of each pixel in the detection frame of the first target in a preset manner.
- the detection frame of the first target includes a plurality of pixels, and each pixel has its own depth value. Based on this embodiment, the detection frame of the first target is determined comprehensively based on the depth values of each pixel in the detection frame of the first target. In order to accurately determine the first coordinate of the first target in the first camera coordinate system according to the depth information and the position of the detection frame of the first target in the image coordinate system, the first target in the first camera coordinate system can be improved. The accuracy of the coordinates in the system.
- the depth value with the highest frequency of occurrence may be selected as the depth information of the detection frame of the first target.
- the inventor found through research that, in practical applications, due to vibration and light during vehicle driving, the quality of the image captured by the camera may be affected, resulting in some noise points in the image, which cannot be Accurately obtaining the depth values of these noise points results in that the depth values of these noise points in the depth information are too large or too small.
- the distance between each point on the same target and the camera device is similar, and the depth value of the corresponding pixel is also similar.
- the frequency of occurrence is the highest.
- the depth value is the depth value corresponding to the most pixels, and the depth value of the individual pixels with large differences can be ignored to avoid the influence of the depth value of the noise pixel in the first image on the depth information of the detection frame of the entire first target. Improve the accuracy of the depth information of the detection frame of the first target.
- the depth value range with the largest number of pixels within the value range to determine the depth information of the detection frame of the first target, for example, the maximum value of the depth value range with the largest number of pixels within the same depth value range. , the minimum value, the average value of the maximum value and the minimum value, or the median value, etc., as the depth value of the detection frame of the first target.
- each depth value range may be pre-divided, and the number of pixels in the depth value of each pixel in the detection frame of the first target, which are respectively within the preset depth value ranges, are within a certain depth value range.
- the greater the number of pixels, the more corresponding points on the surface of the first target, and the depth of the detection frame of the first target is determined based on the depth value range with the largest number of pixels within a certain depth value range.
- the depth values of some pixels with large differences can be ignored, so as to avoid the influence of the depth values of noise pixels in the first image on the depth information of the detection frame of the entire first target, thereby improving the depth of the detection frame of the first target. accuracy of information.
- the average value of the depth values of each pixel in the detection frame of the first target may also be obtained as the depth information of the detection frame of the first target.
- the average value of the depth values of each pixel in the detection frame of the first target is obtained as the depth information of the detection frame of the first target, which can quickly determine the depth information of the detection frame of the first target and reduce individual differences
- the larger depth value of the pixel has an influence on the depth information of the detection frame of the entire first target, thereby improving the accuracy of the depth information of the detection frame of the first target.
- FIG. 4 is a schematic flowchart of a method for detecting motion information of a target provided by another exemplary embodiment of the present disclosure. As shown in FIG. 4 , on the basis of the above-mentioned embodiment shown in FIG. 2 or FIG. 3 , before step 205 , the following steps may be further included:
- Step 301 Determine the correspondence between at least one object in the first image and at least one object in the second image.
- At least one object in the first image includes the above-mentioned first object.
- At least one target in the first image and at least one target in the second image may be any target of interest in the scene outside the driving object, such as various types of targets such as people, vehicles, buildings, etc. .
- the first object is one or more objects in the at least one object in the first image
- the second object is one or more objects in the at least one object in the second image.
- the first target is a target that needs to be detected with motion information in the first image
- the second target is a target that belongs to the same target as the first target in the second image.
- Step 302 determine the target in the second image corresponding to the first target as the second target.
- step 301 After determining the correspondence between the at least one object in the first image and the at least one object in the second image, based on the correspondence, it is possible to determine the second image corresponding to the first object in the first image.
- target is the second target.
- the corresponding relationship between the objects in the two images can be determined for the two images.
- the second object in the second image corresponding to the first object can be directly determined according to the corresponding relationship, so as to determine the second object in the second image. target efficiency.
- the detection frame of at least one target in the second image may be tracked to obtain the difference between the at least one target in the first image and the at least one target in the second image. Correspondence between.
- the correspondence between objects in different images can be obtained by tracking the detection frame of the object.
- FIG. 5 is a schematic flowchart of a method for detecting motion information of a target provided by yet another exemplary embodiment of the present disclosure. As shown in FIG. 5 , in other embodiments, step 301 may include the following steps:
- Step 3011 Obtain optical flow information from the second image to the first image.
- the optical flow information is used to represent motion or timing information of pixels between images in a video or image sequence.
- the optical flow information from the second image to the first image that is, the two-dimensional motion field from the second image to the pixels in the first image, is used to represent the movement of the pixels in the second image to the first image.
- vision technology can be used, for example, by using an open source computer vision library (Open Source Computer Vision Library, OpenCV), for example, the second image and the first image are input into an OpenCV-based model, and the model Output optical flow information between the second image and the first image.
- OpenCV Open Source Computer Vision Library
- Step 3012 respectively for the detection frame of each target in at least one target in the second image, based on the optical flow information and the detection frame of the target in the second image, determine that the pixel points in the detection frame of the target in the second image are transferred to position in the first image.
- Step 3013 obtain the intersection ratio (Intersection over Union, IoU) between the collection of the position where the pixel points in the detection frame of the target in the second image are transferred to the first image and each detection frame in the first image, that is, The coverage ratio between the set and each detection box in the first image.
- IoU intersection ratio
- the intersection I between the above-mentioned set and each detection frame in the first image, and the union U between the above-mentioned set and each detection frame in the first image can be obtained, and calculate respectively.
- the ratio between the intersection I and the union U between the above set and each detection frame in the first image is taken as the coverage ratio between the set and each detection frame in the first image.
- Step 3014 establish the corresponding relationship between the target in the second image and the target corresponding to the detection frame with the largest intersection ratio in the first image, that is, take the target corresponding to the detection frame with the largest intersection ratio in the first image as the second The target corresponding to the target in the image.
- a set of positions where pixels in the detection frame of a certain target in the second image are transferred to the first image is determined based on the optical flow information between the two images, and the set and the position in the first image are obtained respectively.
- the intersection ratio between each detection frame the larger the intersection ratio, the greater the repetition ratio between the detection frame in the first image and the pixels in the above set, and the intersection ratio between each detection frame in the first image and the set.
- the probability that the largest detection frame is the detection frame of the target in the second image is greater, and the pixels in the detection frame of the target in the second image are transferred to the position in the first image through the optical flow information between the two images and the detection frame of the target in the second image.
- the corresponding relationship between the targets in the two images can be determined by the intersection ratio between the set of , and each detection frame in the first image, which can more accurately and objectively determine the corresponding relationship between the targets in the two images.
- FIG. 6 is a schematic flowchart of a method for detecting motion information of a target provided by yet another exemplary embodiment of the present disclosure. As shown in FIG. 6 , on the basis of the above-mentioned embodiment shown in FIG. 2 or FIG. 3 , step 206 may include the following steps:
- Step 2061 Obtain a vector formed from the third coordinate to the first coordinate.
- the vector formed from the third coordinate to the first coordinate is the displacement vector formed from the third coordinate to the first coordinate, that is, the directed line segment formed from the third coordinate to the first coordinate, the size of the displacement vector, is the straight-line distance from the third coordinate to the first coordinate, and the direction of the displacement vector is from the third coordinate to the first coordinate.
- Step 2062 based on the direction of the vector formed from the third coordinate to the first coordinate, determine the movement direction of the first target within the corresponding time range from the acquisition moment of the second image to the acquisition moment of the first image, based on the third coordinate to the
- the norm of the vector formed by a coordinate and the above time range determine the moving speed of the first target within the above time range.
- the ratio of the norm of the vector formed by the third coordinate to the first coordinate and the above time range can be obtained as The moving speed of the first target within the above time range.
- the movement direction and movement speed of the first target within the above-mentioned time range constitute the movement information of the first target in the above-mentioned time range.
- the movement direction and movement speed of the first target within the above-mentioned corresponding time range can be accurately determined based on the vector formed from the third coordinate to the first coordinate, so as to know the movement state of the first target.
- FIG. 7 is a schematic flowchart of a method for detecting motion information of a target provided by another exemplary embodiment of the present disclosure. As shown in FIG. 7 , on the basis of the above-mentioned embodiments shown in FIGS. 2 to 6 , before step 205 , the following steps may be further included:
- Step 401 perform target detection on the second image to obtain a detection frame of the second target.
- Step 402 acquiring depth information of the second image in the second camera coordinate system.
- the depth information of the detection frame of the second target is determined according to the depth information of the second image in the second camera coordinate system.
- the depth information of the detection frame of the second target refers to the depth information of the detection frame of the second target in the second camera coordinate system.
- Step 403 Determine the second coordinates of the second target in the second camera coordinate system based on the position of the detection frame of the second target in the image coordinate system and the depth information of the detection frame of the second target.
- target detection and acquisition of depth information can be performed in advance for the second image in the image sequence that is located before the first image, and thus the second coordinates of the second target in the second camera coordinate system are determined so as to facilitate subsequent
- the second coordinate of the second target is directly converted to determine the motion information of the first target within the corresponding time range, thereby improving the detection efficiency of the target motion information in the scene.
- the second coordinates of the second target may also be stored, It can be used for subsequent direct query, thereby improving the detection efficiency of target motion information in the scene.
- the first image as a new second image, and use a third image that is sequentially positioned after the first image in the image sequence as a new first image, to perform the target pairing described in any of the foregoing embodiments of the present disclosure.
- the method for detecting the motion information according to the invention determines the motion information of the target in the third image within the corresponding time range from the acquisition time of the first image to the acquisition time of the third image.
- the motion information of the target in the image can be detected frame by frame or at intervals of several frames for the image sequence, so as to realize the continuous detection of the motion state of the target in the scene outside the driving target during the driving process of the driving target, so that the target can be detected according to the target.
- the motion state of the driving object controls the driving of the driving object and ensures the safe driving of the driving object.
- FIG. 8 is a schematic diagram of an application flow of a method for detecting motion information of a target provided by an exemplary embodiment of the present disclosure.
- the method for detecting motion information of a target in an embodiment of the present disclosure is further described below by taking an application embodiment as an example.
- the application embodiment includes:
- Step 501 During the driving process of the driving object, a camera on the driving object collects images of scenes outside the driving object to obtain an image sequence.
- Step 506 is performed for the camera.
- Step 502 using a preset target detection frame, perform target detection on the second image I t-1 , and obtain the detection frame of the target in the second image It -1 , because the detected frame of the target may be one or more , the detection frame set BBox t-1 is used to represent the detected detection frame of the target in the second image I t-1 , and the detection frame of the target number k at time t-1 (hereinafter referred to as: target k) is described as:
- (x, y) represents the coordinates of the detection frame of target k in the image coordinate system
- w and h represent the width and height of the detection frame of target k, respectively.
- Step 503 using a preset depth estimation method, perform depth estimation on the second image It -1 to obtain a depth map D t- 1 corresponding to the second image It -1 .
- the depth map D t-1 includes the depth values in the second camera coordinate system corresponding to different pixel points in the second image It-1 at time t -1, and the pixel points in the second image It-1 (i , j) the depth value in the second camera coordinate system can be expressed as
- Step 504 obtain the depth value of each pixel in the detection frame of each target in the second image I t-1 from the depth map D t- 1 corresponding to the second image I t-1 , and adopt a preset method, based on the first
- the depth value of each pixel in the detection frame of each target in the second image It -1 determines the depth value of the detection frame of each target in the second image It-1.
- the depth value of each pixel in the detection frame of each target in the second image I t -1 means that each pixel in the detection frame of each target in the second image I t-1 is in the second camera coordinate system depth value.
- steps 503 to 504 and step 502 may be performed simultaneously, or may be performed in any time sequence, which is not limited in this embodiment of the present disclosure.
- Step 505 respectively for the detection frame of each target in the second image I t-1 , based on the position of the detection frame of each target in the image coordinate system and the depth value of the detection frame of each target, determine each target at time t-1 The corresponding 3D coordinates (second coordinates) in the second camera coordinate system.
- the 3D coordinates in the second camera coordinate system corresponding to the detection frame of target k at time t-1 can be obtained as follows
- K is an internal parameter of the camera device, which is used to represent the properties of the camera device itself, and can be obtained by calibration in advance.
- Step 506 Obtain the pose change matrix T t-1 ⁇ t of the camera from time t-1 to time t .
- step 506 , steps 502 to 505 , and steps 508 to 513 may be performed simultaneously, or may be performed in any time sequence, which is not limited in this embodiment of the present disclosure.
- Step 507 according to the above-mentioned pose change matrix T t-1 ⁇ t , respectively convert the second coordinates of each target in the second image I t-1 in the second camera coordinate system to the 3D coordinates in the first camera coordinate system (ie the third coordinate above).
- the second coordinates of the detection frame of the target k in the second image I t-1 can be set as follows Convert to third coordinate
- Step 508 using a preset target detection frame, perform target detection on the first image It, and obtain a detection frame of the target (that is, the above-mentioned first target) in the first image It, because the detection frame of the detected target may be a or more, the detection frame of the first target is represented by the detection frame set BBox t , and the detection frame of the target numbered k ⁇ (hereinafter referred to as: target k ⁇ ) in the first target at time t is described as:
- (x, y) represents the coordinates of the detection frame of the target p in the image coordinate system
- w and h respectively represent the width and height of the detection frame of the target k ⁇ .
- Step 509 using a preset depth estimation method, perform depth estimation on the first image It to obtain a depth map D t corresponding to the first image It .
- the depth map D t includes depth values in the first camera coordinate system corresponding to different pixel points in the first image It at time t , and the pixel point (i, j ) in the first image It is in the first camera coordinate system
- the depth value in the system can be expressed as
- Step 510 Obtain the depth value of each pixel in the detection frame of the first target from the depth map D t corresponding to the first image I t , and adopt a preset method based on the depth of each pixel in the detection frame of the first target. value to determine the depth value of the detection frame of the first target.
- the depth value of the detection frame of the first target refers to the depth value of the detection frame of the first target in the first camera coordinate system.
- steps 509 to 510 and step 508 may be performed simultaneously, or may be performed in any order in time, which is not limited in this embodiment of the present disclosure.
- Step 511 based on the position of the detection frame of the first target in the image coordinate system and the depth value of the detection frame of the first target, determine the first coordinate in the first camera coordinate system corresponding to the first target at time t.
- the first target may be one target or multiple targets.
- the detection frame based on each target is determined in the image coordinate system.
- the position and depth values determine the 3D coordinates in the first camera coordinate system corresponding to the detection frame of the target at time t (ie, the above-mentioned first coordinates). For example, continuing to take the target k ⁇ at time t as an example, the 3D coordinates in the first camera coordinate system corresponding to the detection frame of the target k ⁇ at time t can be obtained as follows
- K is an internal parameter of the camera device, which is used to represent the properties of the camera device itself, and can be obtained by calibration in advance.
- Step 512 Determine the correspondence between the first object in the first image It -1 and the object in the second image It.
- Step 513 determine the target in the second image corresponding to the first target as the second target.
- the second target may be one target or multiple targets.
- the second target may be one target or multiple targets, and multiple targets may be targets of the same type (for example, all people), or different types of targets (for example, including people, vehicles, etc.). , buildings, etc.).
- the second object in the second image corresponding to the first object may be determined by the manner described in any of the above-mentioned embodiments of FIG. 4 to FIG. 5 of the present disclosure,
- steps 512 to 513 may be executed after passing through step 502 and step 508, and may be executed simultaneously with the above-mentioned other steps in this application embodiment, or may be executed in any time sequence, and this embodiment of the present disclosure does not make restrictions.
- Step 514 based on the first coordinate of the first target and the corresponding third coordinate of the second target, determine the motion information of the first target in the corresponding time range ⁇ t from time t -1 to time t.
- the first target may be one target or multiple targets.
- step 514 is performed for each first target respectively.
- the third coordinate of the corresponding second target k at time t-1 Determine the motion information of the first target k ⁇ in the corresponding time range ⁇ t . Specifically, get the third coordinate to the first coordinate
- the formed vector, taking the direction of the vector as the movement direction of the first target k ⁇ in the corresponding time range ⁇ t is expressed as:
- FIG. 9 is a schematic flowchart of a method for controlling a traveling object based on motion information of a target provided by an exemplary embodiment of the present disclosure. This embodiment can be applied to traveling objects such as vehicles, robots, and toy cars. As shown in FIG. 9 , the method for controlling a traveling object based on the motion information of the target in this embodiment includes the following steps:
- Step 601 During the driving process of the driving object, an image sequence of a scene outside the driving object is collected by a camera device on the driving object.
- Step 602 taking at least one frame image in the image sequence as the first image, and taking at least one frame image in the image sequence before the first image and separated from the first image by a preset number of frames as the second image,
- the motion information of the target in the scene outside the driving object is determined by using the method for detecting the motion information during driving according to any of the above embodiments of the present disclosure.
- Step 603 Generate a control instruction for controlling the driving state of the driving object according to the motion information of the above target, so as to control the driving state of the driving object.
- the motion information detection method during driving described in any of the embodiments of the present disclosure can be used to determine the motion information of the target in the driving scene, and then generate the motion information for controlling the driving according to the motion information of the target.
- the control command of the driving state of the object thus realizing the use of computer vision technology to detect the motion information of the target in the driving scene, and the intelligent driving control of the driving object, which is beneficial to meet the real-time intelligent driving control of the driving object in the unmanned scene, so as to ensure Safe driving of moving objects.
- control instructions may include, but are not limited to, at least one of the following: a control instruction for maintaining the size of the motion speed, a control instruction for adjusting the size of the motion speed (for example, a control for decelerating travel).
- control commands for accelerating driving, etc.
- control commands for maintaining the direction of movement such as control commands for left steering, control commands for right steering, and control commands for merging to the left lane , or a control command to merge to the right lane, etc.
- a control command for early warning such as a reminder message to pay attention to the target ahead
- a control command for switching the driving mode such as switching to the automatic cruise driving mode
- control instructions for switching to manual driving mode etc.
- the method for detecting the motion information of the target or the method for controlling the traveling object based on the motion information of the target provided by any of the above embodiments of the present disclosure can be executed by any appropriate device with data processing capabilities, including but not limited to: terminal equipment and servers etc.
- the method for detecting the motion information of the target or the method for controlling the traveling object based on the motion information of the target provided by any of the above-mentioned embodiments of the present disclosure may be executed by a processor, for example, the processor calls the corresponding instructions stored in the memory to
- the method for detecting motion information of a target or the method for controlling a traveling object based on the motion information of the target provided by any of the above embodiments of the present disclosure is executed. No further description will be given below.
- FIG. 10 is a schematic structural diagram of an apparatus for detecting motion information of a target provided by an exemplary embodiment of the present disclosure.
- the device for detecting the motion information of the target may be installed in electronic equipment such as terminal equipment and servers, or may be installed on moving objects such as vehicles, robots, toy cars, etc., to execute the motion of the target in any of the above embodiments of the present disclosure method of information detection.
- the apparatus for detecting motion information of a target includes: a detection module 701 , a first acquisition module 702 , a first determination module 703 , a second determination module 704 , a second acquisition module 705 , a conversion module 706 and The third determination module 707 . in:
- the detection module 701 is configured to perform target detection on a first image to obtain a detection frame of the first target, where the first image is an image of a scene outside the driving object collected by a camera on the driving object during the driving process of the driving object.
- the first acquiring module 702 is configured to acquire depth information of the first image in the corresponding first camera coordinate system.
- the first determining module 703 is configured to determine the depth information of the detection frame of the first target according to the depth information of the first image obtained by the first obtaining module 702 .
- the second determination module 704 is configured to determine that the first target is in the image coordinate system based on the position of the detection frame of the first target obtained by the detection module 701 and the depth information of the detection frame of the first target determined by the first determination module 703 The first coordinate in the first camera coordinate system.
- the second obtaining module 705 is configured to obtain the pose change information of the camera device from collecting the second image to collecting the first image.
- the second image is an image whose time sequence is located before the first image in the image sequence where the first image is located and is spaced from the first image by a preset number of frames.
- the conversion module 706 is configured to convert the second coordinate of the second target in the second camera coordinate system corresponding to the second image to the third coordinate in the first camera coordinate system according to the pose change information obtained by the second acquisition module 705. coordinate.
- the second target is the target in the second image corresponding to the first target.
- the third determining module 707 is configured to determine, based on the first coordinates determined by the second determining module 704 and the third coordinates converted by the converting module 706, the corresponding time of the first target from the time of capturing the second image to the time of capturing the first image Motion information in range.
- computer vision technology is used to determine the motion information of the target in the driving scene based on collecting the scene outside the driving object during the driving process of the driving object, without the aid of lidar, compared with the use of lidar to obtain the target motion speed and
- the method of direction because there is no need to construct point cloud data by emitting laser beams at high frequencies, perform target detection and target tracking on two point cloud data, and calculate the movement speed and direction of the target, it can avoid a lot of computational processing and save processing time. , improve processing efficiency, and help meet the needs of scenarios with high real-time requirements such as unmanned driving.
- FIG. 11 is a schematic structural diagram of an apparatus for detecting motion information of a target provided by another exemplary embodiment of the present disclosure.
- the first determining module 703 includes: The depth value of each pixel in the detection frame of the first target is obtained from the depth information of the image; the first determination unit 7032 is configured to adopt a preset method based on each pixel in the detection frame of the first target obtained by the first obtaining unit 7031 The depth value of the point determines the depth information of the detection frame of the first target.
- the first determining unit 7032 is specifically configured to select, among the depth values of each pixel in the detection frame of the first target acquired by the first acquiring unit 7031, the depth value with the highest occurrence frequency as the depth value.
- the depth information of the detection frame of the first target is specifically configured to select, among the depth values of each pixel in the detection frame of the first target acquired by the first acquiring unit 7031, the depth value with the highest occurrence frequency as the depth value.
- the first determining unit 7032 is specifically configured to determine, among the depth values of each pixel in the detection frame of the first target, the number of pixels that are respectively within the preset depth value ranges; based on The depth information of the detection frame of the first target is determined in the depth value range with the largest number of pixel points in the same depth value range.
- the first determining unit 7032 is specifically configured to acquire the average value of the depth values of each pixel in the detection frame of the first target as the depth information of the detection frame of the first target.
- the apparatus for detecting the motion information of the target in the above embodiment may further include: a fourth determination module 708 and a fifth determination module 709. in:
- the fourth determination module 708 is configured to determine the correspondence between at least one object in the first image and at least one object in the second image; wherein, the objects in the first image include the above-mentioned first object.
- the fifth determining module 709 is configured to determine, according to the correspondence determined by the fourth module 708, the target in the second image corresponding to the first target as the above-mentioned second target.
- the fourth determination module 708 is specifically configured to track the detection frame of at least one target in the second image to obtain at least one target in the first image and the target in the second image. Correspondence between at least one target.
- the fourth determining module 708 may include: a second acquiring unit 7081, configured to acquire optical flow information from the second image to the first image; and a second determining unit 7082, configured to respectively target the second The detection frame of each target in at least one target in the image, based on the above-mentioned optical flow information and the detection frame of the target in the second image, determine the position where the pixel points in the detection frame of the target in the second image are transferred to the first image
- the third acquisition unit 7083 is used to obtain the intersection ratio between the collection of the positions where the pixel points in the detection frame of the target are transferred to the first image and each detection frame in the first image; the establishment unit 7084 is used to establish The correspondence between the target in the second image and the target corresponding to the detection frame with the largest intersection ratio in the first image.
- the third determining module 707 includes: a fourth acquiring unit 7071, configured to acquire a vector formed from the third coordinate to the first coordinate; a third determining unit 7072, using Based on the direction of the vector obtained by the fourth obtaining unit 7071, the movement direction of the first target within the above time range is determined, and the movement speed of the first target within the above time range is determined based on the norm of the vector and the above time range.
- the detection module 701 may also be configured to perform target detection on the second image to obtain a detection frame of the second target.
- the first obtaining module 702 may also be configured to obtain depth information of the second image in the second camera coordinate system.
- the second determination module 704 can also be configured to determine the second target based on the position of the detection frame of the second target obtained by the detection module 701 in the image coordinate system and the depth information of the detection frame of the second target determined by the first determination module 703 The second coordinate in the second camera coordinate system.
- the apparatus for detecting motion information of a target in the above embodiment may further include: a storage module 710 configured to store the second data of the second target determined by the second determining module 704 coordinate.
- the first image may also be used as the new second image, and the third image in the sequence of images located after the first image may be used as the new image.
- each module in the device for detecting the motion information of the target performs corresponding operations to determine the target in the third image within the corresponding time range from the acquisition moment of the first image to the acquisition moment of the third image. sports information.
- FIG. 12 is a schematic structural diagram of an apparatus for controlling a traveling object based on motion information of a target provided by an exemplary embodiment of the present disclosure.
- the device for controlling the traveling object based on the motion information of the target can be installed on the traveling objects such as vehicles, robots, toy cars, etc., to control the traveling object based on the motion information of the target. of.
- the device for controlling the traveling object based on the motion information of the target includes a camera device 801 , a motion information detection device 802 and a control device 803 . in:
- the camera device 801 is arranged on the driving object, and is used for collecting an image sequence of a scene outside the driving object during the driving process of the driving object.
- the motion information detection device 802 is configured to use at least one frame image in the above image sequence as the first image, and use at least one frame image in the above image sequence before the first image and spaced from the first image by a preset number of frames As the second image, the motion information of the object in the scene outside the driving object is determined.
- the motion information detection apparatus 802 may be specifically implemented by the apparatus for detecting motion information of a target according to any of the embodiments in FIG. 10 to FIG. 11 .
- the control device 803 is configured to generate a control instruction for controlling the traveling state of the traveling object according to the motion information of the target detected by the motion information detection device 802 .
- the image sequence of the scene outside the driving object is collected by the camera device on the driving object, and at least one frame image in the image sequence is used as the first image, and the image sequence located in the first image in the image sequence is used as the first image.
- At least one frame of image before the image and separated from the first image by a preset number of frames is used as the second image, and the motion information of the target in the driving scene is determined by using the method for detecting the motion information of the target described in any embodiment of the present disclosure, Then, according to the motion information of the target, a control command for controlling the driving state of the driving object is generated, so that the computer vision technology is used to detect the motion information of the target in the driving scene, and the intelligent driving control of the driving object is realized, which is conducive to satisfying the unmanned driving scene. Real-time intelligent driving control of the driving object in the middle of the road to ensure the safe driving of the driving object.
- control instructions may include, but are not limited to, at least one of the following: a control instruction for maintaining the size of the motion speed, a control instruction for adjusting the size of the motion speed, and a control instruction for maintaining the motion direction
- FIG. 13 illustrates a block diagram of an electronic device according to an embodiment of the present disclosure.
- the electronic device includes one or more processors 11 and memory 12 .
- the processor 11 may be a central processing unit (Central Processing Unit, CPU) or other form of processing unit with data processing capability and/or instruction execution capability, and may control other components in the electronic device 10 to perform desired functions.
- CPU Central Processing Unit
- Memory 12 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
- the volatile memory may include, for example, random access memory (Random Access Memory, RAM) and/or cache memory (cache).
- the non-volatile memory may include, for example, a read-only memory (Read-Only Memory, ROM), a hard disk, a flash memory, and the like.
- One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 11 may execute the program instructions to implement the detection of the motion information of the target in the various embodiments of the present disclosure described above method or a method of controlling a traveling object based on the motion information of the target and/or other desired functions.
- Various contents such as depth information of an image, depth information of a detection frame of a target, and pose change information of a camera can also be stored in the computer-readable storage medium.
- the electronic device 10 may also include an input device 13 and an output device 14 interconnected by a bus system and/or other form of connection mechanism (not shown).
- the input device 13 may be the aforementioned microphone or microphone array, or the input device 13 may be a communication network connector.
- the input device 13 may also include, for example, a keyboard, a mouse, and the like.
- the output device 14 can output various information to the outside, including the determined motion information of the first target within a time range corresponding to the time of collection of the second image to the time of collection of the first image.
- the output devices 14 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
- the electronic device 10 may also include any other suitable components according to the specific application.
- embodiments of the present disclosure may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification Steps in a method for detecting motion information of a target or a method for controlling a traveling object based on the motion information of the target described in the various embodiments of the present disclosure.
- the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages.
- the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
- embodiments of the present disclosure may also be computer-readable storage media having computer program instructions stored thereon that, when executed by a processor, cause the processor to perform the above-described "Example Method" section of this specification Steps in a method for detecting motion information of a target or a method for controlling a traveling object based on the motion information of the target according to various embodiments of the present disclosure described in .
- the computer-readable storage medium may employ any combination of one or more readable media.
- the readable medium may be a readable signal medium or a readable storage medium.
- the readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above.
- readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (Erasable Programmable Read-Only Memory, EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage devices, magnetic storage devices, Or any suitable combination of the above.
- the methods and apparatus of the present disclosure may be implemented in many ways.
- the methods and apparatus of the present disclosure may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
- the above-described order of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise.
- the present disclosure can also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing methods according to the present disclosure.
- the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
- each component or each step may be decomposed and/or recombined. These disaggregations and/or recombinations should be considered equivalents of the present disclosure.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (11)
- 一种对目标的运动信息进行检测的方法,包括:对第一图像进行目标检测,得到第一目标的检测框,所述第一图像为行驶对象上的摄像装置在所述行驶对象行驶过程中采集的所述行驶对象外场景的图像;获取所述第一图像在对应的第一相机坐标系中的深度信息;根据所述第一图像的深度信息,确定所述第一目标的检测框的深度信息,并基于所述第一目标的检测框在图像坐标系中的位置和所述第一目标的检测框的深度信息,确定所述第一目标在所述第一相机坐标系中的第一坐标;获取所述摄像装置从采集第二图像到采集所述第一图像的位姿变化信息;其中,所述第二图像为所述第一图像所在图像序列中时序位于所述第一图像之前、且与所述第一图像间隔预设帧数的图像;根据所述位姿变化信息,将第二目标在所述第二图像对应的第二相机坐标系中的第二坐标转换到所述第一相机坐标系中的第三坐标;其中,所述第二目标为所述第一目标对应的第二图像中的目标;基于所述第一坐标和所述第三坐标,确定所述第一目标从所述第二图像的采集时刻到所述第一图像的采集时刻对应时间范围内的运动信息。
- 根据权利要求1所述的方法,其中,所述根据所述第一图像在对应的第一相机坐标系中的深度信息,确定所述第一目标的检测框的深度信息,包括:从所述第一图像的深度信息中获取所述第一目标的检测框中各像素点的深度值;采用预设方式,基于所述第一目标的检测框中各像素点的深度值,确定所述第一目标的检测框的深度信息。
- 根据权利要求1所述的方法,其中,所述根据所述位姿变化信息,将第二目标在所述第二图像对应的第二相机坐标系中的第二坐标转换到所述第一相机坐标系中的第三坐标之前,还包括:确定所述第一图像中的至少一个目标与所述第二图像中的至少一个目标之间的对应关系;所述第一图像中的至少一个目标包括所述第一目标;根据所述对应关系,确定所述第一目标对应的第二图像中的目标作为所述第二目标。
- 根据权利要求3所述的方法,其中,所述确定所述第一图像中的至少一个目标与所述第二图像中的至少一个目标之间的对应关系,包括:对所述第二图像中的至少一个目标的检测框进行跟踪,得到所述第一图像中的至少一个目标与所述第二图像中的至少一个目标之间的对应关系;或者,获取所述第二图像到所述第一图像的光流信息;分别针对所述第二图像中的至少一个目标中各目标的检测框,基于所述光流信息和所述第二图像中的目标的检测框,确定所述第二图像中的目标的检测框中像素点转移到所述第一图像中的位置;获取所述目标的检测框中像素点转移到所述第一图像中的位置的集合与所述第一图像中的各检测框之间的交并比;建立所述第二图像中的目标与所述第一图像中交并比最大的检测框对应目标之间的对应关系。
- 根据权利要求1所述的方法,其中,所述基于所述第一坐标和所述第三坐标,确定所述第一目标从所述第二图像的采集时刻到所述第一图像的采集时刻对应时间范围内的运动信息,包括:获取所述第三坐标到所述第一坐标形成的向量;基于所述向量的方向确定所述第一目标在所述时间范围内的运动方向,基于所述向量的范数与所述时间范围确定所述第一目标在所述时间范围内的运动速度,其中,所述第一目标在所述时间范围内的运动信息包括:所述第一目标在所述时间范围内的运动方向和运动速度。
- 根据权利要求1-5任一所述的方法,其中,所述根据所述位姿变化信息,将第二目标在所述第二图像对应的第二相机坐标系中的第二坐标转换到所述第一相机坐标系中的第三坐标之前,还包括:对所述第二图像进行目标检测,得到所述第二目标的检测框;获取所述第二图像在所述第二相机坐标系中的深度信息,并根据所述第二图像在所述第二相机坐标系中的深度信息,确定所述第二目标的检测框的深度信息;基于所述第二目标的检测框在图像坐标系中的位置和所述第二目标的检测框的深度信息,确定所述第二目标在所述第二相机坐标系中的第二坐标。
- 一种基于目标的运动信息控制行驶对象的方法,包括:在行驶对象行驶过程中,通过所述行驶对象上的摄像装置采集所述行驶对象外场景的图像序列;以所述图像序列中的至少一帧图像作为第一图像、以所述图像序列中位于所述第一图像之前、且与所述第一图像间隔预设帧数的至少一帧图像作为第二图像,利用权利要求1-7任一所述的方法,确定所述场景中目标的运动信息;根据所述目标的运动信息生成用于控制所述行驶对象行驶状态的控制指令。
- 一种对目标的运动信息进行检测的装置,包括:检测模块,用于对第一图像进行目标检测,得到第一目标的检测框,所述第一图像为行驶对象上的摄像装置在所述行驶对象行驶过程中采集的所述行驶对象外场景的图像;第一获取模块,用于获取所述第一图像在对应的第一相机坐标系中的深度信息;第一确定模块,用于根据所述第一获取模块获取的所述第一图像的深度信息,确定所述第一目标的检测框的深度信息;第二确定模块,用于基于所述检测模块得到的所述第一目标的检测框在图像坐标系中的位置和所述第一确定模块确定的所述第一目标的检测框的深度信息,确定所述第一目标在所述第一相机坐标系中的第一坐标;第二获取模块,用于获取摄像装置从采集第二图像到采集所述第一图像的位姿变化信息;其中,所述第二图像为所述第一图像所在图像序列中时序位于所述第一图像之前、且与所述第一图像间隔预设帧数的图像;转换模块,用于根据所述第二获取模块获取的所述位姿变化信息,将第二目标在所述第二图像对应的第二相机坐标系中的第二坐标转换到所述第一相机坐标系中的第三坐标;其中,所述第二目标为所述第一目标对应的第二图像中的目标;第三确定模块,用于基于所述第二确定模块确定的所述第一坐标和所述转换模块转换到的所述第三坐标,确定所述第一目标从所述第二图像的采集 时刻到所述第一图像的采集时刻对应时间范围内的运动信息。
- 一种基于目标的运动信息控制行驶对象的装置,包括:摄像装置,设置于行驶对象上,用于在行驶对象行驶过程中,采集所述行驶对象外场景的图像序列;运动信息检测装置,用于以所述图像序列中的至少一频帧图像作为第一图像、以所述图像序列中位于所述第一图像之前、且与所述第一图像间隔预设帧数的至少一帧图像作为第二图像,确定所述场景中目标的运动信息;所述运动信息检测装置包括权利要求10-16任一所述的装置;控制装置,用于根据所述运动信息检测装置检测到的所述目标的运动信息,生成用于控制所述行驶对象行驶状态的控制指令。
- 一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行上述权利要求1-7任一所述的方法。
- 一种电子设备,所述电子设备包括:处理器;用于存储所述处理器可执行指令的存储器;所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现上述权利要求1-7任一所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2022557731A JP7306766B2 (ja) | 2021-04-07 | 2022-02-18 | ターゲット動き情報検出方法、装置、機器及び媒体 |
EP22783799.4A EP4246437A1 (en) | 2021-04-07 | 2022-02-18 | Method and apparatus for detecting motion information of target, and device and medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110373003.XA CN113096151B (zh) | 2021-04-07 | 2021-04-07 | 对目标的运动信息进行检测的方法和装置、设备和介质 |
CN202110373003.X | 2021-04-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022213729A1 true WO2022213729A1 (zh) | 2022-10-13 |
Family
ID=76674988
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/076765 WO2022213729A1 (zh) | 2021-04-07 | 2022-02-18 | 对目标的运动信息进行检测的方法和装置、设备和介质 |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP4246437A1 (zh) |
JP (1) | JP7306766B2 (zh) |
CN (1) | CN113096151B (zh) |
WO (1) | WO2022213729A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115890639A (zh) * | 2022-11-17 | 2023-04-04 | 浙江荣图智能科技有限公司 | 一种机器人视觉引导定位抓取控制系统 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113096151B (zh) * | 2021-04-07 | 2022-08-09 | 地平线征程(杭州)人工智能科技有限公司 | 对目标的运动信息进行检测的方法和装置、设备和介质 |
CN113936042B (zh) * | 2021-12-16 | 2022-04-05 | 深圳佑驾创新科技有限公司 | 一种目标跟踪方法、装置和计算机可读存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160034040A1 (en) * | 2014-07-29 | 2016-02-04 | Sony Computer Entertainment Inc. | Information processing device, information processing method, and computer program |
CN110415276A (zh) * | 2019-07-30 | 2019-11-05 | 北京字节跳动网络技术有限公司 | 运动信息计算方法、装置及电子设备 |
CN111723716A (zh) * | 2020-06-11 | 2020-09-29 | 深圳地平线机器人科技有限公司 | 确定目标对象朝向的方法、装置、系统、介质及电子设备 |
CN112419385A (zh) * | 2021-01-25 | 2021-02-26 | 国汽智控(北京)科技有限公司 | 一种3d深度信息估计方法、装置及计算机设备 |
CN112509047A (zh) * | 2020-12-10 | 2021-03-16 | 北京地平线信息技术有限公司 | 基于图像的位姿确定方法、装置、存储介质及电子设备 |
CN112541553A (zh) * | 2020-12-18 | 2021-03-23 | 深圳地平线机器人科技有限公司 | 目标对象的状态检测方法、装置、介质以及电子设备 |
CN113096151A (zh) * | 2021-04-07 | 2021-07-09 | 地平线征程(杭州)人工智能科技有限公司 | 对目标的运动信息进行检测的方法和装置、设备和介质 |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019084804A1 (zh) * | 2017-10-31 | 2019-05-09 | 深圳市大疆创新科技有限公司 | 一种视觉里程计及其实现方法 |
CN108734726A (zh) * | 2017-12-04 | 2018-11-02 | 北京猎户星空科技有限公司 | 一种目标跟踪方法、装置、电子设备及存储介质 |
CN111344644B (zh) | 2018-08-01 | 2024-02-20 | 深圳市大疆创新科技有限公司 | 用于基于运动的自动图像捕获的技术 |
CN111354037A (zh) | 2018-12-21 | 2020-06-30 | 北京欣奕华科技有限公司 | 一种定位方法及系统 |
CN109816690A (zh) * | 2018-12-25 | 2019-05-28 | 北京飞搜科技有限公司 | 基于深度特征的多目标追踪方法及系统 |
CN111402286B (zh) * | 2018-12-27 | 2024-04-02 | 杭州海康威视系统技术有限公司 | 一种目标跟踪方法、装置、系统及电子设备 |
CN109727273B (zh) * | 2018-12-29 | 2020-12-04 | 北京茵沃汽车科技有限公司 | 一种基于车载鱼眼相机的移动目标检测方法 |
EP3680858A1 (en) * | 2019-01-11 | 2020-07-15 | Tata Consultancy Services Limited | Dynamic multi-camera tracking of moving objects in motion streams |
CN111213153A (zh) * | 2019-01-30 | 2020-05-29 | 深圳市大疆创新科技有限公司 | 目标物体运动状态检测方法、设备及存储介质 |
CN111247557A (zh) * | 2019-04-23 | 2020-06-05 | 深圳市大疆创新科技有限公司 | 用于移动目标物体检测的方法、系统以及可移动平台 |
CN112015170A (zh) * | 2019-05-29 | 2020-12-01 | 北京市商汤科技开发有限公司 | 运动物体检测及智能驾驶控制方法、装置、介质及设备 |
JP7383870B2 (ja) * | 2019-05-30 | 2023-11-21 | モービルアイ ビジョン テクノロジーズ リミテッド | デバイス、方法、システムおよびコンピュータプログラム |
CN110533699B (zh) * | 2019-07-30 | 2024-05-24 | 平安科技(深圳)有限公司 | 基于光流法的像素变化的动态多帧测速方法 |
JP7339616B2 (ja) * | 2019-08-07 | 2023-09-06 | 眞次 中村 | 速度測定装置および速度測定方法 |
CN110929567B (zh) * | 2019-10-17 | 2022-09-27 | 北京全路通信信号研究设计院集团有限公司 | 基于单目相机监控场景下目标的位置速度测量方法及系统 |
CN111179311B (zh) | 2019-12-23 | 2022-08-19 | 全球能源互联网研究院有限公司 | 多目标跟踪方法、装置及电子设备 |
CN111583329B (zh) * | 2020-04-09 | 2023-08-04 | 深圳奇迹智慧网络有限公司 | 增强现实眼镜显示方法、装置、电子设备和存储介质 |
CN111897429A (zh) | 2020-07-30 | 2020-11-06 | 腾讯科技(深圳)有限公司 | 图像显示方法、装置、计算机设备及存储介质 |
CN112541938A (zh) * | 2020-12-17 | 2021-03-23 | 通号智慧城市研究设计院有限公司 | 一种行人速度测量方法、系统、介质及计算设备 |
-
2021
- 2021-04-07 CN CN202110373003.XA patent/CN113096151B/zh active Active
-
2022
- 2022-02-18 JP JP2022557731A patent/JP7306766B2/ja active Active
- 2022-02-18 WO PCT/CN2022/076765 patent/WO2022213729A1/zh active Application Filing
- 2022-02-18 EP EP22783799.4A patent/EP4246437A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160034040A1 (en) * | 2014-07-29 | 2016-02-04 | Sony Computer Entertainment Inc. | Information processing device, information processing method, and computer program |
CN110415276A (zh) * | 2019-07-30 | 2019-11-05 | 北京字节跳动网络技术有限公司 | 运动信息计算方法、装置及电子设备 |
CN111723716A (zh) * | 2020-06-11 | 2020-09-29 | 深圳地平线机器人科技有限公司 | 确定目标对象朝向的方法、装置、系统、介质及电子设备 |
CN112509047A (zh) * | 2020-12-10 | 2021-03-16 | 北京地平线信息技术有限公司 | 基于图像的位姿确定方法、装置、存储介质及电子设备 |
CN112541553A (zh) * | 2020-12-18 | 2021-03-23 | 深圳地平线机器人科技有限公司 | 目标对象的状态检测方法、装置、介质以及电子设备 |
CN112419385A (zh) * | 2021-01-25 | 2021-02-26 | 国汽智控(北京)科技有限公司 | 一种3d深度信息估计方法、装置及计算机设备 |
CN113096151A (zh) * | 2021-04-07 | 2021-07-09 | 地平线征程(杭州)人工智能科技有限公司 | 对目标的运动信息进行检测的方法和装置、设备和介质 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115890639A (zh) * | 2022-11-17 | 2023-04-04 | 浙江荣图智能科技有限公司 | 一种机器人视觉引导定位抓取控制系统 |
Also Published As
Publication number | Publication date |
---|---|
CN113096151A (zh) | 2021-07-09 |
JP2023523527A (ja) | 2023-06-06 |
JP7306766B2 (ja) | 2023-07-11 |
CN113096151B (zh) | 2022-08-09 |
EP4246437A1 (en) | 2023-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI691730B (zh) | 用於檢測運輸工具的環境資訊之方法和系統 | |
WO2022213729A1 (zh) | 对目标的运动信息进行检测的方法和装置、设备和介质 | |
Shin et al. | Roarnet: A robust 3d object detection based on region approximation refinement | |
JP7345504B2 (ja) | Lidarデータと画像データの関連付け | |
WO2020052540A1 (zh) | 对象标注方法、移动控制方法、装置、设备及存储介质 | |
WO2019179464A1 (zh) | 用于预测目标对象运动朝向的方法、车辆控制方法及装置 | |
CN110363058B (zh) | 使用单触发卷积神经网络的用于避障的三维对象定位 | |
US11270457B2 (en) | Device and method for detection and localization of vehicles | |
US10668921B2 (en) | Enhanced vehicle tracking | |
EP3766044B1 (en) | Three-dimensional environment modeling based on a multicamera convolver system | |
US10679369B2 (en) | System and method for object recognition using depth mapping | |
WO2019129255A1 (zh) | 一种目标跟踪方法及装置 | |
Manglik et al. | Forecasting time-to-collision from monocular video: Feasibility, dataset, and challenges | |
KR20210022703A (ko) | 운동 물체 검출 및 지능형 운전 제어 방법, 장치, 매체 및 기기 | |
WO2020233436A1 (zh) | 车辆速度确定方法及车辆 | |
WO2023036083A1 (zh) | 传感器数据处理方法、系统及可读存储介质 | |
Manglik et al. | Future near-collision prediction from monocular video: Feasibility, dataset, and challenges | |
Li et al. | Vehicle object detection based on rgb-camera and radar sensor fusion | |
Mukherjee et al. | Ros-based pedestrian detection and distance estimation algorithm using stereo vision, leddar and cnn | |
Pandey et al. | Light-weight object detection and decision making via approximate computing in resource-constrained mobile robots | |
US20220375134A1 (en) | Method, device and system of point cloud compression for intelligent cooperative perception system | |
WO2021223166A1 (zh) | 状态信息确定方法、装置、系统、可移动平台和存储介质 | |
Chi et al. | Dynamic small target detection and tracking based on hierarchical network and adaptive input image stream | |
US20240062386A1 (en) | High throughput point cloud processing | |
Chen et al. | 3D Car Tracking using Fused Data in Traffic Scenes for Autonomous Vehicle. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2022557731 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 17907662 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22783799 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022783799 Country of ref document: EP Effective date: 20230614 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |