WO2023273467A1 - Procédé et appareil de détermination de données de valeur vraie, procédé et appareil d'apprentissage de réseau neuronal, et procédé et appareil de commande de déplacement - Google Patents

Procédé et appareil de détermination de données de valeur vraie, procédé et appareil d'apprentissage de réseau neuronal, et procédé et appareil de commande de déplacement Download PDF

Info

Publication number
WO2023273467A1
WO2023273467A1 PCT/CN2022/084393 CN2022084393W WO2023273467A1 WO 2023273467 A1 WO2023273467 A1 WO 2023273467A1 CN 2022084393 W CN2022084393 W CN 2022084393W WO 2023273467 A1 WO2023273467 A1 WO 2023273467A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
state data
data
detected
target
Prior art date
Application number
PCT/CN2022/084393
Other languages
English (en)
Chinese (zh)
Inventor
罗铨
鞠孝亮
蒋沁宏
李弘扬
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023273467A1 publication Critical patent/WO2023273467A1/fr

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
    • B60W30/08Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to a method, device, electronic device, and storage medium for determining truth data, neural network training, and driving control.
  • Neural network is a model in machine learning, and it is a mathematical model that imitates the behavior characteristics of animal neural networks and performs distributed parallel information processing.
  • training samples can be used to train the neural network, and the trained neural network can be used for image recognition, image classification, and the like.
  • the neural network can be applied in the field of driving, and the scene images collected on the driving device can be detected through the trained neural network.
  • the present disclosure at least provides a method, device, electronic device, and storage medium for determining truth data, training neural networks, and driving control.
  • the present disclosure provides a method for determining truth value data, including:
  • the continuous state data includes a plurality of points of the object to be detected determined based on point cloud data collected by at least one radar device Cloud collects status data at any moment;
  • the target state data of the object to be detected at the image acquisition time corresponding to the image to be processed can be predicted, since the continuous state data includes the state data under multiple point cloud acquisition moments, Therefore, based on the state data at multiple point cloud acquisition moments corresponding to the object to be detected, the target state data of the object to be detected at the image acquisition time corresponding to the image to be processed can be predicted more accurately, and the time between image acquisition time and point cloud acquisition time is eased.
  • the deviation of the state data of the object to be detected caused by inconsistency improves the accuracy of the target state data.
  • the determining the continuous state data that matches the object to be detected included in the image to be processed includes:
  • the target object that the detection object matches can use the intermediate state data corresponding to each candidate object and the image to be processed to determine the Euclidean distance between the candidate object and the object to be detected, and then determine the distance between the candidate object and the object to be detected.
  • the target object whose Euclidean distance is the smallest the target object can be confirmed as the object to be detected, and then the continuous state data corresponding to the target object can be determined as the continuous state data of the object to be detected.
  • the determining the continuous state data that matches the object to be detected included in the image to be processed includes:
  • the point cloud data can track the target object in real time, that is, the continuous state data of the target object in the point cloud data can be confirmed, therefore, based on the image acquisition time corresponding to the image to be processed and the corresponding At the moment of point cloud collection, determine the target point cloud data associated with the image to be processed; then determine the target object in the target point cloud data that matches the object to be detected, and then the continuous state data corresponding to the target object can be used as the continuous state data of the object to be detected. status data.
  • the predicting the target state data of the object to be detected at the time of image acquisition corresponding to the image to be processed based on the continuous state data corresponding to the object to be detected includes:
  • the point cloud collection time corresponding to any state data is consistent with the image collection time, determine the any state data as the to-be-detected The target state data corresponding to the object;
  • the point cloud collection time corresponding to each state data is inconsistent with the image collection time, and the image collection time is located in the multiple point cloud collection time In the case of between any two point cloud collection moments, predict the target state data corresponding to the object to be detected based on the state data corresponding to the any two point cloud collection moments.
  • the accuracy of the generated target state data can be improved.
  • the method also includes:
  • the to-be-processed image is generated based on the continuous state data corresponding to the image to be processed and each constructed motion model.
  • the method further includes:
  • the target state data is adjusted based on the measurement state data to obtain adjusted target state data.
  • the target point cloud data matching with the target to be detected can be used to determine the measurement status data of the target to be detected. Since the accuracy of the measurement status data is high, the target status data can be adjusted based on the measurement status data. The adjusted target state data is obtained, so that the accuracy of the adjusted target state data is relatively high.
  • the present disclosure provides a neural network training method, including:
  • a neural network is trained to obtain a target neural network.
  • the present disclosure provides a driving control method, including:
  • the target detection neural network is used to detect the road image to obtain the state data of the target object included in the road image, wherein the target detection neural network is obtained by using sample data training, and the sample data corresponds to the real
  • the value data is determined by using the method for determining the true value data described in any one of the first aspect
  • the traveling device is controlled based on the state data of the target object included in the road image.
  • the present disclosure provides a device for determining truth value data, including:
  • the first acquisition module is used to acquire the image to be processed collected by the image acquisition device and the point cloud data collected by at least one radar device;
  • a determining module configured to determine continuous state data that matches the object to be detected included in the image to be processed; wherein, the continuous state data includes the to-be-detected object determined based on point cloud data collected by at least one radar device The state data of the object at multiple point cloud collection moments;
  • a predicting module configured to predict target state data of the object to be detected at an image acquisition moment corresponding to the image to be processed based on the continuous state data corresponding to the object to be detected.
  • the present disclosure provides a neural network training device, including:
  • the second acquisition module is configured to acquire sample data, wherein the true value data corresponding to the sample data is determined by using the method for determining true value data in the first aspect;
  • the training module is used to train the neural network based on the sample data to obtain the target neural network.
  • the present disclosure provides a driving control device, including:
  • the third acquisition module is used to acquire road images collected by the driving device during driving;
  • a detection module configured to use a target detection neural network to detect the road image to obtain state data of target objects included in the road image, wherein the target detection neural network is obtained by using sample data training, and the The true value data corresponding to the sample data is determined by using the method for determining the true value data described in any one of the first aspect;
  • a control module configured to control the traveling device based on the state data of the target object included in the road image.
  • the present disclosure provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the The memory communicates with each other through a bus, and when the machine-readable instructions are executed by the processor, the steps of the method for determining truth value data as described in the above-mentioned first aspect or any implementation mode are executed; or the steps of the above-mentioned second aspect are executed The steps of the neural network training method; or the steps of the driving control method described in the third aspect above.
  • the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the implementation described in the first aspect or any implementation mode above is executed.
  • FIG. 1 shows a schematic flowchart of a method for determining truth value data provided by an embodiment of the present disclosure
  • FIG. 2 shows a schematic flowchart of a neural network training method provided by an embodiment of the present disclosure
  • Fig. 3 shows a schematic flowchart of a driving control method provided by an embodiment of the present disclosure
  • Fig. 4 shows a schematic structural diagram of a device for determining truth value data provided by an embodiment of the present disclosure
  • FIG. 5 shows a schematic diagram of the architecture of a neural network training device provided by an embodiment of the present disclosure
  • Fig. 6 shows a schematic structural diagram of a driving control device provided by an embodiment of the present disclosure
  • FIG. 7 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • FIG. 8 shows a schematic structural diagram of another electronic device provided by an embodiment of the present disclosure.
  • FIG. 9 shows a schematic structural diagram of another electronic device provided by an embodiment of the present disclosure.
  • the training samples can be used to train the neural network, and the trained neural network can be used for image recognition and image classification.
  • the neural network can be applied in the field of driving, and the scene images collected on the driving device can be detected through the trained neural network. Therefore, how to obtain the true value data of training samples quickly, in real time, and in large quantities has become an important issue in the process of neural network research and development.
  • the real value data of the image can be obtained by using the coordination between the positioning information and the vehicle to be tested.
  • multiple vehicles to be tested can be placed around the target vehicle, and the positioning information of the vehicle to be tested can be determined to control the target vehicle.
  • the image acquisition equipment installed on the system collects sample images, and determines the true value data of each vehicle to be tested in the sample image according to the determined positioning information of the vehicle to be tested.
  • an embodiment of the present disclosure provides a method for determining truth value data.
  • the execution subject of the truth data determination method, the neural network training method, and the driving control method provided by the embodiments of the present disclosure is generally a computer device with a certain computing power, and the computer device includes, for example: a terminal device or a server or other processing device,
  • the device can be user equipment (User Equipment, UE), mobile device, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc.
  • the method for determining the truth data, the neural network training method, and the driving control method can be implemented by calling the computer-readable instructions stored in the memory by the processor.
  • FIG. 1 it is a schematic flowchart of a method for determining true value data provided by an embodiment of the present disclosure.
  • the method for determining true value data includes S101-S103, wherein:
  • S102 Determine the continuous state data that matches the object to be detected included in the image to be processed; wherein, the continuous state data includes the object to be detected determined based on the point cloud data collected by at least one radar device at multiple point cloud acquisition moments status data;
  • the target state data of the object to be detected at the image acquisition time corresponding to the image to be processed can be predicted, since the continuous state data includes the state data under multiple point cloud acquisition moments, Therefore, based on the state data at multiple point cloud acquisition moments corresponding to the object to be detected, the target state data of the object to be detected at the image acquisition time corresponding to the image to be processed can be predicted more accurately, and the time between image acquisition time and point cloud acquisition time is eased.
  • the deviation of the state data of the object to be detected caused by inconsistency improves the accuracy of the target state data.
  • the image acquisition device may be any sensor capable of image acquisition, and at least one radar device may be a lidar device, a millimeter-wave radar device, or the like.
  • an image acquisition device, a lidar device and a millimeter-wave radar can be installed on the traveling device, and during the movement of the traveling device, the image acquisition device is controlled to collect images to be processed in real time, and the laser radar device and the millimeter-wave radar device are controlled in real time The collected point cloud data, so that the execution subject can obtain the image to be processed collected by the image acquisition device, as well as the first point cloud data collected by the laser radar device and the second point cloud data collected by the millimeter wave radar device.
  • the image to be processed corresponds to a first time stamp
  • the first time stamp can represent the image acquisition time corresponding to the image to be processed
  • the point cloud data corresponds to a second time stamp
  • the second time stamp can represent the point cloud The point cloud acquisition time corresponding to the data.
  • the image to be processed may include one or more objects to be detected.
  • the objects to be detected may be motor vehicles, non-motor vehicles, pedestrians, animals, and the like.
  • the state data of the object to be detected may include size data, position data, moving speed, orientation data, etc. of the three-dimensional detection frame corresponding to the object to be detected.
  • the radar equipment includes lidar equipment and millimeter-wave radar equipment
  • the point cloud data collected by the lidar equipment can be used to determine the position of the object to be detected at multiple point cloud collection times (the point cloud collection time corresponding to the lidar equipment).
  • point cloud data collected by LiDAR equipment When there are both point cloud data collected by LiDAR equipment and point cloud data collected by millimeter-wave radar equipment at the same point cloud collection time, due to the high accuracy of LiDAR equipment, you can choose to use the data collected by LiDAR equipment.
  • the point cloud data determines the state data.
  • the continuous state data corresponding to the object to be detected can be stored in a state buffer, and the state buffer can store state data corresponding to the object to be detected at multiple point cloud acquisition moments within the target time period.
  • the state buffer may store the state data corresponding to the object to be detected at multiple point cloud collection moments within the last 10 seconds.
  • the continuous state data matching the object to be detected included in the image to be processed can be determined according to the following two methods:
  • Step A1 for each object to be detected in the image to be processed, based on the stored candidate state data corresponding to each candidate object, determine the intermediate state data of the candidate object at the image acquisition time corresponding to the image to be processed;
  • Step A2 based on the intermediate state data corresponding to each candidate object and the image to be processed, determine the target object matching the object to be detected from each candidate object, and determine the continuous state data corresponding to the target object as the continuous state data of the object to be detected .
  • the candidate object may be a historical object to be detected located in the historical image to be processed (historical image collected before the image to be processed in the current frame).
  • the candidate state data corresponding to the candidate object includes the state data at each point cloud acquisition time corresponding to the candidate object determined based on the point cloud data.
  • the candidate state data corresponding to the candidate object includes the state data consistent with the image acquisition time, then the state data is determined as the intermediate state data corresponding to the candidate object;
  • intermediate state data corresponding to the candidate object may be predicted based on the candidate state data corresponding to the candidate object.
  • the image acquisition time corresponding to the image to be processed is 10:10:10 (10:10:10)
  • the candidate state data corresponding to the candidate object does not include the state data of 10:10:10
  • the size data of the candidate object at 10:10:09 can be set as 10 :10:10 size data; according to the moving speed and moving state of the candidate object at 10:10:09 (such as uniform motion, uniform acceleration motion, etc.), it can be determined from 10:10:09 to 10:10:10 Move distance, and determine the moving speed of the candidate object at 10:10:10; and then predict the candidate object at 10:10:10 based on the moving distance, the position data and the orientation data of the candidate object at 10:10:09
  • the position data and orientation data that is, the intermediate state data of the candidate object at 10:10:10 (image collection time) is predicted.
  • the target object matching the object to be detected can be determined from each candidate object.
  • the Euclidean distance between each candidate object and the object to be detected can be determined based on the intermediate state data corresponding to each candidate object and the image to be processed; for example, the position information of the object to be detected in the image to be processed can be converted To the coordinate system corresponding to the real scene, generate the position information of the object to be detected in the coordinate system corresponding to the real scene, and then use the position information of the object to be detected in the coordinate system corresponding to the real scene and the intermediate state data of the candidate object to indicate Position information, generating the Euclidean distance between the candidate object and the object to be detected.
  • the position information indicated by the intermediate state data of the candidate object can also be converted to the image coordinate system corresponding to the image to be processed to generate the converted position information, based on the converted position information corresponding to the candidate object and the position of the object to be detected Process the position information in the image to generate the Euclidean distance between the candidate object and the object to be detected.
  • the candidate object corresponding to the minimum Euclidean distance can be determined as the target object matching the object to be detected; Matched target object.
  • the first bird's-eye view including the candidate object that is, to obtain the first bird's-eye view corresponding to each candidate object; and to determine the second bird's-eye view corresponding to the image to be processed picture.
  • the Intersection over Union (IoU) between the candidate object and the object to be detected can be determined;
  • the intersection ratio between each candidate object determines the target object that matches the object to be detected.
  • the candidate object corresponding to the maximum IoU may be determined as the target object matching the object to be detected.
  • the candidate object corresponding to the maximum IoU may be determined as the target object matching the object to be detected.
  • the continuous state data corresponding to the target object can be determined as the continuous state data of the object to be detected.
  • the target object that the detection object matches can use the intermediate state data corresponding to each candidate object and the image to be processed to determine the Euclidean distance between the candidate object and the object to be detected, and then determine the distance between the candidate object and the object to be detected.
  • the target object whose Euclidean distance is the smallest the target object can be confirmed as the object to be detected, and then the continuous state data corresponding to the target object can be determined as the continuous state data of the object to be detected.
  • Step B1 based on the image acquisition time corresponding to the image to be processed and the point cloud acquisition time corresponding to each frame of point cloud data, determine the target point cloud data associated with the image to be processed;
  • Step B2 for each object to be detected in the image to be processed, determine the target object in the target point cloud data that matches the object to be detected; and use the continuous state data corresponding to the target object as the continuous state data of the object to be detected.
  • the target point cloud data associated with the image to be processed may be determined based on the image acquisition time corresponding to the image to be processed and the point cloud acquisition time corresponding to each frame of point cloud data. For example, the point cloud data whose difference between the point cloud collection time and the image collection time is smaller than the set time difference may be determined as the target point cloud data associated with the image to be processed. And/or, the point cloud data with the smallest difference between the point cloud collection time and the image collection time may be determined as the target point cloud data associated with the image to be processed.
  • the target object in the target point cloud data that matches the object to be detected can be determined based on the Euclidean distance or IoU.
  • the process of determining the target object in the target point cloud data that matches the object to be detected can refer to the description in the first method, and will not be described in detail here.
  • the continuous state data of the target object in the point cloud data can be used as the continuous state data of the object to be detected.
  • the process of mode 1 can be used directly to determine the continuous state data that matches the object to be detected included in the image to be processed; or, the process of mode 2 can be directly used to determine the data to be detected that matches Detect the continuous state data of object matching; or, based on the image acquisition time corresponding to the image to be processed and the point cloud acquisition time corresponding to each frame of point cloud data, determine whether there is target point cloud data associated with the image to be processed; , use the process of mode 2 to determine the continuous state data that matches the object to be detected included in the image to be processed; when it does not exist, use the process of mode 1 to determine the continuous status data.
  • the point cloud data can track the target object in real time, that is, the continuous state data of the target object in the point cloud data can be confirmed, therefore, based on the image acquisition time corresponding to the image to be processed and the corresponding At the moment of point cloud collection, determine the target point cloud data associated with the image to be processed; then determine the target object in the target point cloud data that matches the object to be detected, and then the continuous state data corresponding to the target object can be used as the continuous state data of the object to be detected. status data.
  • the target state of the object to be detected at the image acquisition moment corresponding to the image to be processed can be predicted based on the continuous state data corresponding to the object to be detected data.
  • predicting the target state data of the object to be detected at the time of image acquisition corresponding to the image to be processed may include the following situations:
  • Case 1 In the continuous state data corresponding to the object to be detected, if there is any point cloud acquisition time corresponding to the state data that is consistent with the image acquisition time, determine any state data as the target state data corresponding to the object to be detected ;
  • Case 2 In the continuous state data corresponding to the object to be detected, the point cloud acquisition time corresponding to each state data is inconsistent with the image acquisition time, and the image acquisition time is located in any two point cloud acquisition times among multiple point cloud acquisition times In the case of time between, based on the state data corresponding to any two point cloud collection time, the target state data corresponding to the object to be detected is predicted.
  • Case 3 When the image acquisition time is after the latest time among multiple point cloud acquisition times, based on the continuous state data corresponding to the image to be processed and each motion model constructed, generate the object to be detected in each motion model The predicted data under each motion model and the confidence corresponding to each predicted data; based on the predicted data under each motion model and the corresponding confidence of the predicted data, predict the target state data corresponding to the object to be detected.
  • each state data included in the continuous state data corresponding to the object to be detected is determined according to the point cloud data collected by the radar equipment, and each point cloud data corresponds to a point cloud collection time, and the point cloud data corresponding to The point cloud collection time is determined as the point cloud collection time corresponding to the state data, so it can be determined whether there is any point cloud collection time corresponding to the state data in the continuous state data corresponding to the object to be detected that is consistent with the image collection time, and if it exists, then In case one, the state data at which the point cloud collection time is consistent with the image collection time can be determined as the target state data corresponding to the object to be detected.
  • the image acquisition time is between any two point cloud acquisition times among the multiple point cloud acquisition times, that is, the image acquisition time is between the earliest and the latest points indicated by the multiple point cloud acquisition times.
  • the target state data corresponding to the object to be detected can be predicted by interpolation.
  • the continuous state data corresponding to the object to be detected includes state data at multiple point cloud collection moments
  • the multiple point cloud collection moments may include: 10 minutes and 10 seconds, 10 minutes and 12 seconds, 10 minutes and 15 seconds, and 10 minutes and 20 seconds seconds, if the image acquisition time is 10 minutes and 14 seconds, the state data corresponding to 10 minutes and 12 seconds and the state data corresponding to 10 minutes and 15 seconds can be used to predict the target state data corresponding to the object to be detected.
  • the object to be detected can be generated based on the continuous state data corresponding to the image to be processed and the constructed motion models
  • the prediction data under each motion model and the confidence corresponding to each prediction data for example, the motion model can be a uniform motion model, a uniform acceleration motion model, a uniform deceleration motion model, etc., wherein the motion model can be constructed according to needs.
  • the Kalman filter algorithm can be used to predict the predicted data of the object to be detected under each motion model and the confidence corresponding to each predicted data .
  • the target state data corresponding to the object to be detected can then be predicted based on the predicted data under each motion model and the confidence levels corresponding to the predicted data. For example, the prediction data with the highest confidence can be determined as the target state data corresponding to the object to be detected.
  • the target point cloud data associated with the image to be processed based on the image acquisition time corresponding to the image to be processed and the point cloud acquisition time corresponding to each frame of point cloud data; and then use the target point cloud matched with the object to be detected data, determine the measurement state data of the object to be detected; and based on the measurement state data of the object to be detected, adjust the prediction data under each motion model and the confidence corresponding to the prediction data, and determine the object to be detected based on the adjusted confidence Corresponding target state data.
  • the predicted data corresponding to the adjusted maximum confidence is determined as the target state data corresponding to the object to be detected.
  • case 4 when the image acquisition time is before the earliest time among multiple point cloud acquisition times, the image to be processed is discarded, that is, the true value data of the image to be processed is not determined.
  • Situation 4 may occur, so when Situation 4 occurs, the radar equipment can be inspected and maintained.
  • the accuracy of the generated target state data can be improved.
  • Step 1 Based on the image acquisition time corresponding to the image to be processed and the point cloud acquisition time corresponding to each frame of point cloud data, determine the target point cloud data associated with the image to be processed;
  • Step 2 using the target point cloud data matched with the object to be detected to determine the measurement state data of the object to be detected;
  • Step 3 Adjust the target state data based on the measurement state data to obtain adjusted target state data.
  • the point cloud data whose difference between the point cloud collection time and the image collection time is smaller than the set time difference may be determined as the target point cloud data associated with the image to be processed. And/or, the point cloud data with the smallest difference between the point cloud collection time and the image collection time may be determined as the target point cloud data associated with the image to be processed.
  • the target point cloud data matched with the target to be detected can be used to determine the measurement status data of the target to be detected, wherein the measurement status data can include at least one of the following data: the target point cloud data associated with the image to be processed is located in The local point cloud data on the object, the measurement position, measurement size, measurement orientation, measurement speed, etc. of the 3D detection frame corresponding to the object to be detected.
  • the target state data can be adjusted based on the measurement state data to obtain the adjusted target state data.
  • the update process in the Kalman filter algorithm can be used to adjust the target state data based on the measurement state data to obtain the adjusted Target state data.
  • the target point cloud data matching with the target to be detected can be used to determine the measurement status data of the target to be detected. Since the accuracy of the measurement status data is high, the target status data can be adjusted based on the measurement status data. The adjusted target state data is obtained, so that the accuracy of the adjusted target state data is relatively high.
  • radar equipment including lidar equipment and millimeter-wave radar equipment as examples to illustrate the method for determining the true value data:
  • the image to be processed collected by the image acquisition device, the first point cloud data collected by the lidar device, and the second point cloud data collected by the millimeter wave radar device can be obtained.
  • the second step based on the image acquisition time of the image to be processed, the first point cloud acquisition time corresponding to the first point cloud data, and the second point cloud acquisition time corresponding to the second point cloud data, it can be judged whether there is If there is matched target first point cloud data and/or target second point cloud data, associate the target first point cloud data and/or target second point cloud data with the image to be processed. And based on the target first point cloud data and/or the target second point cloud data, the measurement state data of the object to be detected included in the image to be processed can be determined.
  • the third step is to determine the continuous state data matching the object to be detected included in the image to be processed. For a specific determination process, reference may be made to the above specific description about S102.
  • the state data in the continuous state data corresponding to the object to be detected, when there is state data whose corresponding point cloud acquisition time is consistent with the image acquisition time, the state data can be determined as the target state data corresponding to the object to be detected.
  • the continuous state data corresponding to the object to be detected there is no state data whose corresponding point cloud acquisition time is consistent with the image acquisition time, and any two point cloud acquisitions whose image acquisition time is located in multiple point cloud acquisition time.
  • use the interpolation method to predict the target state data corresponding to the object to be detected.
  • the object to be detected is generated at each motion
  • the predicted data under the model and the confidence corresponding to each predicted data; based on the predicted data under each motion model and the corresponding confidence of the predicted data, the target state data corresponding to the object to be detected is predicted.
  • the image acquisition time is before the earliest time among the multiple point cloud acquisition times, the image to be processed may be discarded, that is, the true value data of the image to be processed may not be determined.
  • a fifth step may also be included, that is, the measurement state data corresponding to the object to be detected determined in the second step may be used to perform the target state data generated in the fourth step (at this time
  • the target state data can be regarded as the initial state data) and adjusted to generate the adjusted target state data of the object to be detected at the image acquisition moment.
  • each object to be detected can be matched with multiple motion models, each motion model corresponds to a Kalman filter, and then the Kalman filter can be used to determine the state data of the object to be detected under each motion model, and/or Confidence levels for individual motion models are determined.
  • the Kalman filter algorithm can be used to adjust the target state data by using the measured state data of the object to be detected to generate the adjusted
  • the confidence of each motion model under the object to be detected can be updated, for example, the confidence of the motion model corresponding to the target state data can be increased, and the confidence of other motion models can be decreased.
  • Kalman filtering can be used to predict the predicted data of the object to be detected under each motion model and the first confidence level corresponding to the predicted data (which can be approximated as the motion model corresponding to The first confidence degree of the object to be detected); and then use the measurement state data of the object to be detected to adjust the generated prediction data under each motion model to obtain the adjusted state data and the second confidence degree corresponding to the adjusted state data (It may be approximated as the second confidence degree corresponding to the motion model), determine the maximum value of the second confidence degree, and use the adjusted state data corresponding to the maximum value as the adjusted target state data.
  • the confidence degree corresponding to the motion model may be generated, so as to update the confidence degree of the motion model.
  • the generated target state data corresponding to the object to be detected can also be stored in the state buffer corresponding to the object to be detected, and the state data stored in the state buffer can be updated in order to utilize the continuous state data in the updated state buffer.
  • the state data is used to determine the target state data of the object to be detected in the subsequently collected image to be processed.
  • FIG. 2 it is a schematic flowchart of a neural network training method provided by an embodiment of the present disclosure.
  • the method includes S201-S202, wherein:
  • the sample data may be a collected sample image, and the real data corresponding to the sample image may be determined by using the method for determining real value data described in the above-mentioned embodiments.
  • the sample data can be used to train the neural network until the accuracy of the trained neural network is greater than the set accuracy threshold, or until the loss value of the trained neural network is less than the set loss threshold to obtain the target neural network.
  • the network structure of the neural network can be determined as required.
  • the true value data of the sample data can be determined more accurately, and when the sample data is used to train the neural network, the performance of the obtained target neural network can be improved.
  • FIG. 3 it is a schematic flowchart of a driving control method provided by an embodiment of the present disclosure.
  • the method includes S301-S303, wherein:
  • S302 use the target detection neural network to detect the road image, and obtain the state data of the target object included in the road image, wherein the target detection neural network is obtained by using sample data training, and the true value data corresponding to the sample data is obtained by using the above implementation Determined by the method for determining the truth value data described in the method;
  • the driving device may be a self-driving vehicle, a vehicle equipped with an Advanced Driving Assistance System (Advanced Driving Assistance System, ADAS), or a robot.
  • the road image may be an image collected by the driving device in real time during driving.
  • the target object may be any object and/or any object that may appear in the road.
  • the target object may be animals and pedestrians appearing on the road, or other vehicles (including motor vehicles and non-motor vehicles) on the road.
  • the state data can include size data, position data, orientation data, moving speed, etc.
  • the size data can be the length, width and height of the three-dimensional detection frame of the target object
  • the position data can be the center point of the three-dimensional detection frame of the target object.
  • the coordinate information in the camera coordinate system, the orientation data may be the deviation angle between the reference plane of the target object and the preset direction.
  • the driving device When controlling the driving device, the driving device can be controlled to accelerate, decelerate, turn, brake, etc., or voice prompt information can be played to prompt the driver to control the driving device to accelerate, decelerate, turn, brake, etc.
  • the true value data of the sample data can be determined more accurately, and when the sample data is used to train the neural network, the performance of the obtained target detection neural network can be better.
  • the target detection neural network with better performance when detecting the road image, the state data of the target object can be determined more accurately; then based on the state data of the target object included in the road image, the driving device can be controlled more accurately , Improve the safety of the driving state.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiment of the present disclosure also provides a device for determining the truth value data, as shown in FIG. Module 402, prediction module 403, specifically:
  • the first acquisition module 401 is configured to acquire the image to be processed collected by the image acquisition device and the point cloud data collected by at least one radar device;
  • a determining module 402 configured to determine continuous state data that matches the object to be detected included in the image to be processed; wherein, the continuous state data includes the to-be-detected object determined based on point cloud data collected by at least one radar device Detect the state data of the object at multiple point cloud acquisition moments;
  • the prediction module 403 is configured to predict the target state data of the object to be detected at the image acquisition moment corresponding to the image to be processed based on the continuous state data corresponding to the object to be detected.
  • the determining module 402 when determining the continuous state data matching the object to be detected included in the image to be processed, is configured to:
  • the determining module 402 when determining the continuous state data matching the object to be detected included in the image to be processed, is configured to:
  • the prediction module 403 based on the continuous state data corresponding to the object to be detected, predicts the target state data of the object to be detected at the image acquisition moment corresponding to the image to be processed , for:
  • the point cloud collection time corresponding to any state data is consistent with the image collection time, determine the any state data as the to-be-detected The target state data corresponding to the object;
  • the point cloud collection time corresponding to each state data is inconsistent with the image collection time, and the image collection time is located in the multiple point cloud collection time In the case of between any two point cloud collection moments, predict the target state data corresponding to the object to be detected based on the state data corresponding to the any two point cloud collection moments.
  • the prediction module 403 is further configured to:
  • the to-be-processed image is generated based on the continuous state data corresponding to the image to be processed and each constructed motion model.
  • the method further includes: adjusting Module 404, for:
  • the target state data is adjusted based on the measurement state data to obtain adjusted target state data.
  • the embodiment of the present disclosure also provides a neural network training device, as shown in FIG. ,specific:
  • the second acquisition module 501 is configured to acquire sample data, wherein the true-value data corresponding to the sample data is determined by using the method for determining true-value data described in the above-mentioned embodiment;
  • the training module 502 is configured to train a neural network based on the sample data to obtain a target neural network.
  • the embodiment of the present disclosure also provides a driving control device, as shown in FIG. Module 603, specifically:
  • a third acquiring module 601, configured to acquire road images collected by the traveling device during driving
  • the detection module 602 is configured to use a target detection neural network to detect the road image to obtain state data of target objects included in the road image, wherein the target detection neural network is obtained by using sample data training, and the The true value data corresponding to the sample data is determined by the method for determining the true value data described in the above embodiment;
  • a control module 603, configured to control the driving device based on the state data of the target object included in the road image.
  • the functions of the device provided by the embodiments of the present disclosure or the included templates can be used to execute the methods described in the above method embodiments, and its specific implementation can refer to the description of the above method embodiments. For brevity, here No longer.
  • an embodiment of the present disclosure also provides an electronic device.
  • FIG. 7 it is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure, including a processor 701 , a memory 702 , and a bus 703 .
  • the memory 702 is used to store execution instructions, including a memory 7021 and an external memory 7022; the memory 7021 here is also called an internal memory, and is used to temporarily store calculation data in the processor 701 and exchange data with an external memory 7022 such as a hard disk.
  • the processor 701 exchanges data with the external memory 7022 through the memory 7021.
  • the processor 701 communicates with the memory 702 through the bus 703, so that the processor 701 executes the following instructions:
  • the continuous state data includes a plurality of points of the object to be detected determined based on point cloud data collected by at least one radar device Cloud collects status data at any moment;
  • an embodiment of the present disclosure also provides an electronic device.
  • FIG. 8 it is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure, including a processor 801 , a memory 802 , and a bus 803 .
  • the memory 802 is used to store execution instructions, including a memory 8021 and an external memory 8022; the memory 8021 here is also called an internal memory, and is used to temporarily store calculation data in the processor 801 and exchange data with an external memory 8022 such as a hard disk.
  • the processor 801 exchanges data with the external memory 8022 through the memory 8021.
  • the processor 801 communicates with the memory 802 through the bus 803, so that the processor 801 executes the following instructions:
  • a neural network is trained to obtain a target neural network.
  • an embodiment of the present disclosure also provides an electronic device.
  • FIG. 9 it is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure, including a processor 901 , a memory 902 , and a bus 903 .
  • the memory 902 is used to store execution instructions, including a memory 9021 and an external memory 9022; the memory 9021 here is also called an internal memory, and is used to temporarily store calculation data in the processor 901 and exchange data with an external memory 9022 such as a hard disk.
  • the processor 901 exchanges data with the external memory 9022 through the memory 9021.
  • the processor 901 communicates with the memory 902 through the bus 903, so that the processor 901 executes the following instructions:
  • the target detection neural network is used to detect the road image to obtain the state data of the target object included in the road image, wherein the target detection neural network is obtained by using sample data training, and the sample data corresponds to the real
  • the value data is determined by using the method for determining the true value data described in the above embodiment
  • the traveling device is controlled based on the state data of the target object included in the road image.
  • an embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored.
  • the computer program is run by a processor, the method for determining the truth value data described in the above-mentioned method embodiments is executed. , the neural network training method, the steps of the driving control method.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • the embodiment of the present disclosure also provides a computer program product, the computer program product carries program code, and the instructions included in the program code can be used to execute the method for determining the truth value data, the neural network training method,
  • the steps of the driving control method reference may be made to the above-mentioned method embodiments for details, and details are not repeated here.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Automation & Control Theory (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

La présente divulgation concerne un procédé et un appareil de détermination de données de valeur vraie, un procédé et un appareil d'apprentissage de réseau neuronal, un procédé et un appareil de commande de déplacement, et un dispositif électronique et un support de stockage. Le procédé de détermination de données de valeur réelle consiste à : acquérir une image à traiter qui est collectée par un dispositif de collecte d'image, et des données de nuage de points collectées par au moins un dispositif radar ; déterminer des données d'état continu qui sont comprises dans ladite image et coïncident avec un objet à tester, les données d'état continu comprenant des données d'état, qui sont déterminées sur la base des données de nuage de points collectées par le ou les dispositifs radar, dudit objet à une pluralité de moments de collecte de nuage de points ; et sur la base des données d'état continu correspondant audit objet, prédire des données d'état cible dudit objet à un moment de collecte d'image correspondant à ladite image.
PCT/CN2022/084393 2021-06-30 2022-03-31 Procédé et appareil de détermination de données de valeur vraie, procédé et appareil d'apprentissage de réseau neuronal, et procédé et appareil de commande de déplacement WO2023273467A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110739051.6 2021-06-30
CN202110739051.6A CN113469042A (zh) 2021-06-30 2021-06-30 真值数据确定、神经网络训练、行驶控制方法及装置

Publications (1)

Publication Number Publication Date
WO2023273467A1 true WO2023273467A1 (fr) 2023-01-05

Family

ID=77876640

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/084393 WO2023273467A1 (fr) 2021-06-30 2022-03-31 Procédé et appareil de détermination de données de valeur vraie, procédé et appareil d'apprentissage de réseau neuronal, et procédé et appareil de commande de déplacement

Country Status (2)

Country Link
CN (1) CN113469042A (fr)
WO (1) WO2023273467A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117612070A (zh) * 2024-01-19 2024-02-27 福思(杭州)智能科技有限公司 静态真值数据的校正方法和装置、存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469042A (zh) * 2021-06-30 2021-10-01 上海商汤临港智能科技有限公司 真值数据确定、神经网络训练、行驶控制方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109300143A (zh) * 2018-09-07 2019-02-01 百度在线网络技术(北京)有限公司 运动向量场的确定方法、装置、设备、存储介质和车辆
EP3438872A1 (fr) * 2017-08-04 2019-02-06 Bayerische Motoren Werke Aktiengesellschaft Procédé, appareil et programme informatique pour véhicule
CN110163904A (zh) * 2018-09-11 2019-08-23 腾讯大地通途(北京)科技有限公司 对象标注方法、移动控制方法、装置、设备及存储介质
CN112381873A (zh) * 2020-10-23 2021-02-19 北京亮道智能汽车技术有限公司 一种数据标注方法及装置
CN112926461A (zh) * 2021-02-26 2021-06-08 商汤集团有限公司 神经网络训练、行驶控制方法及装置
CN113469042A (zh) * 2021-06-30 2021-10-01 上海商汤临港智能科技有限公司 真值数据确定、神经网络训练、行驶控制方法及装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025642B (zh) * 2016-01-27 2018-06-22 百度在线网络技术(北京)有限公司 基于点云数据的车辆轮廓检测方法和装置
CN109658418A (zh) * 2018-10-31 2019-04-19 百度在线网络技术(北京)有限公司 场景结构的学习方法、装置及电子设备
CN110008851B (zh) * 2019-03-15 2021-11-19 深兰科技(上海)有限公司 一种车道线检测的方法及设备
CN111427979B (zh) * 2020-01-15 2021-12-21 深圳市镭神智能系统有限公司 基于激光雷达的动态地图构建方法、系统及介质
CN111523600B (zh) * 2020-04-26 2023-12-19 上海商汤临港智能科技有限公司 神经网络训练、目标检测、及智能设备控制的方法及装置
CN112950622B (zh) * 2021-03-29 2023-04-18 上海商汤临港智能科技有限公司 一种目标检测方法、装置、计算机设备和存储介质
CN112990200A (zh) * 2021-03-31 2021-06-18 上海商汤临港智能科技有限公司 一种数据标注方法、装置、计算机设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3438872A1 (fr) * 2017-08-04 2019-02-06 Bayerische Motoren Werke Aktiengesellschaft Procédé, appareil et programme informatique pour véhicule
CN109300143A (zh) * 2018-09-07 2019-02-01 百度在线网络技术(北京)有限公司 运动向量场的确定方法、装置、设备、存储介质和车辆
CN110163904A (zh) * 2018-09-11 2019-08-23 腾讯大地通途(北京)科技有限公司 对象标注方法、移动控制方法、装置、设备及存储介质
CN112381873A (zh) * 2020-10-23 2021-02-19 北京亮道智能汽车技术有限公司 一种数据标注方法及装置
CN112926461A (zh) * 2021-02-26 2021-06-08 商汤集团有限公司 神经网络训练、行驶控制方法及装置
CN113469042A (zh) * 2021-06-30 2021-10-01 上海商汤临港智能科技有限公司 真值数据确定、神经网络训练、行驶控制方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117612070A (zh) * 2024-01-19 2024-02-27 福思(杭州)智能科技有限公司 静态真值数据的校正方法和装置、存储介质
CN117612070B (zh) * 2024-01-19 2024-05-03 福思(杭州)智能科技有限公司 静态真值数据的校正方法和装置、存储介质

Also Published As

Publication number Publication date
CN113469042A (zh) 2021-10-01

Similar Documents

Publication Publication Date Title
US10077054B2 (en) Tracking objects within a dynamic environment for improved localization
EP3714290B1 (fr) Localisation lidar à l'aide d'un réseau cnn 3d pour inférence de solution dans des véhicules à conduite autonome
US11594011B2 (en) Deep learning-based feature extraction for LiDAR localization of autonomous driving vehicles
US11137762B2 (en) Real time decision making for autonomous driving vehicles
CN109937343B (zh) 用于自动驾驶车辆交通预测中的预测轨迹的评估框架
US11364931B2 (en) Lidar localization using RNN and LSTM for temporal smoothness in autonomous driving vehicles
US10997729B2 (en) Real time object behavior prediction
US11199846B2 (en) Learning-based dynamic modeling methods for autonomous driving vehicles
JP6578331B2 (ja) 自律走行車のコマンド遅延を決定するための方法
EP3511863B1 (fr) Apprentissage par représentation distribuable permettant d'associer des observations de plusieurs véhicules
US10990855B2 (en) Detecting adversarial samples by a vision based perception system
WO2023273467A1 (fr) Procédé et appareil de détermination de données de valeur vraie, procédé et appareil d'apprentissage de réseau neuronal, et procédé et appareil de commande de déplacement
US11250240B1 (en) Instance segmentation using sensor data having different dimensionalities
CN114758502B (zh) 双车联合轨迹预测方法及装置、电子设备和自动驾驶车辆
CN112558036A (zh) 用于输出信息的方法和装置
US20210383213A1 (en) Prediction device, prediction method, computer program product, and vehicle control system
CN116311943B (zh) 交叉路口的平均延误时间的估算方法及装置
US20230194659A1 (en) Target-based sensor calibration
CN116580367A (zh) 数据处理方法、装置、电子设备和存储介质
CN117724361A (zh) 应用于自动驾驶仿真场景的碰撞事件检测方法及装置
CN116466685A (zh) 针对自动驾驶感知算法的评测方法及装置、设备和介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE