WO2022205211A1 - Procédé et appareil de commande de déplacement de véhicule et véhicule - Google Patents

Procédé et appareil de commande de déplacement de véhicule et véhicule Download PDF

Info

Publication number
WO2022205211A1
WO2022205211A1 PCT/CN2021/084731 CN2021084731W WO2022205211A1 WO 2022205211 A1 WO2022205211 A1 WO 2022205211A1 CN 2021084731 W CN2021084731 W CN 2021084731W WO 2022205211 A1 WO2022205211 A1 WO 2022205211A1
Authority
WO
WIPO (PCT)
Prior art keywords
vehicle
user
slot value
driving
training
Prior art date
Application number
PCT/CN2021/084731
Other languages
English (en)
Chinese (zh)
Inventor
苏琪
聂为然
许明霞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180001475.0A priority Critical patent/CN113226886A/zh
Priority to PCT/CN2021/084731 priority patent/WO2022205211A1/fr
Publication of WO2022205211A1 publication Critical patent/WO2022205211A1/fr

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/001Planning or execution of driving tasks
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
    • B60W30/18Propelling the vehicle
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2555/00Input parameters relating to exterior conditions, not covered by groups B60W2552/00, B60W2554/00
    • B60W2555/20Ambient conditions, e.g. wind or rain

Definitions

  • the present application relates to the field of automatic driving, and more particularly, to a method, device and vehicle for controlling the driving of a vehicle.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that responds in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theory.
  • Autopilot is a mainstream application in the field of artificial intelligence.
  • Autopilot technology relies on the cooperation of computer vision, radar, monitoring devices and global positioning systems to allow motor vehicles to achieve autonomous driving without the need for human active operation.
  • Autonomous vehicles use various computing systems to help transport users from one location to another. Some autonomous vehicles may require some initial or continuous input from a user, such as a pilot, driver, or passenger.
  • An autonomous vehicle permits the operator to switch from a manual operating mode to an autonomous driving mode or a mode in between. Since automatic driving technology does not require humans to drive motor vehicles, it can theoretically effectively avoid human driving errors, reduce the occurrence of traffic accidents, and improve the efficiency of highway transportation. Therefore, autonomous driving technology is getting more and more attention.
  • the driving basis of autonomous vehicles is based on the preset destination and the surrounding environment of the vehicle obtained by various sensors, and finally sends the user to the corresponding destination through the planned route.
  • the user may have some temporary intentions that are different from driving to the destination according to the visual information around the vehicle. If you are close to the car in front, you need to keep your distance, etc.
  • the user under the existing autonomous driving technology, if the user generates the above temporary intention, he can only temporarily take over the control of the vehicle through manual intervention, and then execute his own temporary intention. Since the vehicle has been switched to manual driving mode at this time, users can no longer enjoy the more worry-free and safer driving experience brought by autonomous driving technology.
  • Level 5 (as defined by the Society of Automotive Engineers (SAE) on the level of automation)
  • SAE Society of Automotive Engineers
  • the present application provides a method, device and vehicle for controlling the driving of a vehicle, which can improve the user's sense of experience in the process of automatic driving.
  • a method for controlling the driving of a vehicle is provided, and the method for controlling the driving of a vehicle provided by the present application can be executed by an electronic device supporting the driving of the vehicle.
  • An electronic device refers to a computer system that can be abstracted.
  • the electronic device supporting the control of the running of the vehicle may also be referred to as the device for controlling the running of the vehicle.
  • the device for controlling the driving of the vehicle may be the whole machine of the electronic device, or may be part of the device in the electronic device, for example: a chip related to the function of controlling the driving of the vehicle, such as a system chip.
  • the system chip is also called system on chip (system on chip, SOC), or SOC chip.
  • the device for controlling the driving of the vehicle may be a terminal device or an in-vehicle device such as an in-vehicle computer, an in-vehicle machine, a mobile phone, etc. in the vehicle, or a processor, System-on-a-chip or other types of in-vehicle chips.
  • an in-vehicle device such as an in-vehicle computer, an in-vehicle machine, a mobile phone, etc. in the vehicle, or a processor, System-on-a-chip or other types of in-vehicle chips.
  • the method includes: in the automatic driving mode of the vehicle, acquiring user instructions; acquiring environmental information around the vehicle; performing multi-modal understanding on the user instructions and the environmental information around the vehicle to determine the user's driving intention; according to the user's driving intention, Generate autonomous driving control commands for the vehicle.
  • the user's driving intention in the automatic driving mode of the vehicle, can be determined by acquiring user instructions and environmental information around the vehicle, and performing multi-modal understanding of the user instructions and environmental information around the vehicle; According to the user's driving intention, an automatic driving control command for the vehicle is generated.
  • the user's temporary driving intention can be executed, and the user does not need to manually take over the control to execute the temporary driving intention, so that the user's experience in the process of automatic driving can be improved.
  • the driving intent includes at least one intent, each of the at least one intent includes n slots, and each of the n slots includes Slot name, slot value and classification of slot value, n is greater than or equal to 0, n is an integer.
  • the intent includes at least one of: stopping, overtaking, decelerating, following, and turning.
  • the slot name includes at least one of: a parking position, a speed value, an overtaking or following object, and a steering orientation.
  • the classification of the slot value is: an enumeration type slot value, a text type slot value or an environment type slot value,
  • the enumeration slot value indicates that the slot value is a predefined enumeration value
  • the text slot value indicates that the slot value is a substring in the user command or the text generated according to the user command
  • the environment slot value indicates The slot value is identified in the environment information according to the content mentioned in the user instruction.
  • the environment class slot value includes an image class slot value, and the image can reflect the environment around the vehicle. Therefore, the image-type slot value may indicate that the slot value is an identification made in the image information according to the content mentioned in the user instruction.
  • generating an automatic driving control instruction for the vehicle according to the user's driving intention includes: judging whether the driving intention is feasible according to the driving intention, the surrounding environment and traffic regulations; The intent is feasible, and the autonomous driving control instructions for the vehicle are generated.
  • prompt information may be generated and sent to the user.
  • the prompt information may include the reason why the driving intention is not feasible.
  • the driving intention after the driving intention is determined, it is judged whether the driving intention is feasible according to the driving intention, the surrounding environment and traffic regulations; if the driving intention is feasible, the automatic driving control instruction for the vehicle is regenerated. In this way, it is possible to avoid violation of traffic laws or other problems when executing the user's driving intention in the automatic driving mode, which ensures the user experience and the safety of automatic driving during the automatic driving process.
  • the user instruction includes any one or more of a user voice instruction, a user text instruction, and a user air gesture instruction.
  • the user voice command or the user air gesture command can be converted into a user text command, and then the text command and the user gesture command can be converted into user text commands.
  • the multimodal understanding of the surrounding environment information can also be performed directly on the user's voice command or the user's gesture command in the air, which is not limited in this application.
  • the method further includes: sending a photographing activation signal to a photographing device to activate the photographing device to photograph environmental information around the vehicle; acquiring the environmental information around the vehicle includes: Obtain the environmental information around the vehicle photographed by the photographing device according to the photographing activation signal.
  • the environmental information photographed by the photographing device may also be recorded as image information.
  • the acquired environmental information may be not only image information captured by a photographing device, but also environmental information acquired by lidar, vehicle-mounted sensors, and/or Internet of Vehicles, etc., which is not limited in this application.
  • acquiring environmental information around the vehicle includes: acquiring environmental information around the vehicle periodically photographed by the photographing device.
  • the user's driving intention is presented to the user through an augmented reality-head-up display AR-HUD or a central control screen.
  • the user's driving intention may be presented to the user in the form of augmented reality-head-up display AR-HUD or a central control screen, so that the user can timely judge the correctness of the multimodal understanding result.
  • a device for controlling the driving of a vehicle includes an acquisition unit and a processing unit.
  • the acquisition unit is used to acquire user instructions; the acquisition unit is further used to acquire information around the vehicle. environmental information; the processing unit is used for multimodal understanding of user instructions and environmental information around the vehicle to determine the user's driving intention; the processing unit is also used for generating automatic driving control instructions for the vehicle according to the user's driving intention.
  • the driving intent includes at least one intent, each intent in the at least one intent includes n slots, and each of the n slots includes Slot name, slot value and classification of slot value, n is greater than or equal to 0, n is an integer.
  • the intent includes at least one of: stopping, overtaking, decelerating, following, and turning.
  • the slot name includes at least one of: a parking position, a speed value, an overtaking or following object, and a steering orientation.
  • the classification of the slot value is: an enumeration type slot value, a text type slot value or an environment type slot value, wherein the enumeration type slot value Indicates that the slot value is a predefined enumeration value, the text type slot value indicates that the slot value is a substring in the user instruction or the text generated according to the user instruction, and the environment type slot value indicates that the slot value is based on the user instruction.
  • the mentioned content is identified in the environmental information.
  • the processing unit is further configured to: determine whether the driving intention is feasible according to the driving intention, the surrounding environment and traffic regulations; if the driving intention is feasible, generate an automatic driving control for the vehicle instruction.
  • the user instructions include: any one or more of user voice instructions, user text instructions, and user air gesture instructions.
  • the device further includes: a sending unit, where the sending unit is configured to send a photographing activation signal to the photographing device, so as to activate the photographing device to photograph environmental information around the vehicle;
  • the acquiring unit is further configured to: acquire the environmental information around the vehicle photographed by the photographing device according to the photographing activation signal.
  • the acquiring unit is further configured to: acquire environmental information around the vehicle periodically photographed by the photographing device.
  • the user's driving intention is presented to the user through an augmented reality-head-up display AR-HUD or a central control screen.
  • a training method for a multimodal processing module including: acquiring training data, the training data includes training input data and training target data, the training input data includes user instructions and environmental information around the vehicle, and the training target data Including the driving intention corresponding to the training input data; training the multimodal processing module according to the training input data and the training target data.
  • the driving intent includes at least one intent, each intent in the at least one intent includes n slots, and each of the n slots includes Slot name, slot value and classification of slot value, n is greater than or equal to 0, n is an integer.
  • the intent includes at least one of: stopping, overtaking, decelerating, following, and turning.
  • the slot name includes at least one of: a parking position, a speed value, an overtaking or following object, and a steering orientation.
  • the classification of the slot value is: an enumeration type slot value, a text type slot value or an environment type slot value, wherein the enumeration type slot value Indicates that the slot value is a predefined enumeration value, the text type slot value indicates that the slot value is a substring in the user instruction or the text generated according to the user instruction, and the environment type slot value indicates that the slot value is based on the user instruction.
  • the mentioned content is identified in the environmental information.
  • a fourth aspect provides a training device for a multimodal processing module, including an acquisition unit and a processing unit, the acquisition unit is used to acquire training data, the training data includes training input data and training target data, and the training input data includes user instructions and the environment information around the vehicle, the training target data includes the driving intention corresponding to the training input data; the processing unit is used for training the multimodal processing module according to the training input data and the training target data.
  • the driving intent includes at least one intent, each intent in the at least one intent includes n slots, and each of the n slots includes a slot Bit name, slot value and classification of slot value, n is greater than or equal to 0, n is an integer.
  • the intent includes at least one of: stopping, overtaking, decelerating, following, and turning.
  • the slot name includes at least one of: a parking location, a speed value, an overtaking or following object, and a steering orientation.
  • the classification of the slot value is: an enumeration type slot value, a text type slot value or an environment type slot value, wherein the enumeration type slot value Indicates that the slot value is a predefined enumeration value, the text type slot value indicates that the slot value is a substring in the user instruction or the text generated according to the user instruction, and the environment type slot value indicates that the slot value is based on the user instruction.
  • the mentioned content is identified in the environmental information.
  • another method for controlling the driving of a vehicle comprising: in an automatic driving mode of the vehicle, acquiring a user instruction; acquiring environmental information around the vehicle; and determining the user according to the user instruction and the environmental information The driving intention of the vehicle; at least according to the driving intention of the user, an automatic driving control instruction for the vehicle is generated; based on the automatic driving control instruction, the vehicle is controlled to drive.
  • the user's driving intention in the automatic driving mode of the vehicle, can be determined by acquiring the user's instruction and the environmental information around the vehicle, and according to the user's instruction and the environmental information; of the autopilot control commands.
  • the user's temporary driving intention can be executed, and the user does not need to manually take over the control to execute the temporary driving intention, so that the user's experience in the process of automatic driving can be improved.
  • determining the user's driving intention according to the user instruction and the environment information includes: performing multimodal understanding on the user instruction and the environment information; The result of multimodal understanding determines the user's driving intention.
  • the user instruction includes at least one of a user voice instruction, a user text instruction, and a user air gesture instruction.
  • the driving intent includes at least one intent, each of the at least one intent includes n slots, and each of the n slots includes Slot name, slot value and classification of slot value, n is greater than or equal to 0, n is an integer.
  • the intent includes at least one of: stopping, overtaking, decelerating, following, and turning.
  • the slot name includes at least one of: a parking position, a speed value, an overtaking or following object, and a steering orientation.
  • the classification of the slot value is: an enumeration type slot value, a text type slot value or an environment type slot value,
  • the enumeration slot value indicates that the slot value is a predefined enumeration value
  • the text slot value indicates that the slot value is a substring in the user command or the text generated according to the user command
  • the environment slot value indicates The slot value is identified in the environment information according to the content mentioned in the user instruction.
  • an automatic driving control instruction for the vehicle is generated according to the user's driving intention; including: judging whether the driving intention is feasible according to the driving intention, the surrounding environment and traffic regulations; The driving intention is feasible, and the automatic driving control command for the vehicle is generated.
  • prompt information may be generated and sent to the user.
  • the prompt information may include the reason why the driving intention is not feasible.
  • the driving intention after the driving intention is determined, it is judged whether the driving intention is feasible according to the driving intention, the surrounding environment and traffic regulations; if the driving intention is feasible, the automatic driving control instruction for the vehicle is regenerated. Therefore, it is possible to avoid violation of traffic laws or other problems when executing the user's driving intention in the automatic driving mode, thereby ensuring the user experience in the automatic driving process and the safety of automatic driving.
  • the user's instruction to be acquired is to acquire the user's text instruction
  • the user's natural voice instruction or the user's airspace instruction may be acquired first. Gesture commands; then convert natural voice commands or user air gesture commands into text commands.
  • the method further includes: sending a photographing activation signal to a photographing device to activate the photographing device to photograph environmental information around the vehicle; acquiring the environmental information around the vehicle includes: Obtain the environmental information around the vehicle photographed by the photographing device according to the photographing activation signal.
  • acquiring the environmental information around the vehicle includes: acquiring the environmental information around the vehicle periodically photographed by the photographing device.
  • the user's driving intention is presented to the user through an augmented reality-head-up display AR-HUD or a central control screen.
  • the user's driving intention may be presented to the user in the form of augmented reality-head-up display AR-HUD or a central control screen, so that the user can timely judge the correctness of the multimodal understanding result.
  • another apparatus for controlling the running of a vehicle including various modules capable of implementing the method for controlling the running of a vehicle in the fifth aspect or any possible implementation manner of the fifth aspect.
  • a seventh aspect provides a processing method for a multimodal processing module, where the multimodal processing module is obtained by training according to the third aspect or the training method in any possible implementation manner of the third aspect; the processing method includes: The multimodal processing module obtains input data, and the input data includes user instructions and environmental information around the vehicle; the multimodal processing module outputs the driving intention according to the input data.
  • a multimodal processing module is provided, wherein the multimodal processing module is obtained by training according to the third aspect or the training method in any possible implementation manner of the third aspect; the multimodal processing module is obtained by training.
  • the processing module includes: an acquisition unit for acquiring input data, where the input data includes user instructions and environmental information around the vehicle; and a processing unit for outputting driving intentions according to the input data.
  • an autonomous driving vehicle including the device in the second aspect or any possible implementation of the second aspect; and/or, including the fourth aspect or any possible implementation of the fourth aspect and/or, including the above sixth aspect or the device in any possible implementation manner of the sixth aspect; and/or, including the above eighth aspect or the module in any possible implementation manner of the eighth aspect;
  • a tenth aspect provides a device for controlling the driving of a vehicle, characterized by comprising a processor and a memory, wherein the memory is used for storing program instructions, and the processor is used for calling the program instructions to execute the above-mentioned first aspect or The method for controlling the driving of a vehicle in any possible implementation manner of the first aspect; and/or, calling the program instructions to execute the fifth aspect or any possible implementation manner of the fifth aspect. Another way to control the movement of a vehicle.
  • a training device for a multimodal processing module includes a processor and a memory, the memory is used for storing program instructions, and the processor is used for calling the program instructions to execute the above
  • the third aspect or the method for training the multimodal processing module in any possible implementation manner of the third aspect is provided, characterized in that it includes a processor and a memory, the memory is used for storing program instructions, and the processor is used for calling the program instructions to execute the above.
  • a twelfth aspect provides a system, where the system includes the above-mentioned second aspect or the apparatus in any possible implementation manner of the second aspect; and/or, includes the above-mentioned sixth aspect or any possible implementation of the sixth aspect device in the manner.
  • the system may be a vehicle, or may be an on-board system on a vehicle, which is not limited in this application.
  • a thirteenth aspect provides a computer program product containing instructions, which, when the computer program product runs on a computer, causes the computer to execute the control in the first aspect or any possible implementation manner of the first aspect A method for driving a vehicle; and/or, executing the another method for controlling the driving of a vehicle in the fifth aspect or any possible implementation manner of the fifth aspect.
  • a fourteenth aspect provides a computer program product containing instructions, when the computer program product runs on a computer, the computer program product causes the computer to execute the third aspect or any of the possible implementations of the third aspect.
  • the training method of the modality processing module is not limited to:
  • a fifteenth aspect provides a computer-readable storage medium, where the computer-readable medium stores program code for execution by a device, the program code including the first aspect or any possibility for executing the first aspect
  • a sixteenth aspect provides a computer-readable storage medium, where the computer-readable medium stores program code for execution by a device, the program code including the third aspect or any possibility for executing the third aspect.
  • the training method of the multimodal processing module in the implementation manner of .
  • a seventeenth aspect provides a chip, the chip includes a processor and a data interface, the processor reads instructions stored in a memory through the data interface, and executes the first aspect or any possibility of the first aspect The method for controlling the driving of a vehicle in the implementation manner of the above; and/or, executing the another method for controlling the driving of a vehicle in the fifth aspect or any possible implementation manner of the fifth aspect.
  • the chip may further include a memory, in which instructions are stored, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method for controlling vehicle driving in the first aspect or any possible implementation manner of the first aspect; and/or, execute the fifth aspect or any possible implementation manner of the fifth aspect. Said another method of controlling the running of a vehicle.
  • a chip in an eighteenth aspect, includes a processor and a data interface, the processor reads an instruction stored in a memory through the data interface, and executes the third aspect or any possibility of the third aspect.
  • the training method of the multimodal processing module in the implementation manner of .
  • the chip may further include a memory, in which instructions are stored, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the training method of the multimodal processing module in the third aspect or any possible implementation manner of the third aspect.
  • FIG. 1 is a functional block diagram of a vehicle provided by an embodiment of the present application.
  • FIG. 2 is an exemplary diagram of an automatic driving system to which an embodiment of the present application is applicable;
  • FIG. 3 is an example diagram of an application of a cloud-side command to an autonomous driving vehicle according to an embodiment of the present application
  • FIG. 4 is an example diagram of a method for controlling the driving of a vehicle provided by an embodiment of the present application
  • FIG. 5 is an example diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 6 is an example diagram of a specific implementation provided by an embodiment of the present application.
  • FIG. 7 is an exemplary diagram of another specific implementation manner provided by an embodiment of the present application.
  • FIG. 8 is an exemplary diagram of a multimodal processing method provided by an embodiment of the present application.
  • FIG. 9 is an exemplary diagram of another multimodal processing method provided by an embodiment of the present application.
  • FIG. 10 is an example diagram of a training method for a multimodal processing module provided by an embodiment of the present application.
  • FIG. 11 is an example diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 12 is an example diagram of a device for controlling the driving of a vehicle provided by an embodiment of the present application.
  • 13 is a training device for a multimodal processing module provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of an apparatus provided by an embodiment of the present application.
  • FIG. 15 is an example diagram of a computer program product provided by an embodiment of the present application.
  • FIG. 1 is a functional block diagram of a vehicle provided by an embodiment of the present application.
  • the vehicle 100 is configured in a fully or partially autonomous driving mode.
  • the vehicle 100 can control itself while in an autonomous driving mode, and can determine the current state of the vehicle and its surroundings through human manipulation, determine the possible behavior of at least one other vehicle in the surrounding environment, and determine the other vehicles perform The confidence level corresponding to the likelihood of the possible behavior, the vehicle 100 is controlled based on the determined information.
  • the vehicle 100 may be placed to operate without human interaction.
  • Vehicle 100 may include various subsystems, such as travel system 102 , sensor system 104 , control system 106 , one or more peripherals 108 and power supply 110 , computer system 112 , and user interface 116 .
  • vehicle 100 may include more or fewer subsystems, and each subsystem may include multiple elements. Additionally, each of the subsystems and elements of the vehicle 100 may be interconnected by wire or wirelessly.
  • the travel system 102 may include components that provide powered motion for the vehicle 100 .
  • travel system 102 may include engine 118 , energy source 119 , transmission 120 , and wheels/tires 121 .
  • the engine 118 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, such as a gasoline engine and electric motor hybrid engine, an internal combustion engine and an air compression engine hybrid engine.
  • Engine 118 converts energy source 119 into mechanical energy.
  • Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity.
  • the energy source 119 may also provide energy to other systems of the vehicle 100 .
  • Transmission 120 may transmit mechanical power from engine 118 to wheels 121 .
  • Transmission 120 may include a gearbox, a differential, and a driveshaft.
  • transmission 120 may also include other devices, such as clutches.
  • the drive shaft may include one or more axles that may be coupled to one or more wheels 121 .
  • the sensor system 104 may include several sensors that sense information about the environment surrounding the vehicle 100 .
  • the sensor system 104 may include a positioning system 122 (the positioning system may be a global positioning system (GPS) system, a Beidou system or other positioning systems), an inertial measurement unit (IMU) 124, Radar 126 , laser rangefinder 128 and camera 130 .
  • the sensor system 104 may also include sensors of the internal systems of the vehicle 100 being monitored (eg, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensor data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, orientation, velocity, etc.). This detection and identification is a critical function for the safe operation of the autonomous vehicle 100 .
  • the positioning system 122 may be used to estimate the geographic location of the vehicle 100 .
  • the IMU 124 is used to sense position and orientation changes of the vehicle 100 based on inertial acceleration.
  • IMU 124 may be a combination of an accelerometer and a gyroscope.
  • Radar 126 may utilize radio signals to sense objects within the surrounding environment of vehicle 100 . In some embodiments, in addition to sensing objects, radar 126 may be used to sense the speed and/or heading of objects.
  • the laser rangefinder 128 may utilize laser light to sense objects in the environment in which the vehicle 100 is located.
  • the laser rangefinder 128 may include one or more laser sources, laser scanners, and one or more detectors, among other system components.
  • Camera 130 may be used to capture multiple images of the surrounding environment of vehicle 100 .
  • Camera 130 may be a still camera or a video camera.
  • Control system 106 controls the operation of the vehicle 100 and its components.
  • Control system 106 may include various elements including steering system 132 , throttle 134 , braking unit 136 , sensor fusion algorithms 138 , computer vision system 140 , route control system 142 , and obstacle avoidance system 144 .
  • the steering system 132 is operable to adjust the heading of the vehicle 100 .
  • it may be a steering wheel system.
  • the throttle 134 is used to control the operating speed of the engine 118 and thus the speed of the vehicle 100 .
  • the braking unit 136 is used to control the deceleration of the vehicle 100 .
  • the braking unit 136 may use friction to slow the wheels 121 .
  • the braking unit 136 may convert the kinetic energy of the wheels 121 into electrical current.
  • the braking unit 136 may also take other forms to slow the wheels 121 to control the speed of the vehicle 100 .
  • Computer vision system 140 may be operable to process and analyze images captured by camera 130 in order to identify objects and/or features in the environment surrounding vehicle 100 .
  • the objects and/or features may include traffic signals, road boundaries and obstacles.
  • Computer vision system 140 may use object recognition algorithms, Structure from Motion (SFM) algorithms, video tracking, and other computer vision techniques.
  • SFM Structure from Motion
  • the computer vision system 140 may be used to map the environment, track objects, estimate the speed of objects, and the like.
  • the route control system 142 is used to determine the travel route of the vehicle 100 .
  • the route control system 142 may combine data from the sensors 138 , the GPS 122 , and one or more predetermined maps to determine a driving route for the vehicle 100 .
  • the obstacle avoidance system 144 is used to identify, evaluate, and avoid or otherwise traverse potential obstacles in the environment of the vehicle 100 .
  • control system 106 may additionally or alternatively include components other than those shown and described. Alternatively, some of the components shown above may be reduced.
  • Peripherals 108 may include a wireless communication system 146 , an onboard computer 148 , a microphone 150 and/or a speaker 152 .
  • peripherals 108 provide a means for a user of vehicle 100 to interact with user interface 116 .
  • the onboard computer 148 may provide information to the user of the vehicle 100 .
  • User interface 116 may also operate on-board computer 148 to receive user input.
  • the onboard computer 148 can be operated via a touch screen.
  • peripheral devices 108 may provide a means for vehicle 100 to communicate with other devices located within the vehicle.
  • microphone 150 may receive audio (eg, voice commands or other audio input) from a user of vehicle 100 .
  • speakers 152 may output audio to a user of vehicle 100 .
  • Wireless communication system 146 may wirelessly communicate with one or more devices, either directly or via a communication network.
  • wireless communication system 146 may use 3G cellular communications such as code division multiple access (CDMA), global system for mobile communications (GSM), general packet radio service , GPRS), or 4G cellular communications, such as long term evolution (LTE), or 5G cellular communications.
  • the wireless communication system 146 may communicate with a wireless local area network (WLAN) using WiFi.
  • the wireless communication system 146 may communicate directly with the device using an infrared link, Bluetooth, or the like.
  • Other wireless protocols, such as various vehicle communication systems, for example, wireless communication system 146 may include one or more dedicated short range communications (DSRC) devices, which may include communication between vehicles and/or roadside stations public and/or private data communications.
  • DSRC dedicated short range communications
  • the power supply 110 may provide power to various components of the vehicle 100 .
  • the power source 110 may be a rechargeable lithium-ion or lead-acid battery.
  • One or more battery packs of such a battery may be configured as a power source to provide power to various components of the vehicle 100 .
  • power source 110 and energy source 119 may be implemented together, such as in some all-electric vehicles.
  • Computer system 112 may include at least one processor 113 that executes instructions 115 stored in a non-transitory computer-readable medium such as memory 114 .
  • Computer system 112 may also be multiple computing devices that control individual components or subsystems of vehicle 100 in a distributed fashion.
  • the processor 113 may be any conventional processor, such as a commercially available CPU. Alternatively, the processor may be a dedicated device such as an ASIC or other hardware-based processor.
  • FIG. 1 functionally illustrates the processor, memory, and other elements of the computer 110 in the same block, one of ordinary skill in the art will understand that the processor, computer, or memory may actually include a processor, a computer, or a memory that may or may not Multiple processors, computers, or memories stored within the same physical enclosure.
  • the memory may be a hard drive or other storage medium located within an enclosure other than computer 110 .
  • reference to a processor or computer will be understood to include reference to a collection of processors or computers or memories that may or may not operate in parallel.
  • some components such as the steering and deceleration components may each have their own processor that only performs computations related to component-specific functions .
  • a processor may be located remotely from the vehicle and in wireless communication with the vehicle. In other aspects, some of the processes described herein are performed on a processor disposed within the vehicle while others are performed by a remote processor, including taking steps necessary to perform a single maneuver.
  • the memory 114 may contain instructions 115 (eg, program logic) executable by the processor 113 to perform various functions of the vehicle 100 , including those described above.
  • Memory 114 may also contain additional instructions, including instructions to send data to, receive data from, interact with, and/or control one or more of travel system 102 , sensor system 104 , control system 106 , and peripherals 108 . instruction.
  • memory 114 may store data such as road maps, route information, vehicle location, direction, speed, and other such vehicle data, among other information. Such information may be used by the vehicle 100 and the computer system 112 during operation of the vehicle 100 in autonomous, semi-autonomous and/or manual modes.
  • a user interface 116 for providing information to or receiving information from a user of the vehicle 100 .
  • the user interface 116 may include one or more input/output devices within the set of peripheral devices 108 , such as a wireless communication system 146 , an onboard computer 148 , a microphone 150 and a speaker 152 .
  • Computer system 112 may control functions of vehicle 100 based on input received from various subsystems (eg, travel system 102 , sensor system 104 , and control system 106 ) and from user interface 116 .
  • computer system 112 may utilize input from control system 106 in order to control steering unit 132 to avoid obstacles detected by sensor system 104 and obstacle avoidance system 144 .
  • computer system 112 is operable to provide control of various aspects of vehicle 100 and its subsystems.
  • one or more of these components described above may be installed or associated with the vehicle 100 separately.
  • memory 114 may exist partially or completely separate from vehicle 100.
  • the above-described components may be communicatively coupled together in a wired and/or wireless manner.
  • FIG. 1 should not be construed as a limitation on the embodiments of the present application.
  • a self-driving car traveling on a road can recognize objects within its surroundings to determine adjustments to the current speed.
  • the objects may be other vehicles, traffic control equipment, or other types of objects.
  • each identified object may be considered independently, and based on the object's respective characteristics, such as its current speed, acceleration, distance from the vehicle, etc., may be used to determine the speed at which the autonomous vehicle is to adjust.
  • the autonomous vehicle vehicle 100 or a computing device associated with the autonomous vehicle 100 may be based on the characteristics of the identified objects and the state of the surrounding environment (eg, traffic, rain, ice on the road, etc.) to predict the behavior of the identified object.
  • each identified object is dependent on the behavior of the other, so it is also possible to predict the behavior of a single identified object by considering all identified objects together.
  • the vehicle 100 can adjust its speed based on the predicted behavior of the identified object.
  • the self-driving car can determine what steady state the vehicle will need to adjust to (eg, accelerate, decelerate, or stop) based on the predicted behavior of the object.
  • other factors may also be considered to determine the speed of the vehicle 100, such as the lateral position of the vehicle 100 in the road being traveled, the curvature of the road, the proximity of static and dynamic objects, and the like.
  • the computing device may also provide instructions to modify the steering angle of the vehicle 100 so that the self-driving car follows a given trajectory and/or maintains contact with objects in the vicinity of the self-driving car (eg, , cars in adjacent lanes on the road) safe lateral and longitudinal distances.
  • objects in the vicinity of the self-driving car eg, , cars in adjacent lanes on the road
  • the autonomous vehicle 100 or a computing device associated with the autonomous vehicle 100 may also be based on the state of the vehicle and the detected environmental information, Predict the availability of autonomous driving on the road ahead and control the switching between autonomous and manual driving modes.
  • the above-mentioned vehicle 100 can be a car, a truck, a motorcycle, a bus, a boat, an airplane, a helicopter, a lawn mower, a recreational vehicle, a playground vehicle, construction equipment, a tram, a golf cart, a train, a cart, etc.
  • the application examples are not particularly limited.
  • FIG. 2 is an example diagram of an automatic driving system provided by an embodiment of the present application.
  • the automatic driving system shown in FIG. 2 includes a computer system 101 , wherein the computer system 101 includes a processor 103 , and the processor 103 is coupled with a system bus 105 .
  • the processor 103 may be one or more processors, each of which may include one or more processor cores.
  • a video adapter 107 which can drive a display 109, is coupled to the system bus 105.
  • the system bus 105 is coupled to an input/output (I/O) bus 113 through a bus bridge 111 .
  • I/O interface 115 is coupled to the I/O bus.
  • I/O interface 115 communicates with various I/O devices, such as input device 117 (eg, keyboard, mouse, touch screen, etc.), media tray 121, (eg, compact disc read-only) memory, CD-ROM), multimedia interface, etc.).
  • Transceiver 123 which can transmit and/or receive radio communication signals
  • camera 155 which can capture sceneries and dynamic digital video images
  • USB universal serial bus
  • the processor 103 may be any conventional processor, including a reduced instruction set computing (reduced instruction set computer, RISC) processor, a complex instruction set computing (complex instruction set computer, CISC) processor or a combination of the above.
  • the processor may be a dedicated device such as an application specific integrated circuit (ASIC).
  • the processor 103 may be a neural network processor or a combination of a neural network processor and the above-mentioned conventional processors.
  • computer system 101 may be located remotely from the autonomous vehicle and may communicate wirelessly with the autonomous vehicle.
  • some of the processes described herein are performed on a processor disposed within the autonomous vehicle, others are performed by a remote processor, including taking actions required to perform a single maneuver.
  • Network interface 129 is a hardware network interface, such as a network card.
  • the network 127 may be an external network, such as the Internet, or an internal network, such as an Ethernet network or a virtual private network (VPN).
  • the network 127 may also be a wireless network, such as a WiFi network, a cellular network, and the like.
  • the hard disk drive interface is coupled to the system bus 105 .
  • the hard drive interface is connected to the hard drive.
  • System memory 135 is coupled to system bus 105 . Data running in system memory 135 may include operating system 137 and application programs 143 of computer 101 .
  • the operating system includes a parser 139 (shell) and a kernel 141 (kernel).
  • the shell 139 is an interface between the user and the kernel of the operating system.
  • the shell is the outermost layer of the operating system.
  • the shell manages the interaction between the user and the operating system: waiting for user input, interpreting user input to the operating system, and processing various operating system output.
  • Kernel 141 consists of those parts of the operating system that manage memory, files, peripherals, and system resources. Interacting directly with hardware, the operating system kernel usually runs processes and provides inter-process communication, providing CPU time slice management, interrupts, memory management, IO management, and more.
  • Application 143 includes programs that control the autonomous driving of the car, for example, programs that manage the interaction of the autonomous car with obstacles on the road, programs that control the route or speed of the autonomous car, and programs that control the interaction of the autonomous car with other autonomous vehicles on the road. .
  • Application 143 also exists on the system of deploying server 149.
  • computer system 101 may download application 143 from deploying server 14 when application 147 needs to be executed.
  • the application 141 may be a program that controls the autonomous vehicle to activate or deactivate the assisted autonomous driving function.
  • Sensor 153 is associated with computer system 101 .
  • the sensor 153 is used to detect the environment around the computer 101 .
  • the sensor 153 can detect animals, cars, obstacles and pedestrian crossings, etc. Further sensors can also detect the environment around the above-mentioned animals, cars, obstacles and pedestrian crossings, such as: the environment around animals, for example, animals appear around other animals, weather conditions, ambient light levels, etc.
  • the sensors may be cameras, infrared sensors, chemical detectors, microphones, and the like.
  • Computer system 112 in FIG. 1 may also receive information from or transfer information to other computer systems.
  • sensor data collected from the sensor system 104 of the vehicle 100 may be transferred to another computer for processing of the data.
  • data from the computer system 312 may be transmitted via a network to a server 320 on the cloud side (which may also be referred to as the cloud) for further processing.
  • Networks and intermediate nodes may include various configurations and protocols, including the Internet, the World Wide Web, Intranets, Virtual Private Networks, Wide Area Networks, Local Area Networks, private networks using one or more of the company's proprietary communication protocols, Ethernet, WiFi, and hypertext The hypertext transfer protocol (HTTP), and various combinations of the foregoing.
  • Such communications may be by any device capable of transferring data to and from other computers, such as modems and wireless interfaces.
  • data such as vehicle status and environmental information are transmitted to the cloud-side server 320 for further processing.
  • the cloud-side server can use a variety of neural network models to identify and process these data, and feed the identification results back to the computer system 312, so that The computer system 312 may determine whether the assisted autopilot function is turned on or off.
  • server 320 may include a server having multiple computers, such as a load balancing server farm, that exchange information with different nodes of the network for the purpose of receiving, processing, and transmitting data from computer system 312 .
  • the server may be configured similarly to computer system 312 , with processor 330 , memory 340 , instructions 350 , and data 360 .
  • An automated driving system may contain several assisted automated driving functions. Such as pre-collision safety braking (pre-collision system, PCS), adaptive cruise control (adaptive cruise control, ACC), lane keeping assist (lane keeping aid, LKA), cross traffic alert (cross traffic alert, CTA), Rear cross traffic alert (RCTA), blind spot warning (BSW), off vehicle warning and traffic jam assist (TJA), etc.
  • pre-collision safety braking pre-collision system, PCS
  • adaptive cruise control adaptive cruise control
  • ACC adaptive cruise control
  • LKA lane keeping assist
  • cross traffic alert crossing traffic alert
  • CTA Rear cross traffic alert
  • BW blind spot warning
  • TJA off vehicle warning and traffic jam assist
  • the driving basis of autonomous vehicles is based on the preset destination and the surrounding environment of the vehicle obtained by various sensors, and finally sends the user to the corresponding destination through the planned route.
  • the user may have some temporary intentions that are different from driving to the destination according to the visual information around the vehicle. If you are close to the car in front, you need to keep your distance, etc.
  • the present application provides a method for controlling the driving of a vehicle, so that during the process of driving an autonomous vehicle in the automatic driving mode, if the user has a temporary intention, the user's instructions and the surrounding environment information of the vehicle can be multi-processed.
  • Modal understanding determine the user's driving intention, and control the motion of the vehicle according to the user's driving intention. Therefore, the user's temporary intention can be executed in the automatic driving mode, and the user's experience in the automatic driving process can be further improved.
  • FIG. 4 is an example diagram of a method for controlling the driving of a vehicle provided by an embodiment of the present application. It should be understood that the method shown in FIG. 4 can be applied to the vehicle shown in FIG. 1 or the automatic driving system shown in FIG. 2 . It should be understood that the method shown in FIG. 4 is performed in an automatic driving mode.
  • the method 400 includes steps S410 to S440, which will be described in detail below.
  • the user instruction includes: any one or more of a user's natural voice instruction (ie, a user's voice instruction), a user text instruction, and a user air gesture instruction, which is not limited in this application.
  • a user's natural voice instruction ie, a user's voice instruction
  • a user text instruction ie, a user's voice instruction
  • a user air gesture instruction ie, a user's voice instruction
  • Temporary intentions can be input to related in-vehicle devices by means of user instructions.
  • the temporary intent is input into the microphone by means of natural voice instructions; for another example, the temporary intent is input into the relevant user action acquisition device by means of air gesture instructions; for example, the temporary intent is transmitted by means of text instructions It is directly input into the relevant text input device, which is not limited in this application.
  • the user text instruction in the above step S410 may be obtained directly from the user through the relevant text entry device, or the user may be obtained from other devices first.
  • a voice command or an air gesture command is then converted into a text command through a related device.
  • the present application does not limit the acquisition method of the text command.
  • the user if the user generates a temporary intention, he can use natural speech to speak his intention to the relevant in-vehicle device (eg, a microphone) in the car.
  • the conversion of natural speech instructions into text instructions may be implemented by automatic speech recognition (ASR).
  • ASR automatic speech recognition
  • the user's text instruction is acquired, and specifically, the text instruction may be acquired from the ASR.
  • the air gesture instruction can be converted into a text instruction by the relevant gesture recognition device.
  • the environmental information around the vehicle can be acquired through a photographing device, specifically, an image or video is acquired through the photographing device, so as to reflect the environmental information through the information in the image or video; it can also be obtained through lidar, vehicle-mounted sensors and/or vehicle
  • This application does not limit the environmental information obtained through networking or the like.
  • the solution will be described by taking the photographing device acquiring the environmental information as an example.
  • the photographing device may obtain video information or image information, or may first obtain video information around the vehicle, and then obtain image information from the video, which is not limited in this application.
  • the acquisition of image information captured by a photographing device is taken as an example for description, but it should be understood that this does not constitute a limitation to the present application.
  • a shooting activation signal may be sent to the photographing apparatus to activate the photographing apparatus to photograph image information (ie, environmental information) around the vehicle.
  • image information ie, environmental information
  • the photographing device photographs the surrounding image information
  • the surrounding image information photographed by the photographing device is acquired.
  • the photographing device may periodically photograph image information around the vehicle.
  • acquiring image information around the vehicle may include: acquiring image information around the vehicle periodically captured by a photographing device.
  • the suitable image information may be the image information newly captured by the photographing device, or may be image information corresponding to a specific time interval estimated according to the recognition time of natural voice commands or air gesture commands. It may also be the image information corresponding to the acquisition of the text instruction. Specifically, the selection of the image information should be carried out according to the actual situation, which is not limited in this application.
  • S430 perform multimodal understanding on the user's instruction and the environmental information around the vehicle, and determine the user's driving intention. or,
  • the above step S430 may also be: determining the user's driving intention according to the user's instruction and environmental information around the vehicle.
  • determining the user's driving intention according to the user's instruction and environmental information around the vehicle This means that the solution of the present application does not limit the way of determining the user's driving intention according to the user's instructions and the environmental information around the vehicle. Determined by other means, which is not limited in this application.
  • the multimodal understanding of the user's instruction and the environmental information around the vehicle to determine the user's driving intention is used as an example for description.
  • step S430 can be completed in a multi-modal processing module (ie, the multi-modal processing module 540 in FIG. 5 ).
  • the module will be described below with reference to FIG. 5 , and the process of multimodal processing will be described with reference to FIG. 8 and FIG. 9 , which will not be repeated here.
  • the driving intention includes at least one intention, each intention in the at least one intention includes n slots, and each slot in the n slots includes a slot name, a slot value, and a classification of the slot value, n is greater than or equal to 0, and n is an integer.
  • the intent may include at least one of: stop, overtake, slow down, follow, turn, and the like. It should be understood that other intentions may also be included in actual operations, which are not limited in this application.
  • the slot name may include at least one of: a parking position, a speed value, an overtaking or following object, a turning direction, and the like. It should be understood that in actual operation, other slot names may also be included, which are not limited in this application.
  • the classification of the slot value may be: an enumeration type slot value, a text type slot value or an environment type slot value.
  • the enumeration class slot value indicates that the slot value is a predefined enumeration value. For example: the user command is "turn right at the next intersection”. At this time, there is a slot corresponding to the steering orientation. Since the steering orientation can be enumerated, for example, there are only four options for the steering orientation: left, right, straight, U-turn. At this time, the slot value of the slot "turning orientation" is "right", and the slot value can be understood as an enumeration type slot value.
  • the text-type slot value indicates that the slot value is a substring in the user instruction or the text generated according to the user instruction. It should be understood that the slot value at this time is a non-enumerable value.
  • the user command is "stop next to the gas station”. At this time, there is a slot corresponding to the parking position. Since the parking position cannot be enumerated, at this time, the substring in the command can be used. "Beside the station” is used as a slot value, which can be understood as a text-based slot value.
  • the user's instruction is "park at the luxurious hotel in front”. At this time, there is a slot corresponding to the parking position.
  • the text generated according to the instruction can be used.
  • "High-level hotel” is used as a slot value, which can also be understood as a text-based slot value. It should be understood that the above-mentioned descriptions are all described below by taking a user text instruction as an example. Then, the text-type slot value indicates that the slot value may be a substring in the user text instruction or text generated according to the user text instruction, and the following embodiments take this as an example.
  • the environment class slot value indicates that the slot value is identified in the environment information according to the content mentioned in the user instruction.
  • the environmental information when the environmental information is acquired by the photographing device, the environmental information may be image information, then the environment-based slot value may also be an image-based slot value, and the image can reflect the environment around the vehicle. Therefore, the image-type slot value indicates that the slot value is identified in the image information according to the content mentioned in the user instruction.
  • the image-type slot value indicates that the slot value is identified in the image information according to the content mentioned in the user instruction.
  • the image-type slot value indicates that the slot value is identified in the image information according to the content mentioned in the user instruction.
  • the image-type slot value indicates that the slot value is identified in the image information according to the content mentioned in the user instruction.
  • the user command is "drive to the blue car position and pull over to the side”
  • there is a slot corresponding to the parking position Since the parking position is the "blue car position", you can use
  • the rectangular frame identifies the "blu
  • driving intention includes at least one intention
  • the driving intention may include one intention or multiple intentions at the same time. For example, when the user instruction is "turn right at the next intersection”, it includes a steering intent; when the user instruction is “turn right at the next intersection and stop”, it includes a steering intent and a parking intent.
  • each intent in at least one intent includes n slots, each of the n slots includes a slot name, a slot value, and a classification of the slot value, and n is greater than or equal to 0. , n is an integer", which means that the intent may include one or more slots describing the intent, or may not include the slot. If the intent includes a slot describing the intent, each corresponding slot includes a slot name, a slot value, and a classification of the slot value.
  • the representation is to stop, and there is no slot describing the intent at this time, and subsequent operations can be performed directly based on the intent.
  • the slot name, slot value, and slot corresponding to the slot can be listed according to the user command. Classification of bit values.
  • the slot name, slot value, and classification of the slot value corresponding to the first slot of the parking intent may be the parking location, the gas station ahead, and the text-based slot value, respectively;
  • the slot name, slot value and the classification of the slot value corresponding to the two slots can be the parking position, the rectangular frame (identifying the gas station ahead in the image information), and the image slot value.
  • the user's driving intention can be presented to the user through an augmented reality-head up display (AR-HUD) or a central control screen, so that the user can timely judge the correctness of the multimodal understanding result.
  • AR-HUD augmented reality-head up display
  • central control screen a central control screen
  • the AR-HUD can present the object mentioned by the user on the windshield (such as the rectangular box shown in (a) in Figure 11), or use the AR-HUD The control screen, etc. displays the objects mentioned by the user.
  • an automatic driving control instruction for the vehicle may be generated according to the above-obtained driving intention. So that the vehicle can control the vehicle according to the automatic driving control instruction in the automatic driving mode.
  • the driving intention after the driving intention is determined, it is judged whether the driving intention is feasible according to the driving intention, the surrounding environment and traffic regulations; if the driving intention is feasible, the automatic driving control instruction for the vehicle is regenerated. Therefore, it is possible to avoid violation of traffic laws or other problems when executing the user's driving intention in the automatic driving mode, thereby ensuring the user experience in the automatic driving process and the safety of automatic driving.
  • prompt information may be generated and sent to the user.
  • the prompt information may also include the reason why the driving intention is not feasible.
  • the vehicle can also prompt the user through a voice broadcast, such as "parking for you"; it can also use AR-HUD or the central control screen to display the target path and the target path of the vehicle to be driven.
  • the target position is displayed to the user (eg, dynamic arrows and boxes shown in (b) of FIG. 11 ).
  • the above-mentioned method 400 may be executed on a cloud server or an edge cloud server, or may be executed in a computer system of a vehicle, which is not limited in this application.
  • the user's driving intention in the automatic driving mode of the vehicle, can be determined by acquiring user instructions and environmental information around the vehicle, and performing multi-modal understanding of the user instructions and environmental information around the vehicle; Then, according to the user's driving intention, an automatic driving control command for the vehicle is generated.
  • the user's temporary driving intention can be executed, and the user does not need to manually take over the control to execute the temporary driving intention, so that the user's experience in the process of automatic driving can be improved.
  • FIG. 5 is an example diagram of a system architecture provided by an embodiment of the present application. It should be understood that the system architecture is only an example, and does not constitute a limitation to the present application. As shown in FIG. 5 , the system architecture 500 includes: a microphone 510, an automatic speech recognition (ASR) module 520, a camera 530 (ie, a photographing device), a multimodal processing module 540, a decision planning calculation module 550 and Vehicle motion control module 560 . These modules are described below.
  • ASR automatic speech recognition
  • Microphone 510 a microphone or microphone group deployed in the vehicle cockpit, used to collect audio information of the user in the cockpit, that is, the user's voice command involved in this application, which may also be referred to as the user's natural voice command.
  • ASR module 520 used to recognize the user's natural language instructions collected by the microphone 510, and convert the user's natural language instructions into text instructions.
  • Camera 530 a camera or camera group deployed on the vehicle, used to collect image information around the vehicle.
  • Multimodal processing module 540 mainly includes a multimodal intent recognition engine. It is used to receive the text instruction recognized by the ASR module 520 and the image information collected by the camera 530, and generate the corresponding driving intention according to the text instruction and the image information. And in some cases, the multimodal processing module 540 can also be used to control the camera 530 to collect image information, as shown in Embodiment 1 below.
  • Decision planning calculation module 550 used for judging the driving intention generated by the multimodal processing module 540 in combination with traffic regulations, surrounding environment and other conditions to determine whether the driving intention is feasible. The driving intent is adjusted where necessary, and vehicle control commands are generated.
  • Vehicle motion control module 560 used to control the vehicle motion according to the vehicle control command from the decision planning calculation module 550 .
  • FIG. 6 is an example diagram of a specific implementation provided by an embodiment of the present application. As shown in FIG. 6 , the specific implementation includes steps 1 to 11, and these steps are described in detail below.
  • Step 1 The user issues a voice command.
  • Step 2 Send natural voice commands.
  • the microphone 510 sends the received natural voice instruction to the ASR module 520 .
  • the ASR module 520 performs voice recognition on the received voice command, and identifies the text command corresponding to the voice command.
  • Step 4. Transmit user text instructions.
  • the ASR module 520 transmits the recognized textual instructions to the multimodal processing module 530 .
  • Step 5 Send a capture activation signal.
  • the multimodal processing module 530 After receiving the text instruction, the multimodal processing module 530 sends a shooting activation signal to the camera 530 to activate the camera 530 to collect surrounding image information.
  • Step 6 Capture image information around the vehicle.
  • the camera 530 After the camera 530 receives the shooting activation signal, it shoots image information around the vehicle.
  • Step 7 Send image information around the vehicle.
  • the camera 530 sends the captured image information around the vehicle to the multimodal processing module 540 .
  • Step 8 Multimodal understanding based on textual instructions and image information.
  • the multimodal processing module 540 performs multimodal understanding based on the text instruction and image information, and obtains the user's driving intention.
  • Step 9 Send driving intent.
  • the multimodal processing module 540 sends the driving intention identified in step 8 to the decision planning calculation module 550 .
  • Step 10 Determine if the intent is feasible.
  • the user's driving intention may not comply with the traffic laws (for example, the user requires the opposite direction of the one-way street or requests to stop at the intersection where parking is not possible, etc.); or, the user's driving intention may not be realized in the current surrounding environment; or some other circumstances lead to The user's driving intent may not be realized.
  • the decision planning calculation module 550 needs to judge whether the driving intention is feasible according to the driving intention in combination with necessary information such as the surrounding environment and traffic regulations, generate prompt information according to the judgment result, and notify the user. For example, if the judgment result is infeasible, the user's driving intention cannot be executed, and the user can be informed of the reason for the inability to execute. If the judgment result is feasible, step 11 is executed.
  • Step 11 Adjust the driving parameters of the vehicle according to the driving intention, surrounding environment, traffic regulations and other information.
  • the decision planning calculation module 550 determines the specific vehicle motion control instruction according to the driving intention, surrounding environment, traffic regulations and other necessary information, and sends it to the vehicle motion control module 560 .
  • the vehicle motion control module 560 performs specific execution operations according to the vehicle motion control instructions.
  • control instruction of the vehicle motion may be modified according to the actual situation, so that the vehicle continues to drive in the automatic driving mode to the final destination to be reached by the user.
  • FIG. 7 is an example diagram of another specific implementation manner provided by an embodiment of the present application. As shown in FIG. 7 , the specific implementation includes steps 1 to 10, and these steps are described in detail below.
  • Step 1 to Step 4 Reference may be made to Step 1 to Step 4 in the previous implementation manner (in FIG. 6 ), which will not be repeated here.
  • Step 5 Periodically capture image information around the vehicle.
  • the camera 530 periodically captures image information around the vehicle.
  • Step 6 Send image information around the vehicle.
  • the camera 530 periodically sends the captured image information around the vehicle to the multimodal processing module 540 .
  • Step 7 Multimodal understanding based on textual instructions and image information.
  • the multi-modal processing module 540 obtains the user's driving intention based on multi-modal understanding of the text instruction and image information at an appropriate time.
  • the image information at the appropriate time may be the latest image information, or may be image information corresponding to a specific time interval estimated according to the recognition time of the natural language instruction.
  • Step 8 to Step 10 Reference may be made to Step 9 to Step 11 in the previous implementation (in FIG. 6 ), which will not be repeated here.
  • FIG. 8 is an example diagram of a multimodal processing process provided by an embodiment of the present application.
  • the multi-modal processing mainly inputs user instructions and environmental information into the multi-modal processing module, and the multi-modal understanding is carried out through the multi-modal processing module, and finally the driving intention is output.
  • the multimodal processing module is obtained through pre-training. Specifically, in the training process, user instructions (such as user voice instructions, user text instructions or user air gesture instructions), environmental information (such as image information), and corresponding driving intentions can be used as training data to perform multimodal processing.
  • the modules are trained as shown in Figure 10. So that in the application stage of the multimodal processing module, after inputting user instructions and environmental information, the corresponding driving intention can be output.
  • FIG. 9 is an exemplary diagram of another multimodal processing process provided by an embodiment of the present application.
  • text instructions are used as user instructions
  • image information is used as environmental information.
  • FIG. 9 is only a structural example of the multimodal processing module shown in FIG. 8 , and does not constitute a limitation to the present application. It should be understood that, in practice, the structure of the multimodal processing module can also take other forms, and the structure of the multimodal processing module can also be composed of other processing models, networks or modules, as long as the input text instructions and images can be realized. It is enough to output the driving intention of the information.
  • the multimodal processing process in this example will be described below with reference to FIG. 9 .
  • the multimodal processing module may include a text processing model, a convolutional neural network (CNN), an attention module att.1 and an attention module att.2.
  • the text processing model may be a BERT model commonly used in text processing, or may be other models that can be used for text processing, which is not limited in this application.
  • the CNN network can be a deep residual network (Deep residual network, ResNet), etc., which is not limited.
  • the process of the multimodal processing module for understanding the driving intent can be as follows:
  • the text instruction extracts the corresponding text features through the BERT model; the image information extracts the corresponding image features through the CNN network (eg: ResNet).
  • the CNN network eg: ResNet
  • the attention module att.1 is used to synthesize the text features with the image features, so as to obtain at least one intent and n slots corresponding to each intent in the at least one intent, where n is greater than or equal to 0, and n is an integer.
  • each of the n slots includes a slot name, a slot value and a classification of the slot value, wherein the classification of the slot value is an enumeration slot value, a text slot value or an image class Slot value (see the description of the driving intention in Figure 4).
  • the slot value of a certain slot corresponding to the intent obtained by the attention module att.1 is classified as an image class slot value, then the image feature is integrated with the text feature through the attention module att.2, so as to obtain the slot value
  • the slot value of the bit that is, the rectangular frame of the object mentioned in the user text instruction, for example, the rectangular frame corresponding to the blue car in Figure 11.
  • the information obtained by att.1 and att.2 is the driving intention.
  • FIG. 10 is an example diagram of a training method for a multimodal processing module provided by an embodiment of the present application. As shown in FIG. 10, the training method 1000 includes steps S1010 and S1020, and the steps are described below.
  • the training data includes training input data and training target data
  • the training input data includes user instructions and environmental information around the vehicle
  • the training target data includes the driving intention corresponding to the training input data
  • the driving intent includes at least one intent, each intent in the at least one intent includes n slots, and each of the n slots includes a slot name, a slot value, and a classification of the slot value, where n is greater than or equal to 0, where n is an integer.
  • the intent includes at least one of: stop, overtake, slow down, follow, turn, and the like.
  • the slot name includes at least one of: a parking position, a speed value, an overtaking or following object, a turning direction, and the like.
  • Slot values are classified as: enumeration type slot value, text type slot value or environment type slot value.
  • the enumeration slot value indicates that the slot value is a predefined enumeration value
  • the text slot value indicates that the slot value is a substring in the user command or the text generated according to the user command
  • the environment slot value indicates The slot value is identified in the environment information according to the content mentioned in the user instruction.
  • FIG. 11 is an example diagram of an application scenario provided by an embodiment of the present application. It should be understood that the application scenario shown in FIG. 11 is only an example, and does not constitute a limitation to the present application. The application scenario is described below with reference to FIG. 11 .
  • the user of the autonomous driving vehicle temporarily generates a new driving intention when the vehicle is driving in the autonomous driving mode according to a preset destination, and expresses a voice to the vehicle (for example, the vehicle the microphone on the top) to issue natural voice commands, such as "drive to the blue car position and pull over”.
  • a voice for example, the vehicle the microphone on the top
  • natural voice commands such as "drive to the blue car position and pull over”.
  • the relevant on-board devices on the vehicle such as the ASR module, recognize the natural language commands and convert them into text commands.
  • the device or related module on the vehicle for controlling the driving of the vehicle determines the temporary intention of the user (that is, the user needs to park on the roadside of the blue car in front) through the above method 400, and then the device or related module determines the temporary driving intention of the vehicle according to the temporary driving intention of the vehicle. Generate appropriate vehicle control commands and issue them to the vehicle.
  • the vehicle can also provide user feedback through voice announcements and/or augmented reality-head up display (AR-HUD). As shown in (b) of Figure 11, the vehicle can prompt the user through voice broadcast, such as "stopping for you"; it can also display the target path and target location of the vehicle to be driven by AR-HUD. user.
  • AR-HUD augmented reality-head up display
  • this application scenario can also be understood as a user display interface, which can present the driving intention to the user, such as the rectangular frame shown in (a) in FIG.
  • the target position for travel is shown as arrows and boxes as shown in (b) of FIG. 11 .
  • FIG. 12 is an example diagram of a device for controlling the driving of a vehicle provided by an embodiment of the present application.
  • the apparatus 1200 includes an acquisition unit 1210 and a processing unit 1220 .
  • the obtaining unit 1210 is configured to obtain user instructions.
  • the acquiring unit 1210 is further configured to acquire environmental information around the vehicle.
  • the processing unit 1220 is configured to perform multimodal understanding on user instructions and environmental information around the vehicle to determine the user's driving intention.
  • the processing unit 1220 is further configured to generate an automatic driving control instruction for the vehicle according to the user's driving intention.
  • the driving intent may include at least one intent, each intent in the at least one intent includes n slots, and each of the n slots includes a slot name, a slot value, and a value of the slot value.
  • Classification, n is greater than or equal to 0, and n is an integer.
  • the intent may include at least one of: stop, overtake, slow down, follow, turn, and the like.
  • the slot name may include at least one of: a parking position, a speed value, an overtaking or following object, a turning direction, and the like.
  • the classification of the slot value may be: an enumeration class slot value, a text class slot value or an environment class slot value, wherein the enumeration class slot value indicates that the slot value is a predefined enumeration value.
  • the text slot value indicates that the slot value is a substring in the user command or the text generated according to the user command
  • the environment slot value indicates that the slot value is made in the environment information according to the content mentioned in the user command logo.
  • the processing unit 1220 may also be used to: determine whether the driving intention is feasible according to the driving intention, the surrounding environment and traffic regulations; if the driving intention is feasible, generate an automatic driving control instruction for the vehicle.
  • the user instruction includes any one or more of a user voice instruction, a user text instruction, and a user air gesture instruction.
  • the apparatus 1200 may further include: a sending unit 1230, the sending unit 1230 may be configured to send a photographing activation signal to the photographing apparatus, so as to activate the photographing apparatus to photograph the environmental information around the vehicle;
  • the acquiring unit 1210 may also be configured to: acquire the environmental information around the vehicle photographed by the photographing device according to the photographing activation signal.
  • the acquiring unit 1210 may be further configured to: acquire environmental information around the vehicle periodically photographed by the photographing device.
  • the user's driving intention can be presented to the user through an augmented reality-head-up display AR-HUD or a central control screen.
  • FIG. 13 is a training device for a multimodal processing module provided by an embodiment of the present application.
  • the apparatus 1300 includes an acquisition unit 1310 and a processing unit 1320 .
  • the obtaining unit 1310 is configured to obtain training data, the training data includes training input data and training target data, the training input data includes user instructions and environmental information around the vehicle, and the training target data includes the driving intention corresponding to the training input data.
  • the processing unit 1320 is configured to train the multimodal processing module according to the training input data and the training target data.
  • the driving intention may include at least one intention, each intention in the at least one intention includes n slots, and each slot in the n slots includes a slot name, a slot value, and a classification of the slot value.
  • n is greater than or equal to 0, and n is an integer.
  • the intent may include at least one of: stop, overtake, slow down, follow, turn, and the like.
  • the slot name may include at least one of: a parking position, a speed value, an overtaking or following object, a turning direction, and the like.
  • the slot value can be classified as: an enumeration slot value, a text slot value or an environment slot value, wherein the enumeration slot value indicates that the slot value is a predefined enumeration value.
  • the text slot value indicates that the slot value is a substring in the user command or the text generated according to the user command
  • the environment slot value indicates that the slot value is made in the environment information according to the content mentioned in the user command logo.
  • FIG. 14 is a schematic structural diagram of an apparatus provided by an embodiment of the present application.
  • the apparatus 1400 includes a processor 1402 , a communication interface 1403 and a memory 1404 .
  • one example of the apparatus 1400 may be a chip.
  • Another example of apparatus 1400 may be a computing device.
  • the processor 1402, the memory 1404 and the communication interface 1403 can communicate through a bus.
  • Executable code is stored in the memory 1404, and the processor 1402 reads the executable code in the memory 1404 to execute the corresponding method.
  • the memory 1404 may also include other software modules required for running processes such as an operating system.
  • the operating system can be LINUX TM , UNIX TM , WINDOWS TM and the like.
  • the executable code in the memory 1404 is used to implement the method shown in FIG. 4 or FIG. 10
  • the processor 1402 reads the executable code in the memory 1404 to execute the method shown in FIG. 4 or FIG. 10 .
  • the processor 1402 may be a CPU.
  • Memory 1404 may include volatile memory, such as random access memory (RAM).
  • RAM random access memory
  • the memory 1404 may also include non-volatile memory (2non-volatile memory, 2NVM), such as 2read-only memory (2ROM), flash memory, hard disk drive (HDD) or solid state drive ( solid state disk, SSD).
  • 2NVM non-volatile memory
  • 2ROM read-only memory
  • flash memory such as 2read-only memory (2ROM), flash memory, hard disk drive (HDD) or solid state drive ( solid state disk, SSD).
  • HDD hard disk drive
  • SSD solid state drive
  • example computer program product 1500 is provided using signal bearing medium 1501 .
  • the signal bearing medium 1501 may include one or more program instructions 1502 that, when executed by one or more processors, may provide the functions or portions of the functions described above with respect to the methods shown in FIG. 4 or FIG. 10 .
  • one or more of the features of S410 to S440 may be undertaken by one or more instructions associated with the signal bearing medium 1501 .
  • the signal bearing medium 1501 may include a computer readable medium 1503 such as, but not limited to, a hard drive, a compact disc (CD), a digital video disc (DVD), a digital tape, a memory, a read only memory (read only memory) -only memory, ROM) or random access memory (RAM), etc.
  • the signal bearing medium 1501 may include a computer recordable medium 1504 such as, but not limited to, memory, read/write (R/W) CDs, R/W DVDs, and the like.
  • signal bearing medium 1501 may include communication medium 1505, such as, but not limited to, digital and/or analog communication media (eg, fiber optic cables, waveguides, wired communication links, wireless communication links, etc.).
  • the signal bearing medium 1501 may be conveyed by a wireless form of communication medium 1505 (eg, a wireless communication medium conforming to the IEEE 802.11 standard or other transmission protocol).
  • the one or more program instructions 1502 may be, for example, computer-executable instructions or logic-implemented instructions.
  • the aforementioned computing device may be configured to, in response to program instructions 1502 communicated to the computing device via one or more of computer readable media 1503 , computer recordable media 1504 , and/or communication media 1505 , Provides various operations, functions, or actions. It should be understood that the arrangements described herein are for illustrative purposes only.
  • a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
  • an application running on a computing device and the computing device may be components.
  • One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between 2 or more computers.
  • these components can execute from various computer readable media having various data structures stored thereon.
  • a component may, for example, be based on a signal having one or more data packets (eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals) Communicate through local and/or remote processes.
  • data packets eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Human Computer Interaction (AREA)
  • Control Of Driving Devices And Active Controlling Of Vehicle (AREA)
  • Traffic Control Systems (AREA)

Abstract

L'invention concerne un procédé et un appareil destinés à commander le déplacement d'un véhicule et un véhicule. Le procédé consiste : dans un mode de fonctionnement automatique d'un véhicule, à obtenir une instruction d'utilisateur ; à obtenir des informations d'environnement autour du véhicule ; à réaliser une compréhension multimodale sur l'instruction d'utilisateur et les informations d'environnement autour du véhicule et à déterminer une intention de conduite de l'utilisateur ; et à générer une instruction de commande de déplacement automatique pour le véhicule en fonction de l'intention de conduite de l'utilisateur.
PCT/CN2021/084731 2021-03-31 2021-03-31 Procédé et appareil de commande de déplacement de véhicule et véhicule WO2022205211A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180001475.0A CN113226886A (zh) 2021-03-31 2021-03-31 控制车辆行驶的方法、装置及车辆
PCT/CN2021/084731 WO2022205211A1 (fr) 2021-03-31 2021-03-31 Procédé et appareil de commande de déplacement de véhicule et véhicule

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/084731 WO2022205211A1 (fr) 2021-03-31 2021-03-31 Procédé et appareil de commande de déplacement de véhicule et véhicule

Publications (1)

Publication Number Publication Date
WO2022205211A1 true WO2022205211A1 (fr) 2022-10-06

Family

ID=77081297

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084731 WO2022205211A1 (fr) 2021-03-31 2021-03-31 Procédé et appareil de commande de déplacement de véhicule et véhicule

Country Status (2)

Country Link
CN (1) CN113226886A (fr)
WO (1) WO2022205211A1 (fr)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113460092A (zh) * 2021-09-01 2021-10-01 国汽智控(北京)科技有限公司 车辆控制的方法、装置、设备、存储介质及产品
CN114043987A (zh) * 2021-10-13 2022-02-15 集度科技有限公司 指令处理方法、装置、终端和存储介质
CN114171025A (zh) * 2021-12-09 2022-03-11 阿维塔科技(重庆)有限公司 自动驾驶方法、装置、电子设备及计算机可读存储介质
CN114283601A (zh) * 2021-12-23 2022-04-05 深圳创维-Rgb电子有限公司 车辆驾驶方法、系统、电视机以及存储介质
CN114475632B (zh) * 2022-03-11 2022-11-01 阿波罗智能技术(北京)有限公司 自动驾驶控制数据确定方法、装置、设备及存储介质
CN115457959B (zh) * 2022-11-08 2023-02-10 广州小鹏汽车科技有限公司 语音交互方法、服务器及计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140365228A1 (en) * 2013-03-15 2014-12-11 Honda Motor Co., Ltd. Interpretation of ambiguous vehicle instructions
US20150062168A1 (en) * 2013-03-15 2015-03-05 Honda Motor Co., Ltd. System and method for providing augmented reality based directions based on verbal and gestural cues
CN109426256A (zh) * 2017-09-05 2019-03-05 百度(美国)有限责任公司 自动驾驶车辆的基于驾驶员意图的车道辅助系统
CN110023178A (zh) * 2016-12-12 2019-07-16 苹果公司 使用意图信号指导目的地附近的自主车辆
CN111008532A (zh) * 2019-12-12 2020-04-14 广州小鹏汽车科技有限公司 语音交互方法、车辆和计算机可读存储介质
CN111026873A (zh) * 2019-10-24 2020-04-17 中国人民解放军军事科学院国防科技创新研究院 无人车及其导航方法、装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190163331A1 (en) * 2017-11-28 2019-05-30 International Business Machines Corporation Multi-Modal Dialog Broker
US11455982B2 (en) * 2019-01-07 2022-09-27 Cerence Operating Company Contextual utterance resolution in multimodal systems

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140365228A1 (en) * 2013-03-15 2014-12-11 Honda Motor Co., Ltd. Interpretation of ambiguous vehicle instructions
US20150062168A1 (en) * 2013-03-15 2015-03-05 Honda Motor Co., Ltd. System and method for providing augmented reality based directions based on verbal and gestural cues
CN110023178A (zh) * 2016-12-12 2019-07-16 苹果公司 使用意图信号指导目的地附近的自主车辆
CN109426256A (zh) * 2017-09-05 2019-03-05 百度(美国)有限责任公司 自动驾驶车辆的基于驾驶员意图的车道辅助系统
CN111026873A (zh) * 2019-10-24 2020-04-17 中国人民解放军军事科学院国防科技创新研究院 无人车及其导航方法、装置
CN111008532A (zh) * 2019-12-12 2020-04-14 广州小鹏汽车科技有限公司 语音交互方法、车辆和计算机可读存储介质

Also Published As

Publication number Publication date
CN113226886A (zh) 2021-08-06

Similar Documents

Publication Publication Date Title
WO2022205211A1 (fr) Procédé et appareil de commande de déplacement de véhicule et véhicule
CN110550029B (zh) 障碍物避让方法及装置
WO2022016457A1 (fr) Procédé et dispositif de commande de commutation de mode de conduite de véhicule
WO2021102955A1 (fr) Procédé et appareil de planification de trajet pour véhicule
WO2022027304A1 (fr) Procédé et appareil de test de véhicule autonome
WO2021212379A1 (fr) Procédé et appareil de détection de ligne de délimitation de voie
EP4234356A2 (fr) Vérification a distance du nombre de passagers dans un véhicule autonome
WO2022057737A1 (fr) Procédé de commande de stationnement et dispositif associé
WO2022062825A1 (fr) Procédé, dispositif de commande de véhicule et véhicule
US20230048680A1 (en) Method and apparatus for passing through barrier gate crossbar by vehicle
US20230350405A1 (en) Methods and Systems for Gradually Adjusting Vehicle Sensor Perspective using Remote Assistance
CN113954858A (zh) 一种规划车辆行驶路线的方法以及智能汽车
WO2022052872A1 (fr) Procédé et appareil de conduite autonome
WO2022022344A1 (fr) Procédé et appareil de commande de conduite automatique
WO2022062582A1 (fr) Procédé et appareil de commande de temps d'apport de lumière d'un module de caméra
WO2022017307A1 (fr) Procédé, appareil et système de génération de scénarios de conduite autonome
WO2022061702A1 (fr) Procédé, appareil et système pour des alertes de conduite
EP4159564A1 (fr) Procédé et dispositif de planification de paramètres de mouvement longitudinal de véhicule
EP4130921A1 (fr) Procédé d'optimisation de la régulation et de la commande de prise de décision, procédé de commande de déplacement de véhicule et dispositifs associés
WO2023015510A1 (fr) Procédé d'évitement de collision et appareil de commande
US20230195107A1 (en) Systems, Methods, and Apparatus for using Remote Assistance to Annotate Images of an Environment
WO2022001432A1 (fr) Procédé d'inférence de voie et procédé et appareil d'entraînement de modèle d'inférence de voie
WO2022127502A1 (fr) Procédé et dispositif de commande
WO2022061725A1 (fr) Procédé et appareil d'observation d'élément de circulation
US20230196784A1 (en) Systems, Methods, and Apparatus for using Remote Assistance to Classify Objects in an Environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21933874

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21933874

Country of ref document: EP

Kind code of ref document: A1