US20210334652A1 - Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle - Google Patents

Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle Download PDF

Info

Publication number
US20210334652A1
US20210334652A1 US17/204,287 US202117204287A US2021334652A1 US 20210334652 A1 US20210334652 A1 US 20210334652A1 US 202117204287 A US202117204287 A US 202117204287A US 2021334652 A1 US2021334652 A1 US 2021334652A1
Authority
US
United States
Prior art keywords
frames
scene
codes
specific
autonomous vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US17/204,287
Other versions
US11157813B1 (en
Inventor
Hongmo Je
Bongnam Kang
Yongjoong Kim
Sung An Gweon
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Stradvision Inc
Original Assignee
Stradvision Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US17/204,287 priority Critical patent/US11157813B1/en
Application filed by Stradvision Inc filed Critical Stradvision Inc
Priority to CN202180020419.1A priority patent/CN115279643A/en
Priority to KR1020217040053A priority patent/KR102589764B1/en
Priority to JP2021576476A priority patent/JP7181654B2/en
Priority to EP21168387.5A priority patent/EP3901822B1/en
Priority to PCT/KR2021/004714 priority patent/WO2021215740A1/en
Assigned to StradVision, Inc. reassignment StradVision, Inc. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GWEON, SUNG AN, JE, HONGMO, KANG, BONGNAM, KIM, YONGJOONG
Application granted granted Critical
Publication of US11157813B1 publication Critical patent/US11157813B1/en
Publication of US20210334652A1 publication Critical patent/US20210334652A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/001Planning or execution of driving tasks
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R21/00Arrangements or fittings on vehicles for protecting or preventing injuries to occupants or pedestrians in case of accidents or other traffic risks
    • B60R21/01Electrical circuits for triggering passive safety arrangements, e.g. airbags, safety belt tighteners, in case of vehicle accidents or impending vehicle accidents
    • B60R21/013Electrical circuits for triggering passive safety arrangements, e.g. airbags, safety belt tighteners, in case of vehicle accidents or impending vehicle accidents including means for detecting collisions, impending collisions or roll-over
    • B60R21/0134Electrical circuits for triggering passive safety arrangements, e.g. airbags, safety belt tighteners, in case of vehicle accidents or impending vehicle accidents including means for detecting collisions, impending collisions or roll-over responsive to imminent contact with an obstacle, e.g. using radar systems
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
    • B60W30/08Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/02Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to ambient conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0043Signal treatments, identification of variables or parameters, parameter estimation or state estimation
    • B60W2050/005Sampling
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2400/00Indexing codes relating to detected, measured or calculated conditions or factors
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2420/00Indexing codes relating to the type of sensors based on the principle of their operation
    • B60W2420/40Photo, light or radio wave sensitive means, e.g. infrared sensors
    • B60W2420/403Image sensing, e.g. optical camera
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60YINDEXING SCHEME RELATING TO ASPECTS CROSS-CUTTING VEHICLE TECHNOLOGY
    • B60Y2300/00Purposes or special features of road vehicle drive control systems
    • B60Y2300/08Predicting or avoiding probable or impending collision
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60YINDEXING SCHEME RELATING TO ASPECTS CROSS-CUTTING VEHICLE TECHNOLOGY
    • B60Y2400/00Special features of vehicle units
    • B60Y2400/30Sensors
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection

Definitions

  • the present disclosure relates to a method for on-vehicle active learning to be used for training a perception network of an autonomous vehicle; and more particularly, to the method for selecting training data, to be used for training the perception network, from real-time data of the autonomous vehicle, and for training the perception network with the selected training data, and an on-vehicle active learning device using the same.
  • deep learning which uses a neural network including multiple hidden layers between an input layer and an output layer, has high performance on the object identification.
  • the neural network is generally trained via backpropagation using one or more losses.
  • the autonomous vehicle is a vehicle driven without any action of a driver in response to driving information and driving environments of the vehicle, and uses a perception network based on deep learning in order to detect driving environment information, e.g., objects, lanes, traffic signal, etc. near the vehicle.
  • driving environment information e.g., objects, lanes, traffic signal, etc. near the vehicle.
  • Such an autonomous vehicle requires online learning, that is, training with the perception network installed, in order to update the perception network.
  • the autonomous vehicle since a storage capacity of an embedded system for the autonomous vehicle is limited, the autonomous vehicle must perform data sampling on a database, e.g., cloud storage, in which the training data are stored in order to acquire some part of the training data and update the perception network using said some part of the training data.
  • a database e.g., cloud storage
  • sampling methods such as a random sampling method, metadata sampling method and manual curation sampling method, etc. have been used for performing the data sampling.
  • sampling methods are inappropriate for an on-vehicle active learning since such sampling methods must store all data under offline condition in order to perform the active learning.
  • a method for on-vehicle active learning to be used for training a perception network of an autonomous vehicle including steps of: (a) an on-vehicle active learning device, if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, performing or supporting another device to perform a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information; and (b) the on-vehicle active learning device performing or supporting another device to perform at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle
  • the method further includes a step of: (c) the on-vehicle active learning device performing or supporting another device to perform (c1) a process of sampling the specific frames stored in the frame storing part by using the specific scene codes to thereby generate training data and (c2) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
  • the on-vehicle active learning device performs or supports another device to perform at least one of (i) a process of under-sampling the specific frames by referring to the scene codes or a process of over-sampling the specific frames by referring to the scene codes, to thereby generate the training data and thus train the perception network, at the step of (c1) and (ii) (ii-1) a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing and (ii-2) a process of training the perception network via backpropagation using the weight-balanced losses, at the step of (c2).
  • the on-vehicle active learning device performs or supports another device to perform a process of allowing the scene code assigning module to (i) apply a learning operation to each of the frames, to thereby classify each of the scenes of each of the frames into one of classes of driving environments and one of classes of driving roads and thus generate each of class codes of each of the frames, via a scene classifier based on deep learning, (ii) detect each of driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module, and (iii) generate each of the scene codes for each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
  • the on-vehicle active learning device performs or supports another device to perform a process of allowing the scene code assigning module to (i) detect one or more scene changes in the frames via the driving event detecting module and thus generate one or more frame-based event codes and (ii) detect one or more operation states, corresponding to the sensing information, of the autonomous vehicle and thus generate one or more vehicle-based event codes, to thereby generate the event codes.
  • the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, on which no object is detected from its collision area, corresponding to a collision event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, on which an object is detected from its collision area, corresponding to a normal event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame where an object, with its confidence score included in the object detection information equal to or lower than a preset value, is located as one of the specific frames.
  • the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames, by referring to the scene codes.
  • an on-vehicle active learning device for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, including: at least one memory that stores instructions; and at least one processor configured to execute the instructions to perform or support another device to perform: (I) if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information and (II) at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each
  • the processor further performs or supports another device to perform: (III) (III-1) a process of sampling the specific frames stored in the frame storing part by using the specific scene codes to thereby generate training data and (III-2) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
  • the processor performs or supports another device to perform at least one of (i) a process of under-sampling the specific frames by referring to the scene codes or a process of over-sampling the specific frames by referring to the scene codes, to thereby generate the training data and thus train the perception network, at the process of (III-1) and (ii) (ii-1) a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing and (ii-2) a process of training the perception network via backpropagation using the weight-balanced losses, at the process of (III-2).
  • the processor performs or supports another device to perform a process of allowing the scene code assigning module to (i) apply a learning operation to each of the frames, to thereby classify each of the scenes of each of the frames into one of classes of driving environments and one of classes of driving roads and thus generate each of class codes of each of the frames, via a scene classifier based on deep learning, (ii) detect each of driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module, and (iii) generate each of the scene codes for each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
  • the processor performs or supports another device to perform a process of allowing the scene code assigning module to (i) detect one or more scene changes in the frames via the driving event detecting module and thus generate one or more frame-based event codes and (ii) detect one or more operation states, corresponding to the sensing information, of the autonomous vehicle and thus generate one or more vehicle-based event codes, to thereby generate the event codes.
  • the processor performs or supports another device to perform a process of selecting a certain frame, on which no object is detected from its collision area, corresponding to a collision event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • the processor performs or supports another device to perform a process of selecting a certain frame, on which an object is detected from its collision area, corresponding to a normal event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • the processor performs or supports another device to perform a process of selecting a certain frame where an object, with its confidence score included in the object detection information equal to or lower than a preset value, is located as one of the specific frames.
  • the processor performs or supports another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames, by referring to the scene codes.
  • recordable media readable by a computer for storing a computer program to execute the method of the present disclosure is further provided.
  • FIG. 1 is a drawing schematically illustrating an on-vehicle active learning device for on-vehicle active learning to be used for training a perception network of an autonomous vehicle in accordance with one example embodiment of the present disclosure.
  • FIG. 2 is a drawing schematically illustrating a method for the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • FIG. 3 is a drawing schematically illustrating a method for generating a scene code during processes of the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • FIG. 4 is a drawing schematically illustrating a method for determining a useful frame, which has a degree of usefulness higher than a threshold usefulness value, for the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • FIG. 5 is a drawing schematically illustrating another method for determining the useful frame for the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • FIG. 1 is a drawing schematically illustrating an on-vehicle active learning device for on-vehicle active learning, to be used for training a perception network of an autonomous vehicle, in accordance with one example embodiment of the present disclosure.
  • the on-vehicle active learning device 1000 may include a memory 1001 which stores one or more instructions for performing the on-vehicle active learning of one or more consecutive frames in a driving video acquired from the autonomous vehicle, and a processor 1002 which performs functions for the on-vehicle active learning in response to the instructions stored in the memory 1001 .
  • the basic learning device 1000 may typically achieve a desired system performance by using combinations of at least one computing device and at least one computer software, e.g., a computer processor, a memory, a storage, an input device, an output device, or any other conventional computing components, an electronic communication device such as a router or a switch, an electronic information storage system such as a network-attached storage (NAS) device and a storage area network (SAN) as the computing device and any instructions that allow the computing device to function in a specific way as the computer software.
  • a computer software e.g., a computer processor, a memory, a storage, an input device, an output device, or any other conventional computing components, an electronic communication device such as a router or a switch, an electronic information storage system such as a network-attached storage (NAS) device and a storage area network (SAN) as the computing device and any instructions that allow the computing device to function in a specific way as the computer software.
  • NAS network-attached storage
  • SAN storage area network
  • the processor of the computing device may include hardware configuration of MPU (Micro Processing Unit) or CPU (Central Processing Unit), cache memory, data bus, etc. Additionally, the computing device may further include software configuration of OS and applications that achieve specific purposes.
  • MPU Micro Processing Unit
  • CPU Central Processing Unit
  • computing device does not exclude an integrated device including any combination of a processor, a memory, a medium, or any other computing components for implementing the present disclosure.
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module 1200 , to thereby allow the scene code assigning module 1200 to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information.
  • each of the scene codes may be created by encoding, e.g., codifying, information on each of the scenes of each of the frames and information on the driving events.
  • the scene code assigning module 1200 may perform or support another device to perform a process of applying a learning operation to each of the frames and thus classifying each of the scenes of each of the frames into one of preset classes of driving environments and one of preset classes of driving roads, to thereby generate each of class codes of each of the frames, via a scene classifier 1210 based on deep learning. That is, the scene classifier 1210 may extract features of each of the frames and classify the extracted features into one of the classes of the driving environments and one of the classes of the driving roads, to thereby generate each of the class codes of each of the frames.
  • the driving environments may include information on weather and information on a time zone of an area where the autonomous vehicle is driven, but the scope of the present disclosure is not limited thereto, and may include various information on weather in a local area or region where the autonomous vehicle is driven.
  • the information on weather may include information on weather phenomena like sunshine, rain, snow, fog, etc. and the information on the time zone may include information like day, night, etc.
  • the driving roads may include types of roads, e.g., a highway, an urban road, a tunnel, etc., where the autonomous vehicle is driven, but the scope of the present disclosure is not limited thereto, and may include various road environments where the autonomous vehicle is driven.
  • the scene code assigning module 1200 may perform or support another device to perform a process of detecting each of the driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module 1220 .
  • the event codes may include (1) frame-based event codes detected by using the consecutive frames and (2) vehicle-based event codes detected by using the sensing information.
  • the scene code assigning module 1200 may perform or support another device to perform a process of inputting the consecutive frames into a scene change detector of the driving event detecting module 1220 , to thereby allow the scene change detector to detect whether each of the scenes of each of the consecutive frames is changed and thus generate each of the frame-based event codes corresponding to each of the frames.
  • the frame-based event codes may include codes respectively corresponding to a uniform sample, a scene change, etc. according to whether the scenes are changed.
  • the scene code assigning module 1200 may perform or support another device to perform a process of detecting operation states of the autonomous vehicle by using the sensing information and thus detecting events which occur while the autonomous vehicle is driven, to thereby generate vehicle-based event codes.
  • the vehicle-based event codes may include codes respectively corresponding to a rapid steering, rapid brake slamming, normal action, AEB activated action, etc.
  • the scene code assigning module 1200 may perform or support another device to perform a process of generating each of the scene codes of each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
  • the following table may indicate each of the scene codes assigned to each of the frames.
  • Class code Event code Driving environment Frame-based Vehicle-based (weather/time) Driving road event code event code sunshine, rain, snow, highway/city/ uniform rapid steering/ fog, etc. day/night tunnel sample/ rapid brake scene change slamming/ normal action/ AEB activated
  • scene codes listed in the above table are not to be taken in a limiting sense, and various types of the scene codes of the frames in the driving video can be generated.
  • the driving video and the sensing information may be inputted into a driving video & driving information analyzing module 1110 of the autonomous vehicle.
  • the driving video & driving information analyzing module 1110 may perform or support another device to perform a process of applying a learning operation to the consecutive frames of the driving video, to thereby detect information on a nearby environment of the autonomous vehicle, for example, information on objects, such as vehicles, pedestrians, etc., information on lanes, information on traffic signal of the driving road, etc. via the perception network, and a process of detecting information on the operation states of the autonomous vehicle by referring to the sensing information.
  • the information on the nearby environment and the information on the operation states of the autonomous vehicle may be transmitted to an autonomous driving controlling part 1500 , and the autonomous driving controlling part 1500 may control operation of the autonomous vehicle by using the information on the nearby environment and the information on the operation states.
  • the driving video & driving information analyzing module 1110 may perform or support another device to perform a process of detecting objects from the frames of the driving video, to thereby generate object detection information of each of the frames, via an object detector based on deep learning, for example, the object detector based on a convolutional neural network (CNN), or a process of segmenting the frames of the driving video, to thereby generate the information on the lanes on each of the frames, via a segmentation network based on deep learning.
  • the driving video & driving information analyzing module 1110 may also perform or support another device to perform a process of outputting the information on the operation states of the autonomous vehicle.
  • the information on the operation states may include information on driving conditions of the autonomous vehicle respectively corresponding to an acceleration, a deceleration, a steering wheel operation, an activation autonomous emergency brake (AEB), etc. of the autonomous vehicle.
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting frames useful for the training data, with which the perception network of the autonomous vehicle is to be trained, by using each of the scene codes of each of the frames and the object detection information on each of the frames detected by the object detector, via a frame selecting module 1300 and a process of storing the frames, selected as the training data, in a frame storing part 1400 .
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of allowing the frame selecting module 1300 to select frames, i.e., images, which are useful for training the perception network based on deep learning of the autonomous vehicle, among the consecutive frames acquired from the driving video.
  • the frame selecting module 1300 may select the frames useful for training the perception network in various ways.
  • the on-vehicle active learning device 1000 may perform or support another device to perform (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle, via a frame selecting module 1300 , by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and (ii) a process of storing the specific frames and their corresponding specific scene codes in a frame storing part 1400 , i.e., a memory with limited capacity installed on the autonomous vehicle, such that the specific frames and their corresponding specific scene codes match with one another.
  • a frame selecting module 1300 by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector
  • a process of storing the specific frames and their corresponding specific scene codes in a frame storing part 1400 i.e., a memory with limited
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames among the frames by using the scene codes and the object detection information, via the frame selecting module 1300 and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part 1400 such that the specific frames and their corresponding specific scene codes match with one another.
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, which has a collision area where no object is detected in a collision event, as one of the specific frames useful for training the perception network by referring to the scene codes.
  • the collision event may be a driving event performed in a situation, e.g., a sudden braking, a sudden right turn, a sudden left turn, etc., in which an operation state of the autonomous vehicle represents a traffic collision or an estimated traffic collision.
  • the collision event may include an event where braking of the autonomous vehicle occurs when a traffic collision is expected to be imminent, but the scope of the present disclosure is not limited thereto.
  • the collision area may be an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • an event code of the autonomous vehicle corresponds to a sudden braking, a sudden right turn, a sudden left turn, etc.
  • an object must be detected in the collision area, however, if no object is detected in the collision area on one of the frames of the driving video, a false negative is suspected, therefore, said one of the frames may be selected as one of the specific frames useful for training the perception network.
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, which has the collision area where an object is detected in a normal event, as one of the specific frames useful for training the perception network by referring to the scene codes.
  • the normal event may be an event where the autonomous vehicle is driven normally without any accidents or collisions.
  • said one of the frames may be selected as one of the specific frames useful for training the perception network.
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, where an object with its confidence score included in the object detection information equal to or lower than a preset value is located, as one of the specific frames which are useful for training the perception network.
  • the perception network is determined as properly operating on such frames, therefore, such frames may be determined as frames not useful for training the perception network and be discarded.
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames which are useful for training the perception network by referring to the scene codes.
  • a frame where a pedestrian is detected may be determined as a hard example, that is, an example which has the degree of usefulness higher than a threshold usefulness value, to be used for training the perception network and thus said frame may be determined as useful for training the perception network.
  • the perception network may be determined as sufficiently trained, and therefore, said frame may be determined as not useful for training the perception network in order to avoid overfitting.
  • the method described above for determining whether the frames of the driving video are useful for training the perception network or not is just an example. That is, the scope of the present disclosure is not limited thereto and the method may vary by set conditions.
  • the frame selecting module 1300 may determine whether the frames of the driving video are useful for training the perception network or not by using a trained network, i.e., a trained deep learning network.
  • the frame selecting module 1300 may perform or support another device to perform a process of inputting the frames into an auto labeling network 1310 and the trained deep learning network 1320 , respectively. Thereafter, by performing an output comparison, which is a process of comparing an output from the auto labeling network 1310 and an output from the trained deep learning network 1320 , the frames may be determined as useful or not for training the perception network. If the outputs are identical or similar to each other, the frames may be determined as not useful. And, if a difference between the outputs is equal or greater than a predetermined value, the frames may be considered as hard examples and determined useful for training the perception network.
  • the frame selecting module 1300 may perform or support another device to perform a process of modifying the frames in various ways, to thereby create various modified frames.
  • the various ways of modifying the frames may include resizing the frames, changing aspect ratios of the frames, changing color tone of the frames, etc.
  • the frame selecting module 1300 may perform or support another device to perform a process of inputting each of the modified frames into the trained deep learning network 1320 .
  • the frames may be determined as useful or not for training the perception network. If the computed variance is equal or smaller than a preset threshold, the frames may be determined as not useful. And, if the computed variance is greater than the preset threshold, the frames may be considered as hard examples and thus determined as useful for training the perception network.
  • the on-vehicle active learning device 1000 may perform or support another device to perform (i) a process of sampling the specific frames stored in the frame storing part 1400 by using the specific scene codes to thereby generate training data and (ii) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
  • the on-vehicle active learning device 1000 may perform or support another device to perform (i) a process of under-sampling through selecting a part of the specific frames in a majority class and as many as possible of the specific frames in a minority class by referring to the scene codes or (ii) a process of over-sampling through generating copies of the specific frames in the minority class as many as the number of the specific frames in the majority class, by referring to the scene codes, at the step of sampling the specific frames stored in the frame storing part 1400 , to thereby generate the training data and thus train the perception network with the sampled training data.
  • the number of frames corresponding to the majority class is 100 and that the number of frames corresponding to the minority class is 10, then if a desired number of frames to be sampled is 30, ten frames corresponding to the minority class may be selected and twenty frames corresponding to the majority class may be selected.
  • the on-vehicle active learning device 1000 may perform or support another device to perform a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing, to thereby train the perception network via backpropagation by using the weight-balanced losses, at the step of executing the on-vehicle learning of the perception network by using the specific frames stored in the frame storing part 1400 .
  • the present disclosure has an effect of providing the method for improving an efficiency of training the perception network with new training data by performing a process of assigning the scene code corresponding to a frame of a video, a process of determining the frame as useful for training or not, and then a process of storing the data in a storage of a vehicle.
  • the present disclosure has another effect of providing the method for performing the on-line active learning on the vehicle itself, through sampling balancing on the training data according to the scene code.
  • the present disclosure has still another effect of providing the method for performing the on-vehicle learning of the perception network of the autonomous vehicle by performing the sampling balancing on the training data according to its corresponding scene code.
  • the embodiments of the present disclosure as explained above can be implemented in a form of executable program command through a variety of computer means recordable to computer readable media.
  • the computer readable media may include solely or in combination, program commands, data files, and data structures.
  • the program commands recorded to the media may be components specially designed for the present disclosure or may be usable to those skilled in the art.
  • Computer readable media include magnetic media such as hard disk, floppy disk, and magnetic tape, optical media such as CD-ROM and DVD, magneto-optical media such as floptical disk and hardware devices such as ROM, RAM, and flash memory specially designed to store and carry out program commands.
  • Program commands include not only a machine language code made by a complier but also a high level code that can be used by an interpreter etc., which is executed by a computer.
  • the aforementioned hardware device can work as more than a software module to perform the action of the present disclosure and vice versa.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Mechanical Engineering (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Transportation (AREA)
  • Automation & Control Theory (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Control Of Driving Devices And Active Controlling Of Vehicle (AREA)
  • Image Processing (AREA)

Abstract

A method of on-vehicle active learning for training a perception network of an autonomous vehicle is provided. The method includes steps of: an on-vehicle active learning device, (a) if a driving video and sensing information are acquired from a camera and sensors on an autonomous vehicle, inputting frames of the driving video and the sensing information into a scene code assigning module to generate scene codes including information on scenes in the frames and on driving events; and (b) at least one of selecting a part of the frames, whose object detection information satisfies a condition, as specific frames by using the scene codes and the object detection information and selecting a part of the frames, matching a training policy, as the specific frames by using the scene codes and the object detection information, and storing the specific frames and specific scene codes in a frame storing part.

Description

    CROSS REFERENCE TO RELATED APPLICATION
  • This application claims the benefit of priority to U.S. Provisional Patent Application No. 63/014,877, filed on Apr. 24, 2020, the entire contents of which being incorporated herein by reference.
  • FIELD OF THE DISCLOSURE
  • The present disclosure relates to a method for on-vehicle active learning to be used for training a perception network of an autonomous vehicle; and more particularly, to the method for selecting training data, to be used for training the perception network, from real-time data of the autonomous vehicle, and for training the perception network with the selected training data, and an on-vehicle active learning device using the same.
  • BACKGROUND OF THE DISCLOSURE
  • Recently, researches has been conducted on methods of identifying objects via machine learning technologies.
  • As one of the machine learning technologies, deep learning, which uses a neural network including multiple hidden layers between an input layer and an output layer, has high performance on the object identification.
  • And, the neural network is generally trained via backpropagation using one or more losses.
  • Conventionally, in order to train a deep learning network, raw data were collected according to a data collection policy, and then human labelers perform annotation on the raw data, to thereby generate new training data. Thereafter, by using the new training data and existing training data, the deep learning network is trained, and then, by referring to a result of analysis conducted by human engineers, a training algorithm for the deep learning network is revised and improved. Moreover, by referring to the result of the analysis, the data collection policy and incorrect annotations are revised.
  • However, as a performance of the deep learning network is improved, hard examples useful for training become scarce in such conventional methods. Accordingly, an efficiency of training the deep learning network with new training data becomes less productive, and, therefore, a return on investment from a data annotation performed by the human labelers is reduced.
  • Meanwhile, the autonomous vehicle is a vehicle driven without any action of a driver in response to driving information and driving environments of the vehicle, and uses a perception network based on deep learning in order to detect driving environment information, e.g., objects, lanes, traffic signal, etc. near the vehicle.
  • Such an autonomous vehicle requires online learning, that is, training with the perception network installed, in order to update the perception network. However, since a storage capacity of an embedded system for the autonomous vehicle is limited, the autonomous vehicle must perform data sampling on a database, e.g., cloud storage, in which the training data are stored in order to acquire some part of the training data and update the perception network using said some part of the training data.
  • Conventionally, sampling methods, such as a random sampling method, metadata sampling method and manual curation sampling method, etc. have been used for performing the data sampling. However, such sampling methods are inappropriate for an on-vehicle active learning since such sampling methods must store all data under offline condition in order to perform the active learning.
  • SUMMARY OF THE DISCLOSURE
  • It is an object of the present disclosure to solve all the aforementioned problems.
  • It is another object of the present disclosure to provide a method for allowing on-line active learning.
  • It is still another object of the present disclosure to provide a method for improving an efficiency of training a perception network with new training data.
  • It is still yet another object of the present disclosure to provide a method for performing on-vehicle learning of the perception network of an autonomous vehicle.
  • In accordance with one aspect of the present disclosure, there is provided a method for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, including steps of: (a) an on-vehicle active learning device, if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, performing or supporting another device to perform a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information; and (b) the on-vehicle active learning device performing or supporting another device to perform at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another.
  • As one example, the method further includes a step of: (c) the on-vehicle active learning device performing or supporting another device to perform (c1) a process of sampling the specific frames stored in the frame storing part by using the specific scene codes to thereby generate training data and (c2) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
  • As one example, at the step of (c), the on-vehicle active learning device performs or supports another device to perform at least one of (i) a process of under-sampling the specific frames by referring to the scene codes or a process of over-sampling the specific frames by referring to the scene codes, to thereby generate the training data and thus train the perception network, at the step of (c1) and (ii) (ii-1) a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing and (ii-2) a process of training the perception network via backpropagation using the weight-balanced losses, at the step of (c2).
  • As one example, at the step of (a), the on-vehicle active learning device performs or supports another device to perform a process of allowing the scene code assigning module to (i) apply a learning operation to each of the frames, to thereby classify each of the scenes of each of the frames into one of classes of driving environments and one of classes of driving roads and thus generate each of class codes of each of the frames, via a scene classifier based on deep learning, (ii) detect each of driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module, and (iii) generate each of the scene codes for each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
  • As one example, the on-vehicle active learning device performs or supports another device to perform a process of allowing the scene code assigning module to (i) detect one or more scene changes in the frames via the driving event detecting module and thus generate one or more frame-based event codes and (ii) detect one or more operation states, corresponding to the sensing information, of the autonomous vehicle and thus generate one or more vehicle-based event codes, to thereby generate the event codes.
  • As one example, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, on which no object is detected from its collision area, corresponding to a collision event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • As one example, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, on which an object is detected from its collision area, corresponding to a normal event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • As one example, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame where an object, with its confidence score included in the object detection information equal to or lower than a preset value, is located as one of the specific frames.
  • As one example, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames, by referring to the scene codes.
  • In accordance with another aspect of the present disclosure, there is provided an on-vehicle active learning device for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, including: at least one memory that stores instructions; and at least one processor configured to execute the instructions to perform or support another device to perform: (I) if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information and (II) at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another.
  • As one example, the processor further performs or supports another device to perform: (III) (III-1) a process of sampling the specific frames stored in the frame storing part by using the specific scene codes to thereby generate training data and (III-2) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
  • As one example, at the process of (III), the processor performs or supports another device to perform at least one of (i) a process of under-sampling the specific frames by referring to the scene codes or a process of over-sampling the specific frames by referring to the scene codes, to thereby generate the training data and thus train the perception network, at the process of (III-1) and (ii) (ii-1) a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing and (ii-2) a process of training the perception network via backpropagation using the weight-balanced losses, at the process of (III-2).
  • As one example, at the process of (I), the processor performs or supports another device to perform a process of allowing the scene code assigning module to (i) apply a learning operation to each of the frames, to thereby classify each of the scenes of each of the frames into one of classes of driving environments and one of classes of driving roads and thus generate each of class codes of each of the frames, via a scene classifier based on deep learning, (ii) detect each of driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module, and (iii) generate each of the scene codes for each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
  • As one example, the processor performs or supports another device to perform a process of allowing the scene code assigning module to (i) detect one or more scene changes in the frames via the driving event detecting module and thus generate one or more frame-based event codes and (ii) detect one or more operation states, corresponding to the sensing information, of the autonomous vehicle and thus generate one or more vehicle-based event codes, to thereby generate the event codes.
  • As one example, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame, on which no object is detected from its collision area, corresponding to a collision event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • As one example, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame, on which an object is detected from its collision area, corresponding to a normal event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • As one example, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame where an object, with its confidence score included in the object detection information equal to or lower than a preset value, is located as one of the specific frames.
  • As one example, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames, by referring to the scene codes.
  • In addition, recordable media readable by a computer for storing a computer program to execute the method of the present disclosure is further provided.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The following drawings to be used to explain example embodiments of the present disclosure are only part of example embodiments of the present disclosure and other drawings can be obtained based on the drawings by those skilled in the art of the present disclosure without inventive work.
  • FIG. 1 is a drawing schematically illustrating an on-vehicle active learning device for on-vehicle active learning to be used for training a perception network of an autonomous vehicle in accordance with one example embodiment of the present disclosure.
  • FIG. 2 is a drawing schematically illustrating a method for the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • FIG. 3 is a drawing schematically illustrating a method for generating a scene code during processes of the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • FIG. 4 is a drawing schematically illustrating a method for determining a useful frame, which has a degree of usefulness higher than a threshold usefulness value, for the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • FIG. 5 is a drawing schematically illustrating another method for determining the useful frame for the on-vehicle active learning in accordance with one example embodiment of the present disclosure.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • Detailed explanation on the present disclosure to be made below refer to attached drawings and diagrams illustrated as specific embodiment examples under which the present disclosure may be implemented to make clear of purposes, technical solutions, and advantages of the present disclosure. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention.
  • Besides, in the detailed description and claims of the present disclosure, a term “include” and its variations are not intended to exclude other technical features, additions, components or steps. Other objects, benefits and features of the present disclosure will be revealed to one skilled in the art, partially from the specification and partially from the implementation of the present disclosure. The following examples and drawings will be provided as examples but they are not intended to limit the present disclosure.
  • Moreover, the present disclosure covers all possible combinations of example embodiments indicated in this specification. It is to be understood that the various embodiments of the present disclosure, although different, are not necessarily mutually exclusive. For example, a particular feature, structure, or characteristic described herein in connection with one embodiment may be implemented within other embodiments without departing from the spirit and scope of the present disclosure. In addition, it is to be understood that the position or arrangement of individual elements within each disclosed embodiment may be modified without departing from the spirit and scope of the present disclosure. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present disclosure is defined only by the appended claims, appropriately interpreted, along with the full range of equivalents to which the claims are entitled. In the drawings, similar reference numerals refer to the same or similar functionality throughout the several aspects.
  • To allow those skilled in the art to carry out the present disclosure easily, the example embodiments of the present disclosure by referring to attached diagrams will be explained in detail as shown below.
  • FIG. 1 is a drawing schematically illustrating an on-vehicle active learning device for on-vehicle active learning, to be used for training a perception network of an autonomous vehicle, in accordance with one example embodiment of the present disclosure. By referring to FIG. 1, the on-vehicle active learning device 1000 may include a memory 1001 which stores one or more instructions for performing the on-vehicle active learning of one or more consecutive frames in a driving video acquired from the autonomous vehicle, and a processor 1002 which performs functions for the on-vehicle active learning in response to the instructions stored in the memory 1001.
  • Specifically, the basic learning device 1000 may typically achieve a desired system performance by using combinations of at least one computing device and at least one computer software, e.g., a computer processor, a memory, a storage, an input device, an output device, or any other conventional computing components, an electronic communication device such as a router or a switch, an electronic information storage system such as a network-attached storage (NAS) device and a storage area network (SAN) as the computing device and any instructions that allow the computing device to function in a specific way as the computer software.
  • The processor of the computing device may include hardware configuration of MPU (Micro Processing Unit) or CPU (Central Processing Unit), cache memory, data bus, etc. Additionally, the computing device may further include software configuration of OS and applications that achieve specific purposes.
  • However, such description of the computing device does not exclude an integrated device including any combination of a processor, a memory, a medium, or any other computing components for implementing the present disclosure.
  • Meanwhile, a method for using the on-vehicle active learning device 1000 for the on-vehicle active learning, to be used for training the perception network of the autonomous vehicle, is explained below by referring to FIG. 2 in accordance with one example embodiment of the present disclosure.
  • First, if the driving video and sensing information are acquired respectively from a camera, e.g., an image sensor, and one or more sensors mounted on the autonomous vehicle while the autonomous vehicle is driven, the on-vehicle active learning device 1000 may perform or support another device to perform a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module 1200, to thereby allow the scene code assigning module 1200 to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information.
  • Herein, each of the scene codes may be created by encoding, e.g., codifying, information on each of the scenes of each of the frames and information on the driving events.
  • For example, by referring to FIG. 3, the scene code assigning module 1200 may perform or support another device to perform a process of applying a learning operation to each of the frames and thus classifying each of the scenes of each of the frames into one of preset classes of driving environments and one of preset classes of driving roads, to thereby generate each of class codes of each of the frames, via a scene classifier 1210 based on deep learning. That is, the scene classifier 1210 may extract features of each of the frames and classify the extracted features into one of the classes of the driving environments and one of the classes of the driving roads, to thereby generate each of the class codes of each of the frames.
  • Herein, the driving environments may include information on weather and information on a time zone of an area where the autonomous vehicle is driven, but the scope of the present disclosure is not limited thereto, and may include various information on weather in a local area or region where the autonomous vehicle is driven. Also, the information on weather may include information on weather phenomena like sunshine, rain, snow, fog, etc. and the information on the time zone may include information like day, night, etc. Also, the driving roads may include types of roads, e.g., a highway, an urban road, a tunnel, etc., where the autonomous vehicle is driven, but the scope of the present disclosure is not limited thereto, and may include various road environments where the autonomous vehicle is driven.
  • Also, the scene code assigning module 1200 may perform or support another device to perform a process of detecting each of the driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module 1220.
  • Herein, the event codes may include (1) frame-based event codes detected by using the consecutive frames and (2) vehicle-based event codes detected by using the sensing information.
  • As one example, the scene code assigning module 1200 may perform or support another device to perform a process of inputting the consecutive frames into a scene change detector of the driving event detecting module 1220, to thereby allow the scene change detector to detect whether each of the scenes of each of the consecutive frames is changed and thus generate each of the frame-based event codes corresponding to each of the frames. Herein, the frame-based event codes may include codes respectively corresponding to a uniform sample, a scene change, etc. according to whether the scenes are changed. In addition, the scene code assigning module 1200 may perform or support another device to perform a process of detecting operation states of the autonomous vehicle by using the sensing information and thus detecting events which occur while the autonomous vehicle is driven, to thereby generate vehicle-based event codes. Herein, the vehicle-based event codes may include codes respectively corresponding to a rapid steering, rapid brake slamming, normal action, AEB activated action, etc. And, the scene code assigning module 1200 may perform or support another device to perform a process of generating each of the scene codes of each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
  • The following table may indicate each of the scene codes assigned to each of the frames.
  • Class code Event code
    Driving environment Frame-based Vehicle-based
    (weather/time) Driving road event code event code
    sunshine, rain, snow, highway/city/ uniform rapid steering/
    fog, etc. day/night tunnel sample/ rapid brake
    scene change slamming/
    normal action/
    AEB activated
  • However, it should be noted that the scene codes listed in the above table are not to be taken in a limiting sense, and various types of the scene codes of the frames in the driving video can be generated.
  • Herein, by referring to FIG. 2 again, the driving video and the sensing information may be inputted into a driving video & driving information analyzing module 1110 of the autonomous vehicle.
  • Then, the driving video & driving information analyzing module 1110 may perform or support another device to perform a process of applying a learning operation to the consecutive frames of the driving video, to thereby detect information on a nearby environment of the autonomous vehicle, for example, information on objects, such as vehicles, pedestrians, etc., information on lanes, information on traffic signal of the driving road, etc. via the perception network, and a process of detecting information on the operation states of the autonomous vehicle by referring to the sensing information. And, the information on the nearby environment and the information on the operation states of the autonomous vehicle may be transmitted to an autonomous driving controlling part 1500, and the autonomous driving controlling part 1500 may control operation of the autonomous vehicle by using the information on the nearby environment and the information on the operation states.
  • As one example, the driving video & driving information analyzing module 1110 may perform or support another device to perform a process of detecting objects from the frames of the driving video, to thereby generate object detection information of each of the frames, via an object detector based on deep learning, for example, the object detector based on a convolutional neural network (CNN), or a process of segmenting the frames of the driving video, to thereby generate the information on the lanes on each of the frames, via a segmentation network based on deep learning. Also, the driving video & driving information analyzing module 1110 may also perform or support another device to perform a process of outputting the information on the operation states of the autonomous vehicle. Herein, the information on the operation states may include information on driving conditions of the autonomous vehicle respectively corresponding to an acceleration, a deceleration, a steering wheel operation, an activation autonomous emergency brake (AEB), etc. of the autonomous vehicle.
  • Next, the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting frames useful for the training data, with which the perception network of the autonomous vehicle is to be trained, by using each of the scene codes of each of the frames and the object detection information on each of the frames detected by the object detector, via a frame selecting module 1300 and a process of storing the frames, selected as the training data, in a frame storing part 1400.
  • That is, the on-vehicle active learning device 1000 may perform or support another device to perform a process of allowing the frame selecting module 1300 to select frames, i.e., images, which are useful for training the perception network based on deep learning of the autonomous vehicle, among the consecutive frames acquired from the driving video.
  • Herein, the frame selecting module 1300 may select the frames useful for training the perception network in various ways.
  • That is, the on-vehicle active learning device 1000 may perform or support another device to perform (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle, via a frame selecting module 1300, by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and (ii) a process of storing the specific frames and their corresponding specific scene codes in a frame storing part 1400, i.e., a memory with limited capacity installed on the autonomous vehicle, such that the specific frames and their corresponding specific scene codes match with one another.
  • Also, the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames among the frames by using the scene codes and the object detection information, via the frame selecting module 1300 and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part 1400 such that the specific frames and their corresponding specific scene codes match with one another.
  • As one example, the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, which has a collision area where no object is detected in a collision event, as one of the specific frames useful for training the perception network by referring to the scene codes. Herein, the collision event may be a driving event performed in a situation, e.g., a sudden braking, a sudden right turn, a sudden left turn, etc., in which an operation state of the autonomous vehicle represents a traffic collision or an estimated traffic collision. For example, the collision event may include an event where braking of the autonomous vehicle occurs when a traffic collision is expected to be imminent, but the scope of the present disclosure is not limited thereto. Herein, the collision area may be an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
  • That is, if an event code of the autonomous vehicle corresponds to a sudden braking, a sudden right turn, a sudden left turn, etc., an object must be detected in the collision area, however, if no object is detected in the collision area on one of the frames of the driving video, a false negative is suspected, therefore, said one of the frames may be selected as one of the specific frames useful for training the perception network.
  • Also, the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, which has the collision area where an object is detected in a normal event, as one of the specific frames useful for training the perception network by referring to the scene codes. Herein, the normal event may be an event where the autonomous vehicle is driven normally without any accidents or collisions.
  • That is, if the autonomous vehicle is driven normally without any accidents or collisions, etc., no object should be detected in the collision area, however, if an object is detected in the collision areas on one of the frames of the driving video, a function false alarm is suspected, therefore, said one of the frames may be selected as one of the specific frames useful for training the perception network.
  • Also, the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, where an object with its confidence score included in the object detection information equal to or lower than a preset value is located, as one of the specific frames which are useful for training the perception network.
  • And, for frames corresponding to situations other than the specific situations described above, the perception network is determined as properly operating on such frames, therefore, such frames may be determined as frames not useful for training the perception network and be discarded.
  • Meanwhile, according to a training policy of the perception network, the on-vehicle active learning device 1000 may perform or support another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames which are useful for training the perception network by referring to the scene codes.
  • As one example, in case the scene code corresponds to a rainy night, a frame where a pedestrian is detected may be determined as a hard example, that is, an example which has the degree of usefulness higher than a threshold usefulness value, to be used for training the perception network and thus said frame may be determined as useful for training the perception network. As another example, in case the scene code corresponds to a sunny day, the perception network may be determined as sufficiently trained, and therefore, said frame may be determined as not useful for training the perception network in order to avoid overfitting.
  • However, it should be noted that the method described above for determining whether the frames of the driving video are useful for training the perception network or not is just an example. That is, the scope of the present disclosure is not limited thereto and the method may vary by set conditions.
  • Meanwhile, the frame selecting module 1300 may determine whether the frames of the driving video are useful for training the perception network or not by using a trained network, i.e., a trained deep learning network.
  • For example, by referring to FIG. 4, the frame selecting module 1300 may perform or support another device to perform a process of inputting the frames into an auto labeling network 1310 and the trained deep learning network 1320, respectively. Thereafter, by performing an output comparison, which is a process of comparing an output from the auto labeling network 1310 and an output from the trained deep learning network 1320, the frames may be determined as useful or not for training the perception network. If the outputs are identical or similar to each other, the frames may be determined as not useful. And, if a difference between the outputs is equal or greater than a predetermined value, the frames may be considered as hard examples and determined useful for training the perception network.
  • As another example, by referring to FIG. 5, the frame selecting module 1300 may perform or support another device to perform a process of modifying the frames in various ways, to thereby create various modified frames. Herein, the various ways of modifying the frames may include resizing the frames, changing aspect ratios of the frames, changing color tone of the frames, etc. And then, the frame selecting module 1300 may perform or support another device to perform a process of inputting each of the modified frames into the trained deep learning network 1320. Thereafter, by computing a variance of output values of each of the modified frames from the trained deep learning network 1320, the frames may be determined as useful or not for training the perception network. If the computed variance is equal or smaller than a preset threshold, the frames may be determined as not useful. And, if the computed variance is greater than the preset threshold, the frames may be considered as hard examples and thus determined as useful for training the perception network.
  • Next, the on-vehicle active learning device 1000 may perform or support another device to perform (i) a process of sampling the specific frames stored in the frame storing part 1400 by using the specific scene codes to thereby generate training data and (ii) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
  • Herein, the on-vehicle active learning device 1000 may perform or support another device to perform (i) a process of under-sampling through selecting a part of the specific frames in a majority class and as many as possible of the specific frames in a minority class by referring to the scene codes or (ii) a process of over-sampling through generating copies of the specific frames in the minority class as many as the number of the specific frames in the majority class, by referring to the scene codes, at the step of sampling the specific frames stored in the frame storing part 1400, to thereby generate the training data and thus train the perception network with the sampled training data. For example, in case that the number of frames corresponding to the majority class is 100 and that the number of frames corresponding to the minority class is 10, then if a desired number of frames to be sampled is 30, ten frames corresponding to the minority class may be selected and twenty frames corresponding to the majority class may be selected.
  • Also, the on-vehicle active learning device 1000 may perform or support another device to perform a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing, to thereby train the perception network via backpropagation by using the weight-balanced losses, at the step of executing the on-vehicle learning of the perception network by using the specific frames stored in the frame storing part 1400.
  • The present disclosure has an effect of providing the method for improving an efficiency of training the perception network with new training data by performing a process of assigning the scene code corresponding to a frame of a video, a process of determining the frame as useful for training or not, and then a process of storing the data in a storage of a vehicle.
  • The present disclosure has another effect of providing the method for performing the on-line active learning on the vehicle itself, through sampling balancing on the training data according to the scene code.
  • The present disclosure has still another effect of providing the method for performing the on-vehicle learning of the perception network of the autonomous vehicle by performing the sampling balancing on the training data according to its corresponding scene code.
  • The embodiments of the present disclosure as explained above can be implemented in a form of executable program command through a variety of computer means recordable to computer readable media. The computer readable media may include solely or in combination, program commands, data files, and data structures. The program commands recorded to the media may be components specially designed for the present disclosure or may be usable to those skilled in the art. Computer readable media include magnetic media such as hard disk, floppy disk, and magnetic tape, optical media such as CD-ROM and DVD, magneto-optical media such as floptical disk and hardware devices such as ROM, RAM, and flash memory specially designed to store and carry out program commands. Program commands include not only a machine language code made by a complier but also a high level code that can be used by an interpreter etc., which is executed by a computer. The aforementioned hardware device can work as more than a software module to perform the action of the present disclosure and vice versa.
  • As seen above, the present disclosure has been explained by specific matters such as detailed components, limited embodiments, and drawings. They have been provided only to help more general understanding of the present disclosure. It, however, will be understood by those skilled in the art that various changes and modification may be made from the description without departing from the spirit and scope of the disclosure as defined in the following claims.
  • Accordingly, the thought of the present disclosure must not be confined to the explained embodiments, and the following patent claims as well as everything including variations equal or equivalent to the patent claims pertain to the category of the thought of the present disclosure.

Claims (18)

1. A method for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, comprising steps of:
(a) an on-vehicle active learning device, if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, performing or supporting another device to perform a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information;
(b) the on-vehicle active learning device performing or supporting another device to perform at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another; and
(c) the on-vehicle active learning device performing or supporting another device to perform (c1) a process of sampling the specific frames stored in the frame storing part by using the specific scene codes to thereby generate training data and (c2) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
2. (canceled)
3. The method of claim 1, wherein, at the step of (c), the on-vehicle active learning device performs or supports another device to perform at least one of (i) a process of under-sampling the specific frames by referring to the scene codes or a process of over-sampling the specific frames by referring to the scene codes, to thereby generate the training data and thus train the perception network, at the step of (c1) and (ii) (ii-1) a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing and (ii-2) a process of training the perception network via backpropagation using the weight-balanced losses, at the step of (c2).
4. A method for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, comprising steps of:
(a) an on-vehicle active learning device, if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, performing or supporting another device to perform a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information; and
(b) the on-vehicle active learning device performing or supporting another device to perform at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another; and
wherein, at the step of (a), the on-vehicle active learning device performs or supports another device to perform a process of allowing the scene code assigning module to (i) apply a learning operation to each of the frames, to thereby classify each of the scenes of each of the frames into one of classes of driving environments and one of classes of driving roads and thus generate each of class codes of each of the frames, via a scene classifier based on deep learning, (ii) detect each of driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module, and (iii) generate each of the scene codes for each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
5. The method of claim 4, wherein the on-vehicle active learning device performs or supports another device to perform a process of allowing the scene code assigning module to (i) detect one or more scene changes in the frames via the driving event detecting module and thus generate one or more frame-based event codes and (ii) detect one or more operation states, corresponding to the sensing information, of the autonomous vehicle and thus generate one or more vehicle-based event codes, to thereby generate the event codes.
6. The method of claim 1, wherein, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, on which no object is detected from its collision area, corresponding to a collision event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
7. A method for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, comprising steps of:
(a) an on-vehicle active learning device, if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, performing or supporting another device to perform a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information; and
(b) the on-vehicle active learning device performing or supporting another device to perform at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another; and
wherein, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, on which an object is detected from its collision area, corresponding to a normal event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
8. The method of claim 1, wherein, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame where an object, with its confidence score included in the object detection information equal to or lower than a preset value, is located as one of the specific frames.
9. The method of claim 1, wherein, at the step of (b), the on-vehicle active learning device performs or supports another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames, by referring to the scene codes.
10. An on-vehicle active learning device for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, comprising:
at least one memory that stores instructions; and
at least one processor configured to execute the instructions to perform or support another device to perform: (I) if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information and (II) at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another; and
wherein the processor further performs or supports another device to perform:
(III) (III-1) a process of sampling the specific frames stored in the frame storing part by using the specific scene codes to thereby generate training data and (III-2) a process of executing on-vehicle learning of the perception network of the autonomous vehicle by using the training data.
11. (canceled)
12. The on-vehicle active learning device of claim 10, wherein, at the process of (III), the processor performs or supports another device to perform at least one of (i) a process of under-sampling the specific frames by referring to the scene codes or a process of over-sampling the specific frames by referring to the scene codes, to thereby generate the training data and thus train the perception network, at the process of (III-1) and (ii) (ii-1) a process of calculating one or more weight-balanced losses on the training data, corresponding to the scene codes, by weight balancing and (ii-2) a process of training the perception network via backpropagation using the weight-balanced losses, at the process of (III-2).
13. An on-vehicle active learning device for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, comprising:
at least one memory that stores instructions; and
at least one processor configured to execute the instructions to perform or support another device to perform: (I) if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the Response to Non-Final Office Action frames and the sensing information and (II) at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another; and
wherein, at the process of (I), the processor performs or supports another device to perform a process of allowing the scene code assigning module to (i) apply a learning operation to each of the frames, to thereby classify each of the scenes of each of the frames into one of classes of driving environments and one of classes of driving roads and thus generate each of class codes of each of the frames, via a scene classifier based on deep learning, (ii) detect each of driving events, which occurs while the autonomous vehicle is driven, by referring to each of the frames and each piece of the sensing information on each of the frames, to thereby generate each of event codes, via a driving event detecting module, and (iii) generate each of the scene codes for each of the frames by using each of the class codes of each of the frames and each of the event codes of each of the frames.
14. The on-vehicle active learning device of claim 13, wherein the processor performs or supports another device to perform a process of allowing the scene code assigning module to (i) detect one or more scene changes in the frames via the driving event detecting module and thus generate one or more frame-based event codes and (ii) detect one or more operation states, corresponding to the sensing information, of the autonomous vehicle and thus generate one or more vehicle-based event codes, to thereby generate the event codes.
15. The on-vehicle active learning device of claim 10, wherein, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame, on which no object is detected from its collision area, corresponding to a collision event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
16. An on-vehicle active learning device for on-vehicle active learning to be used for training a perception network of an autonomous vehicle, comprising:
at least one memory that stores instructions; and
at least one processor configured to execute the instructions to perform or support another device to perform: (I) if a driving video and sensing information are acquired respectively from a camera and one or more sensors mounted on an autonomous vehicle while the autonomous vehicle is driven, a process of inputting one or more consecutive frames of the driving video and the sensing information into a scene code assigning module, to thereby allow the scene code assigning module to generate each of one or more scene codes including information on each of scenes in each of the frames and information on one or more driving events by referring to the frames and the sensing information and (II) at least one of (i) a process of selecting a first part of the frames, whose object detection information generated during the driving events satisfies a preset condition, as specific frames to be used for training the perception network of the autonomous vehicle by using each of the scene codes of each of the frames and the object detection information, for each of the frames, detected by an object detector and a process of storing the specific frames and their corresponding specific scene codes in a frame storing part such that the specific frames and their corresponding specific scene codes match with one another and (ii) a process of selecting a second part of the frames, matching with a training policy of the perception network of the autonomous vehicle, as the specific frames by using the scene codes and the object detection information and a process of storing the specific frames and their corresponding specific scene codes in the frame storing part such that the specific frames and their corresponding specific scene codes match with one another; and
wherein, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame, on which an object is detected from its collision area, corresponding to a normal event, as one of the specific frames by referring to the scene codes, wherein the collision area is an area, in the certain frame, where an object is estimated as being located if the autonomous vehicle collides with the object or where the object is estimated to be located if the autonomous vehicle is estimated to collide with the object.
17. The on-vehicle active learning device of claim 10, wherein, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame where an object, with its confidence score included in the object detection information equal to or lower than a preset value, is located as one of the specific frames.
18. The on-vehicle active learning device of claim 10, wherein, at the process of (II), the processor performs or supports another device to perform a process of selecting a certain frame, from which a pedestrian in a rare driving environment is detected, as one of the specific frames, by referring to the scene codes.
US17/204,287 2020-04-24 2021-03-17 Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle Active US11157813B1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
US17/204,287 US11157813B1 (en) 2020-04-24 2021-03-17 Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle
KR1020217040053A KR102589764B1 (en) 2020-04-24 2021-04-14 On-vehicle active learning method and device for learning the perception network of an autonomous vehicle
JP2021576476A JP7181654B2 (en) 2020-04-24 2021-04-14 On-vehicle active learning method and apparatus for learning the perception network of an autonomous driving vehicle
EP21168387.5A EP3901822B1 (en) 2020-04-24 2021-04-14 Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle
CN202180020419.1A CN115279643A (en) 2020-04-24 2021-04-14 On-board active learning method and apparatus for training a perception network of an autonomous vehicle
PCT/KR2021/004714 WO2021215740A1 (en) 2020-04-24 2021-04-14 Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063014877P 2020-04-24 2020-04-24
US17/204,287 US11157813B1 (en) 2020-04-24 2021-03-17 Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle

Publications (2)

Publication Number Publication Date
US11157813B1 US11157813B1 (en) 2021-10-26
US20210334652A1 true US20210334652A1 (en) 2021-10-28

Family

ID=75529884

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/204,287 Active US11157813B1 (en) 2020-04-24 2021-03-17 Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle

Country Status (6)

Country Link
US (1) US11157813B1 (en)
EP (1) EP3901822B1 (en)
JP (1) JP7181654B2 (en)
KR (1) KR102589764B1 (en)
CN (1) CN115279643A (en)
WO (1) WO2021215740A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11475628B2 (en) * 2021-01-12 2022-10-18 Toyota Research Institute, Inc. Monocular 3D vehicle modeling and auto-labeling using semantic keypoints
GB202208488D0 (en) * 2022-06-09 2022-07-27 Aptiv Tech Ltd Computer-implemented method of training a neural network to select objects, and a vehicle, cloud server, and traffic infrastructure unit

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190035128A1 (en) * 2017-07-31 2019-01-31 Iain Matthew Russell Unmanned aerial vehicles
US20200033880A1 (en) * 2018-07-30 2020-01-30 Toyota Research Institute, Inc. System and method for 3d scene reconstruction of agent operation sequences using low-level/high-level reasoning and parametric models
US20200081435A1 (en) * 2018-09-06 2020-03-12 Baidu Online Network Technology (Beijing) Co., Ltd. Autonomous driving assistance method, driving device, assistance device and readable storage medium
US20200311961A1 (en) * 2019-03-29 2020-10-01 Denso Ten Limited Image processing device and image processing method
US20210001858A1 (en) * 2019-07-01 2021-01-07 Hyundai Motor Company Lane change control device and method for autonomous vehicle
US20210027103A1 (en) * 2019-07-24 2021-01-28 Nvidia Corporation Automatic generation of ground truth data for training or retraining machine learning models
US20210133497A1 (en) * 2019-11-02 2021-05-06 Perceptive Automata Inc. Adaptive Sampling of Stimuli for Training of Machine Learning Based Models for Predicting Hidden Context of Traffic Entities For Navigating Autonomous Vehicles
US20210133988A1 (en) * 2019-04-12 2021-05-06 Logitech Europe S.A. Video content activity context and regions
US20210142168A1 (en) * 2019-11-07 2021-05-13 Nokia Technologies Oy Methods and apparatuses for training neural networks
US20210142069A1 (en) * 2018-05-18 2021-05-13 Cambricon Technologies Corporation Limited Video retrieval method, and method and apparatus for generating video retrieval mapping relationship

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8111923B2 (en) 2008-08-14 2012-02-07 Xerox Corporation System and method for object class localization and semantic class based image segmentation
JP2014010633A (en) 2012-06-29 2014-01-20 Honda Elesys Co Ltd Image recognition device, image recognition method, and image recognition program
US20140005907A1 (en) 2012-06-29 2014-01-02 Magna Electronics Inc. Vision-based adaptive cruise control system
KR102190484B1 (en) * 2013-11-11 2020-12-11 삼성전자주식회사 Method and apparatus for training recognizer, method and apparatus for recognizing data
KR102147361B1 (en) * 2015-09-18 2020-08-24 삼성전자주식회사 Method and apparatus of object recognition, Method and apparatus of learning for object recognition
US10530991B2 (en) * 2017-01-28 2020-01-07 Microsoft Technology Licensing, Llc Real-time semantic-aware camera exposure control
KR102060662B1 (en) * 2017-05-16 2019-12-30 삼성전자주식회사 Electronic device and method for detecting a driving event of vehicle
US10452920B2 (en) * 2017-10-31 2019-10-22 Google Llc Systems and methods for generating a summary storyboard from a plurality of image frames
US10473788B2 (en) * 2017-12-13 2019-11-12 Luminar Technologies, Inc. Adjusting area of focus of vehicle sensors by controlling spatial distributions of scan lines
CN108133484B (en) * 2017-12-22 2022-01-28 北京奇虎科技有限公司 Automatic driving processing method and device based on scene segmentation and computing equipment
US11941719B2 (en) * 2018-01-23 2024-03-26 Nvidia Corporation Learning robotic tasks using one or more neural networks
US10318842B1 (en) * 2018-09-05 2019-06-11 StradVision, Inc. Learning method, learning device for optimizing parameters of CNN by using multiple video frames and testing method, testing device using the same
WO2020068784A1 (en) * 2018-09-24 2020-04-02 Schlumberger Technology Corporation Active learning framework for machine-assisted tasks
US10504027B1 (en) * 2018-10-26 2019-12-10 StradVision, Inc. CNN-based learning method, learning device for selecting useful training data and test method, test device using the same
CN109257622A (en) * 2018-11-01 2019-01-22 广州市百果园信息技术有限公司 A kind of audio/video processing method, device, equipment and medium
CN109993082B (en) * 2019-03-20 2021-11-05 上海理工大学 Convolutional neural network road scene classification and road segmentation method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190035128A1 (en) * 2017-07-31 2019-01-31 Iain Matthew Russell Unmanned aerial vehicles
US20210142069A1 (en) * 2018-05-18 2021-05-13 Cambricon Technologies Corporation Limited Video retrieval method, and method and apparatus for generating video retrieval mapping relationship
US20200033880A1 (en) * 2018-07-30 2020-01-30 Toyota Research Institute, Inc. System and method for 3d scene reconstruction of agent operation sequences using low-level/high-level reasoning and parametric models
US20200081435A1 (en) * 2018-09-06 2020-03-12 Baidu Online Network Technology (Beijing) Co., Ltd. Autonomous driving assistance method, driving device, assistance device and readable storage medium
US20200311961A1 (en) * 2019-03-29 2020-10-01 Denso Ten Limited Image processing device and image processing method
US20210133988A1 (en) * 2019-04-12 2021-05-06 Logitech Europe S.A. Video content activity context and regions
US20210001858A1 (en) * 2019-07-01 2021-01-07 Hyundai Motor Company Lane change control device and method for autonomous vehicle
US20210027103A1 (en) * 2019-07-24 2021-01-28 Nvidia Corporation Automatic generation of ground truth data for training or retraining machine learning models
US20210133497A1 (en) * 2019-11-02 2021-05-06 Perceptive Automata Inc. Adaptive Sampling of Stimuli for Training of Machine Learning Based Models for Predicting Hidden Context of Traffic Entities For Navigating Autonomous Vehicles
US20210142168A1 (en) * 2019-11-07 2021-05-13 Nokia Technologies Oy Methods and apparatuses for training neural networks

Also Published As

Publication number Publication date
KR20210152025A (en) 2021-12-14
EP3901822A1 (en) 2021-10-27
CN115279643A (en) 2022-11-01
WO2021215740A1 (en) 2021-10-28
JP7181654B2 (en) 2022-12-01
EP3901822B1 (en) 2024-02-14
US11157813B1 (en) 2021-10-26
KR102589764B1 (en) 2023-10-17
JP2022539697A (en) 2022-09-13
EP3901822C0 (en) 2024-02-14

Similar Documents

Publication Publication Date Title
US11195030B2 (en) Scene classification
EP2995519B1 (en) Modifying autonomous vehicle driving by recognizing vehicle characteristics
US8520893B2 (en) Method and system for detecting object
US11157813B1 (en) Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle
JP6418574B2 (en) Risk estimation device, risk estimation method, and computer program for risk estimation
KR102043089B1 (en) Method for extracting driving lane, device and computer readable medium for performing the method
CN110222596B (en) Driver behavior analysis anti-cheating method based on vision
CN108960074B (en) Small-size pedestrian target detection method based on deep learning
CN110738081B (en) Abnormal road condition detection method and device
US11132607B1 (en) Method for explainable active learning, to be used for object detector, by using deep encoder and active learning device using the same
CN111976585A (en) Projection information recognition device and method based on artificial neural network
US20240020964A1 (en) Method and device for improving object recognition rate of self-driving car
KR20240011067A (en) Method for improving object detecting ratio of self-driving car and apparatus thereof
Kushwaha et al. Yolov7-based Brake Light Detection Model for Avoiding Rear-End Collisions
Zaman et al. Deep Learning Approaches for Vehicle and Pedestrian Detection in Adverse Weather
Kovačić et al. Measurement of road traffic parameters based on multi-vehicle tracking
KR102039814B1 (en) Method and apparatus for blind spot detection
CN112277958B (en) Driver braking behavior analysis method
Triwibowo et al. Analysis of Classification and Calculation of Vehicle Type at APILL Intersection Using YOLO Method and Kalman Filter
Sun et al. Predicting Traffic Hazardous Events Based on Naturalistic Driving Data
KR20240046382A (en) Method of creating a drivable path through recognition of event of road and computing device using the same
KR20160064931A (en) Apparatus and Method for detecting intelligent high-speed multi-object
KR20240037396A (en) Learning method and learning device for supporting autonomous driving by generating information on surrounding vehicles based on machine learning and testing method and testing device using the same
Ismail et al. On the Performance of Extended Real-time Object Detection and Attribute Estimation Within Urban Scene Understanding
KR20230117913A (en) OBJECT TRACKING DEVICE USING LiDAR SENSOR AND METHOD OF DETERMINING THE MOVING AND STATIONARY STATE OF THE OBJECT

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

AS Assignment

Owner name: STRADVISION, INC., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JE, HONGMO;KANG, BONGNAM;KIM, YONGJOONG;AND OTHERS;REEL/FRAME:057243/0916

Effective date: 20210312

STCF Information on status: patent grant

Free format text: PATENTED CASE