US20230410486A1 - Information processing apparatus, information processing method, and program - Google Patents

Information processing apparatus, information processing method, and program Download PDF

Info

Publication number
US20230410486A1
US20230410486A1 US18/252,219 US202118252219A US2023410486A1 US 20230410486 A1 US20230410486 A1 US 20230410486A1 US 202118252219 A US202118252219 A US 202118252219A US 2023410486 A1 US2023410486 A1 US 2023410486A1
Authority
US
United States
Prior art keywords
image
recognition
recognition model
learning
reliability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/252,219
Other languages
English (en)
Inventor
Guifen TIAN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Assigned to Sony Group Corporation reassignment Sony Group Corporation ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TIAN, GUIFEN
Publication of US20230410486A1 publication Critical patent/US20230410486A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/02Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to ambient conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/778Active pattern-learning, e.g. online learning of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/16Anti-collision systems

Definitions

  • the present technology relates to an information processing apparatus, an information processing method, and a program, and particularly to an information processing apparatus, an information processing method, and a program suitable for use in a case of relearning a recognition model.
  • a recognition model for recognizing various recognition targets around a vehicle is used. Furthermore, there is a case where the recognition model is updated in order to keep favorable accuracy of the recognition model (see, for example, Patent Document 1).
  • the present technology has been made in view of such a situation, and is to enable efficient relearning of a recognition model.
  • An information processing apparatus includes: a collection timing control unit configured to control a timing to collect a learning image candidate that is an image to be a candidate for a learning image to be used in relearning of a recognition model; and a learning image collection unit configured to select the learning image from among the learning image candidates that have been collected, on the basis of at least one of a feature of the learning image candidate or a similarity to the learning image that has been accumulated.
  • An information processing method includes, by the information processing apparatus: controlling a timing to collect a learning image candidate that is an image to be a candidate for a learning image to be used in relearning of a recognition model; and selecting the learning image from among the learning image candidates that have been collected, on the basis of at least one of a feature of the learning image candidate or a similarity to the learning image that has been accumulated.
  • a program causes a computer to execute processing including: controlling a timing to collect a learning image candidate that is an image to be a candidate for a learning image to be used in relearning of a recognition model; and selecting the learning image from among the learning image candidates that have been collected, on the basis of at least one of a feature of the learning image candidate or a similarity to the learning image that has been accumulated.
  • control is performed on a timing to collect a learning image candidate that is an image to be a candidate for a learning image to be used in relearning of a recognition model, and the learning image is selected from among the learning image candidates that have been collected, on the basis of at least one of a feature of the learning image candidate or a similarity to the learning image that has been accumulated.
  • FIG. 1 is a block diagram illustrating a configuration example of a vehicle control system.
  • FIG. 2 is a view illustrating an example of a sensing area.
  • FIG. 3 is a block diagram illustrating an embodiment of an information processing system to which the present technology is applied.
  • FIG. 4 is a block diagram illustrating a configuration example of an information processing unit of FIG. 3 .
  • FIG. 5 is a flowchart for explaining recognition model learning processing.
  • FIG. 6 is a diagram for explaining a specific example of recognition processing.
  • FIG. 7 is a flowchart for explaining a first embodiment of reliability threshold value setting processing.
  • FIG. 8 is a flowchart for explaining a second embodiment of the reliability threshold value setting processing.
  • FIG. 9 is a graph illustrating an example of a PR curve.
  • FIG. 10 is a flowchart for explaining verification image collection processing.
  • FIG. 11 is a view illustrating a format example of verification image data.
  • FIG. 12 is a flowchart for explaining dictionary data generation processing.
  • FIG. 13 is a flowchart for explaining verification image classification processing.
  • FIG. 14 is a flowchart for explaining learning image collection processing.
  • FIG. 15 is a view illustrating a format example of learning image data.
  • FIG. 16 is a flowchart for explaining recognition model update processing.
  • FIG. 17 is a flowchart for explaining details of recognition model verification processing using a high-reliability verification image.
  • FIG. 18 is a flowchart for explaining details of recognition model verification processing using a low-reliability verification image.
  • FIG. 19 is a block diagram illustrating a configuration example of a computer.
  • FIG. 1 is a block diagram illustrating a configuration example of a vehicle control system 11 , which is an example of a mobile device control system to which the present technology is applied.
  • the vehicle control system 11 is provided in a vehicle 1 and performs processing related to travel assistance and automated driving of the vehicle 1 .
  • the vehicle control system 11 includes a processor 21 , a communication unit 22 , a map information accumulation unit 23 , a global navigation satellite system (GNSS) reception unit 24 , an external recognition sensor 25 , an in-vehicle sensor 26 , a vehicle sensor 27 , a recording unit 28 , a travel assistance/automated driving control unit 29 , a driver monitoring system (DMS) 30 , a human machine interface (HMI) 31 , and a vehicle control unit 32 .
  • GNSS global navigation satellite system
  • DMS driver monitoring system
  • HMI human machine interface
  • the processor 21 , the communication unit 22 , the map information accumulation unit 23 , the GNSS reception unit 24 , the external recognition sensor 25 , the in-vehicle sensor 26 , the vehicle sensor 27 , the recording unit 28 , the travel assistance/automated driving control unit 29 , the driver monitoring system (DMS) 30 , the human machine interface (HMI) 31 , and the vehicle control unit 32 are connected to each other via a communication network 41 .
  • the communication network 41 includes, for example, a bus, an in-vehicle communication network conforming to any standard such as a controller area network (CAN), a local interconnect network (LIN), a local area network (LAN), FlexRay, or Ethernet (registered trademark), and the like. Note that there is also a case where each unit of the vehicle control system 11 is directly connected by, for example, short-range wireless communication (near field communication (NFC)), Bluetooth (registered trademark), or the like without via the communication network 41 .
  • NFC near field communication
  • Bluetooth registered trademark
  • each unit of the vehicle control system 11 communicates via the communication network 41 , the description of the communication network 41 is to be omitted.
  • the processor 21 and the communication unit 22 perform communication via the communication network 41 , it is simply described that the processor 21 and the communication unit 22 perform communication.
  • the processor 21 includes various processors such as, for example, a central processing unit (CPU), a micro processing unit (MPU), and an electronic control unit (ECU).
  • the processor 21 controls the entire vehicle control system 11 .
  • the communication unit 22 communicates with various types of equipment inside and outside the vehicle, other vehicles, servers, base stations, and the like, and transmits and receives various data.
  • the communication unit 22 receives, from the outside, a program for updating software for controlling an operation of the vehicle control system 11 , map information, traffic information, information around the vehicle 1 , and the like.
  • the communication unit 22 transmits information regarding the vehicle 1 (for example, data indicating a state of the vehicle 1 , a recognition result by a recognition unit 73 , and the like), information around the vehicle 1 , and the like to the outside.
  • the communication unit 22 performs communication corresponding to a vehicle emergency call system such as an eCall.
  • a communication method of the communication unit 22 is not particularly limited. Furthermore, a plurality of communication methods may be used.
  • the communication unit 22 performs wireless communication with in-vehicle equipment by a communication method such as wireless LAN, Bluetooth, NFC, or wireless USB (WUSB).
  • a communication method such as wireless LAN, Bluetooth, NFC, or wireless USB (WUSB).
  • the communication unit 22 performs wired communication with in-vehicle equipment through a communication method such as a universal serial bus (USB), a high-definition multimedia interface (HDMI, registered trademark), or a mobile high-definition link (MHL), via a connection terminal (not illustrated) (and a cable if necessary).
  • USB universal serial bus
  • HDMI high-definition multimedia interface
  • MHL mobile high-definition link
  • the in-vehicle equipment is, for example, equipment that is not connected to the communication network 41 in the vehicle.
  • mobile equipment or wearable equipment carried by a passenger such as a driver, information equipment brought into the vehicle and temporarily installed, and the like are assumed.
  • the communication unit 22 uses a wireless communication method such as a fourth generation mobile communication system (4G), a fifth generation mobile communication system (5G), long term evolution (LTE), or dedicated short range communications (DSRC), to communicate with a server or the like existing on an external network (for example, the Internet, a cloud network, or a company-specific network) via a base station or an access point.
  • 4G fourth generation mobile communication system
  • 5G fifth generation mobile communication system
  • LTE long term evolution
  • DSRC dedicated short range communications
  • the communication unit 22 uses a peer to peer (P2P) technology to communicate with a terminal (for example, a terminal of a pedestrian or a store, or a machine type communication (MTC) terminal) existing near the own vehicle.
  • a terminal for example, a terminal of a pedestrian or a store, or a machine type communication (MTC) terminal
  • MTC machine type communication
  • the communication unit 22 performs V2X communication.
  • the V2X communication is, for example, vehicle to vehicle communication with another vehicle, vehicle to infrastructure communication with a roadside device or the like, vehicle to home communication, vehicle to pedestrian communication with a terminal or the like possessed by a pedestrian, or the like.
  • the communication unit 22 receives an electromagnetic wave transmitted by a road traffic information communication system (vehicle information and communication system (VICS), registered trademark), such as a radio wave beacon, an optical beacon, or FM multiplex broadcasting.
  • a road traffic information communication system vehicle information and communication system (VICS), registered trademark
  • VICS vehicle information and communication system
  • the map information accumulation unit 23 accumulates a map acquired from the outside and a map created by the vehicle 1 .
  • the map information accumulation unit 23 accumulates a three-dimensional high-precision map, a global map having lower accuracy than the high-precision map and covering a wide area, and the like.
  • the high-precision map is, for example, a dynamic map, a point cloud map, a vector map (also referred to as an advanced driver assistance system (ADAS) map), or the like.
  • the dynamic map is, for example, a map including four layers of dynamic information, semi-dynamic information, semi-static information, and static information, and is supplied from an external server or the like.
  • the point cloud map is a map including a point cloud (point group data).
  • the vector map is a map in which information such as a lane and a position of a traffic light is associated with the point cloud map.
  • the point cloud map and the vector map may be supplied from, for example, an external server or the like, or may be created by the vehicle 1 as a map for performing matching with a local map to be described later on the basis of a sensing result by a radar 52 , a LiDAR 53 , or the like, and may be accumulated in the map information accumulation unit 23 . Furthermore, in a case where the high-precision map is supplied from an external server or the like, in order to reduce a communication capacity, for example, map data of several hundred meters square regarding a planned path on which the vehicle 1 will travel is acquired from a server or the like.
  • the GNSS reception unit 24 receives a GNSS signal from a GNSS satellite, and supplies to the travel assistance/automated driving control unit 29 .
  • the external recognition sensor 25 includes various sensors used for recognizing a situation outside the vehicle 1 , and supplies sensor data from each sensor to each unit of the vehicle control system 11 . Any type and number of sensors included in the external recognition sensor 25 may be adopted.
  • the external recognition sensor 25 includes, a camera 51 , the radar 52 , the light detection and ranging or laser imaging detection and ranging (LiDAR) 53 , and an ultrasonic sensor 54 .
  • a camera 51 the radar 52 , the light detection and ranging or laser imaging detection and ranging (LiDAR) 53 , and an ultrasonic sensor 54 .
  • LiDAR laser imaging detection and ranging
  • ultrasonic sensor 54 Any number of the camera 51 , the radar 52 , the LiDAR 53 , and the ultrasonic sensor 54 may be adopted, and an example of a sensing area of each sensor will be described later.
  • the camera 51 for example, a camera of any image capturing system such as a time of flight (ToF) camera, a stereo camera, a monocular camera, or an infrared camera is used as necessary.
  • a camera of any image capturing system such as a time of flight (ToF) camera, a stereo camera, a monocular camera, or an infrared camera is used as necessary.
  • ToF time of flight
  • stereo camera stereo camera
  • monocular camera a monocular camera
  • infrared camera infrared camera
  • the external recognition sensor 25 includes an environment sensor for detection of weather, a meteorological state, a brightness, and the like.
  • the environment sensor includes, for example, a raindrop sensor, a fog sensor, a sunshine sensor, a snow sensor, an illuminance sensor, and the like.
  • the external recognition sensor 25 includes a microphone to be used to detect sound around the vehicle 1 , a position of a sound source, and the like.
  • the in-vehicle sensor 26 includes various sensors for detection of information inside the vehicle, and supplies sensor data from each sensor to each unit of the vehicle control system 11 . Any type and number of sensors included in the in-vehicle sensor 26 may be adopted.
  • the in-vehicle sensor 26 includes a camera, a radar, a seating sensor, a steering wheel sensor, a microphone, a biological sensor, and the like.
  • a camera for example, a camera of any image capturing system such as a ToF camera, a stereo camera, a monocular camera, or an infrared camera can be used.
  • the biological sensor is provided, for example, in a seat, a steering wheel, or the like, and detects various kinds of biological information of a passenger such as the driver.
  • the vehicle sensor 27 includes various sensors for detection of a state of the vehicle 1 , and supplies sensor data from each sensor to each unit of the vehicle control system 11 . Any type and number of sensors included in the vehicle sensor 27 may be adopted.
  • the vehicle sensor 27 includes a speed sensor, an acceleration sensor, an angular velocity sensor (gyro sensor), and an inertial measurement unit (IMU).
  • the vehicle sensor 27 includes a steering angle sensor that detects a steering angle of a steering wheel, a yaw rate sensor, an accelerator sensor that detects an operation amount of an accelerator pedal, and a brake sensor that detects an operation amount of a brake pedal.
  • the vehicle sensor 27 includes a rotation sensor that detects a number of revolutions of an engine or a motor, an air pressure sensor that detects an air pressure of a tire, a slip rate sensor that detects a slip rate of a tire, and a wheel speed sensor that detects a rotation speed of a wheel.
  • the vehicle sensor 27 includes a battery sensor that detects a remaining amount and a temperature of a battery, and an impact sensor that detects an external impact.
  • the recording unit 28 includes, for example, a magnetic storage device such as a read only memory (ROM), a random access memory (RAN), and a hard disc drive (HDD), a semiconductor storage device, an optical storage device, a magneto-optical storage device, and the like.
  • the recording unit 28 stores various programs, data, and the like used by each unit of the vehicle control system 11 .
  • the recording unit 28 records a rosbag file including a message transmitted and received by a Robot Operating System (ROS) in which an application program related to automated driving operates.
  • the recording unit 28 includes an Event Data Recorder (EDR) and a Data Storage System for Automated Driving (DSSAD), and records information of the vehicle 1 before and after an event such as an accident.
  • EDR Event Data Recorder
  • DSSAD Data Storage System for Automated Driving
  • the travel assistance/automated driving control unit 29 controls travel support and automated driving of the vehicle 1 .
  • the travel assistance/automated driving control unit 29 includes an analysis unit 61 , an action planning unit 62 , and an operation control unit 63 .
  • the analysis unit 61 performs analysis processing on a situation of the vehicle 1 and surroundings.
  • the analysis unit 61 includes an own-position estimation unit 71 , a sensor fusion unit 72 , and the recognition unit 73 .
  • the own-position estimation unit 71 estimates an own-position of the vehicle 1 on the basis of sensor data from the external recognition sensor 25 and a high-precision map accumulated in the map information accumulation unit 23 .
  • the own-position estimation unit 71 generates a local map on the basis of sensor data from the external recognition sensor 25 , and estimates the own-position of the vehicle 1 by performing matching of the local map with the high-precision map.
  • the position of the vehicle 1 is based on, for example, a center of a rear wheel pair axle.
  • the local map is, for example, a three-dimensional high-precision map, an occupancy grid map, or the like created using a technique such as simultaneous localization and mapping (SLAM).
  • the three-dimensional high-precision map is, for example, the above-described point cloud map or the like.
  • the occupancy grid map is a map in which a three-dimensional or two-dimensional space around the vehicle 1 is segmented into grids of a predetermined size, and an occupancy state of an object is indicated in a unit of a grid.
  • the occupancy state of the object is indicated by, for example, a presence or absence or a presence probability of the object.
  • the local map is also used for detection processing and recognition processing of a situation outside the vehicle 1 by the recognition unit 73 , for example.
  • the own-position estimation unit 71 may estimate the own-position of the vehicle 1 on the basis of a GNSS signal and sensor data from the vehicle sensor 27 .
  • the sensor fusion unit 72 performs sensor fusion processing of combining a plurality of different types of sensor data (for example, image data supplied from the camera 51 and sensor data supplied from the radar 52 ) to obtain new information.
  • Methods for combining different types of sensor data include integration, fusion, association, and the like.
  • the recognition unit 73 performs detection processing and recognition processing of a situation outside the vehicle 1 .
  • the recognition unit 73 performs detection processing and recognition processing of a situation outside the vehicle 1 on the basis of information from the external recognition sensor 25 , information from the own-position estimation unit 71 , information from the sensor fusion unit 72 , and the like.
  • the recognition unit 73 performs detection processing, recognition processing, and the like of an object around the vehicle 1 .
  • the detection processing of the object is, for example, processing of detecting a presence or absence, a size, a shape, a position, a movement, and the like of the object.
  • the recognition processing of the object is, for example, processing of recognizing an attribute such as a type of the object or identifying a specific object.
  • the detection processing and the recognition processing are not necessarily clearly segmented, and may overlap.
  • the recognition unit 73 detects an object around the vehicle 1 by performing clustering for classifying a point cloud on the basis of sensor data of the LiDAR, the radar, or the like for each cluster of point groups. As a result, a presence or absence, a size, a shape, and a position of the object around the vehicle 1 are detected.
  • the recognition unit 73 detects a movement of the object around the vehicle 1 by performing tracking that is following a movement of the cluster of point groups classified by clustering. As a result, a speed and a traveling direction (a movement vector) of the object around the vehicle 1 are detected.
  • the recognition unit 73 recognizes a type of the object around the vehicle 1 by performing object recognition processing such as semantic segmentation on an image data supplied from the camera 51 .
  • the object to be detected or recognized for example, a vehicle, a person, a bicycle, an obstacle, a structure, a road, a traffic light, a traffic sign, a road sign, and the like are assumed.
  • the recognition unit 73 performs recognition processing of traffic rules around the vehicle 1 on the basis of a map accumulated in the map information accumulation unit 23 , an estimation result of the own-position, and a recognition result of the object around the vehicle 1 .
  • this processing for example, a position and a state of a traffic light, contents of a traffic sign and a road sign, contents of a traffic regulation, a travelable lane, and the like are recognized.
  • the recognition unit 73 performs recognition processing of a surrounding environment of the vehicle 1 .
  • the surrounding environment to be recognized for example, weather, a temperature, a humidity, a brightness, road surface conditions, and the like are assumed.
  • the action planning unit 62 creates an action plan of the vehicle 1 .
  • the action planning unit 62 creates an action plan by performing processing of path planning and path following.
  • path planning is processing of planning a rough path from a start to a goal.
  • This path planning is called track planning, and also includes processing of track generation (local path planning) that enables safe and smooth traveling in the vicinity of the vehicle 1 , in consideration of motion characteristics of the vehicle 1 in the path planned by the path planning.
  • Path following is processing of planning an operation for safely and accurately traveling a path planned by the path planning within a planned time. For example, a target speed and a target angular velocity of the vehicle 1 are calculated.
  • the operation control unit 63 controls an operation of the vehicle 1 in order to realize the action plan created by the action planning unit 62 .
  • the operation control unit 63 controls a steering control unit 81 , a brake control unit 82 , and a drive control unit 83 to perform acceleration/deceleration control and direction control such that the vehicle 1 travels on a track calculated by the track planning.
  • the operation control unit 63 performs cooperative control for the purpose of implementing functions of the ADAS, such as collision avoidance or impact mitigation, follow-up traveling, vehicle speed maintaining traveling, collision warning of the own vehicle, lane deviation warning of the own vehicle, and the like.
  • the operation control unit 63 performs cooperative control for the purpose of automated driving or the like of autonomously traveling without depending on an operation of the driver.
  • the DMS 30 performs driver authentication processing, recognition processing of a state of the driver, and the like on the basis of sensor data from the in-vehicle sensor 26 , input data inputted to the HMI 31 , and the like.
  • a state of the driver for example, a physical condition, an awakening level, a concentration level, a fatigue level, a line-of-sight direction, a drunkenness level, a driving operation, a posture, and the like are assumed.
  • the DMS 30 may perform authentication processing of a passenger other than the driver and recognition processing of a state of the passenger. Furthermore, for example, the DMS 30 may perform recognition processing of a situation inside the vehicle on the basis of sensor data from the in-vehicle sensor 26 . As the situation inside the vehicle to be recognized, for example, a temperature, a humidity, a brightness, odor, and the like are assumed.
  • the HMI 31 is used for inputting various data, instructions, and the like, generates an input signal on the basis of the inputted data, instructions, and the like, and supplies to each unit of the vehicle control system 11 .
  • the HMI 31 includes: operation devices such as a touch panel, a button, a microphone, a switch, and a lever; an operation device that can be inputted by a method other than manual operation, such as with voice or a gesture; and the like.
  • the HMI 31 may be a remote control device using infrared ray or other radio waves, or external connection equipment such as mobile equipment or wearable equipment corresponding to an operation of the vehicle control system 11 .
  • the HMI 31 performs output control to control generation and output of visual information, auditory information, and tactile information to the passenger or the outside of the vehicle, and to control output contents, output timings, an output method, and the like.
  • the visual information is, for example, information indicated by an image or light such as an operation screen, a state display of the vehicle 1 , a warning display, or a monitor image indicating a situation around the vehicle 1 .
  • the auditory information is, for example, information indicated by sound such as guidance, warning sound, or a warning message.
  • the tactile information is, for example, information given to a tactile sense of the passenger by a force, a vibration, a movement, or the like.
  • a display device As a device that outputs visual information, for example, a display device, a projector, a navigation device, an instrument panel, a camera monitoring system (CMS), an electronic mirror, a lamp, and the like are assumed.
  • the display device may be, for example, a device that displays visual information in a passenger's field of view, such as a head-up display, a transmissive display, or a wearable device having an augmented reality (AR) function, in addition to a device having a normal display.
  • augmented reality AR
  • an audio speaker for example, an audio speaker, a headphone, an earphone, or the like is assumed.
  • a haptic element using haptic technology As a device that outputs tactile information, for example, a haptic element using haptic technology, or the like, is assumed.
  • the haptic element is provided, for example, on the steering wheel, a seat, or the like.
  • the vehicle control unit 32 controls each unit of the vehicle 1 .
  • the vehicle control unit 32 includes the steering control unit 81 , the brake control unit 82 , the drive control unit 83 , a body system control unit 84 , a light control unit 85 , and a horn control unit 86 .
  • the steering control unit 81 performs detection, control, and the like of a state of a steering system of the vehicle 1 .
  • the steering system includes, for example, a steering mechanism including the steering wheel and the like, an electric power steering, and the like.
  • the steering control unit 81 includes, for example, a controlling unit such as an ECU that controls the steering system, an actuator that drives the steering system, and the like.
  • the brake control unit 82 performs detection, control, and the like of a state of a brake system of the vehicle 1 .
  • the brake system includes, for example, a brake mechanism including a brake pedal, an antilock brake system (ABS), and the like.
  • the brake control unit 82 includes, for example, a controlling unit such as an ECU that controls a brake system, an actuator that drives the brake system, and the like.
  • the drive control unit 83 performs detection, control, and the like of a state of a drive system of the vehicle 1 .
  • the drive system includes, for example, an accelerator pedal, a driving force generation device for generation of a driving force such as an internal combustion engine or a driving motor, a driving force transmission mechanism for transmission of the driving force to wheels, and the like.
  • the drive control unit 83 includes, for example, a controlling unit such as an ECU that controls the drive system, an actuator that drives the drive system, and the like.
  • the body system control unit 84 performs detection, control, and the like of a state of a body system of the vehicle 1 .
  • the body system includes, for example, a keyless entry system, a smart key system, a power window device, a power seat, an air conditioner, an airbag, a seat belt, a shift lever, and the like.
  • the body system control unit 84 includes, for example, a controlling unit such as an ECU that controls the body system, an actuator that drives the body system, and the like.
  • the light control unit 85 performs detection, control, and the like of a state of various lights of the vehicle 1 .
  • the lights to be controlled for example, a headlight, a backlight, a fog light, a turn signal, a brake light, a projection, a display of a bumper, and the like are assumed.
  • the light control unit 85 includes a controlling unit such as an ECU that controls lights, an actuator that drives lights, and the like.
  • the horn control unit 86 performs detection, control, and the like of state of a car horn of the vehicle 1 .
  • the horn control unit 86 includes, for example, a controlling unit such as an ECU that controls the car horn, an actuator that drives the car horn, and the like.
  • FIG. 2 is a view illustrating an example of a sensing area by the camera 51 , the radar 52 , the LiDAR 53 , and the ultrasonic sensor 54 of the external recognition sensor 25 in FIG. 1 .
  • Sensing areas 101 F and 101 B illustrate examples of sensing areas of the ultrasonic sensor 54 .
  • the sensing area 101 F covers a periphery of a front end of the vehicle 1 .
  • the sensing area 101 B covers a periphery of a rear end of the vehicle 1 .
  • Sensing results in the sensing areas 101 F and 101 B are used, for example, for parking assistance and the like of the vehicle 1 .
  • Sensing areas 102 F to 102 B illustrate examples of sensing areas of the radar 52 for a short distance or a middle distance.
  • the sensing area 102 F covers a position farther than the sensing area 101 F in front of the vehicle 1 .
  • the sensing area 102 B covers a position farther than the sensing area 101 B behind the vehicle 1 .
  • the sensing area 102 L covers a rear periphery of a left side surface of the vehicle 1 .
  • the sensing area 102 R covers a rear periphery of a right side surface of the vehicle 1 .
  • a sensing result in the sensing area 102 F is used, for example, for detection of a vehicle, a pedestrian, or the like existing in front of the vehicle 1 , and the like.
  • a sensing result in the sensing area 102 B is used, for example, for a collision prevention function or the like behind the vehicle 1 .
  • Sensing results in the sensing areas 102 L and 102 R are used, for example, for detection of an object in a blind spot on a side of the vehicle 1 , and the like.
  • Sensing areas 103 F to 103 B illustrate examples of sensing areas by the camera 51 .
  • the sensing area 103 F covers a position farther than the sensing area 102 F in front of the vehicle 1 .
  • the sensing area 103 B covers a position farther than the sensing area 102 B behind the vehicle 1 .
  • the sensing area 103 L covers a periphery of a left side surface of the vehicle 1 .
  • the sensing area 103 R covers a periphery of a right side surface of the vehicle 1 .
  • a sensing result in the sensing area 103 F is used for, for example, recognition of a traffic light or a traffic sign, a lane departure prevention assist system, and the like.
  • a sensing result in the sensing area 103 B is used for, for example, parking assistance, a surround view system, and the like.
  • Sensing results in the sensing areas 103 L and 103 R are used, for example, in a surround view system or the like.
  • a sensing area 104 illustrates an example of a sensing area of the LiDAR 53 .
  • the sensing area 104 covers a position farther than the sensing area 103 F in front of the vehicle 1 . Whereas, the sensing area 104 has a narrower range in a left-right direction than the sensing area 103 F.
  • a sensing result in the sensing area 104 is used for, for example, emergency braking, collision avoidance, pedestrian detection, and the like.
  • a sensing area 105 illustrates an example of a sensing area of the radar 52 for a long distance.
  • the sensing area 105 covers a position farther than the sensing area 104 in front of the vehicle 1 . Whereas, the sensing area 105 has a narrower range in a left-right direction than the sensing area 104 .
  • a sensing result in the sensing area 105 is used for, for example, adaptive cruise control (ACC) and the like.
  • ACC adaptive cruise control
  • each sensor may have various configurations other than those in FIG. 2 .
  • the ultrasonic sensor 54 may also perform sensing on a side of the vehicle 1
  • the LiDAR 53 may perform sensing behind the vehicle 1 .
  • FIG. 3 illustrates an embodiment of an information processing system 301 to which the present technology is applied.
  • the information processing system 301 is a system that learns and updates a recognition model for recognizing a specific recognition target in the vehicle 1 .
  • the recognition target of the recognition model is not particularly limited, but for example, the recognition model is assumed to perform depth recognition, semantic segmentation, optical flow recognition, and the like.
  • the information processing system 301 includes an information processing unit 311 and a server 312 .
  • the information processing unit 311 includes a recognition unit 331 , a learning unit 332 , a dictionary data generation unit 333 , and a communication unit 334 .
  • the recognition unit 331 constitutes, for example, a part of the recognition unit 73 in FIG. 1 .
  • the recognition unit 331 executes recognition processing of recognizing a predetermined recognition target by using a recognition model learned by the learning unit 332 and stored in a recognition model storage unit 338 ( FIG. 4 ).
  • the recognition unit 331 recognizes a predetermined recognition target for every pixel of an image (hereinafter, referred to as a captured image) captured by the camera 51 (an image sensor) in FIG. 1 , and estimates reliability of a recognition result.
  • the recognition unit 331 may recognize a plurality of recognition targets. In this case, for example, a different recognition model is used for every recognition target.
  • the learning unit 332 learns a recognition model used by the recognition unit 331 .
  • the learning unit 332 may be provided in the vehicle control system 11 of FIG. 1 or may be provided outside the vehicle control system 11 .
  • the learning unit 332 may constitute a part of the recognition unit 73 , or may be provided separately from the recognition unit 73 .
  • a part of the learning unit 332 may be provided in the vehicle control system 11 , and the rest may be provided outside the vehicle control system 11 .
  • the dictionary data generation unit 333 generates dictionary data for classifying types of images.
  • the dictionary data generation unit 333 causes a dictionary data storage unit 339 ( FIG. 4 ) to store the generated dictionary data.
  • the dictionary data includes a feature pattern corresponding to each type of images.
  • the communication unit 334 constitutes, for example, a part of the communication unit 22 in FIG. 1 .
  • the communication unit 334 communicates with the server 312 via a network 321 .
  • the server 312 performs recognition processing similar to that of the recognition unit 331 by using software for a benchmark test, and executes a benchmark test for verifying accuracy of the recognition processing.
  • the server 312 transmits data including a result of the benchmark test to the information processing unit 311 via the network 321 .
  • a plurality of servers 312 may be provided.
  • FIG. 4 illustrates a detailed configuration example of the information processing unit 311 in FIG. 3 .
  • the information processing unit 311 includes a high-reliability verification image data base (DB) 335 , a low-reliability verification image data base (DB) 336 , a learning image data base (DB) 337 , the recognition model storage unit 338 , and the dictionary data storage unit 339 , in addition to the recognition unit 331 , the learning unit 332 , the dictionary data generation unit 333 , and the communication unit 334 described above.
  • DB high-reliability verification image data base
  • DB low-reliability verification image data base
  • DB learning image data base
  • the recognition model storage unit 338 includes a dictionary data storage unit 339 , in addition to the recognition unit 331 , the learning unit 332 , the dictionary data generation unit 333 , and the communication unit 334 described above.
  • the recognition unit 331 , the learning unit 332 , the dictionary data generation unit 333 , the communication unit 334 , the high-reliability verification image DB 335 , the low-reliability verification image DB 336 , the learning image DB 337 , the recognition model storage unit 338 , and the dictionary data storage unit 339 are connected to each other via a communication network 351 .
  • the communication network 351 constitutes, for example, a part of the communication network 41 in FIG. 1 .
  • the description of the communication network 351 in a case where communication is performed via the communication network 351 is to be omitted.
  • the description of the communication network 351 is to be omitted, and it is simply described that the recognition unit 331 and the recognition model learning unit 366 perform communication.
  • the learning unit 332 includes a threshold value setting unit 361 , a verification image collection unit 362 , a verification image classification unit 363 , a collection timing control unit 364 , a learning image collection unit 365 , the recognition model learning unit 366 , and a recognition model update control unit 367 .
  • the threshold value setting unit 361 sets a threshold value (hereinafter, referred to as a reliability threshold value) to be used for determination of reliability of a recognition result of a recognition model.
  • the verification image collection unit 362 collects a verification image by selecting a verification image from among images (hereinafter, referred to as verification image candidates) that are candidates for a verification image to be used for verification of a recognition model, on the basis of a predetermined condition.
  • the verification image collection unit 362 classifies the verification images into high-reliability verification images or low-reliability verification images, on the basis of reliability of a recognition result for a verification image of the currently used recognition model (hereinafter, referred to as a current recognition model) and the reliability threshold value set by the threshold value setting unit 361 .
  • the high-reliability verification image is a verification image in which the reliability of the recognition result is higher than the reliability threshold value and the recognition accuracy is favorable.
  • the low-reliability verification image is a verification image in which the reliability of the recognition result is lower than the reliability threshold value and improvement in recognition accuracy is required.
  • the verification image collection unit 362 accumulates the high-reliability verification images in the high-reliability verification image DB 335 and accumulates the low-reliability verification images in the low-reliability verification image DB 336 .
  • the verification image classification unit 363 classifies the low-reliability verification image into each type by using a feature pattern of the low-reliability verification image, on the basis of dictionary data accumulated in the dictionary data storage unit 339 .
  • the verification image classification unit 363 gives a label indicating a feature pattern of the low-reliability verification image to the verification image.
  • the collection timing control unit 364 controls a timing to collect images (hereinafter, referred to as learning image candidates) that are candidates for a learning image to be used for learning of a recognition model.
  • the learning image collection unit 365 collects the learning image by selecting the learning image from among the learning image candidates, on the basis of a predetermined condition.
  • the learning image collection unit 365 accumulates the learning images that have been collected in the learning image DB 337 .
  • the recognition model learning unit 366 learns the recognition model by using the learning images accumulated in the learning image DB 337 .
  • the recognition model update control unit 367 verifies a recognition model (hereinafter, referred to as a new recognition model) newly relearned by the recognition model learning unit 366 .
  • the recognition model update control unit 367 controls update of the recognition model on the basis of a verification result of the new recognition model.
  • the recognition model update control unit 367 updates the current recognition model stored in the recognition model storage unit 338 to the new recognition model.
  • recognition model learning processing executed by the recognition model learning unit 366 will be described.
  • This processing is executed, for example, when learning of the recognition model to be used for the recognition unit 331 is first performed.
  • step S 101 the recognition model learning unit 366 learns a recognition model.
  • the recognition model learning unit 366 learns the recognition model by using a loss function loss1 of the following Equation (1).
  • the loss function loss1 is, for example, a loss function disclosed in “Alex Kendall, Yarin Gal, “What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?”, NIPS 2017”.
  • N indicates the number of pixels of the learning image
  • i indicates an identification number for identifying a pixel of the learning image
  • Pred i indicates a recognition result (an estimation result) of the recognition target in the pixel i by the recognition model
  • GT i indicates a correct value of the recognition target in the pixel i
  • sigma i indicates reliability of the recognition result Pred i of the pixel i.
  • the recognition model learning unit 366 learns the recognition model so as to minimize a value of the loss function loss1. As a result, a recognition model capable of recognizing a predetermined recognition target and estimating reliability of the recognition result is generated.
  • the recognition model learning unit 366 learns the recognition model by using a loss function loss2 of the following Equation (2).
  • Equation (2) Note that the meaning of each symbol in Equation (2) is similar to that in Equation (1).
  • the recognition model learning unit 366 learns the recognition model so as to minimize a value of the loss function loss2. As a result, a recognition model capable of recognizing a predetermined recognition target is generated.
  • the vehicles 1 - 1 to 1 - n perform recognition processing by using recognition models 401 - 1 to 401 - n , respectively, and acquire a recognition result.
  • This recognition result is acquired, for example, as a recognition result image including a recognition value representing a recognition result in each pixel.
  • a statistics unit 402 calculates a final recognition result and reliability of the recognition result by taking statistics of the recognition results obtained by the recognition models 401 - 1 to 401 - n .
  • the final recognition result is represented by, for example, an image (a recognition result image) including an average value of recognition values for every pixel of the recognition result images obtained by the recognition models 401 - 1 to 401 - n .
  • the reliability is represented by, for example, an image (a reliability image) including a variance of the recognition value for every pixel of the recognition result images obtained by the recognition models 401 - 1 to 401 - n .
  • the statistics unit 402 is provided, for example, in the recognition units 331 of the vehicles 1 - 1 to 1 - n.
  • the recognition model learning unit 366 causes the recognition model storage unit 338 to store the recognition model obtained by learning.
  • the recognition model learning processing of FIG. 5 is individually executed for each recognition model.
  • This processing is executed, for example, before a verification image is collected.
  • step S 101 the threshold value setting unit 361 performs learning processing of a reliability threshold value. Specifically, the threshold value setting unit 361 learns a reliability threshold value i for reliability of a recognition result of a recognition model, by using a loss function loss3 of the following Equation (3).
  • Mask i (T) is a function having a value of 1 in a case where reliability sigma i of a recognition result of a pixel i is equal to or larger than the reliability threshold value ⁇ , and having a value of 0 in a case where the reliability sigma i of the recognition result of the pixel i is smaller than the reliability threshold value ⁇ .
  • the meanings of the other symbols are similar to those of the loss function loss1 of the above Equation (1).
  • the loss function loss3 is a loss function obtained by adding a loss component of the reliability threshold value ⁇ to the loss function loss1 to be used for learning of a recognition model.
  • the reliability threshold value setting processing of FIG. 7 is individually executed for each recognition model.
  • the reliability threshold value ⁇ can be appropriately set for every recognition model, in accordance with a network structure of each recognition model and a learning image used for each learning model.
  • the reliability threshold value can be dynamically updated to an appropriate value.
  • This processing is executed, for example, before a verification image is collected.
  • the recognition unit 331 performs recognition processing on an input image and obtains reliability of a recognition result. For example, the recognition unit 331 performs recognition processing on m pieces of input image by using a learned recognition model, and calculates a recognition value representing a recognition result in each pixel of each input image and reliability of the recognition value of each pixel.
  • step S 122 the threshold value setting unit 361 creates a precision-recall curve (PR curve) for the recognition result.
  • PR curve precision-recall curve
  • the threshold value setting unit 361 compares a recognition value of each pixel of each input image with a correct value, and determines whether the recognition result of each pixel of each input image is correct or incorrect. For example, the threshold value setting unit 361 determines that the recognition result of the pixel is correct when the recognition value and the correct value match, and determines that the recognition result of the pixel is incorrect when the recognition value and the correct value do not match. Alternatively, for example, the threshold value setting unit 361 determines that the recognition result of the pixel is correct when a difference between the recognition value and the correct value is smaller than a predetermined threshold value, and determines that the recognition result of the pixel is incorrect when a difference between the recognition value and the correct value is equal to or larger than the predetermined threshold value. As a result, the recognition result of each pixel of each input pixel is classified as correct or incorrect.
  • the threshold value setting unit 361 classifies individual pixels of each input image for every threshold value TH on the basis of correct/incorrect and reliability of the recognition result, while changing a threshold value TH for the reliability of the recognition value from 0 to 1 at a predetermined interval (for example, 0.01).
  • the threshold value setting unit 361 counts a number TP of pixels whose recognition result is correct and a number FP of pixels whose recognition result is incorrect, among pixels whose reliability is equal to or higher than the threshold value TH (the reliability ⁇ the threshold value TH). Furthermore, the threshold value setting unit 361 counts the number of pixels TN whose recognition result is correct and the number of pixels FN whose recognition result is incorrect, among pixels whose reliability is smaller than the threshold value TH (the reliability ⁇ the threshold value TH).
  • the threshold value setting unit 361 calculates Precision (compatibility) and Recall (reproduction ratio) of the recognition model by the following Equations (4) and (5) for every threshold value TH.
  • the threshold value setting unit 361 creates the PR curve illustrated in FIG. 9 on the basis of a combination of Precision and Recall at each threshold value TH. Note that a vertical axis of the PR curve in FIG. 9 is Precision, and a horizontal axis is Recall.
  • step S 123 the threshold value setting unit 361 acquires a result of a benchmark test of recognition processing on the input image. Specifically, the threshold value setting unit 361 uploads an input image group used in the processing of S 121 , to the server 312 via the communication unit 334 and the network 321 .
  • the server 312 performs the benchmark test by a plurality of methods. On the basis of results of the individual benchmark tests, the server 312 obtains a combination of Precision and Recall when Precision is maximum. The server 312 transmits data indicating the obtained combination of Precision and Recall, to the information processing unit 311 via the network 321 .
  • the threshold value setting unit 361 receives data indicating a combination of Precision and Recall via the communication unit 334 .
  • the threshold value setting unit 361 sets a reliability threshold value on the basis of the result of the benchmark test. For example, the threshold value setting unit 361 obtains the threshold value TH for Precision acquired from the server 312 , in the PR curve created in the processing of step S 122 . The threshold value setting unit 361 sets the obtained threshold value TH as the reliability threshold value TU.
  • the reliability threshold value I can be set such that Precision is as large as possible.
  • the reliability threshold value setting processing of FIG. 8 is individually executed for each recognition model.
  • the reliability threshold value T can be appropriately set for every recognition model.
  • the reliability threshold value can be dynamically updated to an appropriate value.
  • This processing is started, for example, when the information processing unit 311 acquires a verification image candidate that is a candidate for the verification image.
  • the verification image candidate is captured by the camera 51 and supplied to the information processing unit 311 , received from outside via the communication unit 22 , or inputted from outside via the HMI 31 .
  • the verification image collection unit 362 calculates a hash value of the verification image candidate.
  • the verification image collection unit 362 calculates a 64 bit hash value representing a feature of luminance of the verification image candidate.
  • an algorithm called Perceptual Hash disclosed in “C. Zauner, “Implementation and Benchmarking of Perceptual Image Hash Functions,” Upper Austria University of Applied Sciences, Hagenberg Campus, 2010” is used.
  • the verification image collection unit 362 calculates a minimum distance to an accumulated verification image. Specifically, the verification image collection unit 362 calculates a hamming distance between: a hash value of each verification image already accumulated in the high-reliability verification image DB 335 and the low-reliability verification image DB 336 ; and a hash value of the verification image candidate. Then, the verification image collection unit 362 sets the calculated minimum value of the hamming distance as the minimum distance.
  • the verification image collection unit 362 sets the minimum distance to a fixed value larger than a predetermined threshold value T 1 .
  • step S 203 the verification image collection unit 362 determines whether or not the minimum distance>the threshold value T 1 is satisfied. When it is determined that the minimum distance>the threshold value T 1 is satisfied, that is, in a case where a verification image similar to the verification image candidate has not been accumulated yet, the processing proceeds to step S 204 .
  • step S 204 the recognition unit 331 performs recognition processing on the verification image candidate. Specifically, the verification image collection unit 362 supplies the verification image candidate to the recognition unit 331 .
  • the recognition unit 331 performs recognition processing on the verification image candidate by using a current recognition model stored in the recognition model storage unit 338 . As a result, the recognition value and the reliability of each pixel of the verification image candidate are calculated, and a recognition result image including the recognition value of each pixel and a reliability image including the reliability of each pixel are generated.
  • the recognition unit 331 supplies the recognition result image and the reliability image to the verification image collection unit 362 .
  • step S 205 the verification image collection unit 362 extracts a target region of the verification image.
  • the verification image collection unit 362 calculates an average value (hereinafter, referred to as average reliability) of the reliability of each pixel of the reliability image.
  • average reliability an average value of the reliability of each pixel of the reliability image.
  • the verification image collection unit 362 sets the entire verification image candidate as a target of the verification image.
  • the verification image collection unit 362 compares the reliability of each pixel of the reliability image with the reliability threshold value ⁇ .
  • the verification image collection unit 362 classifies individual pixels of the reliability image into a pixel (hereinafter, referred to as a high-reliability pixel) whose reliability is higher than the reliability threshold value ⁇ , and a pixel (hereinafter, referred to as a low reliability pixel) whose reliability is equal to or lower than the reliability threshold value ⁇ .
  • the verification image collection unit 362 segments the reliability image into a region with high reliability (hereinafter, referred to as a high reliability region) and a region with low reliability (hereinafter, referred to as a low reliability region), by using a predetermined clustering method.
  • the verification image collection unit 362 extracts an image including a rectangular region including the high reliability region from the verification image candidate, to update to the verification image candidate.
  • the verification image collection unit 362 updates the verification image candidate by extracting an image including a rectangular region including the low reliability region from the verification image candidate.
  • step S 206 the verification image collection unit 362 calculates recognition accuracy of the verification image candidate. For example, the verification image collection unit 362 calculates Precision for the verification image candidate as the recognition accuracy, by using the reliability threshold value ⁇ by a method similar to the processing in step S 121 in FIG. 8 described above.
  • step S 207 the verification image collection unit 362 determines whether or not the average reliability of the verification image candidates is larger than the reliability threshold value ⁇ (whether or not the average reliability of the verification image candidate>the reliability threshold value ⁇ is satisfied). In a case where it is determined that the average reliability of the verification image candidate is larger than the reliability threshold value ⁇ (the average reliability of the verification image candidate>the reliability threshold value ⁇ is satisfied), the processing proceeds to step S 208 .
  • step S 208 the verification image collection unit 362 accumulates the verification image candidate as the high-reliability verification image.
  • the verification image collection unit 362 generates verification image data in a format illustrated in FIG. 11 , and accumulates the verification image data in the high-reliability verification image DB 335 .
  • the verification image data includes a number, a verification image, a hash value, reliability, and recognition accuracy.
  • the number is a number for identifying the verification image.
  • the hash value calculated in the processing of step S 201 is set as the hash value.
  • the hash value in the extracted image is calculated and set as the hash value of the verification image data.
  • the average reliability calculated in the processing of step S 205 is set. However, in a case where a part of the verification image candidate is extracted in the processing of step S 205 , the average reliability in the extracted image is calculated and set as the reliability of the verification image data.
  • the recognition accuracy calculated in the processing of step S 206 is set.
  • step S 209 the verification image collection unit 362 determines whether or not the number of high-reliability verification images is larger than a threshold value N (whether or not the number of high-reliability verification images>the threshold value N is satisfied).
  • the verification image collection unit 362 checks the number of high-reliability verification images accumulated in the high-reliability verification image DB 335 , and the processing proceeds to step S 210 when the verification image collection unit 362 determines that the number of high-reliability verification images is larger than the threshold value N (the number of high-reliability verification images>the threshold value N is satisfied).
  • the verification image collection unit 362 deletes the high-reliability verification image having the closest distance to the new verification image. Specifically, the verification image collection unit 362 individually calculates each hamming distance between: a hash value of a verification image newly accumulated in the high-reliability verification image DB 335 ; and a hash value of each high-reliability verification image already accumulated in the high-reliability verification image DB 335 . Then, the verification image collection unit 362 deletes the high-reliability verification image having the closest hamming distance to the newly accumulated verification image, from the high-reliability verification image DB 335 . That is, the high-reliability verification image most similar to the new verification image is deleted.
  • step S 209 determines that the number of high-reliability verification images is equal to or less than the threshold value N (the number of high-reliability verification images ⁇ the threshold value N is satisfied).
  • the processing in step S 210 is skipped, and the verification image collection processing ends.
  • step S 207 determines whether the average reliability of the verification image is equal to or lower than the reliability threshold value ⁇ (the average reliability of the verification image ⁇ the reliability threshold value ⁇ is satisfied).
  • step S 211 the verification image collection unit 362 accumulates the verification image candidate as the low-reliability verification image in the low-reliability verification image DB 336 by processing similar to step S 208 .
  • step S 211 the verification image collection unit 362 determines whether or not the number of low-reliability verification images is larger than the threshold value N (whether or not the number of low-reliability verification images>the threshold value N is satisfied).
  • the verification image collection unit 362 checks the number of low-reliability verification images accumulated in the low-reliability verification image DB 336 , and the processing proceeds to step S 212 when the verification image collection unit 362 determines that the number of low-reliability verification images is larger than the threshold value N (the number of low-reliability verification images>the threshold value N is satisfied).
  • the verification image collection unit 362 deletes the low-reliability verification image having the closest distance to the new verification image. Specifically, the verification image collection unit 362 individually calculates a hamming distance between: a hash value of a verification image newly accumulated in the low-reliability verification image DB 336 ; and a hash value of each low-reliability verification image already accumulated in the low-reliability verification image DB 336 . Then, the verification image collection unit 362 deletes the low-reliability verification image having the closest hamming distance to the newly accumulated verification image, from the low-reliability verification image DB 336 . That is, the low-reliability verification image most similar to the new verification image is deleted.
  • step S 212 determines that the number of low-reliability verification images is equal to or less than the threshold value N (the number of low-reliability verification images ⁇ the threshold value N is satisfied).
  • the processing in step S 213 is skipped, and the verification image collection processing ends.
  • step S 203 when it is determined in step S 203 that the minimum distance is equal to or less than the threshold value T 1 (the minimum distance ⁇ the threshold value T 1 is satisfied), that is, in a case where a verification image similar to the verification image candidate has already been accumulated, the processing of steps S 204 to S 213 is skipped, and the verification image collection processing ends. In this case, the verification image candidate is not selected as the verification image and is discarded.
  • this verification image collection processing is repeated, and verification images of an amount necessary for determining whether or not to update the model after relearning of the recognition model are accumulated in the high-reliability verification image DB 335 and the low-reliability verification image DB 336 .
  • the verification image collection processing of FIG. 10 may be individually executed for each recognition model, and a different verification image group may be collected for every recognition model.
  • dictionary data generation processing executed by the dictionary data generation unit 333 will be described.
  • This processing is started, for example, when a learning image group including learning images for a plurality of pieces of dictionary data is inputted to the information processing unit 311 .
  • Each learning image included in the learning image group includes a feature that causes decrease in recognition accuracy, and a label indicating the feature is given. Specifically, images including the following features are used.
  • step S 231 the dictionary data generation unit 333 normalizes a learning image.
  • the dictionary data generation unit 333 normalizes each learning image such that vertical and horizontal resolutions (the number of pixels) have predetermined values.
  • the dictionary data generation unit 333 increases the number of learning images. Specifically, the dictionary data generation unit 333 increases the number of learning images by performing various types of image processing on each normalized learning image. For example, the dictionary data generation unit 333 generates a plurality of learning images from one learning image by individually performing image processing such as addition of Gaussian noise, horizontal inversion, vertical inversion, addition of image blur, and color change, on the learning image. Note that the generated learning image is given with a label same as the original learning image.
  • the dictionary data generation unit 333 generates dictionary data on the basis of the learning image. Specifically, the dictionary data generation unit 333 performs machine learning using each normalized learning image and each learning image generated from each normalized learning image, and generates a classifier that classifies labels of images as the dictionary data. For machine learning, for example, support vector machine (SVMV) is used, and dictionary data (the classifier) is expressed by the following Equation (6).
  • SVMV support vector machine
  • W represents a weight
  • X represents an input image
  • b represents a constant
  • label represents a predicted value of a label of the input image.
  • the dictionary data generation unit 333 causes the dictionary data storage unit 339 to store dictionary data and a learning image group used to generate the dictionary data.
  • verification image classification processing executed by the verification image classification unit 363 will be described.
  • step S 251 the verification image classification unit 363 normalizes a verification image.
  • the verification image classification unit 363 acquires a verification image having the largest number (most recently accumulated) among unclassified verification images accumulated in the low-reliability verification image DB 336 .
  • the verification image classification unit 363 normalizes the acquired verification image by processing similar to step S 231 in FIG. 12 .
  • step S 252 the verification image classification unit 363 classifies the verification image on the basis of the dictionary data stored in the dictionary data storage unit 339 . That is, the verification image classification unit 363 supplies a label obtained by substituting the verification image into the above-described Equation (6), to the learning image collection unit 365 .
  • This verification image classification processing is executed for all the verification images accumulated in the low-reliability verification image DB 336 .
  • This processing is started, for example, when an operation for activating the vehicle 1 and starting driving is performed, for example, when an ignition switch, a power switch, a start switch, or the like of the vehicle 1 is turned ON. Furthermore, this processing ends, for example, when an operation for ending driving of the vehicle 1 is performed, for example, when the ignition switch, the power switch, the start switch, or the like of the vehicle 1 is turned OFF.
  • step S 301 the collection timing control unit 364 determines whether or not it is a timing to collect the learning image candidates. This determination processing is repeatedly executed until it is determined that it is the timing to collect the learning image candidates. Then, in a case where a predetermined condition is satisfied, the learning image collection unit 365 determines that it is the timing to collect the learning image candidates, and the processing proceeds to step S 302 .
  • a timing is assumed at which an image having a feature different from that of a learning image used for learning of a recognition model in the past can be collected.
  • a timing is assumed at which it is possible to collect an image obtained by capturing a place where high recognition accuracy is required or a place where the recognition accuracy is likely to decrease.
  • the place where high recognition accuracy is required for example, a place where an accident is likely to occur, a place with a large traffic volume, or the like is assumed. Specifically, for example, the following cases are assumed.
  • a timing is assumed at which a factor that causes decrease in recognition accuracy of the recognition model has occurred. Specifically, for example, the following cases are assumed.
  • step S 302 the learning image collection unit 365 acquires a learning image candidate.
  • the learning image collection unit 365 acquires a captured image captured by the camera 51 as the learning image candidate.
  • the learning image collection unit 365 acquires an image received from outside via the communication unit 334 , as the learning image candidate.
  • step S 303 the learning image collection unit 365 performs pattern recognition of the learning image candidate.
  • the learning image collection unit 365 performs product-sum operation of the above-described Equation (6) on an image in each target region by using the dictionary data stored in the dictionary data storage unit 339 , while scanning a target region to be subjected to pattern recognition in a learning image candidate in a predetermined direction. As a result, a label indicating a feature of each region of the learning image candidate is obtained.
  • step S 304 the learning image collection unit 365 determines whether or not the learning image candidate includes a feature to be a collection target. In a case where there is no label matching the label representing the recognition result of the low-reliability verification image described above among the labels given to the individual regions of the learning image candidates, the learning image collection unit 365 determines that the learning image candidate does not include a feature to be the collection target, and the processing returns to step S 301 . In this case, the learning image candidate is not selected as the learning image and is discarded.
  • steps S 301 to S 304 are repeatedly executed until it is determined in step S 304 that the learning image candidate includes a feature to be a collection target.
  • step S 304 in a case where there is a label matching the label representing the recognition result of the low-reliability verification image described above among the labels given to the individual regions of the learning image candidates, the learning image collection unit 365 determines that the learning image candidate includes a feature to be the collection target, and the processing proceeds to step S 305 .
  • step S 305 the learning image collection unit 365 calculates a hash value of the learning image candidate by processing similar to that in step S 201 in FIG. 10 described above.
  • step S 306 the learning image collection unit 365 calculates a minimum distance to an accumulated learning image. Specifically, the learning image collection unit 365 calculates a hamming distance between: a hash value of each learning image already accumulated in the learning image DB 337 ; and a hash value of the learning image candidate. Then, the learning image collection unit 365 sets the calculated minimum value of the hamming distance as the minimum distance.
  • step S 307 the learning image collection unit 365 determines whether or not the minimum distance>a threshold value T 2 is satisfied. In a case where that the minimum distance>the threshold value T 2 is satisfied, that is, in a case where a learning image similar to the learning image candidate has not been accumulated yet, the processing proceeds to step S 308 .
  • step S 308 the learning image collection unit 365 accumulates the learning image candidate as the learning image.
  • the learning image collection unit 365 generates learning image data in a format illustrated in FIG. 15 , and accumulates the learning image data in the learning image DB 337 .
  • the learning image data includes a number, a learning image, and a hash value.
  • the number is a number for identifying the learning image.
  • the hash value calculated in the processing of step S 305 is set as the hash value.
  • step S 301 the processing in and after step S 301 is executed.
  • step S 307 when it is determined in step S 307 that the minimum distance ⁇ the threshold value T 2 is satisfied, that is, in a case where a learning image similar to the learning image candidate has already been accumulated, the processing returns to step S 301 . That is, in this case, the learning image candidate is not selected as the learning image and is discarded.
  • step S 301 Thereafter, the processing in and after step S 301 is executed.
  • the learning image collection processing of FIG. 14 may be executed individually for each recognition model, and the learning image may be collected for every recognition model.
  • recognition model update processing executed by the information processing unit 311 will be described.
  • This processing is executed, for example, at a predetermined timing. For example, a case is assumed in which an accumulation amount of learning images in the learning image DB 337 exceeds a predetermined threshold value, or the like.
  • step S 401 the recognition model learning unit 366 learns a recognition model by using learning images accumulated in the learning image DB 337 , similarly to the processing in step S 101 in FIG. 5 .
  • the recognition model learning unit 366 supplies the generated recognition model to the recognition model update control unit 367 .
  • step S 402 the recognition model update control unit 367 executes recognition model verification processing using a high-reliability verification image.
  • step S 421 the recognition model update control unit 367 acquires a high-reliability verification image. Specifically, among the high-reliability verification images accumulated in the high-reliability verification image DB 335 , the recognition model update control unit 367 acquires one high-reliability verification image that is not yet used for verification of a recognition model, from the high-reliability verification image DB 335 .
  • step S 422 the recognition model update control unit 367 calculates recognition accuracy for the verification image. Specifically, the recognition model update control unit 367 performs recognition processing on the acquired high-reliability verification image by using the recognition model (a new recognition model) obtained in the processing of step S 401 . Furthermore, the recognition model update control unit 367 calculates the recognition accuracy of the high-reliability verification image by processing similar to step S 206 in FIG. 10 described above.
  • step S 423 the recognition model update control unit 367 determines whether or not the recognition accuracy has decreased.
  • the recognition model update control unit 367 compares the recognition accuracy calculated in the processing of step S 422 with the recognition accuracy included in the verification image data including the target high-reliability verification image. That is, the recognition model update control unit 367 compares the recognition accuracy of the new recognition model for the high-reliability verification image with the recognition accuracy of the current recognition model for the high-reliability verification image. In a case where the recognition accuracy of the new recognition model is equal to or higher than the recognition accuracy of the current recognition model, the recognition model update control unit 367 determines that the recognition accuracy has not decreased, and the processing proceeds to step S 424 .
  • step S 424 the recognition model update control unit 367 determines whether or not verification of all the high-reliability verification images has ended. In a case where a high-reliability verification image that has not been verified yet remains in the high-reliability verification image DB 335 , the recognition model update control unit 367 determines that the verification of all the high-reliability verification images has not ended yet, and the processing returns to step S 421 .
  • steps S 421 to S 424 are repeatedly executed until it is determined in step S 423 that the recognition accuracy has decreased or it is determined in step S 424 that the verification of all the high-reliability verification images has ended.
  • step S 424 when it is determined in step S 424 that the verification of all the high-reliability verification images has ended, the recognition model verification processing ends. This is a case where the recognition accuracy of the new recognition model is equal to or higher than the recognition accuracy of the current recognition model for all the high-reliability verification images.
  • step S 423 in a case where the recognition accuracy of the new recognition model is lower than the recognition accuracy of the current recognition model, the recognition model update control unit 367 determines that the recognition accuracy has decreased, and the recognition model verification processing ends. This is a case where there is a high-reliability verification image in which the recognition accuracy of the new recognition model is lower than the recognition accuracy of the current recognition model.
  • step S 403 the recognition model update control unit 367 determines whether or not there is a high-reliability verification image whose recognition accuracy has decreased. In a case where the recognition model update control unit 367 determines that there is no high-reliability verification image in which the recognition accuracy of the new recognition model has decreased as compared with that of the current recognition model on the basis of the result of the processing in step S 402 , the processing proceeds to step S 404 .
  • step S 404 the recognition model update control unit 367 executes recognition model verification processing using a low-reliability verification image.
  • step S 441 the recognition model update control unit 367 acquires a low-reliability verification image. Specifically, among the low-reliability verification images accumulated in the low-reliability verification image DB 336 , the recognition model update control unit 367 acquires one low-reliability verification image that has not yet been used for verification of a recognition model, from the low-reliability verification image DB 336 .
  • step S 442 the recognition model update control unit 367 calculates recognition accuracy for the verification image. Specifically, the recognition model update control unit 367 performs recognition processing on the acquired low-reliability verification image by using the recognition model (a new recognition model) obtained in the processing of step S 401 . Furthermore, the recognition model update control unit 367 calculates the recognition accuracy of the low-reliability verification image by processing similar to step S 206 in FIG. 10 described above.
  • step S 443 the recognition model update control unit 367 determines whether or not the recognition accuracy has been improved.
  • the recognition model update control unit 367 compares the recognition accuracy calculated in the processing of step S 442 with the recognition accuracy included in the verification image data including the target low-reliability verification image. That is, the recognition model update control unit 367 compares the recognition accuracy of the new recognition model for the low-reliability verification image with the recognition accuracy of the current recognition model for the low-reliability verification image. In a case where the recognition accuracy of the new recognition model exceeds the recognition accuracy of the current recognition model, the recognition model update control unit 367 determines that the recognition accuracy has been improved, and the processing proceeds to step S 444 .
  • step S 444 the recognition model update control unit 367 determines whether or not verification of all the low-reliability verification images has ended. In a case where a low-reliability verification image that has not been verified yet remains in the low-reliability verification image DB 336 , the recognition model update control unit 367 determines that the verification of all the low-reliability verification images has not ended yet, and the processing returns to step S 441 .
  • steps S 441 to S 444 are repeatedly executed until it is determined in step S 443 that the recognition accuracy is not improved or it is determined in step S 444 that the verification of all the low-reliability verification images has ended.
  • step S 444 when it is determined in step S 444 that the verification of all the low-reliability verification images has ended, the recognition model verification processing ends. This is a case where the recognition accuracy of the new recognition model exceeds the recognition accuracy of the current recognition model for all the low-reliability verification images.
  • step S 423 in a case where the recognition accuracy of the new recognition model is equal to or lower than the recognition accuracy of the current recognition model, the recognition model update control unit 367 determines that the recognition accuracy is not improved, and the recognition model verification processing ends. This is a case where there is a low-reliability verification image in which the recognition accuracy of the new recognition model is equal to or lower than the recognition accuracy of the current recognition model.
  • step S 405 the recognition model update control unit 367 determines whether or not there is a low-reliability verification image whose recognition accuracy has not been improved. In a case where the recognition model update control unit 367 determines that there is no high-reliability verification image in which the recognition accuracy of the new recognition model is not improved as compared with the current recognition model on the basis of the result of the processing in step S 404 , the processing proceeds to step S 406 .
  • step S 406 the recognition model update control unit 367 updates the recognition model. Specifically, the recognition model update control unit 367 updates the current recognition model stored in the recognition model storage unit 338 to the new recognition model.
  • step S 405 when the recognition model update control unit 367 determines that there is a high-reliability verification image in which the recognition accuracy of the new recognition model is not improved as compared with the current recognition model on the basis of the result of the processing in step S 404 , the processing in step S 406 is skipped, and the recognition model update processing ends. In this case, the recognition model is not updated.
  • step S 403 in a case where the recognition model update control unit 367 determines that there is a high-reliability verification image in which the recognition accuracy of the new recognition model has decreased as compared with that of the current recognition model on the basis of the result of the processing in step S 402 , the processing in steps S 403 to S 406 is skipped, and the recognition model update processing ends. In this case, the recognition model is not updated.
  • the recognition model update processing of FIG. 16 is individually executed for each recognition model, and the recognition models are individually updated.
  • the recognition model can be efficiently relearned, and the recognition accuracy of the recognition model can be improved. Furthermore, by dynamically setting the reliability threshold value ⁇ for every recognition model, the verification accuracy of each recognition model is improved, and as a result, the recognition accuracy of each recognition model is improved.
  • the collection timing control unit 364 may control a timing to collect the learning image candidates on the basis of an environment in which the vehicle 1 is traveling. For example, the collection timing control unit 364 may control to collect the learning image candidates in a case where the vehicle 1 is traveling in rain, snow, smog, or haze, which causes decrease in recognition accuracy of the recognition model.
  • a machine learning method to which the present technology is applied is not particularly limited.
  • the present technology is applicable to both supervised learning and unsupervised learning.
  • a way of giving correct data is not particularly limited.
  • the recognition unit 331 performs depth recognition of a captured image captured by the camera 51 , correct data is generated on the basis of data acquired by the LiDAR 53 .
  • the present technology can also be applied to a case of learning a recognition model for recognizing a predetermined recognition target using sensing data (for example, the radar 52 , the LiDAR 53 , the ultrasonic sensor 54 , and the like) other than an image.
  • sensing data for example, the radar 52 , the LiDAR 53 , the ultrasonic sensor 54 , and the like
  • learning data and verification data for example, point cloud, millimeter wave data, and the like
  • the present technology can also be applied to a case of learning a recognition model for recognizing a predetermined recognition target by using two or more types of sensing data including an image.
  • the present technology can also be applied to, for example, a case of learning a recognition model for recognizing a recognition target in the vehicle 1 .
  • the present technology can also be applied to, for example, a case of learning a recognition model for recognizing a recognition target around or inside a mobile object other than a vehicle.
  • a mobile object such as a motorcycle, a bicycle, a personal mobility, an airplane, a ship, a construction machine, an agricultural machine (tractor) and the like are assumed.
  • the mobile object to which the present technology can be applied also includes, for example, a mobile object that is remotely driven (operated) without being boarded by a user, such as a drone or a robot.
  • the present technology can also be applied to, for example, a case of learning a recognition model for recognizing a recognition target in a place other than a mobile object.
  • the series of processes described above can be executed by hardware or also executed by software.
  • a program that configures the software is installed in a computer.
  • examples of the computer include, for example, a computer that is built in dedicated hardware, a general-purpose personal computer that can perform various functions by being installed with various programs, and the like.
  • FIG. 19 is a block diagram illustrating a configuration example of hardware of a computer that executes the series of processes described above in accordance with a program.
  • a central processing unit (CPU) 1001 a central processing unit (CPU) 1001 , a read only memory (ROM) 1002 , and a random access memory (RAN) 1003 are mutually connected by a bus 1004 .
  • CPU central processing unit
  • ROM read only memory
  • RAN random access memory
  • the bus 1004 is further connected with an input/output interface 1005 .
  • an input unit 1006 To the input/output interface 1005 , an input unit 1006 , an output unit 1007 , a recording unit 1008 , a communication unit 1009 , and a drive 1010 are connected.
  • the input unit 1006 includes an input switch, a button, a microphone, an image sensor, and the like.
  • the output unit 1007 includes a display, a speaker, and the like.
  • the recording unit 1008 includes a hard disk, a non-volatile memory, and the like.
  • the communication unit 1009 includes a network interface or the like.
  • the drive 1010 drives a removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • the series of processes described above are performed, for example, by the CPU 1001 loading a program recorded in the recording unit 1008 into the RAM 1003 via the input/output interface 1005 and the bus 1004 , and executing.
  • the program executed by the computer 1000 can be provided by being recorded on, for example, the removable medium 1011 as a package medium or the like. Furthermore, the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
  • the program can be installed in the recording unit 1008 via the input/output interface 1005 . Furthermore, the program can be received by the communication unit 1009 via a wired or wireless transmission medium, and installed in the recording unit 1008 . Besides, the program can be installed in advance in the ROM 1002 and the recording unit 1008 .
  • the program executed by the computer may be a program that performs processing in time series according to an order described in this specification, or may be a program that performs processing in parallel or at necessary timing such as when a call is made.
  • the system means a set of a plurality of components (a device, a module (a part), and the like), and it does not matter whether or not all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a single device with a plurality of modules housed in one housing are both systems.
  • the present technology can have a cloud computing configuration in which one function is shared and processed in cooperation by a plurality of devices via a network.
  • each step described in the above-described flowchart can be executed by one device, and also shared and executed by a plurality of devices.
  • one step includes a plurality of processes
  • the plurality of processes included in the one step can be executed by one device, and also shared and executed by a plurality of devices.
  • the present technology can also have the following configurations.
  • An information processing apparatus including:
  • a collection timing control unit configured to control a timing to collect a learning image candidate that is an image to be a candidate for a learning image to be used in relearning of a recognition model
  • a learning image collection unit configured to select the learning image from among the learning image candidates that have been collected, on the basis of at least one of a feature of the learning image candidate or a similarity to the learning image that has been accumulated.
  • the recognition model is used to recognize a predetermined recognition target around a vehicle
  • the learning image collection unit selects the learning image from among the learning image candidates including an image obtained by capturing an image of surroundings of the vehicle by an image sensor installed in the vehicle.
  • the collection timing control unit controls a timing to collect the learning image candidate on the basis of at least one of a place or an environment in which the vehicle is traveling.
  • the collection timing control unit performs control to collect the learning image candidate in at least one of a place where the learning image candidate has not been collected, a vicinity of a newly installed construction site, or a vicinity of a place where an accident of a vehicle including a system similar to a vehicle control system provided in the vehicle has occurred.
  • the collection timing control unit performs control to collect the learning image candidate when reliability of a recognition result by the recognition model has decreased while the vehicle is traveling.
  • the collection timing control unit performs control to collect the learning image candidate when at least one of a change of the image sensor installed in the vehicle or a change of an installation position of the image sensor occurs.
  • the collection timing control unit when the vehicle receives an image from outside, the collection timing control unit performs control to collect the received image as the learning image candidate.
  • the learning image collection unit selects the learning image from among the learning image candidates including at least one of a backlight region, a shadow, a reflector, a region in which a similar pattern is repeated, a construction site, an accident site, rain, snow, smog, or haze.
  • the information processing apparatus according to any one of (1) to (8) above, further including:
  • a verification image collection unit configured to select the verification image from among verification image candidates that are images to be a candidate for the verification image to be used for verification of the recognition model, on the basis of similarity to the verification image that has been accumulated.
  • the information processing apparatus further including:
  • a learning unit configured to relearn the recognition model by using the learning image that has been collected
  • a recognition model update control unit configured to control update of the recognition model on the basis of a result of comparison between: recognition accuracy of a first recognition for the verification image, the first recognition model being the recognition model before relearning; and recognition accuracy of a second recognition model for the verification image, the second recognition model being the recognition model obtained by relearning.
  • the verification image collection unit classifies the verification image into a high-reliability verification image having high reliability or a low-reliability verification image having low reliability, and
  • the recognition model update control unit updates the first recognition model to the second recognition model in a case where recognition accuracy of the second recognition model for the high-reliability verification image has not decreased as compared with recognition accuracy of the first recognition model for the high-reliability verification image, and recognition accuracy of the second recognition model for the low-reliability verification image has been improved as compared with recognition accuracy of the first recognition model for the low-reliability verification image.
  • the recognition model recognizes a predetermined recognition target for every pixel of an input image and estimates reliability of a recognition result
  • the verification image collection unit extracts a region to be used for the verification image in the verification image candidate, on the basis of a result of comparison between: reliability of a recognition result for every pixel of the verification image candidate by the recognition model; and a threshold value that is dynamically set.
  • the information processing apparatus further including:
  • a threshold value setting unit configured to learn the threshold value by using a loss function obtained by adding a loss component of the threshold value to a loss function to be used for learning the recognition model.
  • the information processing apparatus further including:
  • a threshold value setting unit configured to set the threshold value, on the basis of a recognition result for an input image by the recognition model and a recognition result for the input image by software for a benchmark test for recognizing a recognition target same as a recognition target of the recognition model.
  • the information processing apparatus according to any one of (12) to (14), further including:
  • a recognition model learning unit configured to relearn the recognition model by using a loss function including the reliability.
  • the information processing apparatus according to any one of (1) to (15), further including:
  • a recognition unit configured to recognize a predetermined recognition target by using the recognition model and estimate reliability of a recognition result.
  • the recognition unit estimates the reliability by taking statistics with a recognition result by another recognition model.
  • the information processing apparatus further including:
  • a learning unit configured to relearn the recognition model by using the learning image that has been collected.
  • An information processing method including,
  • a program for causing a computer to execute processing including:

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Automation & Control Theory (AREA)
  • Mechanical Engineering (AREA)
  • Mathematical Physics (AREA)
  • Transportation (AREA)
  • Traffic Control Systems (AREA)
US18/252,219 2020-11-17 2021-11-04 Information processing apparatus, information processing method, and program Pending US20230410486A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2020190708 2020-11-17
JP2020-190708 2020-11-17
PCT/JP2021/040484 WO2022107595A1 (ja) 2020-11-17 2021-11-04 情報処理装置、情報処理方法、及び、プログラム

Publications (1)

Publication Number Publication Date
US20230410486A1 true US20230410486A1 (en) 2023-12-21

Family

ID=81708794

Family Applications (1)

Application Number Title Priority Date Filing Date
US18/252,219 Pending US20230410486A1 (en) 2020-11-17 2021-11-04 Information processing apparatus, information processing method, and program

Country Status (2)

Country Link
US (1) US20230410486A1 (ja)
WO (1) WO2022107595A1 (ja)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004363988A (ja) * 2003-06-05 2004-12-24 Daihatsu Motor Co Ltd 車両検出方法及び車両検出装置
JP5333080B2 (ja) * 2009-09-07 2013-11-06 株式会社日本自動車部品総合研究所 画像認識システム
JP6573193B2 (ja) * 2015-07-03 2019-09-11 パナソニックIpマネジメント株式会社 判定装置、判定方法、および判定プログラム
WO2019077685A1 (ja) * 2017-10-17 2019-04-25 本田技研工業株式会社 走行モデル生成システム、走行モデル生成システムにおける車両、処理方法およびプログラム
US11681294B2 (en) * 2018-12-12 2023-06-20 Here Global B.V. Method and system for prediction of roadwork zone
JP2020140644A (ja) * 2019-03-01 2020-09-03 株式会社日立製作所 学習装置および学習方法

Also Published As

Publication number Publication date
WO2022107595A1 (ja) 2022-05-27

Similar Documents

Publication Publication Date Title
US11531354B2 (en) Image processing apparatus and image processing method
JPWO2019077999A1 (ja) 撮像装置、画像処理装置、及び、画像処理方法
WO2021241189A1 (ja) 情報処理装置、情報処理方法、およびプログラム
US20240054793A1 (en) Information processing device, information processing method, and program
US20220383749A1 (en) Signal processing device, signal processing method, program, and mobile device
EP4160526A1 (en) Information processing device, information processing method, information processing system, and program
WO2022158185A1 (ja) 情報処理装置、情報処理方法、プログラムおよび移動装置
US20220277556A1 (en) Information processing device, information processing method, and program
US20230289980A1 (en) Learning model generation method, information processing device, and information processing system
US20230251846A1 (en) Information processing apparatus, information processing method, information processing system, and program
US20230245423A1 (en) Information processing apparatus, information processing method, and program
US20230410486A1 (en) Information processing apparatus, information processing method, and program
US20220012552A1 (en) Information processing device and information processing method
WO2023054090A1 (ja) 認識処理装置、認識処理方法、および認識処理システム
WO2024024471A1 (ja) 情報処理装置、情報処理方法、及び、情報処理システム
US20230377108A1 (en) Information processing apparatus, information processing method, and program
US20230418586A1 (en) Information processing device, information processing method, and information processing system
US20230206596A1 (en) Information processing device, information processing method, and program
US20230022458A1 (en) Information processing device, information processing method, and program
WO2023149089A1 (ja) 学習装置、学習方法及び学習プログラム
WO2023032276A1 (ja) 情報処理装置、情報処理方法、及び、移動装置
US20230244471A1 (en) Information processing apparatus, information processing method, information processing system, and program
US20230315425A1 (en) Information processing apparatus, information processing method, information processing system, and program
WO2023090001A1 (ja) 情報処理装置、および情報処理方法、並びにプログラム
WO2020203241A1 (ja) 情報処理方法、プログラム、及び、情報処理装置

Legal Events

Date Code Title Description
AS Assignment

Owner name: SONY GROUP CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TIAN, GUIFEN;REEL/FRAME:063577/0979

Effective date: 20230329

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION