WO2020026318A1 - Distracted driving predictive system - Google Patents

Distracted driving predictive system Download PDF

Info

Publication number
WO2020026318A1
WO2020026318A1 PCT/JP2018/028514 JP2018028514W WO2020026318A1 WO 2020026318 A1 WO2020026318 A1 WO 2020026318A1 JP 2018028514 W JP2018028514 W JP 2018028514W WO 2020026318 A1 WO2020026318 A1 WO 2020026318A1
Authority
WO
WIPO (PCT)
Prior art keywords
driver
risk
data
vehicle
accident
Prior art date
Application number
PCT/JP2018/028514
Other languages
French (fr)
Inventor
Atsushi Suyama
Takuya Kudo
Motoaki Hayashi
Kazuhito Nomura
Kenichi Matsui
Congwei Dang
Original Assignee
Accenture Global Solutions Limited
Sompo Holdings, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Accenture Global Solutions Limited, Sompo Holdings, Inc. filed Critical Accenture Global Solutions Limited
Priority to JP2021504845A priority Critical patent/JP7453209B2/en
Priority to PCT/JP2018/028514 priority patent/WO2020026318A1/en
Publication of WO2020026318A1 publication Critical patent/WO2020026318A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/16Anti-collision systems
    • G08G1/164Centralised systems, e.g. external to vehicles
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C5/00Registering or indicating the working of vehicles
    • G07C5/008Registering or indicating the working of vehicles communicating information to a remotely located station

Definitions

  • This disclosure generally relates to methods, systems, apparatuses, and computer readable media for reducing driving risks.
  • Drivers of vehicles can benefit from information that helps them to avoid accidents and suggestions that assist them with safe driving.
  • Insurance companies and the like require driver data and driving environment data to accurately assess potential driving risks. If the insurance companies know the trends of risky driving patterns then they can take proactive steps to lower such risks.
  • Public agencies, law makers, and the like need driver and driving environment data to optimize regulations and guidelines to increase driving safety. Conventionally, such data and information have not been provided in an accurate and speedy way.
  • a predictive system comprising: one or more sensors configured to: detect data related to at least one of: a driver, a vehicle of the driver, and an external environment of the vehicle, and transmit the detected data; and a server configured to receive the detected data.
  • the server may comprise: a receiver configured to receive the detected data; a processor coupled to a memory, the processor configured to: determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident, calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data, and continually update the machine-learning data based on new detected data; and a transmitter configured to output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  • the processor may be configured to generate time-based predictions of driver risk based on the machine-learning data.
  • the calculation of the at least one of the probability that the event will occur, the risk score, and the information related to a reduction of risk may be based on risk scores related to at least one of the driver, the vehicle of the driver, and the external environment of the vehicle.
  • the generating the time-based predictions of driver risk may comprise generating an initial prediction of driver risk by detecting the data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to at least one of a driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
  • the processor may be configured to automatically generate labels for the near-accident indicators, wherein the labels are to be used to build models for future near-accident indicator prediction.
  • a method comprising: receiving detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle; determining one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident; calculating at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data; continually updating the machine-learning data based on new detected data; and outputting the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  • an apparatus comprising: a receiver configured to receive detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle; a processor coupled to a memory, the processor configured to: determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident, calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data, and continually update the machine-learning data based on new detected data; and a transmitter configured to output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  • a non-transitory computer readable medium comprising a set of instructions, which when executed by one or more processors of a device, cause the one or more processors to: receive detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle; determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data; continually update the machine-learning data based on new detected data; and output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  • FIG. 1 illustrates a schematic diagram of an overall system architecture according to an exemplary embodiment
  • FIG. 2A illustrates a method of reducing driving risks according to an exemplary embodiment
  • FIG. 2B illustrates a method of reducing driving risks according to an exemplary embodiment
  • FIG. 2C illustrates a method of reducing driving risks according to an exemplary embodiment
  • FIG. 3 illustrates a block diagram of a computing platform according to an exemplary embodiment
  • FIG. 4 illustrates a schematic diagram of a risk scoring framework using near-accident indicators and machine-learning (ML)-based prediction, according to an exemplary embodiment
  • FIG. 5 illustrates near-accident indicators according to an exemplary embodiment
  • FIG. 6 illustrates a schematic diagram for combining data from various sensors to assess a distraction score, according to an exemplary embodiment
  • FIG. 7 illustrates a schematic diagram of a probabilistic machine-learning process according to an exemplary embodiment
  • FIG. 8 illustrates time-series prediction models according to an exemplary embodiment
  • FIG. 9 illustrates an indicator distribution to derive a distraction score via an empirical formula, according to an exemplary embodiment
  • FIG. 10 illustrates a schematic diagram for automatically generating ML frameworks, according to an exemplary embodiment
  • FIG. 11 illustrates a data frame segment, which is used as input data for model building and to transmit information, according to an exemplary embodiment
  • FIG. 12 illustrates an instance that a Time-Before-Collision Indicator can be used according to an exemplary embodiment
  • FIG. 13 illustrates a training data generation engine according to an exemplary embodiment
  • FIG. 14 illustrates an exemplary instance where linear interpolation is applied, according to an exemplary embodiment
  • FIG. 15A illustrates an instance where data augmentation for a specific data frame and indicator is applied, according to an exemplary embodiment
  • FIG. 15B illustrates an instance where data augmentation for a specific data frame and indicator is applied, according to an exemplary embodiment
  • FIG. 16 shows the combining of sampling techniques using data interpolation, augmentation, and robust sampling, according to an exemplary embodiment.
  • FIG. 17 illustrates a schematic diagram of model building engine according to an exemplary embodiment
  • FIG. 18 shows pipeline configurable settings and descriptions, according to an exemplary embodiment
  • FIG. 19A illustrates different pipeline operation processes according to exemplary embodiments;
  • FIG. 19B illustrates different pipeline operation processes according to exemplary embodiments
  • FIG. 20A illustrates different pipeline operation processes according to exemplary embodiments
  • FIG. 20B illustrates different pipeline operation processes according to exemplary embodiments
  • FIG. 21A illustrates different pipeline operation processes according to exemplary embodiments
  • FIG. 21B illustrates different pipeline operation processes according to exemplary embodiments
  • FIG. 22 illustrates a vehicle system according to an exemplary embodiment
  • FIG. 23 illustrates another schematic diagram of a system according to an exemplary embodiment.
  • references in the specification to "one embodiment,” “an embodiment,” an illustrative embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
  • the disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof.
  • the disclosed embodiments may also be implemented as instructions carried by or stored on a machine readable (e.g., computer-readable) medium or machine-readable storage medium, which may be read and executed by one or more processors.
  • a machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
  • risk assessment systems have provided drivers with personal driving suggestions, insurance companies with risk reports, and public agencies with support services that rely on accident data.
  • accident data is relatively sparse in volume.
  • exemplary embodiments disclosed herein provide a driver’s risk assessment framework based on ML technologies and quantitative driving risk indicators.
  • Machine learning provides computers and/or systems the ability to learn without being explicitly programmed.
  • Exemplary embodiments make use of sensor data (e.g., in-vehicle video, out-of-vehicle video and physiological data) for risk predictions, so there is no need to rely on accident data that is too sparse in volume.
  • time-series data may prove very useful; however, such use of time-series data may be more technically difficult than compared to conventional non-time-series data analytics.
  • Exemplary embodiments of the instant disclosure overcome many of the above-referenced technological challenges.
  • Exemplary embodiments may provide innovative and novel solutions by defining a set of rules for the purpose of quantitative assessment of driving risks using empirical knowledge, applying ML frameworks to fuse multi-modal data and optimizing the risk assessment system based on such ML frameworks.
  • the ML frameworks may dynamically evolve as more information and data are collected and integrated into the frameworks.
  • Exemplary embodiments not only predict current risk states but also utilize time-series data to predict risk trends.
  • a system 100 where nodes in the system 100 may be communicatively coupled via a wired and/or wireless connection.
  • the nodes within the system 100 may include, for example, one or more vehicles 101, one or more scoring systems 160, one or more predicting systems 170, and drivers 180, insurance firms 190, and government agencies 195.
  • a vehicle 101 may include sensors or detectors for sensing/detecting various types of sensor data 140.
  • the sensors/detectors may include, for example, in-vehicle video cameras, out-of-vehicle video cameras, geolocation sensors, digital tachographs, and/or sensors to detect physiological information of the driver.
  • the sensor data 140 may be collected, stored and analyzed.
  • the sensor data along with other additional data may be provided to a scoring system 160 to calculate and determine a risk assessment score.
  • the additional data in addition to the sensor data 140 may include, for example, vehicle information 110, driver profile data 120, location information (e.g., map information) 130, and/or historical data 150.
  • the historical data 150 may include, for example, historical accident data and driving log data.
  • the predicting system 170 may provide additional information to the scoring system 160 to assist the scoring system 160 in calculating and determining a risk assessment score.
  • the risk assessment score that is produced by the scoring system may be used to trigger an output, including, for example, an alert, a report, or the activation of a control to effectuate an action.
  • the generated risk assessment score may be transmitted to drivers 180, insurance firms 190, and governments 195.
  • a risk assessment score may be calculated according to an exemplary embodiment.
  • sensor data 140 may be obtained, sensed, detected and/or acquired via the various types of sensors described above, or other sensors that are not explicitly mentioned.
  • the sensors which may include, but are not limited to, in-vehicle video cameras, out-of-vehicle video cameras, geolocation sensors, digital tachographs, and/or physiological data sensors, may be physically integrated into the vehicle 101, or maybe separate from the vehicle 101 and may communicate with the vehicle 101 through wired and/or wireless communication means and associated protocols (e.g., Ethernet, Bluetooth(R), Wi-Fi(R), WiMAX, LTE, etc.) to affect such communication.
  • the sensors 140 may also be integrated into mobile device such as, for example, a cell phone, personal digital assistant (PDA), tablet, laptop, or other portable computing device.
  • PDA personal digital assistant
  • a computing device such as, for example, a server, may receive the above-described sensor data 140 related to a driver, a vehicle, and/or an external environment of the vehicle.
  • one or more near-accident indicators may be determined and/or generated based on the sensor data 140.
  • the probability that an event will occur, a risk score, and/or information related to reduction of risk may be calculated or generated based on the near-accident indicators and ML data.
  • ML algorithms may be used to train the ML data of the system 100 based on previous data that is sensed and/or obtained. The probability may be calculated by the computing device illustrated in FIG. 3.
  • the ML data may be continually updated based on newly detected data.
  • the calculated probability that an event will occur, a risk score, and/or information related to reduction of risk may be generated and transmitted to a separate device, where the probability and/or risk assessment score may be considered and/or analyzed by, for example, a driver, an insurance firm, and/or a government agency.
  • FIG. 2B illustrates additional aspects of a method of reducing driving risks, according to an exemplary embodiment.
  • the operation of calculating a probability that an event will occur, a risk score, and/or information related to reduction of risk may include generating risk scores related to the driver, the driver vehicle, and/or an external environment of the vehicle.
  • Block 260 shows an additional operation of generating time-based predictions related to driver risk based on ML data.
  • FIG. 2C illustrates even more aspects of a method of reducing driving risks, according to an exemplary embodiment.
  • the operation of generating the time-based predictions of driver risk may include generating an initial prediction of driver risk by detecting the data related to a driver, a vehicle of the driver, and/or an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to the driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
  • Block 280 shows an additional operation of automatically generating labels for the near-accident indicators. These labels may be used to build models that can be used for near-accident indicator prediction.
  • the computing device 300 may include a processor 320, a memory 326, a data storage 328, a communication subsystem 330 (e.g., transmitter, receiver, transceiver, etc.), and an I/O subsystem 324. Additionally, in some embodiments, one or more of the illustrative components may be incorporated in, or otherwise form a portion of, another component. For example, the memory 326, or portions thereof, may be incorporated in the processor 320 in some embodiments.
  • the computing device 300 may be embodied as, without limitation, a mobile computing device, a smartphone, a wearable computing device, an Internet-of-Things device, a laptop computer, a tablet computer, a notebook computer, a computer, a workstation, a server, a multiprocessor system, and/or a consumer electronic device.
  • the processor 320 may be embodied as any type of processor capable of performing the functions described herein.
  • the processor 320 may be embodied as a single or multi-core processor(s), digital signal processor, microcontroller, or other processor or processing/controlling circuit.
  • the memory 326 may be embodied as any type of volatile or non-volatile memory or data storage capable of performing the functions described herein. In operation, the memory 326 may store various data and software used during operation of the computing device 300 such as operating systems, applications, programs, libraries, and drivers.
  • the memory 126 is communicatively coupled to the processor 320 via the 1/0 subsystem 324, which may be embodied as circuitry and/or components to facilitate input/output operations with the processor 320, the memory 326, and other components of the computing device 300.
  • the data storage device 328 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, non-volatile flash memory, or other data storage devices. With respect to calculating and determining a risk assessment score, the data storage device 328 may store the above-discussed detected data and/or ML data.
  • the computing device 300 may also include a communications subsystem 330, which may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications between the computing device 300 and other remote devices over a computer network (not shown).
  • the communications subsystem 330 may be configured to use any one or more communication technology (e.g., wired or wireless communications) and associated protocols (e.g., Ethernet, Bluetooth(R), Wi-Fi(R), WiMAX, LTE, etc.) to affect such communication.
  • the computing device 300 may further include one or more peripheral devices 332.
  • the peripheral devices 132 may include any number of additional input/output devices, interface devices, and/or other peripheral devices.
  • the peripheral devices 332 may include a display, touch screen, graphics circuitry, keyboard, mouse, speaker system, microphone, network interface, and/or other input/output devices, interface devices, and/or peripheral devices.
  • the computing device 300 may also perform one or more of the functions described in detail below and/or may store any of the databases referred to below.
  • FIG. 4 illustrates a schematic diagram of a risk scoring framework 400 using near-accident indicators and ML-based prediction, according to an exemplary embodiment.
  • a set of near-accident indicators 410 are defined based on a specific purpose.
  • a ML framework automation architecture 420 there may be provided a defined dataset combination 425.
  • the dataset combination may relate to time-series observations and new features (e.g., traffic signals).
  • the dataset combination 425 may include sensor data that is sensed or detected by various sensors/detectors.
  • the sensors/detectors may include, for example, in-vehicle video camera, out-of-vehicle video cameras, geolocation sensors, digital tachographs, and/or sensors to detect physiological data of a driver.
  • Machine learning-based prediction may be provided within the ML framework automation architecture 420.
  • the ML-based prediction component may provide short-term indicator prediction and/or long-term indicator prediction.
  • the risk scoring framework 400 may implement an empirical link formula 430 for risk scoring, where the empirical link formula 430 is based on empirical data.
  • the empirical link formula 430 may be combined with indicator distribution to calculate, generate, and/or determine distraction scores 440, which may be further applied for risk assessment for clients 450.
  • the ML framework automation may continuously refine models, according to an exemplary embodiment.
  • the ML framework automation architecture 420 may generate data models from a variety of different types of data, such as time-series data.
  • the time-series data may include a sequence of data points or data records for a data source over time, such as successive measurements taken by a sensor/detector at various intervals over a period of time.
  • Data sources may include any of a variety of appropriate devices, sensors, systems, components, and/or other appropriate systems that are capable of and configured to generate data records, such as time-series data.
  • the devices for detecting the data constituting the dataset combination may detect time-series data.
  • Such devices measure one or more aspects of their current state or the surrounding environment's current state, and communicate (via wire and/or wirelessly) those measurements as data to a computing device within a vehicle, over the internet (and/or other communication networks, such as local area networks (LANs), wide area networks (WANs), virtual private networks (VPNs), wireless networks (e.g., Wi-Fi networks, BLUETOOTH networks, mobile data networks), or any combination thereof).
  • LANs local area networks
  • WANs wide area networks
  • VPNs virtual private networks
  • wireless networks e.g., Wi-Fi networks, BLUETOOTH networks, mobile data networks
  • the time-series data may include measurements that are taken by engine sensors that take measurements related to the operation and performance of various components of an engine, such as measurements of the temperature, rate of rotation, pressure, flow rate, receipt of control signals, transmission of control signals, and/or other appropriate measurements.
  • Such exemplary time-series data may be used, for example, to generate data models for an engine, which may be used to prospectively identify and correct issues with the engine before they occur (e.g., identify indicators that belts are wearing and in need of replacement) and/or to reactively determine which component(s) of an engine caused a malfunction (e.g., which components are causing vibrations at low speeds).
  • near-accident indicators 510 include: “Time-Before-Collision”, “Distance to white lines”, “Distance to side walk”, “Response time to traffic signal change”, “Degree of drowsiness”, etc.
  • the near-accident indicators 510 may be factors that may be prospectively used to identify, for example, the risk of an accident occurring.
  • These example near-accident-indicators 510 may be determined based on domain knowledge, however other candidate domain-specific near-accident-indicators may be identified via domain-specific surveys and/or investigation.
  • the relevance between near-accident indicators such as “Time-Before-Collision” and driving distraction indicators may be determined based on distraction scores, which in turn are determined at least based on a Loss Function. Distraction scores and Loss Functions are discussed in more detail below.
  • Some of the near-accident indicators 510 may be predicted based on machine-learning.
  • Exemplary formulas 520 for calculating or estimating a Time-Before-Collision indicator are also shown in FIG. 5 from the perspective of Car 1. The calculations may be performed via a computing device in Car 1 or via a remote computing device with which Car 1 communicates.
  • the formulas 520 may be based on a distance between two cars (e.g., Car 1 and Car 2) and their respective velocities.
  • the distance between Car 1 and Car 2 may be determined by, for example, one or more out-of-car video cameras.
  • the velocity of Car 2 may also be detected by the one or more out-of-car video cameras, or by other object velocity detectors, which may use ultrasonic, radar or light-based measurement.
  • the velocity of Car 1 may be detected by its speedometer, or determined from an analysis of geolocation information from sensor data 140.
  • data related to the Near-Accident-Indicators may be collected and used to continually develop and refine a risk assessment system.
  • Machine-based learning may be used to predict trends in the Near-Accident-Indicators that may lead to better identification of symptoms that contribute to increased driving risk. Such is made possible by defining quantitative measurements (e.g., Near-Accident-Indicators) to assess a driver and a driver vehicle’s condition with respect to driving risk.
  • the sensor data may include, for example in-vehicle video data 610 detected by in-vehicle cameras, outside video data 620, which may be detected by out-of-vehicle cameras and may be video of the environment outside of the vehicle, and physiological data 630, which may be detected by physiological sensors and which may detect physiological data of a driver.
  • the in-vehicle cameras may detect in-vehicle features 650 such as, for example, whether a driver is using a mobile device, whether a driver is yawning, and/or the head orientation of a driver.
  • the out-of-vehicle video cameras may detect out-of-vehicle features 660 such as, for example, the distance between one vehicle and another, the distance between a vehicle and pedestrians, and the state of traffic signals.
  • the physiological data sensors may detect physiological features 670 such as, for example, heart rate, body temperature, and brain waves of a driver.
  • the features 650, 660, 670 detected by the various sensors may be extracted and combined to predict different indicators which may be used to determine driver risk.
  • combining multiple sources of sensor data may provide higher confidence in predicted indicators. For example, in the case of a yawn event, analysis of multiple features may lead to a more accurate indicator of drowsiness.
  • the indicators may include, but are not limited to: “Degree or Change Rate of Drowsiness” 680, “Degree or Change Rate of Concentration” 685, “Degree or Change Rate of Stress” 690, and/or “Degree or Change Rate of Risky Driving Behaviors” 695.
  • ML algorithms such as, for example, deep neural networks may handle complicated combinations of unstructured data and perform predictions.
  • a risk assessment system model may be predefined before it is updated and/or trained.
  • Historical data 710 can be used along with a pre-defined model 720, and the pre-defined model 720 along with the historical data 710 can be trained/updated, via a training/updating operation S730, to create a trained/updated model 740.
  • the training/updating operation S730 may include, but is not limited to, one-pass batched training or multi-epoch mini-batched training.
  • the trained/updated model 740 may be continuously updated and personalized, via online training S750, using real-time sensor data 760.
  • trained model 740 may be an input model, and may be incrementally trained using real-time sensor data 760, which may be obtained online and may have the same data formation as the historical data 710.
  • the output of the online training operation S750 may be an updated trained model 740, which will replace the older version.
  • the trained model 740 may then generate indicators (e.g., Near-Accident-Indicators), via a prediction operation S770.
  • Such indicators e.g., current driver’s level of drowsiness
  • driver risk may be assessed much better than with conventional models.
  • the prediction operation S770 may include the trained model 740 being applied to real-time sensor data 760 to produce indicators distribution 780.
  • a difference between S750 and S770 is that in S750 the labelled data in real-time must also be provided for online training, while in S770 the labelled data is unavailable.
  • the indicators that have the same data formation with labelled data may be predicted during prediction operation S770.
  • Historical data 710 may contain previously measured sensor data including, but not limited to, sensor data 610, 620, 630.
  • Pre-defined model 720 may be, but is not limited to, regression models such as linear regression, non-linear regression, logistic regression, decision tree, random forest, support vector machine (SVM), neural network, and deep neural network models.
  • a risk assessment system model may be predefined before it is updated and/or trained.
  • Historical data 710 along with the pre-defined model 720, may be trained/updated periodically with newly accumulated data 730.
  • the predefined model 720 then becomes a trained model 740.
  • the trained model 740 may be continuously updated and personalized via online training 750 using sensor data 760.
  • the trained model 740 may then provide prediction distributions 770 over indicators (e.g., Near-Accident-Indicators), which are not predictable (e.g., current driver’s level of drowsiness) and uncertain. Thus, driver risk may be assessed much better than with any conventional models.
  • indicators e.g., Near-Accident-Indicators
  • time-series prediction models generated in the embodiment of FIG. 7 are shown according to a further exemplary embodiment. Shown in FIG. 8 are time-series prediction models 810 and 820 reflecting the model behavior at different points of time T1 and T2, point T2 being after the time-series prediction model 810 has been updated with newly observed data.
  • Time-series prediction is another advantage of probabilistic-based ML approaches. Even when time-series prediction model 810 is not yet updated, it may take full advantage of sequential sensor data. Model 810, even though not updated, may provide future predictions of risk indicators using risk symptoms (e.g., yawn), so that corrective actions may be taken based on predictions indicated by the model 810.
  • Time-series prediction model 820 reflects a model that may continuously update its predicted distribution over indicators using newly observed data.
  • the updating of the time-series prediction model 820 may occur via an online updating scheme.
  • the trained time-series prediction models may continuously provide long-term prediction of near accident indicators.
  • time-series models 810 and 820 A benefit of a continually updated time-series model is evident by time-series models 810 and 820.
  • time-series model 810 the predicted driver assessment is based on data that is collected about the driver at a particular point in time T1.
  • Time-series model 820 illustrates a driver’s state when detected at a point in time T2 later than the time point T1 in time-series model 810 - so the time-series model 820 reflects a more accurate driver risk assessment prediction based on updated data, than the driver state data detected at the earlier point in time T1 in time-series model 810.
  • the empirical link formula may integrate a loss function and indicator distribution to determine a distraction score Score(x).
  • the distraction score Score(x) may be an overall driving risk score of the target driver, and is computed by integrating Indicator Distribution p (y
  • Indicator Distribution may be obtained as discussed in FIGs. 7 and 8.
  • the Loss Function of y may be determined according to empirical knowledge (e.g., empirical historical data) based on a target application (e.g., if Time-Before-Collision indicator y > 3 seconds, impose no penalty).
  • a target application e.g., if Time-Before-Collision indicator y > 3 seconds, impose no penalty.
  • the Loss Function may be predefined and/or based on a threshold
  • the Loss Function may be more than a mere threshold and may take into consideration indicator values as well as distributions based on in-domain experience and knowledge (e.g., from driving safety experts).
  • the domain of the definition of Loss Function may be consistent with the Indicator Distribution function.
  • risk indicators may vary based on their application. For example, degree of drowsiness may be particularly crucial to truck drivers who usually have long drives.
  • a loss function may impose stronger penalties on drowsiness indicators via the loss function, without changing indicator prediction distributions.
  • Empirical loss functions may be designed according to each of the indicators and target applications. Thus, the indicator predictions may be reusable across various applications.
  • FIG. 10 a schematic diagram for automatically generating ML frameworks is illustrated.
  • the ML framework automation architecture 1000 shown in FIG. 10 may automatically generate machine frameworks, which in turn may automatically build complex model large-scale data in a short time span.
  • various sensors may detect and obtain information on the driver, the driver’s vehicle, and/or factors external to the vehicle.
  • the collected data may be referred to as sensor data 1010.
  • the sensor data 1010 may be analyzed and subjected to time-series modeling 1020.
  • the collected and analyzed sensor data may be received from one or more vehicles, and stored in a dataset storage 1030.
  • the above-described dataset storage 1030 may be embodied as any type of storage means, including memory.
  • the dataset storage 1030 may be embodied as any type of volatile or non-volatile memory or data storage capable of performing the functions described herein.
  • the dataset storage 1030 may store various data and software.
  • the dataset storage 1030 may be provided in a vehicle or may be part of a separate device and/or server communicatively coupled to the driver vehicle.
  • the stored data that has been collected from one or more vehicles may be transmitted to a training data generation engine 1040, where the transmitted data may be subject to interpolation, augmentation, and/or robust sampling.
  • the data that is manipulated by the training data generation engine may be transmitted to a model building engine 1050, in which the data may be subjected to loss function analysis based on how the data will be used.
  • the training data generation engine 1040 and model building engine 1050 may be separate computing devices, similar to that which is illustrated in FIG. 3, or may constitute a single computing device.
  • a data frame segment 1100 which may be used as input data for model building and for transmitting information according to an exemplary embodiment, is shown.
  • a data frame 1110 of the data segment 1100 is also shown.
  • a data frame 1110 may be a time point associated information set that contains useful data for model building.
  • a set of successive data frames 1110 in a defined time span may constitute a data frame segment 1100, which may be used as input data for model building and for communication between the aforementioned sensors and server computing device.
  • All sensor data as well as features extracted from the raw sensor data that are associated to a specific time point may be grouped into a data frame 1110, which may include video images, sensor observations, transformed time-specific variables, etc.
  • Some category or attribute data such as driver age/gender, vehicle type, may also be integrated into the data frames 1110. By doing so, all category/profile specific information may be moved from model building hyper-parameters to input data sets, so that the generality of the data frames may be enhanced.
  • the data frame segments 1100 may be used to build models for indicator prediction. Any possible observation or measurement methods, including, for example, using a beacon, light detection and ranging (LiDAR), or video taking via a camera, may be used to generate labels of indicators. Such labelled indicators may be asynchronously matched to relevant data frame segments.
  • LiDAR light detection and ranging
  • video taking via a camera may be used to generate labels of indicators.
  • Such labelled indicators may be asynchronously matched to relevant data frame segments.
  • a label of an indicator may correspond to ground truth data.
  • a beacon system may generate two measured distances from the beacon sensor to the successive cars running along the road, based the estimated running speeds of the two cars. Therefore, the beacon system may be able to accurately calculate the distance between the two cars in the future, and such distance data may be considered ground truth data for calculating true indicator values such as Time-Before-Collision Indicator.
  • a label may be automatically generated with the measured value.
  • the stored data frame segments may be backtracked and used to identify data frame segments corresponding to this label.
  • a measurement indicator 1220 may be generated at time T, where the measurement indicator is the Time-Before-Collision (i.e., time before collision between Car 1 and Car 2) indicator.
  • the indicator 1220 generated at time T two candidate data frame segments 1210 may found, which are generated at previous times T 1 and T 2 , respectively.
  • the pairs of matched data frame segments 1210 and indicator labels may be the dataset used for complex model building.
  • FIG. 13 provides a detailed illustration of a training data generation engine 1300, which may be similar to the training data generation engine 1040 illustrated in FIG. 10.
  • the training data generation engine 1300 may include a data interpolation module 1310, a data augmentation module 1320, and/or a robust sampling module 1330 for applying data interpolation, augmentation, and robust sampling technologies to automatically generate a training data set for complex model building.
  • the data interpolation module 1310 may perform data interpolation by increasing the matched pairs of data frame segments and indicators, by finding similar data frame segments compared to the actual labelled ones, and assigning synthetic label values to them. Similarity algorithms for finding similar data frames may include but are not limited to Euclidean distance, Jaccard index, and Kolmogorov similarity algorithms.
  • the data augmentation module 1320 may perform data augmentation by increasing the number of unique data frame segments by adding one or more kinds of variances to original ones. Augmentation algorithms for image processing may include but are not limited to translation, rotation, and brightness / contrast adjustment.
  • the robust sampling module 1330 may perform robust sampling by increasing the combination instances of data frame segments, as well as balancing the distribution of the labels, so as to make the model building process more robust, and reduce risks of data distribution bias.
  • FIG. 14 a scenario in which data interpolation is applied and exemplary similarity algorithms, are shown according to an exemplary embodiment. That is, in box 1410, there are listed several different similarity algorithms that may be used to find similar data frames according to an exemplary embodiment. Also, an exemplary scenario 1420 in which data interpolation is applied to train automatic data generation is shown. When applying linear interpolation, the parameters to interpolate the corresponding data frame segment may first be determined, then the parameters may be used to interpolate the indicator.
  • indicators TBC T1 and TBC T3 at time T 1 and T 3 separately, may be calculated.
  • T 2 there is no measurement.
  • the corresponding collected data frame segments at t 1 , t 2 , and t 3 are DFS t1 , DFS t2 , and DFS t3 .
  • 1) parameters a, b may be calculated, which produce by linear formula: a * + b * such that the similar distance is minimized; and 2) interpolated indicator may be calculated with linear formula: a * + b *
  • FIGs. 15A and 15B the application of data augmentation and various augmentation methods are shown, respectively, according to an exemplary embodiment.
  • FIG. 15A it is illustrated that when applying data augmentation, a number of extra data frame segments 1520 that are associated with an initial data frame 1510 may be generated via data augmentation.
  • FIG. 15B lists several different selectable augmentation methods (e.g., related to image data).
  • the original distributions of input data often may be unbalanced and, therefore, cause risks of bias of the model.
  • the robustness of the model may be increased so that the prediction performance may be kept in a stable satisfying level.
  • the original distribution of data may be subjected to interpolation and augmentation model 1620. Based on the data interpolation and augmentation model 1620, numbers of samples around the entire range of distribution may be increased.
  • Robust sampling 1630 may then be applied to the interpolated and augmented data, via down sampling and over sampling. By down sampling the dense range of data while over sampling the sparse range of data, better-balanced distribution of input data may be obtained for model building purposes.
  • a Machine Learning Automation Architecture may be used to create a model building engine 1700 that, in turn, builds a wide range of complex models.
  • Methods of pipeline, incrementation, boosting, and transferring, as shown in FIG. 17, may be used to facilitate model building processes to improve model performance.
  • a model building pipeline 1720 may organize a set of key configurations including data sources, target variable(s), model structure, cost function, regularization and optimization schemes, as well as other hyper parameters.
  • a model building pipeline may be a defined sequence of model building tasks, such as loading datasets, splitting datasets for training and validation, loading configured hyper-parameters, performing model training operations, or doing model validation after training.
  • Model building pipelines may be managed and run in parallel computation environments. Model building with an incrementation method may continually use an existing pipeline 1730 with newly obtained data. The model may be enhanced based on an enlarged training dataset. Model building with a ‘boosting’ method focuses on what an initial model does poorly and uses newly obtained data to correct the initial model. The original pipeline may be modified to incorporate this usage, typically by applying ensemble methods.
  • An ensemble method may include operations using multiple models to realize better performance.
  • An example of such includes, but is not limited to, a model that makes good predictions for a residential street but performs poorly for highway situations.
  • a new model may be built with the new highway datasets, then the two models may be combined together to work as a single model.
  • the combined methods include but are not limited to, simply averaging the predicted result, voting on different situations, or building a wrapper simple model such as linear regression to merge the separate predicted results.
  • model building with the ‘transferring’ method may involve replicating an entire or partial structure of some well-built model as the baseline, and then may start to build the new model for another application.
  • the new dataset may include the original data but usually contains different data at some extent.
  • FIGs. 18-21 respectively relate to the pipeline definition, a pipeline incrementation method, a pipeline boosting method, and a pipeline transferring method, each of which are described in FIG. 17.
  • FIG. 18, for example, illustrates well-defined configurations of a repeatable model building pipeline.
  • Pipelines may be cornerstones of ML framework automation for large scale model building.
  • FIGs. 19A and 19B illustrate exemplary scenarios of using a model building engine to build models with incrementation.
  • models built with a moderate dataset volume may be applied for early market application deployment, and, after accumulating more data, the models may be incrementally built-up, developed and refined with the existing pipelines to improve model performance.
  • FIG. 19A in case of early deployment, an original pipeline may be re-run with newly accumulated data and the model may be updated with data from early models, to produce a new model.
  • a pre-set original pipeline may be reconnected to newly accumulated data. All other configuration settings may be kept the same as the original values.
  • the above-described incrementation method relates to a usage of a pipeline, where the pipeline is re-run with increased datasets, while all other configurations are kept in their original form.
  • the newly trained model may be expected to have better performance compared to the original model, and may be applied in the same context of the original context to replace the old model.
  • FIGs. 20A and 20B illustrate exemplary scenarios of using a model building engine to build models with boosting.
  • the performance of existing models may be continually monitored by comparing the prediction results of existing models to newly measured models. For example, when one model performs poorly under some specific conditions, such as when highway conditions or night-time driving is involved, the relevant data samples may be increased and the model structure may be modified, and then the pipeline may be re-run.
  • models built with moderate dataset volumes may be applied for early market application deployment, and, after accumulating more data, the models may be incrementally built with the existing pipelines to improve model performance. Such operations are performed in the Use Case of FIG. 20A and the pipeline operation process of FIG. 20B.
  • the ‘boosting’ method may involve boosting usage of a pipeline, where the pipeline is re-run with manipulated datasets, for example, to intentionally increase the datasets.
  • the increase in datasets may mean that the datasets relate to the conditions under which the model performs poorly, such as when predicting highway condition.
  • the boosting method there is an option of whether or not to use the above-mentioned ensemble method. If the ensemble method is not used, a newly trained model may be in the same structure as the original model. In this case, boosting usage may be similar to incrementation usage, except that the datasets are manipulated with consideration of a model’s poor performance aspects.
  • a newly trained model may be combined together with the original model; in other words, the new final model may be a combination model containing two or more models (according to the number of rounds the ensemble method is applied).
  • the newly trained model (or combined models) may be applied in the original context to replace the old model.
  • FIGs. 21A and 21B illustrate exemplary scenarios of using a model building engine to build models with a ‘transferring’ method.
  • the available data may be insufficient to build new models, such as when developing applications for a new type of vehicle, expanding to a new market, or introducing a new target indicator.
  • an existing model may be used as the base and the model may be further developed with a limited new dataset, to accelerate the building process and improve model performance.
  • models built with a moderate dataset volume may be applied for early market application deployment, and after accumulating more data the models may be incrementally built with the existing pipelines to improve model performance. Such operations may be performed in the Use Case of FIG. 21A and the pipeline operation process of FIG. 21B.
  • the ‘transferring’ method may include transferring usage of a pipeline, where the application purpose is different from the original model.
  • the purpose of transferring usage of a pipeline may be to train a model that predicts Distance-To-Side-Walk Indicator.
  • the original model is used as a base structure, with its current internal parameters kept as initial values, while the datasets and ground truth values are replaced with different videos and an actual distance to a sidewalk.
  • the newly trained model may be applied to a different context, for example, to predict a Distance-To-Side-Walk Indicator while the original model predicting Time-Before-Collision Indicator is kept in its original state without replacement.
  • exemplary embodiments of the instant disclosure exceed conventional simple ML models, even as dataset scales increase.
  • Conventional ML models are often “shallow” models containing no more than hundreds or thousands of internal parameters.
  • exemplary embodiments of the instant disclosure relate to deep learning models (i.e. complex models) that often contain millions of internal parameters. Deep models may require much more data to do model building, therefore it is very difficult to build such models from a scratch, which is often the situation when launching a new business or creating new solutions.
  • the available datasets may be utilized in a maximum way by taking advantage of data interpolation, augmentation, and robust sampling.
  • exemplary implementation of the exemplary embodiments may make model building tasks more manageable, replicable, and transferrable, so that as a whole such complex models may be built in the most efficient way as well as in the most wide scale.
  • Exemplary embodiments set forth herein apply complex ML models (e.g., deep neural networks) to perform operations and calculations.
  • Exemplary embodiments reflect a framework automation architecture to automatically generate the input that support complex model building approaches. Based on the time-series prediction characteristics and indicator-loss function mechanism, world-wide-scale data may be collected and such big data may be utilized to build and deliver general complex models with efficiency.
  • Vehicle 2200 may include a front facing camera 2210, a rear facing camera 2220, a physiological sensor 2230, a digital tachograph module 2240, and a central computation module 2250.
  • the vehicle 2200 may wirelessly connect to a cloud service 2260.
  • the front facing camera 2210 may detect outside visual data and transmit the data to the central computation module 2250 via a USB interface.
  • the rear facing camera 2220 may detect the driver and other in-vehicle visual data and transmit the data to the central computation module 2250 via a USB interface.
  • the physiological sensor 2230 may obtain driver physiological data and transmit the data to the central computation module 2250 via a WLAN interface such as, for example, Bluetooth or Wi-Fi.
  • the digital tachograph module 2240 may obtain vehicle location and running data, and transmit the data to the central computation module 2250 via a USB interface.
  • the central computation module 2250 may collect sensor data and perform prediction.
  • the central computation module 2250 may communicate to the cloud service 2260 via a mobile data protocol.
  • the cloud service may provide remote functional service, such as, for example, map access or user profile management.
  • the central computation module 2250 may include a USB module 2251, a WLAN module 2252, a data communication module 2253, a GPU module 2254, a CPU module 2255, a memory module 2256, a storage module 2257, and a touch-panel module 2258.
  • the USB module 2251 may implement USB data transmission protocol
  • the WLAN module 2252 may implement WLAN data transmission protocols, such as, for example, Bluetooth and Wi-Fi
  • the data communication module 2253 may implement mobile wireless protocols
  • the GPU module 2254 may perform deep learning related processing
  • the CPU module 2255 may perform general purpose computation processing
  • the memory module 2256 may store data and provide high-speed accessibility
  • the storage module 2257 may permanently store data in media such as SSD
  • the touch-panel module 2258 may provide user input and display functionality.
  • FIG. 23 shows a networking architecture 2300 that includes a vehicle 2310, a network 2320, a server device 2360, and a vehicle and driver information server 2340.
  • the server device 2360 may include a scoring module 2370 and predicting module 2380, which may have similar functionality to the scoring system 160 and predicting system 170 of FIG. 1.
  • Sensors of the vehicle 2310 may transmit sensor data (e.g., sensor data 140 of FIG. 1) to server device 2360 over the network 2320.
  • the vehicle 2310 may receive historical data 150, vehicle info 110, driver profile data 120, and/or map information 130 from the vehicle and driver information server 2340 via a direct communication link.
  • the direct communication link may be a wireless link such as, for example, a Bluetooth, IR, Wi-Fi, NFC link, etc.
  • Communication between the vehicle 2310 and the vehicle and driver information server 2340 may also be indirectly provided over the network 2320.
  • the network 2320 may include any suitable combination of servers, access points, routers, base stations, mobile switching centers, public switching telephone network (PSTN) components, etc., to facilitate communication.
  • PSTN public switching telephone network
  • each block in the flowcharts or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block(s) may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • the methods shown herein may generally be implemented in a computing device or system.
  • the computing device or system may be a user level device or system or a server-level device or system. More particularly, the methods may be implemented in one or more modules as a set of logic instructions stored in a machine or computer-readable storage medium such as random access memory (RAM), read only memory (ROM), programmable ROM (PROM), firmware, flash memory, etc., in configurable logic such as, for example, programmable logic arrays (PLAs), field programmable gate arrays (FPGAs), complex programmable logic devices (CPLDs), in fixed-functionality logic hardware using circuit technology such as, for example, application specific integrated circuit (ASIC), complementary metal oxide semiconductor (CMOS) or transistor-transistor logic (TTL) technology, or any combination thereof.
  • RAM random access memory
  • ROM read only memory
  • PROM programmable ROM
  • firmware flash memory
  • PLAs programmable logic arrays
  • FPGAs field programmable gate arrays
  • CPLDs
  • computer program code to carry out operations shown in the methods of FIGs. 2A and 2B may be written in any combination of one or more programming languages, including an object-oriented programming language such as JAVA, SMALLTALK, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
  • logic instructions might include assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, and/or other structural components that are native to hardware (e.g., host processor, central processing unit/CPU, microcontroller, etc.).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Technology Law (AREA)
  • Development Economics (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Traffic Control Systems (AREA)

Abstract

Systems, apparatuses, and methods directed to receiving detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle; determining one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident; calculating at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data; continually updating the machine-learning data based on new detected data; and outputting the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.

Description

DISTRACTED DRIVING PREDICTIVE SYSTEM Field of the Disclosure
This disclosure generally relates to methods, systems, apparatuses, and computer readable media for reducing driving risks.
Background
Drivers of vehicles can benefit from information that helps them to avoid accidents and suggestions that assist them with safe driving. Insurance companies and the like require driver data and driving environment data to accurately assess potential driving risks. If the insurance companies know the trends of risky driving patterns then they can take proactive steps to lower such risks. Public agencies, law makers, and the like need driver and driving environment data to optimize regulations and guidelines to increase driving safety. Conventionally, such data and information have not been provided in an accurate and speedy way.
Summary
Consistent with the disclosure, exemplary embodiments of systems, apparatuses, and methods thereof for reducing driving risks, are disclosed.
According to an embodiment, there is provided a predictive system comprising: one or more sensors configured to: detect data related to at least one of:
a driver, a vehicle of the driver, and an external environment of the vehicle, and
transmit the detected data; and a server configured to receive the detected data. The server may comprise: a receiver configured to receive the detected data; a processor coupled to a memory, the processor configured to: determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident, calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data, and continually update the machine-learning data based on new detected data; and a transmitter configured to output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
The processor may be configured to generate time-based predictions of driver risk based on the machine-learning data.
The calculation of the at least one of the probability that the event will occur, the risk score, and the information related to a reduction of risk may be based on risk scores related to at least one of the driver, the vehicle of the driver, and the external environment of the vehicle.
The generating the time-based predictions of driver risk may comprise generating an initial prediction of driver risk by detecting the data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to at least one of a driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
The processor may be configured to automatically generate labels for the near-accident indicators, wherein the labels are to be used to build models for future near-accident indicator prediction.
According to another embodiment, there is provided a method comprising: receiving detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle; determining one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident; calculating at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data; continually updating the machine-learning data based on new detected data; and outputting the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
According to yet another embodiment there is provided an apparatus comprising: a receiver configured to receive detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle; a processor coupled to a memory, the processor configured to: determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident, calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data, and continually update the machine-learning data based on new detected data; and a transmitter configured to output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
According to yet another embodiment, there is provided a non-transitory computer readable medium comprising a set of instructions, which when executed by one or more processors of a device, cause the one or more processors to: receive detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle; determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data; continually update the machine-learning data based on new detected data; and output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The various embodiments will become apparent to one skilled in the art by reading the following specification and appended claims, and by referencing the following drawings, in which:
FIG. 1 illustrates a schematic diagram of an overall system architecture according to an exemplary embodiment; FIG. 2A illustrates a method of reducing driving risks according to an exemplary embodiment; FIG. 2B illustrates a method of reducing driving risks according to an exemplary embodiment; FIG. 2C illustrates a method of reducing driving risks according to an exemplary embodiment; FIG. 3 illustrates a block diagram of a computing platform according to an exemplary embodiment; FIG. 4 illustrates a schematic diagram of a risk scoring framework using near-accident indicators and machine-learning (ML)-based prediction, according to an exemplary embodiment; FIG. 5 illustrates near-accident indicators according to an exemplary embodiment; FIG. 6 illustrates a schematic diagram for combining data from various sensors to assess a distraction score, according to an exemplary embodiment; FIG. 7 illustrates a schematic diagram of a probabilistic machine-learning process according to an exemplary embodiment; FIG. 8 illustrates time-series prediction models according to an exemplary embodiment; FIG. 9 illustrates an indicator distribution to derive a distraction score via an empirical formula, according to an exemplary embodiment; FIG. 10 illustrates a schematic diagram for automatically generating ML frameworks, according to an exemplary embodiment; FIG. 11 illustrates a data frame segment, which is used as input data for model building and to transmit information, according to an exemplary embodiment; FIG. 12 illustrates an instance that a Time-Before-Collision Indicator can be used according to an exemplary embodiment; FIG. 13 illustrates a training data generation engine according to an exemplary embodiment; FIG. 14 illustrates an exemplary instance where linear interpolation is applied, according to an exemplary embodiment; FIG. 15A illustrates an instance where data augmentation for a specific data frame and indicator is applied, according to an exemplary embodiment; FIG. 15B illustrates an instance where data augmentation for a specific data frame and indicator is applied, according to an exemplary embodiment; FIG. 16 shows the combining of sampling techniques using data interpolation, augmentation, and robust sampling, according to an exemplary embodiment. FIG. 17 illustrates a schematic diagram of model building engine according to an exemplary embodiment; FIG. 18 shows pipeline configurable settings and descriptions, according to an exemplary embodiment; FIG. 19A illustrates different pipeline operation processes according to exemplary embodiments; FIG. 19B illustrates different pipeline operation processes according to exemplary embodiments; FIG. 20A illustrates different pipeline operation processes according to exemplary embodiments; FIG. 20B illustrates different pipeline operation processes according to exemplary embodiments; FIG. 21A illustrates different pipeline operation processes according to exemplary embodiments; FIG. 21B illustrates different pipeline operation processes according to exemplary embodiments; FIG. 22 illustrates a vehicle system according to an exemplary embodiment; and FIG. 23 illustrates another schematic diagram of a system according to an exemplary embodiment.
Description of the Embodiments
While the concepts of the present disclosure are susceptible to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and will be described herein in detail. It should be understood, however, that there is no intent to limit the concepts of the present disclosure to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives consistent with the present disclosure and the appended claims.
References in the specification to "one embodiment," "an embodiment," an illustrative embodiment," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
The disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on a machine readable (e.g., computer-readable) medium or machine-readable storage medium, which may be read and executed by one or more processors. A machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
In the related art, risk assessment systems have provided drivers with personal driving suggestions, insurance companies with risk reports, and public agencies with support services that rely on accident data. However, accident data is relatively sparse in volume. Accordingly, exemplary embodiments disclosed herein provide a driver’s risk assessment framework based on ML technologies and quantitative driving risk indicators. Machine learning provides computers and/or systems the ability to learn without being explicitly programmed. Exemplary embodiments make use of sensor data (e.g., in-vehicle video, out-of-vehicle video and physiological data) for risk predictions, so there is no need to rely on accident data that is too sparse in volume.
Individual drivers are aware of the importance of avoiding accidents. They want to utilize available data to obtain personalized suggestions for safe driving. Insurance companies need a more accurate and quickly-updated scoring method to assess potential driving risks. If insurance companies know the trends of risky driving patterns then they can take proactive steps to lower such risks. Public agencies and law makers also need to optimize the regulations and guidelines to enhance driving safety in a society influenced by socio-economic factors such as an aging population and increases in the number of foreign-born laborers and immigrants.
Technological challenges exist with regard to providing individual drivers, insurance companies, and public agencies/law makers with the proper amount of information/data in a timely manner to make informed decisions. For example, driving risk scores need to be quantitatively assessed; however, using accident data, as is the case in the related art, is too coarse in granularity and too sparse in volume. Also, in some situations the aspects of risk associated with drivers need to be separated based on empirical judgements, which is not effectively done in the related art. Additionally, when using various sources of sensor data for the purpose of assessing driving risk, it may be technically difficult to combine the data efficiently, without losing any valuable information. It should also be noted that using time-series data to predict risk trends may prove very useful; however, such use of time-series data may be more technically difficult than compared to conventional non-time-series data analytics. Exemplary embodiments of the instant disclosure overcome many of the above-referenced technological challenges.
Exemplary embodiments may provide innovative and novel solutions by defining a set of rules for the purpose of quantitative assessment of driving risks using empirical knowledge, applying ML frameworks to fuse multi-modal data and optimizing the risk assessment system based on such ML frameworks. The ML frameworks may dynamically evolve as more information and data are collected and integrated into the frameworks. Exemplary embodiments not only predict current risk states but also utilize time-series data to predict risk trends.
Referring to FIG. 1, a system 100 according to exemplary embodiment is shown, where nodes in the system 100 may be communicatively coupled via a wired and/or wireless connection. The nodes within the system 100 may include, for example, one or more vehicles 101, one or more scoring systems 160, one or more predicting systems 170, and drivers 180, insurance firms 190, and government agencies 195.
A vehicle 101 may include sensors or detectors for sensing/detecting various types of sensor data 140. The sensors/detectors may include, for example, in-vehicle video cameras, out-of-vehicle video cameras, geolocation sensors, digital tachographs, and/or sensors to detect physiological information of the driver. The sensor data 140 may be collected, stored and analyzed.
The sensor data along with other additional data may be provided to a scoring system 160 to calculate and determine a risk assessment score. The additional data in addition to the sensor data 140 may include, for example, vehicle information 110, driver profile data 120, location information (e.g., map information) 130, and/or historical data 150. The historical data 150 may include, for example, historical accident data and driving log data. The predicting system 170 may provide additional information to the scoring system 160 to assist the scoring system 160 in calculating and determining a risk assessment score. The risk assessment score that is produced by the scoring system may be used to trigger an output, including, for example, an alert, a report, or the activation of a control to effectuate an action. The generated risk assessment score may be transmitted to drivers 180, insurance firms 190, and governments 195.
As described above, based on the sensor data 140, a risk assessment score may be calculated according to an exemplary embodiment. A method 200 for calculating a risk assessment score, according to an exemplary embodiment, is shown in FIG. 2.
In FIG. 2A, sensor data 140 may be obtained, sensed, detected and/or acquired via the various types of sensors described above, or other sensors that are not explicitly mentioned. The sensors, which may include, but are not limited to, in-vehicle video cameras, out-of-vehicle video cameras, geolocation sensors, digital tachographs, and/or physiological data sensors, may be physically integrated into the vehicle 101, or maybe separate from the vehicle 101 and may communicate with the vehicle 101 through wired and/or wireless communication means and associated protocols (e.g., Ethernet, Bluetooth(R), Wi-Fi(R), WiMAX, LTE, etc.) to affect such communication. The sensors 140 may also be integrated into mobile device such as, for example, a cell phone, personal digital assistant (PDA), tablet, laptop, or other portable computing device.
In block 210, a computing device such as, for example, a server, may receive the above-described sensor data 140 related to a driver, a vehicle, and/or an external environment of the vehicle. In block 215, one or more near-accident indicators may be determined and/or generated based on the sensor data 140. In block 220, the probability that an event will occur, a risk score, and/or information related to reduction of risk may be calculated or generated based on the near-accident indicators and ML data. According to an exemplary embodiment, ML algorithms may be used to train the ML data of the system 100 based on previous data that is sensed and/or obtained. The probability may be calculated by the computing device illustrated in FIG. 3.
In block 230, the ML data may be continually updated based on newly detected data. Next, in block 240, the calculated probability that an event will occur, a risk score, and/or information related to reduction of risk, may be generated and transmitted to a separate device, where the probability and/or risk assessment score may be considered and/or analyzed by, for example, a driver, an insurance firm, and/or a government agency.
FIG. 2B illustrates additional aspects of a method of reducing driving risks, according to an exemplary embodiment. For example, in block 250, the operation of calculating a probability that an event will occur, a risk score, and/or information related to reduction of risk may include generating risk scores related to the driver, the driver vehicle, and/or an external environment of the vehicle.
Block 260 shows an additional operation of generating time-based predictions related to driver risk based on ML data.
FIG. 2C illustrates even more aspects of a method of reducing driving risks, according to an exemplary embodiment. For example, in block 270, the operation of generating the time-based predictions of driver risk may include generating an initial prediction of driver risk by detecting the data related to a driver, a vehicle of the driver, and/or an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to the driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
Block 280 shows an additional operation of automatically generating labels for the near-accident indicators. These labels may be used to build models that can be used for near-accident indicator prediction.
Referring now to FIG. 3, an exemplary computing device 300 (e.g., a server device) for performing the method of FIG. 2 and calculating a risk assessment score is shown. The computing device 300 may include a processor 320, a memory 326, a data storage 328, a communication subsystem 330 (e.g., transmitter, receiver, transceiver, etc.), and an I/O subsystem 324. Additionally, in some embodiments, one or more of the illustrative components may be incorporated in, or otherwise form a portion of, another component. For example, the memory 326, or portions thereof, may be incorporated in the processor 320 in some embodiments. The computing device 300 may be embodied as, without limitation, a mobile computing device, a smartphone, a wearable computing device, an Internet-of-Things device, a laptop computer, a tablet computer, a notebook computer, a computer, a workstation, a server, a multiprocessor system, and/or a consumer electronic device.
The processor 320 may be embodied as any type of processor capable of performing the functions described herein. For example, the processor 320 may be embodied as a single or multi-core processor(s), digital signal processor, microcontroller, or other processor or processing/controlling circuit.
The memory 326 may be embodied as any type of volatile or non-volatile memory or data storage capable of performing the functions described herein. In operation, the memory 326 may store various data and software used during operation of the computing device 300 such as operating systems, applications, programs, libraries, and drivers. The memory 126 is communicatively coupled to the processor 320 via the 1/0 subsystem 324, which may be embodied as circuitry and/or components to facilitate input/output operations with the processor 320, the memory 326, and other components of the computing device 300.
The data storage device 328 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, non-volatile flash memory, or other data storage devices. With respect to calculating and determining a risk assessment score, the data storage device 328 may store the above-discussed detected data and/or ML data.
The computing device 300 may also include a communications subsystem 330, which may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications between the computing device 300 and other remote devices over a computer network (not shown). The communications subsystem 330 may be configured to use any one or more communication technology (e.g., wired or wireless communications) and associated protocols (e.g., Ethernet, Bluetooth(R), Wi-Fi(R), WiMAX, LTE, etc.) to affect such communication.
As shown, the computing device 300 may further include one or more peripheral devices 332. The peripheral devices 132 may include any number of additional input/output devices, interface devices, and/or other peripheral devices. For example, in some embodiments, the peripheral devices 332 may include a display, touch screen, graphics circuitry, keyboard, mouse, speaker system, microphone, network interface, and/or other input/output devices, interface devices, and/or peripheral devices. The computing device 300 may also perform one or more of the functions described in detail below and/or may store any of the databases referred to below.
Referring now to FIG. 4, this figure illustrates a schematic diagram of a risk scoring framework 400 using near-accident indicators and ML-based prediction, according to an exemplary embodiment. In FIG. 4, a set of near-accident indicators 410 are defined based on a specific purpose. Within a ML framework automation architecture 420, there may be provided a defined dataset combination 425. According to an exemplary embodiment, the dataset combination may relate to time-series observations and new features (e.g., traffic signals).
The dataset combination 425 may include sensor data that is sensed or detected by various sensors/detectors. The sensors/detectors may include, for example, in-vehicle video camera, out-of-vehicle video cameras, geolocation sensors, digital tachographs, and/or sensors to detect physiological data of a driver.
Machine learning-based prediction may be provided within the ML framework automation architecture 420. The ML-based prediction component may provide short-term indicator prediction and/or long-term indicator prediction. The risk scoring framework 400 may implement an empirical link formula 430 for risk scoring, where the empirical link formula 430 is based on empirical data. The empirical link formula 430 may be combined with indicator distribution to calculate, generate, and/or determine distraction scores 440, which may be further applied for risk assessment for clients 450. The ML framework automation may continuously refine models, according to an exemplary embodiment.
The ML framework automation architecture 420 may generate data models from a variety of different types of data, such as time-series data. The time-series data may include a sequence of data points or data records for a data source over time, such as successive measurements taken by a sensor/detector at various intervals over a period of time. Data sources may include any of a variety of appropriate devices, sensors, systems, components, and/or other appropriate systems that are capable of and configured to generate data records, such as time-series data. For example, the devices for detecting the data constituting the dataset combination may detect time-series data. Such devices measure one or more aspects of their current state or the surrounding environment's current state, and communicate (via wire and/or wirelessly) those measurements as data to a computing device within a vehicle, over the internet (and/or other communication networks, such as local area networks (LANs), wide area networks (WANs), virtual private networks (VPNs), wireless networks (e.g., Wi-Fi networks, BLUETOOTH networks, mobile data networks), or any combination thereof).
For example, the time-series data may include measurements that are taken by engine sensors that take measurements related to the operation and performance of various components of an engine, such as measurements of the temperature, rate of rotation, pressure, flow rate, receipt of control signals, transmission of control signals, and/or other appropriate measurements. Such exemplary time-series data may be used, for example, to generate data models for an engine, which may be used to prospectively identify and correct issues with the engine before they occur (e.g., identify indicators that belts are wearing and in need of replacement) and/or to reactively determine which component(s) of an engine caused a malfunction (e.g., which components are causing vibrations at low speeds).
Referring to FIG. 5, an exemplary list 510 of near-accident indicators is shown. Examples of near-accident indicators 510 include: “Time-Before-Collision”, “Distance to white lines”, “Distance to side walk”, “Response time to traffic signal change”, “Degree of drowsiness”, etc. The near-accident indicators 510 may be factors that may be prospectively used to identify, for example, the risk of an accident occurring. These example near-accident-indicators 510 may be determined based on domain knowledge, however other candidate domain-specific near-accident-indicators may be identified via domain-specific surveys and/or investigation. According to an exemplary embodiment, the relevance between near-accident indicators such as “Time-Before-Collision” and driving distraction indicators (e.g., “Degree of drowsiness”) may be determined based on distraction scores, which in turn are determined at least based on a Loss Function. Distraction scores and Loss Functions are discussed in more detail below. Some of the near-accident indicators 510 may be predicted based on machine-learning. Exemplary formulas 520 for calculating or estimating a Time-Before-Collision indicator are also shown in FIG. 5 from the perspective of Car 1. The calculations may be performed via a computing device in Car 1 or via a remote computing device with which Car 1 communicates. The formulas 520 may be based on a distance between two cars (e.g., Car 1 and Car 2) and their respective velocities. The distance between Car 1 and Car 2 may be determined by, for example, one or more out-of-car video cameras. The velocity of Car 2 may also be detected by the one or more out-of-car video cameras, or by other object velocity detectors, which may use ultrasonic, radar or light-based measurement. The velocity of Car 1 may be detected by its speedometer, or determined from an analysis of geolocation information from sensor data 140.
According to an exemplary embodiment, data related to the Near-Accident-Indicators may be collected and used to continually develop and refine a risk assessment system. Machine-based learning may be used to predict trends in the Near-Accident-Indicators that may lead to better identification of symptoms that contribute to increased driving risk. Such is made possible by defining quantitative measurements (e.g., Near-Accident-Indicators) to assess a driver and a driver vehicle’s condition with respect to driving risk.
Referring to FIG. 6, a schematic diagram for combining data from various sensors to assess driver risk, according to an exemplary embodiment, is illustrated. The sensor data may include, for example in-vehicle video data 610 detected by in-vehicle cameras, outside video data 620, which may be detected by out-of-vehicle cameras and may be video of the environment outside of the vehicle, and physiological data 630, which may be detected by physiological sensors and which may detect physiological data of a driver. The in-vehicle cameras may detect in-vehicle features 650 such as, for example, whether a driver is using a mobile device, whether a driver is yawning, and/or the head orientation of a driver. The out-of-vehicle video cameras may detect out-of-vehicle features 660 such as, for example, the distance between one vehicle and another, the distance between a vehicle and pedestrians, and the state of traffic signals. The physiological data sensors may detect physiological features 670 such as, for example, heart rate, body temperature, and brain waves of a driver. The features 650, 660, 670 detected by the various sensors may be extracted and combined to predict different indicators which may be used to determine driver risk. As opposed to using a single feature to predict indicators, combining multiple sources of sensor data may provide higher confidence in predicted indicators. For example, in the case of a yawn event, analysis of multiple features may lead to a more accurate indicator of drowsiness. The indicators may include, but are not limited to: “Degree or Change Rate of Drowsiness” 680, “Degree or Change Rate of Concentration” 685, “Degree or Change Rate of Stress” 690, and/or “Degree or Change Rate of Risky Driving Behaviors” 695.
Referring to FIG. 7, a probabilistic machine-learning framework for training and continually updating the risk assessment system, according to an exemplary embodiment, is illustrated. Conventional statistic models (e.g., based on accident data) like regression analysis are not satisfactory as they do not deal with complicated combinations of highly unstructured sensor data, and predictions based on such models are not reliable. According to the probabilistic ML framework of the exemplary embodiment, ML algorithms such as, for example, deep neural networks may handle complicated combinations of unstructured data and perform predictions.
According to the exemplary embodiment as shown in FIG. 7, a risk assessment system model may be predefined before it is updated and/or trained. Historical data 710 can be used along with a pre-defined model 720, and the pre-defined model 720 along with the historical data 710 can be trained/updated, via a training/updating operation S730, to create a trained/updated model 740. The training/updating operation S730 may include, but is not limited to, one-pass batched training or multi-epoch mini-batched training. The trained/updated model 740 may be continuously updated and personalized, via online training S750, using real-time sensor data 760. In the process of online training S750, trained model 740 may be an input model, and may be incrementally trained using real-time sensor data 760, which may be obtained online and may have the same data formation as the historical data 710. The output of the online training operation S750 may be an updated trained model 740, which will replace the older version. The trained model 740 may then generate indicators (e.g., Near-Accident-Indicators), via a prediction operation S770. Such indicators (e.g., current driver’s level of drowsiness) are typically not predictable and are uncertain. Thus, driver risk may be assessed much better than with conventional models. The prediction operation S770 may include the trained model 740 being applied to real-time sensor data 760 to produce indicators distribution 780. A difference between S750 and S770 is that in S750 the labelled data in real-time must also be provided for online training, while in S770 the labelled data is unavailable. The indicators that have the same data formation with labelled data may be predicted during prediction operation S770.
Historical data 710 may contain previously measured sensor data including, but not limited to, sensor data 610, 620, 630. Pre-defined model 720 may be, but is not limited to, regression models such as linear regression, non-linear regression, logistic regression, decision tree, random forest, support vector machine (SVM), neural network, and deep neural network models.
According to the exemplary embodiment as shown in FIG. 7, a risk assessment system model may be predefined before it is updated and/or trained. Historical data 710, along with the pre-defined model 720, may be trained/updated periodically with newly accumulated data 730. The predefined model 720 then becomes a trained model 740. The trained model 740 may be continuously updated and personalized via online training 750 using sensor data 760. The trained model 740 may then provide prediction distributions 770 over indicators (e.g., Near-Accident-Indicators), which are not predictable (e.g., current driver’s level of drowsiness) and uncertain. Thus, driver risk may be assessed much better than with any conventional models.
Referring to FIG. 8, time-series prediction models generated in the embodiment of FIG. 7 are shown according to a further exemplary embodiment. Shown in FIG. 8 are time- series prediction models 810 and 820 reflecting the model behavior at different points of time T1 and T2, point T2 being after the time-series prediction model 810 has been updated with newly observed data. Time-series prediction is another advantage of probabilistic-based ML approaches. Even when time-series prediction model 810 is not yet updated, it may take full advantage of sequential sensor data. Model 810, even though not updated, may provide future predictions of risk indicators using risk symptoms (e.g., yawn), so that corrective actions may be taken based on predictions indicated by the model 810.
Time-series prediction model 820, on the other hand, reflects a model that may continuously update its predicted distribution over indicators using newly observed data. The updating of the time-series prediction model 820 may occur via an online updating scheme. The trained time-series prediction models may continuously provide long-term prediction of near accident indicators.
A benefit of a continually updated time-series model is evident by time- series models 810 and 820. In time-series model 810, the predicted driver assessment is based on data that is collected about the driver at a particular point in time T1. Time-series model 820 illustrates a driver’s state when detected at a point in time T2 later than the time point T1 in time-series model 810 - so the time-series model 820 reflects a more accurate driver risk assessment prediction based on updated data, than the driver state data detected at the earlier point in time T1 in time-series model 810.
Referring now to FIG. 9, an empirical link formula for risk scoring according to an exemplary embodiment is shown. The empirical link formula may integrate a loss function and indicator distribution to determine a distraction score Score(x). The distraction score Score(x) may be an overall driving risk score of the target driver, and is computed by integrating Indicator Distribution p (y | x, D) and Loss Function L(y), where y is a Near-Accident Indicator, x is input data and D is training data. Indicator Distribution may be obtained as discussed in FIGs. 7 and 8. The Loss Function of y may be determined according to empirical knowledge (e.g., empirical historical data) based on a target application (e.g., if Time-Before-Collision indicator y > 3 seconds, impose no penalty). Although the Loss Function may be predefined and/or based on a threshold, the Loss Function, according to an exemplary embodiment, may be more than a mere threshold and may take into consideration indicator values as well as distributions based on in-domain experience and knowledge (e.g., from driving safety experts). The domain of the definition of Loss Function may be consistent with the Indicator Distribution function.
The importance of risk indicators may vary based on their application. For example, degree of drowsiness may be particularly crucial to truck drivers who usually have long drives. In this exemplary scenario, a loss function may impose stronger penalties on drowsiness indicators via the loss function, without changing indicator prediction distributions. By combining the predictive distribution over Near-Accident Indicators and empirical loss functions, quantitative evaluations of driving risks may be achieved. Empirical loss functions may be designed according to each of the indicators and target applications. Thus, the indicator predictions may be reusable across various applications.
In FIG. 10, according to an exemplary embodiment, a schematic diagram for automatically generating ML frameworks is illustrated. The ML framework automation architecture 1000 shown in FIG. 10 may automatically generate machine frameworks, which in turn may automatically build complex model large-scale data in a short time span. In FIG. 10, various sensors may detect and obtain information on the driver, the driver’s vehicle, and/or factors external to the vehicle. The collected data may be referred to as sensor data 1010. The sensor data 1010 may be analyzed and subjected to time-series modeling 1020. The collected and analyzed sensor data may be received from one or more vehicles, and stored in a dataset storage 1030.
The above-described dataset storage 1030 may be embodied as any type of storage means, including memory. The dataset storage 1030 may be embodied as any type of volatile or non-volatile memory or data storage capable of performing the functions described herein. In operation, the dataset storage 1030 may store various data and software. The dataset storage 1030 may be provided in a vehicle or may be part of a separate device and/or server communicatively coupled to the driver vehicle.
The stored data that has been collected from one or more vehicles may be transmitted to a training data generation engine 1040, where the transmitted data may be subject to interpolation, augmentation, and/or robust sampling. The data that is manipulated by the training data generation engine may be transmitted to a model building engine 1050, in which the data may be subjected to loss function analysis based on how the data will be used. The training data generation engine 1040 and model building engine 1050 may be separate computing devices, similar to that which is illustrated in FIG. 3, or may constitute a single computing device.
Referring to FIG. 11, a data frame segment 1100, which may be used as input data for model building and for transmitting information according to an exemplary embodiment, is shown. A data frame 1110 of the data segment 1100 is also shown. A data frame 1110 may be a time point associated information set that contains useful data for model building. A set of successive data frames 1110 in a defined time span may constitute a data frame segment 1100, which may be used as input data for model building and for communication between the aforementioned sensors and server computing device.
All sensor data as well as features extracted from the raw sensor data that are associated to a specific time point, may be grouped into a data frame 1110, which may include video images, sensor observations, transformed time-specific variables, etc. Some category or attribute data such as driver age/gender, vehicle type, may also be integrated into the data frames 1110. By doing so, all category/profile specific information may be moved from model building hyper-parameters to input data sets, so that the generality of the data frames may be enhanced.
Now referring to FIG. 12, an exemplary scenario involving asynchronized automatic labelling of indicators is shown. The data frame segments 1100 may be used to build models for indicator prediction. Any possible observation or measurement methods, including, for example, using a beacon, light detection and ranging (LiDAR), or video taking via a camera, may be used to generate labels of indicators. Such labelled indicators may be asynchronously matched to relevant data frame segments.
According to an exemplary embodiment, a label of an indicator may correspond to ground truth data. In an exemplary scenario, a beacon system may generate two measured distances from the beacon sensor to the successive cars running along the road, based the estimated running speeds of the two cars. Therefore, the beacon system may be able to accurately calculate the distance between the two cars in the future, and such distance data may be considered ground truth data for calculating true indicator values such as Time-Before-Collision Indicator.
In the case that a Time-Before-Collision Indicator may be used for labelling, when any observation or measurement is done by a beacon, LiDAR, or camera, a label may be automatically generated with the measured value. When an indicator label is automatically generated, the stored data frame segments may be backtracked and used to identify data frame segments corresponding to this label. For example, in FIG. 12, a measurement indicator 1220 may be generated at time T, where the measurement indicator is the Time-Before-Collision (i.e., time before collision between Car 1 and Car 2) indicator. With respect to the indicator 1220 generated at time T, two candidate data frame segments 1210 may found, which are generated at previous times T1 and T2, respectively. The pairs of matched data frame segments 1210 and indicator labels may be the dataset used for complex model building.
FIG. 13 provides a detailed illustration of a training data generation engine 1300, which may be similar to the training data generation engine 1040 illustrated in FIG. 10. The training data generation engine 1300 may include a data interpolation module 1310, a data augmentation module 1320, and/or a robust sampling module 1330 for applying data interpolation, augmentation, and robust sampling technologies to automatically generate a training data set for complex model building. The data interpolation module 1310 may perform data interpolation by increasing the matched pairs of data frame segments and indicators, by finding similar data frame segments compared to the actual labelled ones, and assigning synthetic label values to them. Similarity algorithms for finding similar data frames may include but are not limited to Euclidean distance, Jaccard index, and Kolmogorov similarity algorithms. The data augmentation module 1320 may perform data augmentation by increasing the number of unique data frame segments by adding one or more kinds of variances to original ones. Augmentation algorithms for image processing may include but are not limited to translation, rotation, and brightness / contrast adjustment. The robust sampling module 1330 may perform robust sampling by increasing the combination instances of data frame segments, as well as balancing the distribution of the labels, so as to make the model building process more robust, and reduce risks of data distribution bias.
Data interpolation, data augmentation, and robust sampling are discussed in further detail below with respect to FIGs. 14, 15A, 15B, and 16.
Referring now to FIG. 14, a scenario in which data interpolation is applied and exemplary similarity algorithms, are shown according to an exemplary embodiment. That is, in box 1410, there are listed several different similarity algorithms that may be used to find similar data frames according to an exemplary embodiment. Also, an exemplary scenario 1420 in which data interpolation is applied to train automatic data generation is shown. When applying linear interpolation, the parameters to interpolate the corresponding data frame segment may first be determined, then the parameters may be used to interpolate the indicator.
For example, in exemplary scenario 1420, based on distance and velocity measurements from beacon A and beacon B, indicators TBCT1 and TBCT3, at time T1 and T3 separately, may be calculated. At T2, there is no measurement. The corresponding collected data frame segments at t1, t2, and t3 are DFSt1, DFSt2, and DFSt3. Subsequently:
1) parameters a, b may be calculated, which produce
Figure JPOXMLDOC01-appb-I000001
by linear formula:
Figure JPOXMLDOC01-appb-I000002
a *
Figure JPOXMLDOC01-appb-I000003
+ b *
Figure JPOXMLDOC01-appb-I000004
such that the similar distance
Figure JPOXMLDOC01-appb-I000005
is minimized; and
2) interpolated indicator
Figure JPOXMLDOC01-appb-I000006
may be calculated with linear formula:
Figure JPOXMLDOC01-appb-I000007
a *
Figure JPOXMLDOC01-appb-I000008
+ b *
Figure JPOXMLDOC01-appb-I000009
Referring to FIGs. 15A and 15B, the application of data augmentation and various augmentation methods are shown, respectively, according to an exemplary embodiment. In FIG. 15A, it is illustrated that when applying data augmentation, a number of extra data frame segments 1520 that are associated with an initial data frame 1510 may be generated via data augmentation. FIG. 15B lists several different selectable augmentation methods (e.g., related to image data).
Referring now to FIG. 16, the original distributions of input data often may be unbalanced and, therefore, cause risks of bias of the model. By combining sampling techniques with data interpolation, augmentation and robust sampling, the robustness of the model may be increased so that the prediction performance may be kept in a stable satisfying level. In an original distribution of data model 1610, there may be some value ranges of data that are sparse and, therefore, the samples are few. Consequently, the distribution may be unbalanced, which may cause model performance issues such as biased prediction. The original distribution of data may be subjected to interpolation and augmentation model 1620. Based on the data interpolation and augmentation model 1620, numbers of samples around the entire range of distribution may be increased. Robust sampling 1630 may then be applied to the interpolated and augmented data, via down sampling and over sampling. By down sampling the dense range of data while over sampling the sparse range of data, better-balanced distribution of input data may be obtained for model building purposes.
In FIG. 17, a Machine Learning Automation Architecture according to an exemplary embodiment may be used to create a model building engine 1700 that, in turn, builds a wide range of complex models. Methods of pipeline, incrementation, boosting, and transferring, as shown in FIG. 17, may be used to facilitate model building processes to improve model performance.
In model building engine 1700, a model building pipeline 1720 may organize a set of key configurations including data sources, target variable(s), model structure, cost function, regularization and optimization schemes, as well as other hyper parameters. A model building pipeline may be a defined sequence of model building tasks, such as loading datasets, splitting datasets for training and validation, loading configured hyper-parameters, performing model training operations, or doing model validation after training. Model building pipelines may be managed and run in parallel computation environments. Model building with an incrementation method may continually use an existing pipeline 1730 with newly obtained data. The model may be enhanced based on an enlarged training dataset. Model building with a ‘boosting’ method focuses on what an initial model does poorly and uses newly obtained data to correct the initial model. The original pipeline may be modified to incorporate this usage, typically by applying ensemble methods.
An ensemble method may include operations using multiple models to realize better performance. An example of such includes, but is not limited to, a model that makes good predictions for a residential street but performs poorly for highway situations. When subsequent new datasets are available for highway situations, instead of totally re-building the model, a new model may be built with the new highway datasets, then the two models may be combined together to work as a single model. The combined methods include but are not limited to, simply averaging the predicted result, voting on different situations, or building a wrapper simple model such as linear regression to merge the separate predicted results.
Finally, model building with the ‘transferring’ method may involve replicating an entire or partial structure of some well-built model as the baseline, and then may start to build the new model for another application. The new dataset may include the original data but usually contains different data at some extent.
Referring to FIGs. 18-21, these figures respectively relate to the pipeline definition, a pipeline incrementation method, a pipeline boosting method, and a pipeline transferring method, each of which are described in FIG. 17. FIG. 18, for example, illustrates well-defined configurations of a repeatable model building pipeline. Pipelines may be cornerstones of ML framework automation for large scale model building.
FIGs. 19A and 19B illustrate exemplary scenarios of using a model building engine to build models with incrementation. Under some circumstances, models built with a moderate dataset volume may be applied for early market application deployment, and, after accumulating more data, the models may be incrementally built-up, developed and refined with the existing pipelines to improve model performance. In FIG. 19A, in case of early deployment, an original pipeline may be re-run with newly accumulated data and the model may be updated with data from early models, to produce a new model. With respect to the pipeline operation process of FIG. 19B, a pre-set original pipeline may be reconnected to newly accumulated data. All other configuration settings may be kept the same as the original values.
In general, the above-described incrementation method relates to a usage of a pipeline, where the pipeline is re-run with increased datasets, while all other configurations are kept in their original form. The newly trained model may be expected to have better performance compared to the original model, and may be applied in the same context of the original context to replace the old model.
FIGs. 20A and 20B illustrate exemplary scenarios of using a model building engine to build models with boosting. The performance of existing models may be continually monitored by comparing the prediction results of existing models to newly measured models. For example, when one model performs poorly under some specific conditions, such as when highway conditions or night-time driving is involved, the relevant data samples may be increased and the model structure may be modified, and then the pipeline may be re-run.
Under some circumstances, models built with moderate dataset volumes may be applied for early market application deployment, and, after accumulating more data, the models may be incrementally built with the existing pipelines to improve model performance. Such operations are performed in the Use Case of FIG. 20A and the pipeline operation process of FIG. 20B.
In general, the ‘boosting’ method may involve boosting usage of a pipeline, where the pipeline is re-run with manipulated datasets, for example, to intentionally increase the datasets. The increase in datasets may mean that the datasets relate to the conditions under which the model performs poorly, such as when predicting highway condition. With the boosting method, there is an option of whether or not to use the above-mentioned ensemble method. If the ensemble method is not used, a newly trained model may be in the same structure as the original model. In this case, boosting usage may be similar to incrementation usage, except that the datasets are manipulated with consideration of a model’s poor performance aspects.
If the ensemble method is used, then a newly trained model may be combined together with the original model; in other words, the new final model may be a combination model containing two or more models (according to the number of rounds the ensemble method is applied). The newly trained model (or combined models) may be applied in the original context to replace the old model.
FIGs. 21A and 21B illustrate exemplary scenarios of using a model building engine to build models with a ‘transferring’ method. In some cases, the available data may be insufficient to build new models, such as when developing applications for a new type of vehicle, expanding to a new market, or introducing a new target indicator. In such case, an existing model may be used as the base and the model may be further developed with a limited new dataset, to accelerate the building process and improve model performance. Under some circumstances, models built with a moderate dataset volume may be applied for early market application deployment, and after accumulating more data the models may be incrementally built with the existing pipelines to improve model performance. Such operations may be performed in the Use Case of FIG. 21A and the pipeline operation process of FIG. 21B.
The ‘transferring’ method may include transferring usage of a pipeline, where the application purpose is different from the original model. For example, given an original model that predicts a Time-Before-Collision Indicator by using in-car camera videos, the purpose of transferring usage of a pipeline may be to train a model that predicts Distance-To-Side-Walk Indicator. The original model is used as a base structure, with its current internal parameters kept as initial values, while the datasets and ground truth values are replaced with different videos and an actual distance to a sidewalk. After running the pipeline, the newly trained model may be applied to a different context, for example, to predict a Distance-To-Side-Walk Indicator while the original model predicting Time-Before-Collision Indicator is kept in its original state without replacement.
The prediction power of exemplary embodiments of the instant disclosure exceed conventional simple ML models, even as dataset scales increase. Conventional ML models are often “shallow” models containing no more than hundreds or thousands of internal parameters. On the other hand, exemplary embodiments of the instant disclosure relate to deep learning models (i.e. complex models) that often contain millions of internal parameters. Deep models may require much more data to do model building, therefore it is very difficult to build such models from a scratch, which is often the situation when launching a new business or creating new solutions.
By implementing ML Framework Automation Architecture according to exemplary embodiments of the instant disclosure, the available datasets may be utilized in a maximum way by taking advantage of data interpolation, augmentation, and robust sampling. On the other hand, exemplary implementation of the exemplary embodiments may make model building tasks more manageable, replicable, and transferrable, so that as a whole such complex models may be built in the most efficient way as well as in the most wide scale.
Exemplary embodiments set forth herein apply complex ML models (e.g., deep neural networks) to perform operations and calculations. Exemplary embodiments reflect a framework automation architecture to automatically generate the input that support complex model building approaches. Based on the time-series prediction characteristics and indicator-loss function mechanism, world-wide-scale data may be collected and such big data may be utilized to build and deliver general complex models with efficiency.
Referring to FIG. 22, an exemplary implementation of the Car 1 shown in, for example, FIG. 5, is illustrated. Vehicle 2200 may include a front facing camera 2210, a rear facing camera 2220, a physiological sensor 2230, a digital tachograph module 2240, and a central computation module 2250. The vehicle 2200 may wirelessly connect to a cloud service 2260. The front facing camera 2210 may detect outside visual data and transmit the data to the central computation module 2250 via a USB interface. The rear facing camera 2220 may detect the driver and other in-vehicle visual data and transmit the data to the central computation module 2250 via a USB interface. The physiological sensor 2230 may obtain driver physiological data and transmit the data to the central computation module 2250 via a WLAN interface such as, for example, Bluetooth or Wi-Fi. The digital tachograph module 2240 may obtain vehicle location and running data, and transmit the data to the central computation module 2250 via a USB interface. The central computation module 2250 may collect sensor data and perform prediction. The central computation module 2250 may communicate to the cloud service 2260 via a mobile data protocol. The cloud service may provide remote functional service, such as, for example, map access or user profile management.
A detailed schematic of the central computation module 2250 is also shown in FIG. 22. The central computation module 2250 may include a USB module 2251, a WLAN module 2252, a data communication module 2253, a GPU module 2254, a CPU module 2255, a memory module 2256, a storage module 2257, and a touch-panel module 2258. The USB module 2251 may implement USB data transmission protocol, the WLAN module 2252 may implement WLAN data transmission protocols, such as, for example, Bluetooth and Wi-Fi, the data communication module 2253 may implement mobile wireless protocols, the GPU module 2254 may perform deep learning related processing, the CPU module 2255 may perform general purpose computation processing, the memory module 2256 may store data and provide high-speed accessibility, the storage module 2257 may permanently store data in media such as SSD, and the touch-panel module 2258 may provide user input and display functionality.
FIG. 23 shows a networking architecture 2300 that includes a vehicle 2310, a network 2320, a server device 2360, and a vehicle and driver information server 2340. The server device 2360 may include a scoring module 2370 and predicting module 2380, which may have similar functionality to the scoring system 160 and predicting system 170 of FIG. 1. Sensors of the vehicle 2310 may transmit sensor data (e.g., sensor data 140 of FIG. 1) to server device 2360 over the network 2320. In one example, the vehicle 2310 may receive historical data 150, vehicle info 110, driver profile data 120, and/or map information 130 from the vehicle and driver information server 2340 via a direct communication link. The direct communication link may be a wireless link such as, for example, a Bluetooth, IR, Wi-Fi, NFC link, etc. Communication between the vehicle 2310 and the vehicle and driver information server 2340 may also be indirectly provided over the network 2320. The network 2320 may include any suitable combination of servers, access points, routers, base stations, mobile switching centers, public switching telephone network (PSTN) components, etc., to facilitate communication.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various exemplary embodiments. In this regard, each block in the flowcharts or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block(s) may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The methods shown herein may generally be implemented in a computing device or system. The computing device or system may be a user level device or system or a server-level device or system. More particularly, the methods may be implemented in one or more modules as a set of logic instructions stored in a machine or computer-readable storage medium such as random access memory (RAM), read only memory (ROM), programmable ROM (PROM), firmware, flash memory, etc., in configurable logic such as, for example, programmable logic arrays (PLAs), field programmable gate arrays (FPGAs), complex programmable logic devices (CPLDs), in fixed-functionality logic hardware using circuit technology such as, for example, application specific integrated circuit (ASIC), complementary metal oxide semiconductor (CMOS) or transistor-transistor logic (TTL) technology, or any combination thereof.
For example, computer program code to carry out operations shown in the methods of FIGs. 2A and 2B may be written in any combination of one or more programming languages, including an object-oriented programming language such as JAVA, SMALLTALK, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. Additionally, logic instructions might include assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, and/or other structural components that are native to hardware (e.g., host processor, central processing unit/CPU, microcontroller, etc.).
Example sizes/models/values/ranges may have been given, although embodiments are not limited to the same. Where specific details are set forth in order to describe example embodiments, it should be apparent to one skilled in the art that embodiments can be practiced without, or with variation of, these specific details. The description is thus to be regarded as illustrative instead of limiting.
Those skilled in the art will appreciate from the foregoing description that the broad techniques of the one or more embodiments can be implemented in a variety of forms. Therefore, while the embodiments have been described in connection with particular examples thereof, the true scope of the embodiments should not be so limited since other modifications will become apparent to the skilled practitioner upon a study of the drawings, specification, and following claims.

Claims (20)

  1. A predictive system comprising:
    one or more sensors configured to:
    detect data related to at least one of:
    a driver,
    a vehicle of the driver, and
    an external environment of the vehicle, and
    transmit the detected data; and
    a server configured to receive the detected data, the server comprising:
    a receiver configured to receive the detected data;
    a processor coupled to a memory, the processor configured to:
    determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident,
    calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data, and
    continually update the machine-learning data based on new detected data;
    and
    a transmitter configured to output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  2. The system of claim 1, wherein the processor is configured to generate time-based predictions of driver risk based on the machine-learning data.
  3. The system of claim 1, wherein the calculation of the at least one of the probability that the event will occur, the risk score, and the information related to a reduction of risk is based on risk scores related to at least one of the driver, the vehicle of the driver, and the external environment of the vehicle.
  4. The system of claim 2, wherein the generating the time-based predictions of driver risk comprises generating an initial prediction of driver risk by detecting the data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to at least one of a driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
  5. The system of claim 1, wherein the processor is configured to automatically generate labels for the near-accident indicators, the labels to be used to build models for future near-accident indicator prediction.
  6. A method comprising:
    receiving detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle;
    determining one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident;
    calculating at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data;
    continually updating the machine-learning data based on new detected data; and
    outputting the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  7. The method of claim 6, further comprising generating time-based predictions of driver risk based on the machine-learning data.
  8. The method of claim 6, wherein the calculating the at least one of a probability that the event will occur, the risk score, and the information related to the reduction of risk includes generating risk scores related to at least one of the driver, the vehicle of the driver, and the external environment of the vehicle.
  9. The method of claim 7, wherein the generating the time-based predictions of driver risk comprises generating an initial prediction of driver risk by detecting the data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to at least one of a driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
  10. The method of claim 6, further comprising automatically generating labels for the near-accident indicators, the labels to be used to build models for future near-accident indicator prediction.
  11. An apparatus comprising:
    a receiver configured to receive detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle;
    a processor coupled to a memory, the processor configured to:
    determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident,
    calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data, and
    continually update the machine-learning data based on new detected data;
    and
    a transmitter configured to output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  12. The apparatus of claim 11, wherein the processor is configured to generate time-based predictions of driver risk based on the machine-learning data.
  13. The apparatus of claim 11, wherein the calculation of the at least one of the probability that the event will occur, the risk score, and the information related to a reduction of risk is based on risk scores related to at least one of the driver, the vehicle of the driver, and the external environment of the vehicle.
  14. The apparatus of claim 12, wherein the generating the time-based predictions of driver risk comprises generating an initial prediction of driver risk by detecting the data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to at least one of a driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
  15. The apparatus of claim 11, wherein the processor is configured to automatically generate labels for the near-accident indicators, the labels to be used to build models for future near-accident indicator prediction.
  16. A non-transitory computer readable medium comprising a set of instructions, which when executed by one or more processors of a device, cause the one or more processors to:
    receive detected data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle;
    determine one or more near-accident indicators based on a plurality of different detected data and historical data related to the detected data, the near-accident indicators relating to a probability of an occurrence of an accident
    calculate at least one of the probability that the accident will occur, a risk score, and information related to a reduction of risk, based on the determined one or more near-accident indicators and machine-learning data;
    continually update the machine-learning data based on new detected data; and
    output the at least one of the probability that the event will occur, the risk score, and the information related to the reduction of risk.
  17. The non-transitory computer readable medium of claim 16, wherein the one or more processors generate time-based predictions of driver risk based on the machine-learning data.
  18. The non-transitory computer readable medium of claim 16, wherein the one or more processors calculate the at least one of a probability that the event will occur, the risk score, and the information related to the reduction of risk, based on risk scores related to at least one of the driver, the vehicle of the driver, and the external environment of the vehicle.
  19. The non-transitory computer readable medium of claim 16, wherein the generating the time-based predictions of driver risk comprises generating an initial prediction of driver risk by detecting the data related to at least one of a driver, a vehicle of the driver, and an external environment of the vehicle at one point in time and updating the prediction of driver risk by detecting the data related to at least one of a driver, the vehicle of the driver, and the external environment of the vehicle at a subsequent point in time.
  20. The non-transitory computer readable medium of claim 16, wherein the one or more processors automatically generate labels for the near-accident indicators, the labels to be used to build models for future near-accident indicator prediction.
PCT/JP2018/028514 2018-07-30 2018-07-30 Distracted driving predictive system WO2020026318A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021504845A JP7453209B2 (en) 2018-07-30 2018-07-30 Careless driving prediction system
PCT/JP2018/028514 WO2020026318A1 (en) 2018-07-30 2018-07-30 Distracted driving predictive system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2018/028514 WO2020026318A1 (en) 2018-07-30 2018-07-30 Distracted driving predictive system

Publications (1)

Publication Number Publication Date
WO2020026318A1 true WO2020026318A1 (en) 2020-02-06

Family

ID=69232410

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/028514 WO2020026318A1 (en) 2018-07-30 2018-07-30 Distracted driving predictive system

Country Status (2)

Country Link
JP (1) JP7453209B2 (en)
WO (1) WO2020026318A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668786A (en) * 2020-12-30 2021-04-16 神华信息技术有限公司 Mine car safety assessment prediction method, terminal equipment and storage medium
WO2022023694A1 (en) * 2020-07-29 2022-02-03 Sony Group Corporation Systems, devices and methods for operating a vehicle with sensors monitoring parameters
US11368486B2 (en) * 2019-03-12 2022-06-21 Fortinet, Inc. Determining a risk probability of a URL using machine learning of URL segments
EP4053737A1 (en) * 2021-03-04 2022-09-07 Zenseact AB Detecting and collecting accident related driving experience event data
US11511737B2 (en) 2019-05-23 2022-11-29 Systomix, Inc. Apparatus and method for processing vehicle signals to compute a behavioral hazard measure
CN115953858A (en) * 2022-11-29 2023-04-11 摩尔线程智能科技(北京)有限责任公司 Vehicle-mounted DMS-based driving scoring method and device and electronic equipment
EP4248394A4 (en) * 2020-11-18 2024-10-09 Vinli Inc Collaborative mobility risk assessment platform

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180170372A1 (en) * 2015-08-28 2018-06-21 Sony Corporation Information processing apparatus, information processing method, and program
WO2018135605A1 (en) * 2017-01-23 2018-07-26 パナソニックIpマネジメント株式会社 Event prediction system, event prediction method, program, and moving body

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180170372A1 (en) * 2015-08-28 2018-06-21 Sony Corporation Information processing apparatus, information processing method, and program
WO2018135605A1 (en) * 2017-01-23 2018-07-26 パナソニックIpマネジメント株式会社 Event prediction system, event prediction method, program, and moving body

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11368486B2 (en) * 2019-03-12 2022-06-21 Fortinet, Inc. Determining a risk probability of a URL using machine learning of URL segments
US11511737B2 (en) 2019-05-23 2022-11-29 Systomix, Inc. Apparatus and method for processing vehicle signals to compute a behavioral hazard measure
WO2022023694A1 (en) * 2020-07-29 2022-02-03 Sony Group Corporation Systems, devices and methods for operating a vehicle with sensors monitoring parameters
GB2597692A (en) * 2020-07-29 2022-02-09 Sony Europe Bv Systems, devices and methods for operating a vehicle with sensors monitoring parameters
EP4248394A4 (en) * 2020-11-18 2024-10-09 Vinli Inc Collaborative mobility risk assessment platform
CN112668786A (en) * 2020-12-30 2021-04-16 神华信息技术有限公司 Mine car safety assessment prediction method, terminal equipment and storage medium
CN112668786B (en) * 2020-12-30 2023-09-26 国能信息技术有限公司 Mine car safety assessment prediction method, terminal equipment and storage medium
EP4053737A1 (en) * 2021-03-04 2022-09-07 Zenseact AB Detecting and collecting accident related driving experience event data
CN115953858A (en) * 2022-11-29 2023-04-11 摩尔线程智能科技(北京)有限责任公司 Vehicle-mounted DMS-based driving scoring method and device and electronic equipment

Also Published As

Publication number Publication date
JP7453209B2 (en) 2024-03-19
JP2021532487A (en) 2021-11-25

Similar Documents

Publication Publication Date Title
WO2020026318A1 (en) Distracted driving predictive system
US20230036879A1 (en) Object movement behavior learning
US11861481B2 (en) Searching an autonomous vehicle sensor data repository
US20210117760A1 (en) Methods and apparatus to obtain well-calibrated uncertainty in deep neural networks
US11244402B2 (en) Prediction algorithm based attribute data processing
JP6394735B2 (en) Detection of limbs using hierarchical context-aware
US20210188290A1 (en) Driving model training method, driver identification method, apparatuses, device and medium
US11599563B2 (en) Programmatically identifying a personality of an autonomous vehicle
US11675641B2 (en) Failure prediction
JP2023514465A (en) machine learning platform
US11238369B2 (en) Interactive visualization evaluation for classification models
US20230169420A1 (en) Predicting a driver identity for unassigned driving time
Jagatheesaperumal et al. Artificial Intelligence for road quality assessment in smart cities: a machine learning approach to acoustic data analysis
US20230126842A1 (en) Model prediction confidence utilizing drift
WO2022025244A1 (en) Vehicle accident prediction system, vehicle accident prediction method, vehicle accident prediction program, and trained model generation system
AU2021251463B2 (en) Generating performance predictions with uncertainty intervals
Yoon et al. Who is delivering my food? Detecting food delivery abusers using variational reward inference networks
US20230058076A1 (en) Method and system for auto generating automotive data quality marker
US11145196B2 (en) Cognitive-based traffic incident snapshot triggering
Behboudi et al. Recent Advances in Traffic Accident Analysis and Prediction: A Comprehensive Review of Machine Learning Techniques
Verma et al. Impact of Driving Behavior on Commuter’s Comfort During Cab Rides: Towards a New Perspective of Driver Rating
US20230126323A1 (en) Unsupervised data characterization utilizing drift
US20230128081A1 (en) Automated identification of training datasets
US11887386B1 (en) Utilizing an intelligent in-cabin media capture device in conjunction with a transportation matching system
US20230126294A1 (en) Multi-observer, consensus-based ground truth

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18928382

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021504845

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18928382

Country of ref document: EP

Kind code of ref document: A1