EP4374335A1 - Dispositif électronique et procédé - Google Patents

Dispositif électronique et procédé

Info

Publication number
EP4374335A1
EP4374335A1 EP22751033.6A EP22751033A EP4374335A1 EP 4374335 A1 EP4374335 A1 EP 4374335A1 EP 22751033 A EP22751033 A EP 22751033A EP 4374335 A1 EP4374335 A1 EP 4374335A1
Authority
EP
European Patent Office
Prior art keywords
depth data
ann
criterion
intended functionality
electronic device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP22751033.6A
Other languages
German (de)
English (en)
Inventor
Stefaan VERSCHUERE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Depthsensing Solutions NV SA
Sony Semiconductor Solutions Corp
Original Assignee
Sony Depthsensing Solutions NV SA
Sony Semiconductor Solutions Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Depthsensing Solutions NV SA, Sony Semiconductor Solutions Corp filed Critical Sony Depthsensing Solutions NV SA
Publication of EP4374335A1 publication Critical patent/EP4374335A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning

Definitions

  • the present disclosure generally pertains to the field of artificial intelligence, in particular to operation and training of an artificial neural network, and in particular to production and storage of training data for artificial neural networks.
  • the present disclosure may for example be applied in an automotive context and with respect to depth images obtained by a Time-of-Flight camera.
  • a Time-of-Flight (ToF) sensor is a range imaging camera system that determines the distance of objects by measuring the time of flight of a light signal between the camera and the object for each point of the image.
  • Depth data obtained by a ToF camera may for example be used to in the automotive industry in several applications like in-cabin-monitoring (ICM), face identification for access control or engine starting, gesture recognition for control of the car infotainment system etc.
  • the depth data obtained by a ToF camera is typically analyzed by the application with the help of an artificial neural network (ANN).
  • ANN artificial neural network
  • the depth data is typically stored temporarily in a volatile memory to be processed by the ANN.
  • Depth data may occupy a significant amount of memory, for example the raw data image of a 1 Megapixel iToF sensor may occupy 128 Mbit. After the ANN processing, the depth data is typically released from memory to free up working memory.
  • An ANN is trained e.g. by supervised learning, i.e. a collection of labelled training examples. That means that each depth image (or sequence of depth images) is labelled with a specified classification result to which the depth image should lead.
  • An ANN which is trained with too few or wrongly labelled images may perform sub-optimal at initial deployment and may have so-called false negative failures at operation in an application.
  • the disclosure provides an electronic device comprising circuitry configured to obtain depth data; perform classification of the depth data by an artificial neural network, ANN, to determine classification information for the depth data, perform an intended functionality if a primary criterion or a secondary criterion is fulfilled; determine that the primary criterion for performing the intended functionality is fulfilled based on the classification information obtained by the ANN; determine that the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism; determine, based on the secondary mechanism and the classification information obtained by the ANN.
  • ANN artificial neural network
  • the disclosure provides a method obtaining depth data; performing classification of the depth data by an artificial neural network, ANN, to determine classification information for the depth data, performing an intended functionality if a primary criterion or a secondary criterion is fulfilled; determining that the primary criterion for performing the intended functionality is fulfilled based on the classification information obtained by the ANN; determining that the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism; determining, based on the secondary mechanism and the classification information obtained by the ANN, if the classification of the depth data is false negative.
  • Fig. 1 schematically shows an embodiment of a system for obtaining and storing ToF sensor depth images and using it for training and operation of an ANN in an automotive environment;
  • Fig.2 schematically shows a layered architecture of an automotive software platform
  • Fig. 3 shows an exemplifying architecture of a convolutional neural network for image classification
  • Fig. 4 schematically shows a sliding window operation for determining depth data classified as false negative by an ANN
  • Fig. 5 shows a flowchart of capturing false negative depth data with respect to an intended functionality
  • Fig. 6 schematically shows s feedback loop of capturing false negative classified ToF depth data and a re-training of an ANN with respect to an intended functionality
  • Fig. 7 is a block diagram depicting an example of schematic configuration of a vehicle control system as an example of a mobile body control system to which the capturing of false negative depth data with respect to an intended functionality can be applied;
  • Fig. 8 is a diagram explaining an example of installation positions of an outside-vehicle information detecting section and an imaging section.
  • the embodiments disclose an electronic device comprising circuitry configured to obtain depth data; perform classification of the depth data by an artificial neural network, ANN, to determine classification information for the depth data, perform an intended functionality if a primary criterion or secondary criterion is fulfilled; determine that the primary criterion for performing the intended functionality is fulfilled based on the classification information obtained by the ANN; determine that the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism; determine, based on the secondary mechanism and the classification information obtained by the ANN, if the classification of the
  • the circuitry of the electronic device may comprise one or more microprocessors, microcontrollers, one or more electronic control units (ECUs), communication buses (e.g. CAN, Flexray, Ethernet), system-on-chip (SoC) technology, FPGA technology, GPU technology (or a GPU optimized for ANN processing), and/or an imaging sensor of an imaging camera, in particular a sensor of a ToF camera.
  • the circuitry of the electronic device may also include electronic components such as switching elements (gates, transistors, etc.), resistors, memory elements (capacitors, volatile memory like RAM/ SDRAM, non-volatile memory like ROM or the like), pixel circuitry, a storage, input means (mouse, keyboard, camera, etc.), output means (display (e.g.
  • liquid crystal liquid crystal, (organic) light emitting diode, etc.), loudspeakers, etc., a (wireless) interface, etc., as it is generally known for electronic devices (computers, smartphones, etc.).
  • electronic devices computers, smartphones, etc.
  • sensors for sensing still image or video image data image sensor, camera sensor, video sensor, etc.
  • sensing a fingerprint for sensing environmental parameters (e.g. radar, humidity, light, temperature), etc.
  • the circuitry may be configured to re-train the ANN if the depth data is classified as false negative.
  • the circuitry may be configured to store the obtained depth data on a volatile memory.
  • a volatile memory may be a computer memory that requires power to maintain the stored information and that retains its contents while powered on but when the power is interrupted, the stored data is lost.
  • a volatile memory memory may for example be a general- purpose random-access memory (RAM), a double data rate (DDR) synchronous dynamic random- access memory (SDRAM), or the like.
  • the circuitry may be configured to store the depth data on a non volatile memory if the depth data is classified as false negative.
  • a non-volatile memory may be any type of computer memory that can retain stored information even after power is removed.
  • non-volatile memory examples include flash memory, read-only memory (ROM), ferroelectric RAM, most types of magnetic computer storage devices (e.g. hard disk drives, floppy disks, and magnetic tape), optical discs, and computer storage methods such as paper tape and punched cards.
  • flash memory read-only memory (ROM)
  • ROM read-only memory
  • ferroelectric RAM most types of magnetic computer storage devices (e.g. hard disk drives, floppy disks, and magnetic tape), optical discs, and computer storage methods such as paper tape and punched cards.
  • the depth data may for example comprise one or more images comprising depth information like raw depth data from a ToF sensor, a photon count value image, an amplitude/ confidence map a depth map or the like, and it may also comprise sequence of depth images.
  • each pixel may generate eight individual raw data values (4 phases, 2 Taps/ pixel) per image. If the iToF sensor comprise signal processing capabilities it may convert this raw data into 1/ Q values (phase, amplitude information) and then into depth/ confidence information for the image.
  • the raw data may consist of a photon count value. If the dToF comprises signal processing capabilities, it may convert photon count value into a depth map.
  • the intended functionality may relate to any application, to any application in an automotive context.
  • the intended functionality may relate to opening a door lock of a car.
  • the intended functionality may be performed if a primary or secondary criterion is fulfilled.
  • a primary criterion for performing the intended functionality may be based on the classification information obtained by the ANN. It may for example be determined that a primary criterion for performing the intended functionality is fulfilled based on classification information obtained by the ANN.
  • the ANN may be configured to perform face identification, and the intended functionality may relate to opening a door lock of a car.
  • a primary criterion for performing the intended functionality e.g. opening the door lock
  • the classification information obtained by the ANN is here referred to as a primary mechanism or a fist mechanism to determine if the intended functionality is to be performed.
  • a secondary mechanism may be any other mechanism which is suitable to determine, if the intended functionality is to be performed.
  • a secondary criterion may be based on a secondary mechanism. It may for example be determined that the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism.
  • a secondary criterion for performing the intended functionality may be the opening of door lock with a key.
  • the circuitry may be configured to determine that the classification of the depth data is false negative if the circuitry determines that the secondary criterion for performing the intended functionality is fulfilled based on the secondary mechanism within a predetermined time span after the circuitry determined that the primary criterion for performing the intended functionality is not fulfilled based on the classification information.
  • the depth data may be considered false negative if the primary criterion (ANN based) for opening the door lock by face recognition yields is not fulfilled (e.g. in case that a car owner is not correctly recognized by the ANN as authorized person despite being registered as authorized person).
  • the circuity may recognize the false negative classification if the car owner uses a key to unlock the door within a predetermined time after the false negative classification of his face by the ANN.
  • the classification information may comprise a confidence value related to the intended functionality, and the primary criterion for performing the intended functionality may be fulfilled if the confidence value is above a predetermined acceptance threshold.
  • an ANN may be trained to provide two output classes, namely authorized and not authorized. If a depth image is presented to this ANN, it will provide a first confidence value (or probability) for the depth data indicating a authorized person and a second confidence value (or probability) for the depth data indicating a unauthorized person.
  • the ANN may be trained with respect to a certain intended functionality.
  • the ANN may for example be of the classification type and may provide by its output layer probabilities or confidence values for a predefined number of classification results, named here also classes.
  • a class which is linked to the intended functionality is here referred to as intended class. That is, if the depth data is classified by the ANN into this intended class (the classes respective confidence value exceeds a predetermined acceptance threshold), then it may be determined that a primary criterion for performing the intended functionality is fulfilled, and the intended functionality is performed.
  • the confidence value related to the intended functionality may refer to the confidence value of an intended class.
  • the circuitry may be configured to keep the obtained depth data in the volatile memory for a predetermined time span after performing classification on the depth data by the ANN if the confidence value of the depth data for the intended class is below the predetermined acceptance threshold and above a predetermined monitoring threshold.
  • the circuitry may be configured to store a label indicating a class related to the intended functionality together with the corresponding depth data as labeled depth data on the non-volatile memory.
  • the circuitry may be configured to re-train the ANN based on the labeled depth data stored on the non-volatile memory.
  • the depth data may be based on an image generated by a Time-of- Flight sensor.
  • the intended functionality may relate to an automotive environment.
  • Automotive environment means that the system may be used in any be an kind of automobiles, electric vehicles, hybrid electric vehicles, motorcycles, bicycles, personal mobility vehicles, airplanes, drones, ships, robots, construction machinery, agricultural machinery (tractors), trucks, space vehicles, rail vehicles, water vehicles or the like.
  • the circuitry may be configured to determine that, with respect to the secondary mechanism, the (secondary) criterion for opening the door lock is fulfilled if the key is used to manually unlock the car door.
  • the secondary mechanism may mean the secondary mechanism is using a key to manually unlock the car door.
  • the ANN may be configured to perform gesture recognition, and wherein the intended functionality relates to increasing a volume of an audio system in a car, and wherein the secondary criterion for performing the intended functionality is a criterion for increasing the volume.
  • the circuitry may be configured to determine that, with respect to the secondary mechanism, the secondary criterion for increasing the volume is fulfilled if the a knob or touch screen on the car’s console is operated manually to increase the volume.
  • the secondary mechanism may mean the secondary mechanism is operating a knob or touch screen on the car’s console manually to increase the volume.
  • the ANN may be a convolutional neural network.
  • the ANN may be implemented as a software or as a hardware.
  • the embodiments disclose further a method obtaining depth data; performing classification of the depth data by an artificial neural network, ANN, to determine classification information for the depth data, performing an intended functionality if a primary criterion or a secondary criterion is fulfilled; determining that the primary criterion for performing the intended functionality is fulfilled based on the classification information obtained by the ANN; determining that the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism; determining, based on the secondary mechanism and the classification information obtained by the ANN, if the classification of the depth data is false negative.
  • Fig. 1 schematically shows an embodiment of a system for obtaining and storing ToF sensor depth images and using it for training and operation of an ANN in an automotive environment.
  • the system 100 for obtaining and storing ToF sensor depth images and using it for training and operation of an ANN may be implemented in a car.
  • a ToF Sensor 101 may be an indirect ToF (iToF) sensor or a direct ToF (dToF) sensor or the like.
  • the ToF sensor 101 consists of an array of pixels, wherein each pixel outputs one or more analog voltage values, that are converted via an Analog-to-Digital-Convertor to digital values with a certain resolution which is expressed in a number of bits, for example 10 bits or 12 bits. This digital data is called raw data.
  • the ToF sensor 101 comprises processing capabilities and converts the raw data into a depth map of the recorded scene.
  • the depth map from the ToF sensor 101 is transmitted via the interface 102 to the host processor 103.
  • the interface 102 may be a MIPI standardized interface, for example a camera serial interface (CSI) or it may be a I2C interface.
  • the host processor 103 may be a CPU or a GPU or an FGPA or any other processor.
  • the host processor 103 is connected with a volatile memory 104 via the interface 105 which is a Direct Memory Access (DMA) interface.
  • the volatile memory 104 is a SDRAM memory and the host processor 103 stores the received depth image in the SDRAM 104.
  • DMA Direct Memory Access
  • the host processor 103 On the host processor 103 a software algorithm which executes the ANN algorithm, that is the instructions are executed on the depth map by using its trained coefficients (stored in the SDRAM 105 or the NVM 106). That means the ANN may performs inference (for example a classification) on the depth map (or a sequence of depth images) within the car (for example face identification for opening a door of the car, gesture identification of change the volume of the audio system, driver drowsiness detection or the like) and output its result as a confidence value for further processing (for example send the result to a ECU 108).
  • the host processor 103 is further connected to a non volatile memory (NVM) 106 via an interface 107, which is a Queued Serial Peripheral Interface (QSP1).
  • NVM non volatile memory
  • the host processor 103, the volatile memory 104 and the non-volatile memory 106 may be part of a microcontroller or a system-on-chip (SoC) 110.
  • the host processor 103 is still further connected to an engine control unit (ECU) 108 via an interface 109, which is Controller Area Network bus (CAN).
  • the ECU may be connected to different sensors and actuators within the car and controls several processes within the car (e.g. opening the lock of a door of the car, increase the volume of the audio system or the like).
  • the host processor 103 may receive information (for example the execution/ triggering of a secondary mechanism, see below) from the ECU 108 about the sensors/ actuators and the like (e.g. the start engine knob press, or the key is put into the lock to unlock it, or the volume knob was of the audio system was operated etc.).
  • the ECU 108 may receive inference results of the ANN from the host processor 103.
  • a sequence of depth maps or the like may be processed by the ANN.
  • depth data All the different types of depth information for a single depth image (raw data, photon count value, amplitude/ confidence map, depth map etc.) or a sequence of depth images will be referred to as depth data.
  • the host processor 103 may receive the raw data from the ToF sensor and determine the depth data. Some applications may dependent on motion, that is multiple frames need to be processed simultaneously. In an automotive environment (for example a car) certain actions (may be implemented as applications, see Fig. 2) with a respective intended functionality may be performed (by a the host processor 103 or by a car ECU), like for example opening a lock of a car door, starting the car engine of a car, increasing the volume of an audio system in a car, apply the break of a car or wake up a drowsy driver of a car (by a sending out a stimulus signal). These actions/ intended functionalities may be performed based on the fulfillment different criterions.
  • the fulfillment of a first criterion may be determined based on the classification information of the ANN (also referred to as a primary/ first mechanism).
  • the fulfillment of a secondary criterion may be determined based on a secondary mechanism.
  • the secondary mechanism may be the manually triggering of the intended function (see detailed description below).
  • the CSI is a specification of the Mobile Industry Processor Interface (MIP1) Alliance.
  • MIP1 Mobile Industry Processor Interface
  • CSI-2 or CSI-3 or CCS version may be used.
  • volatile memory 104 may be DRAM, SRAM, SGRAM.
  • the non-volatile memory (NVM) 106 may be for example a flash memory, a read-only memory (ROM), ferroelectric RAM, a magnetic computer storage device like (e.g. a hard disk drives, floppy disks, and magnetic tape) or an optical disc.
  • ROM read-only memory
  • ferroelectric RAM ferroelectric RAM
  • magnetic computer storage device like (e.g. a hard disk drives, floppy disks, and magnetic tape) or an optical disc.
  • the interface 109 between the host processor 103 and ECU 108 of the car may be an Ethernet interface, a Local Interconnect Network bus (LIN), a FlexRay interface, a Serial Peripheral Interface (SPI), an interrupt request (IRQ) or the like.
  • LIN Local Interconnect Network bus
  • SPI Serial Peripheral Interface
  • IIRQ interrupt request
  • the host processor 103 and the volatile memory may be part of a microcontroller or a system-on-chip (SoC) 110 and the non-volatile memory may be connected to the microcontroller or a system-on-chip (SoC) 110 via an I/O interface of the microcontroller or a system-on-chip (SoC) 110.
  • SoC system-on-chip
  • the ANN may for example be a convolutional neural network (CNN), a multi-layer perception (MLP), a Recurrent Neural Networks (RNN), a support vector machine (SVM), an autoencoder, a Generative adversarial network (GAN), a deep neural network (DNN) or the like.
  • CNN convolutional neural network
  • MLP multi-layer perception
  • RNN Recurrent Neural Networks
  • SVM support vector machine
  • GAN Generative adversarial network
  • DNN deep neural network
  • the ANN may be implemented as a hardware instantiation (e.g. GPU) which essentially executes the same algorithm as a software ANN but implemented in an IC.
  • a hardware instantiation e.g. GPU
  • Fig. 2 schematically shows a layered architecture of an automotive software platform.
  • An example of an automotive software platform such as schematically shown in Fig. 2 is for example described in “AUTOSAR, Layered Software Architecture, Classic Platform, Standard 4.3.1” in more detail.
  • An automotive software platform comprises a microcontroller/hardware layer 201, a basic software layer 202, a runtime environment layer 203 and an application layer 204.
  • the different hardware components of the microcontroller/hardware layer 201 like a ToF sensor (101 in Fig. 1), ECUs/ processors (103 in Fig. 2), SDRAM (104 in Fig. 1) and NVM (106 in Fig. 1) or the like may relate to a distributed system.
  • the basic software layer 202 provides several types of services for example: Input/ Output services which standardized the access to sensors, actuators and ECU onboard peripherals; Memory services which standardized the access to internal (e.g. SDRAM) / external memory (e.g.
  • NVM NVM
  • Crypto services which standardized the access to cryptographic primitives including internal/ external hardware accelerators
  • Communication services which standardized the access to vehicle network systems, ECU onboard communication systems and ECU internal software
  • Off-board communication services which standardized access to Vehicle-to-X communication, in vehicle wireless network systems, ECU off-board communication systems
  • System services which provide provision of standardizeable (operating system, timers, error memory) and ECU specific (ECU state management, watchdog manager) services and library functions
  • Driver services which contain the functionality to control and access an internal or an external device.
  • the runtime environment layer 203 realizes the communication between the basic software layer 203 and different software components of the application layer 204. That is runtime environment layer 203 manages the inter- and intra-hardware (ECU) communication.
  • the application layer 204 comprises several different applications with different intended functionalities to support vehicle functions, like for example an application for opening a lock of a door of the car, or an application for changing the volume of the audio system in the car or an application for sending out of a stimulus signal to the driver, or the like. Each of the applications may be activated (perform their intended functionality) based on the fulfillment of different criterions.
  • the fulfillment of a first criterion may be determined based on a classification information of an ANN for face identification (start engine, open lock of car door), of an ANN for gesture identification (changing the volume), of an ANN for driver drowsiness detection (sending out stimulus signal) or of an ANN for a break assistance (breaking).
  • the different ANN which are trained for different intended functionalities, may be implemented as an application within the application layer.
  • the runtime environment layer 204 defines so called ports and from the point of view of the application layer 204 devices such as sensors, controllers, ECUs, and the like are accessible as software components which communicate with each other through respective ports. For example, viewed from the application layer 204 the ToF camera is accessible as software component where the internal functionality of the operating system or the communication protocol stack are largely hidden.
  • the ToF depth data may be received by the ANN(s) through a port provided by the runtime environment layer 203.
  • the ANN may then perform inference (for example classification) on the depth data and output its results, for example the confidence value of the intended classes (see Figs. 3 and 4), through a port to another application with a certain intended functionality.
  • the Depth Data is sent from the ToF sensor 101 to an ANN instantion via the MIPI interface 102, stored in SDRAM 104 and processed by the ANN.
  • the ANN performs inference on the depth data and determines for example classification information.
  • a car ECU or the host processor 103 certain actions with a respective intended functionality may be performed (for example by a car ECU or the host processor 103).
  • These intended functionalities may for example be opening a lock of a car door, starting the car engine of a car, increasing the volume of an audio system in a car, apply the break of a car or sending out a stimulus signal to a drowsy driver to wake him up.
  • These intended functionalities may be performed (for example by the ECU or the host processor 103) based on the fulfillment of one of a plurality of possible criterions.
  • the fulfillment of a first criterion may be determined based on the classification information determined by the ANN.
  • the performing of the intended functionality of sending out stimulus signal may be based on fulfillment of a first criterion based on a classification information of an ANN for driver drowsiness detection.
  • the ANN for driver drowsiness detection may perform image analysis/ recognition of depth images of the face/ facial expression of the driver.
  • the performing of the intended functionality of breaking may be based on fulfillment of a first criterion based on a classification information of an ANN for a break assistance.
  • the ANN for a break assistance may analy2e depth images preceding vehicles.
  • the ANNs which are trained for different intended functionalities may be implemented as an application within the application layer.
  • the ANN may for example perform a classification of the depth data (or a sequence of depth images) into one of a number of classes. That is the ANN performs classification of the depth data and determines classification information for the depth data.
  • the classification information for the depth data that is determined by the ANN may be a certain value for each class for the depth data (see for example the Softmax function for the CNN in Fig. 3).
  • the value of each class is also referred to as confidence value (or confidence level) of the class for the depth data. That is the classification information for the depth data may be a confidence value for each class for the depth data.
  • the class with the highest confidence value may indicate into which class the depth data is classified.
  • the class into which the depth data may be classified may be referred to as classification result.
  • Each ANN - trained with respect to a certain intended functionality - may provide an output layer with classes, where a specific class is linked to the intended functionality.
  • a class is referred to here as “class related to the intended functionality”, or “intended class”. This means that if the depth data is classified into an intended class (the probably and its respective confidence value exceeds a predetermined acceptance threshold Thr_Acc, see also Fig. 4), then it is determined that the first criterion is fulfilled and the intended functionality is performed.
  • An intended class may for example be the authorized-class (open lock of car door), the increase-volume-class (gesture recognition for increasing volume), the break-class (break assistance), the drowsy-class (drowsiness detection).
  • the ANN may determine a confidence value for each class for the whole sequence of depth images as a whole.
  • An example of an ANN is a convolutional neural network (CNN) which is exemplarily described next in Fig. 3.
  • CNN convolutional neural network
  • Fig. 3 shows an exemplifying architecture of a convolutional neural network for image classification.
  • An input image matrix 301 is input into the CNN, wherein each entry of the input image matrix 301 corresponds to one pixel of an image (for example the depth image of a face), which should be processed by the CNN.
  • the value of each entry of the input image matrix 301 may be a 32-bit value, wherein each of the colours red, green, and blue and the depth information occupies 8 bits.
  • the value of each entry of the input image matrix 301 may be a 16-bit value, wherein a grayscale value and the depth information respectively occupy 8 bits.
  • a filter also called kernel or feature detector 302, which is a matrix (may be symmetric or asymmetric; in audio applications, it may be advantageous to use asymmetric kernels as the audio waveform - and therefore also the depth data— may be not symmetric) with an uneven number of rows and columns (for example 3x3, 5x5, 7x7 etc.), is shifted from left to right and top to bottom such that the filter
  • the first layer matrix 303 which has the same dimension as the input image matrix 301.
  • the position of the centre of the filter 302 in the input image matrix 301 is the same position where the generated result of the multiplication-summation as described above is placed in the first layer matrix 303.
  • All rows of the first layer matrix 303 are placed next to each other to form a first layer vector 304.
  • a nonlinearity e.g., ReLU
  • the first layer vector 304 is multiplied with a last layer matrix 305, which yields the result z.
  • the last layer matrix 305 has as many rows as the first layer vector has columns and the number of 5 columns of the last layer vector corresponds to the 5 different classes into which the CNN should classify the input image matrix 301.
  • the result z of the matrix multiplication between the first layer vector 304 and the last layer matrix 305 is input into a Softmax function.
  • the probability value for each class may also be called the confidence value of the class for the image (i.e. for the depth data).
  • 5 2 i.e. the depth data image corresponding to the input image matrix 301 should be classified into two classes, class_l and class_2.
  • This yields the probability P c iass_ 1 that the input image matrix 301 belongs to class class_l and the probability P c iass_ 2 that the input image matrix 301 belongs to class class_2 (for binary classification problems, i.e. S 2, only one output neuron with a sigmoid nonlinearity may be used and if the output is below 0.5 the input may be labelled as class 1 and if it is above 0.5 the input may be labelled as class 2).
  • the entries of the filter 302 and the entries of the of the last layer matrix 305 are the weights (coefficients) of the CNN which are trained during the training process.
  • the CNN can be trained in a supervised manner, by feeding an input image matrix, which is labelled as corresponding to a certain class, into the CNN.
  • the current output of the CNN in the training phase is input into a loss function and through a backpropagating algorithm the weights of the CNN are adapted.
  • the general CNN architecture described above. For example, multiple filters in one layer can be used and/ or multiple layers can be used.
  • the ANN e.g.
  • a CNN may perform face identification, that is it classifies depth data of a face of a person by into one of two different classes - that is either the authorized-class or non- authorized-class.
  • the ANN was trained with respect to the intended functionality of opening the lock of a car door.
  • the ANN may perform driver drowsiness detection, that is it classifies the depth data (for example of a face of a person) into one of three classes - that is the sleeping-class (driver is sleeping), attentive-class (driver is attentive), and drowsy-class (driver is drowsy).
  • the ANN was trained with respect to the intended functionality of sending out a stimulus signal (e.g. flashing the interior lights in the car) to stop the driver from falling asleep/ waking the sleeping driver up.
  • the ANN e.g. a CNN
  • perform gesture recognition for increasing (or decreasing) audio volume that is it classifies sequence of depth images (for example a hand gesture of a person) into on of two classes - that is the increase-volume-class, and decrease-volume-class.
  • the ANN may be trained with respect to the intended functionality of increasing (decreasing) the volume of the audio system in the car.
  • the host processor 103 may release the depth data (or sequence of depth images) from SDRAM 104 to free up working memory (for example one iToF raw depth frame from a 1 Megapixel sensor may occupy 128 Mbits in the SDRAM which is organized in multiples of 8-bits).
  • a respective intended functionality like for example opening a lock of a car door, starting the car engine of a car, increasing the volume of an audio system in a car, apply the break of a car or wake up a drowsy driver of a car (by a sending out a stimulus signal) may be performed by a the host processor 103 or by a car ECU.
  • These actions/intended functionalities may be performed based on the fulfillment of a first criterion as described above. That is the first criterion may be determined as fulfilled based on the classification information of the ANN. However, these actions/intended functionalities may also be performed based on the fulfillment of a secondary criterion.
  • This secondary criterion may be determined as fulfilled based on a secondary mechanism.
  • the secondary mechanism may be the manually triggering of performing of the intended functionality or the triggering of the intended functionality by an alternative application.
  • the secondary mechanism may be the using of the key to unlock the car door.
  • the opening of the lock of the car door may be performed if the secondary criterion is determined as fulfilled based on using of the key to unlock the car door.
  • the secondary mechanism may be operating the knobs or touch screen on the car’s console to increase the volume.
  • the increasing the volume of the audio is performed if the secondary criterion is determined as fulfilled based on operating the knobs or touch screen on the car’s console.
  • the secondary mechanism may be operating of the brake pedel (manually).
  • the breaking is performed if the secondary criterion is determined as fulfilled based on operating of the braking pedal.
  • the secondary mechanism may be triggering the intended functionality by an alternative application.
  • the secondary mechanism may be output signal of a lane assistant (for example if a lane is unusually crossed).
  • the sending out of a stimulus signal to the driver is performed if the secondary criterion is determined as fulfilled based on the output signal of the lane assistant.
  • the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism if the secondary mechanism is activated/ triggered/ executed.
  • the confidence value of the intended class of the depth data may be below the acceptance threshold Thr_Acc (such that the intended functionality is not reached) but still above a predetermined monitoring threshold Thr_Mon (see Fig. 4).
  • the predetermined monitoring threshold Mon_Acc may be a small predetermined amount below the acceptance threshold Thr_Acc, for example the monitoring threshold Mon_Acc may be 1% or 5% or 10% below the acceptance threshold Thr_Acc. This means that the depth data is classified as “just narrowly” not within the intended class. That means, that the first criterion for performing the intended functionality is not fulfilled based on the classification (outside intended class) for theses depth data samples. If however, the secondary criterion based on the secondary mechanism is determined as fulfilled (i.e.
  • the secondary mechanism is triggered) within a short time span after it was determined that the first criterion is not fulfilled based there is high probability that the depth data is a false negative. That means, that the depth data is classified as not belonging to the intended class by the ANN (which would have resulted in performing the intended functionality based on the ANN criterion), although it is a member of that intended class (that is the intended functionality should have of been performed based on the ANN).
  • Fig. 4 it is described how to determine if a depth data is classified false negative based on the secondary mechanism and the confidence value of the depth data.
  • depth data samples together with their corresponding labels may be stored in the NVM as training data. Therefore, it is desirable when operating an ANN trained with respect to an intended functionality, to flag interesting candidates of depth data upfront which may be later turn out to be false negatives and not to release them from the volatile memory 104 for a certain time span after processing.
  • the time span (see Fig. 4 “sliding window operation”) during which these depth data should not be released from the volatile memory 104 should be a span short enough to not use up more memory than necessary but long enough that most likely (for example with 90% or 95% or 98% certainty ) the secondary mechanism for a certain application is triggered/ executed, if it is triggered/ executed at all.
  • the relevant depth data is kept in the volatile memory 104 (instead of releasing them immediately after the ANN has processed them). If in this time span, there occurs the activation of the secondary mechanism, the interesting depth data is still available in the volatile memory 104 and it can be written to a non-volatile memory 106 from the volatile memory 104, together with its corresponding class label.
  • Fig. 4 schematically shows a sliding window operation for determining a depth data classified false negative by the ANN.
  • the graph 300 shows the time on the x-axis and the confidence value for the intended class of a depth data which is classified by a ANN (for example performing face identification for opening a door of the car, performing gesture identification for increasing the volume of the audio system, performing driver drowsiness detection or the like) on the y-axis.
  • the ToF depth data (or sequence of depth images) 403-1,...,403-6 are received by the ANN from an ToF sensor over time, and classified by determining a confidence value (for each available class) for each depth data (or by determining one confidence value for each sequence of depth images).
  • a predetermined monitoring threshold Mon_Acc 402 is defined as a confidence value for the intended class of the ANN which is a predetermined amount below the acceptance threshold Thr_Acc 401, for example 1% or 5% or 10% below the acceptance threshold Thr_Acc 401.
  • the depth data 403-4 is received at time ti, and has a confidence value for the intended class below the acceptance threshold Thr_Acc 401 (i.e. the first criterion for performing the intended functionality is not fulfilled based on the classification result for the depth data) but above the monitoring threshold Mon_Acc 402. That means the depth data 403-4 is an interesting candidate which may later turn out as false negative classified depth data. Therefore, the depth data 403-4 is not immediate released from the SDRAM 104. Instead a sliding window operation 404 is activated with a time span of Tsw (sliding window time). That means during this time span Tsw, i.e. from the time ti until the time ti+Tsw, the depth data 403-4 is kept in the SDRAM 104.
  • Tsw sliding window time
  • the depth data 403-4 is transferred from the SDRAM 104 to the non-volatile memory 106 together with its corresponding label stemming from the secondary mechanism.
  • the corresponding label is the class to which the depth data correcdy belongs, which is the intended class which is indicated by the fulfilment of the secondary criterion based in the secondary mechanism.
  • the secondary mechanism is executed for the depth data 403-4 at time t , which satisfies t £ ti +Tsw, therefore the depth data 403-4 together with its corresponding label from the secondary mechanism is transferred to the non-volatile memory 106.
  • the time ti may be the time at which the depth data is generated at the ToF sensor 101 or the time at which the inference on the depth data in ANN is finished or a time in between. However, since the time span between the generation of the depth data in the ToF sensor and the finishing of the inference of the ANN on the depth data is very small compared to the time span Tsw, this may be not important.
  • the monitoring threshold Mon_Acc 402 should be a small amount the below acceptance threshold Thr_Acc 401, for example between 1-15%, such that interesting candidates are identified without storing to many depth data which in the SDRAM 104 which later not turn out to be classified as false negatives.
  • the monitoring threshold Mon_Acc 402 is needed to limit the amount of SDRAM working memory 104 for the sliding window because it may be not possible to just store each depth data for a longer time span after it is processed.
  • the time span Tsw of the sliding window during which a depth data should not be released may be a time span within which most likely (for example 95% certainty) the secondary criterion based on the secondary mechanism for the intended functionality is fulfilled, if it is fulfilled at all.
  • the time span of the sliding window varies depending on the respective use case (i.e. the intended functionality) and may vary between Is and 60s.
  • the secondary mechanism using the key to unlock the car door
  • the time span sliding window Tsw rnay 60s.
  • the secondary mechanism operating the knobs or touch screen on the car’s console to increase the volume
  • the time span sliding window Tsw 5s .
  • the ANN for face identification for opening a lock of the door of the car or starting the engine of the car classified a person as not authorized (that is the confidence value of the intended class being above the monitoring threshold Mon_Acc and below the acceptance threshold Thr_Acc), and then the key is used to manually triggering the unlocking a short time period later (see sliding window in Fig. 4).
  • the ANN for gesture identification for increasing the volume of the audio system classified the gesture as not-increasing the volume (that is the confidence value of the intended class being above the monitoring threshold Mon_Acc and below the acceptance threshold Thr_Acc) and then knobs or touch screen on the car’s console are operated to increase the volume a short time period later.
  • the ANN for the break-assistance classified a depth data of situation in front of the car as not worth breaking (that is the confidence value of the intended class being above the monitoring threshold Mon_Acc and below the acceptance threshold Thr_Acc) and then the brake pedal is operated a short time period later.
  • the ANN of the driver drowsiness detection classifies a depth data of a driver as awake (that is the confidence value of the intended class being above the monitoring threshold Mon_Acc and below the acceptance threshold Thr_Acc) and then the lane assistant outputs a signal (to output a stimulus signal) a short period of time later.
  • these depth data classified false negative by the ANN together with their corresponding correct labels, i.e. the intended class they belong to (e.g. the authorized-class, the increase-volume- class, the break-class, the drowsy-class etc.) which are delivered by the secondary mechanism should be identified (captured) and stored. Then this labelled depth data may be used as training data for the ANN to improve the operation of the ANN for the intended functionality for images/ people/ use cases not properly covered by the existing training database.
  • the intended class they belong to e.g. the authorized-class, the increase-volume- class, the break-class, the drowsy-class etc.
  • Fig. 5 shows a flowchart of capturing false negative depth data of an ANN with respect to an intended functionality.
  • depth data is generated with a ToF sensor at time ti.
  • the depth data is stored on the volatile memory 104.
  • the depth data is read from the volatile memory into the host processor which carries out the ANN processing.
  • classification of the depth data is performed in an ANN of the depth-based ANN application and a confidence value of the intended class is obtained.
  • step 506 the depth data is drooped from volatile memory and process ends (for example the confidence value may exceed the acceptance threshold and thereby the intended functionality may be performed based on the fulfillment of the first criterion). If the answer in step 505 is yes, the process proceeds with step 507. In this case the intended functionality is not performed because the first criterion is not fulfilled on the classification confidence value of the depth data.
  • step 507 a sliding window operation (see Fig. 4) is started with a time span of Tsw. That means the during the time span Tsw the depth data should not be released from the volatile memory.
  • step 508 it is asked if the current time t is below the generation time of the depth data ti plus the time span of Tsw of the sliding window, i.e. t ⁇ ti+ Tsw. If the answer in step 508 is no, the process proceeds with step 509. In in step 509, the depth data is released from volatile memory and process ends. If the answer in step 508 is yes, the process proceeds with step 510. In step 510, it is asked if the corresponding secondary mechanism for the application was executed (activated). That means, it is determined if the intended functionality is performed based on the fulfillment of the secondary criterion based on the secondary mechanism. If the answer in step 510 is no, the process proceeds again with step 508 (i.e.
  • step 510 the intended functionality is not performed because also the second criterion based on secondary mechanism is not fulfilled). If the answer in step 510 is yes, the process proceeds with step 511. That means it is determined that the intended functionality is performed based on the fulfillment of the secondary criterion based on the secondary mechanism. In step 511, the depth data and corresponding label from the secondary mechanism is stored in a non-volatile memory and the process ends.
  • the depth data and corresponding label from the secondary mechanism can then be read out from the NVM during service and can be sent to an “ANN training-center”, that is for example a central server infrastructure, or a cloud-based service and the ANN for the application can be re-trained (see Fig. 6).
  • an “ANN training-center” that is for example a central server infrastructure, or a cloud-based service and the ANN for the application can be re-trained (see Fig. 6).
  • Fig. 6 schematically shows a feedback loop of capturing a false negative classified ToF depth data and re-training of the ANN with respect to an intended functionality.
  • a classification of the depth data is performed with the ANN.
  • the depth data and corresponding secondary mechanism label are captured as described above with Fig. 4.
  • the captured depth data and its corresponding secondary mechanism label is stored in the non volatile memory.
  • the stored depth data and its corresponding secondary mechanism label is sent to an ANN training-center (for example central server infrastructure, or a cloud-based service).
  • an ANN training-center for example central server infrastructure, or a cloud-based service.
  • step 603 re-training of the ANN is performed (that is its coefficients are adapted to the new data) with the new depth data and its corresponding secondary mechanism label in the ANN training-center. Between step 603 and step 601, the updated coefficients are loaded into the ANN of the application.
  • Fig. 7 is a block diagram depicting an example of schematic configuration of a vehicle control system 7000 as an example of a mobile body control system to which the capturing of false negative depth data of an ANN with respect to an intended functionality can be applied.
  • the vehicle control system 7000 includes a plurality of electronic control units connected to each other via a communication network 7010.
  • the vehicle control system 7000 includes a driving system control unit 7100, a body system control unit 7200, a battery control unit 7300, an outside-vehicle information detecting unit 7400, an in-vehicle information detecting unit 7500, and an integrated control unit 7600 (for example the ECU 108, or the host processor 103).
  • the communication network 7010 connecting the plurality of control units to each other may, for example, be a vehicle-mounted communication network compliant with an arbitrary standard (see interface 109) such as controller area network (CAN), local interconnect network (LIN), local area network (LAN), FlexRay (registered trademark), or the like.
  • CAN controller area network
  • LIN local interconnect network
  • LAN local area network
  • FlexRay registered trademark
  • Each of the control units may include: a microcomputer that performs arithmetic processing according to various kinds of programs; a storage section (for example the SDRAM 104, or the NVM 106) that stores the programs executed by the microcomputer, parameters used for various kinds of operations, or the like; and a driving circuit that drives various kinds of control target devices.
  • Each of the control units further includes: a network interface (I/F) (for example the interfaces 102, 105, 107, 109) for performing communication with other control units via the communication network 7010; and a communication I/F for performing communication with a device, a sensor (for example the ToF sensor 101), or the like within and without the vehicle by wire communication or radio communication.
  • I/F network interface
  • a microcomputer 7610 includes a microcomputer 7610, a general-purpose communication I/F 7620, a dedicated communication I/F 7630, a positioning section 7640, a beacon receiving section 7650, an in-vehicle device I/F 7660, a sound/image output section 7670, a vehicle-mounted network I/F 7680, and a storage section 7690 (for example the SDRAM 104, or the NVM 106).
  • the other control units similarly include a microcomputer, a communication I/F, a storage section, and the like.
  • the driving system control unit 7100 controls the operation of devices related to the driving system of the vehicle in accordance with various kinds of programs.
  • the driving system control unit 7100 functions as a control device for a driving force generating device for generating the driving force of the vehicle, such as an internal combustion engine, a driving motor, or the like, a driving force transmitting mechanism for transmitting the driving force to wheels, a steering mechanism for adjusting the steering angle of the vehicle, a braking device for generating the braking force of the vehicle, and the like.
  • the driving system control unit 7100 may have a function as a control device of an antilock brake system (ABS), electronic stability control (ESC), or the like.
  • ABS antilock brake system
  • ESC electronic stability control
  • the driving system control unit 7100 is connected with a vehicle state detecting section 7110.
  • the vehicle state detecting section 7110 includes at least one of a gyro sensor that detects the angular velocity of axial rotational movement of a vehicle body, an acceleration sensor that detects the acceleration of the vehicle, and sensors for detecting an amount of operation of an accelerator pedal, an amount of operation of a brake pedal, the steering angle of a steering wheel, an engine speed or the rotational speed of wheels, and the like.
  • the driving system control unit 7100 performs arithmetic processing using a signal input from the vehicle state detecting section 7110, and controls the internal combustion engine, the driving motor, an electric power steering device, the brake device, and the like.
  • the body system control unit 7200 controls the operation of various kinds of devices provided to the vehicle body in accordance with various kinds of programs.
  • the body system control unit 7200 functions as a control device for a keyless entry system (for example via face recognition with an ANN), a smart key system, a power window device, or various kinds of lamps such as a headlamp, a backup lamp, a brake lamp, a turn signal, a fog lamp, or the like.
  • a mobile device as an alternative to a key or signals of various kinds of switches can be input to the body system control unit 7200.
  • the body system control unit 7200 receives these input radio waves or signals, and controls a door lock device, the power window device, the lamps, or the like of the vehicle.
  • the battery control unit 7300 controls a secondary battery 7310, which is a power supply source for the driving motor, in accordance with various kinds of programs.
  • the battery control unit 7300 is supplied with information about a battery temperature, a battery output voltage, an amount of charge remaining in the battery, or the like from a battery device including the secondary battery 7310.
  • the battery control unit 7300 performs arithmetic processing using these signals and performs control for regulating the temperature of the secondary battery 7310 or controls a cooling device provided to the battery device or the like.
  • the outside-vehicle information detecting unit 7400 detects information about the outside of the vehicle including the vehicle control system 7000 (for example used by the brake assistant).
  • the outside-vehicle information detecting unit 7400 is connected with at least one of an imaging section 7410 and an outside-vehicle information detecting section 7420.
  • the imaging section 7410 includes at least one of a time-of-flight (ToF) camera, a stereo camera, a monocular camera, an infrared camera, and other cameras.
  • ToF time-of-flight
  • the outside-vehicle information detecting section 7420 includes at least one of an environmental sensor for detecting current atmospheric conditions or weather conditions and a peripheral information detecting sensor for detecting another vehicle, an obstacle, a pedestrian, or the like on the periphery of the vehicle including the vehicle control system 7000.
  • the environmental sensor may be at least one of a rain drop sensor detecting rain, a fog sensor detecting a fog, a sunshine sensor detecting a degree of sunshine, and a snow sensor detecting a snowfall.
  • the peripheral information detecting sensor may be at least one of an ultrasonic sensor, a radar device, and a LIDAR device (Light detection and Ranging device, or Laser imaging detection and ranging device).
  • Each of the imaging section 7410 and the outside-vehicle information detecting section 7420 may be provided as an independent sensor or device or may be provided as a device in which a plurality of sensors or devices are integrated.
  • Fig. 8 is a diagram of assistance in explaining an example of installation positions of an outside- vehicle information detecting section and an imaging section.
  • Imaging sections 7910, 7912, 7914, 7916, and 7918 are, for example, disposed at least one of positions on a front nose, sideview mirrors, a rear bumper, and a back door of the vehicle 7900 and a position on an upper portion of a windshield within the interior of the vehicle.
  • the imaging section 7910 provided to the front nose and the imaging section 7918 provided to the upper portion of the windshield within the interior of the vehicle obtain mainly an image of the front of the vehicle 7900.
  • the imaging sections 7912 and 7914 provided to the sideview mirrors obtain mainly an image of the sides of the vehicle 7900.
  • the imaging section 7916 provided to the rear bumper or the back door obtains mainly an image of the rear of the vehicle 7900.
  • the imaging section 7918 provided to the upper portion of the windshield within the interior of the vehicle is used mainly to detect a preceding vehicle, a pedestrian, an obstacle, a signal, a traffic sign, a lane, or the like.
  • Fig. 8 depicts an example of photographing ranges of the respective imaging sections 7910, 7912, 7914, and 7916.
  • An imaging range a represents the imaging range of the imaging section 7910 provided to the front nose.
  • Imaging ranges b and c respectively represent the imaging ranges of the imaging sections 7912 and 7914 provided to the sideview mirrors.
  • An imaging range d represents the imaging range of the imaging section 7916 provided to the rear bumper or the back door.
  • a bird's-eye image of the vehicle 7900 as viewed from above can be obtained by superimposing image data imaged by the imaging sections 7910, 7912, 7914, and 7916, for example.
  • Outside-vehicle information detecting sections 7920, 7922, 7924, 7926, 7928, and 7930 provided to the front, rear, sides, and corners of the vehicle 7900 and the upper portion of the windshield within the interior of the vehicle may be, for example, an ultrasonic sensor or a radar device.
  • the outside- vehicle information detecting sections 7920, 7926, and 7930 provided to the front nose of the vehicle 7900, the rear bumper, the back door of the vehicle 7900, and the upper portion of the windshield within the interior of the vehicle may be a LIDAR device, for example.
  • This outside- vehicle information detecting sections 7920 to 7930 are used mainly to detect a preceding vehicle, a pedestrian, an obstacle, or the like.
  • the outside-vehicle information detecting unit 7400 makes the imaging section 7410 image an image of the outside of the vehicle and receives imaged image data (for example depth data).
  • the outside- vehicle information detecting unit 7400 receives detection information from the outside-vehicle information detecting section 7420 connected to the outside-vehicle information detecting unit 7400.
  • the outside-vehicle information detecting unit 7400 transmits an ultrasonic wave, an electromagnetic wave, or the like, and receives information of a received reflected wave.
  • the outside- vehicle information detecting unit 7400 may perform processing of detecting an object such as a human, a vehicle, an obstacle, a sign, a character on a road surface, or the like, or processing of detecting a distance thereto.
  • the outside-vehicle information detecting unit 7400 may perform environment recognition processing of recognizing a rainfall, a fog, road surface conditions, or the like on the basis of the received information.
  • the outside-vehicle information detecting unit 7400 may calculate a distance to an object outside the vehicle on the basis of the received information.
  • the outside-vehicle information detecting unit 7400 may perform image recognition processing of recognizing a human, a vehicle, an obstacle, a sign, a character on a road surface, or the like, or processing of detecting a distance thereto.
  • the outside-vehicle information detecting unit 7400 may subject the received image data to processing such as distortion correction, alignment, or the like, and combine the image data imaged by a plurality of different imaging sections 7410 to generate a bird's-eye image or a panoramic image.
  • the outside-vehicle information detecting unit 7400 may perform viewpoint conversion processing using the image data imaged by the imaging section 7410 including the different imaging parts.
  • the in-vehicle information detecting unit 7500 detects information about the inside of the vehicle.
  • the in-vehicle information detecting unit 7500 is, for example, connected with a driver state detecting section 7510 that detects the state of a driver (for example for driver drowsiness detection).
  • the driver state detecting section 7510 may include a camera that images the driver, a biosensor that detects biological information of the driver, a microphone that collects sound within the interior of the vehicle, or the like.
  • the biosensor is, for example, disposed in a seat surface, the steering wheel, or the like, and detects biological information of an occupant sitting in a seat or the driver holding the steering wheel.
  • the in-vehicle information detecting unit 7500 may calculate a degree of fatigue of the driver or a degree of concentration of the driver or may determine whether the driver is dozing.
  • the in-vehicle information detecting unit 7500 may subject an audio signal obtained by the collection of the sound to processing such as noise canceling processing or the like.
  • the integrated control unit 7600 controls general operation within the vehicle control system 7000 in accordance with various kinds of programs.
  • the integrated control unit 7600 is connected with an input section 7800.
  • the input section 7800 is implemented by a device capable of input operation by an occupant, such, for example, as a touch panel, a button, a microphone, a switch, a lever, or the like.
  • the integrated control unit 7600 may be supplied with data obtained by voice recognition of voice input through the microphone.
  • the input section 7800 may, for example, be a remote control device using infrared rays or other radio waves, or an external connecting device such as a mobile telephone, a personal digital assistant (PDA), or the like that supports operation of the vehicle control system 7000.
  • the input section 7800 may be, for example, a camera.
  • an occupant can input information by gesture.
  • data may be input which is obtained by detecting the movement of a wearable device that an occupant wears.
  • the input section 7800 may, for example, include an input control circuit or the like that generates an input signal on the basis of information input by an occupant or the like using the above-described input section 7800, and which outputs the generated input signal to the integrated control unit 7600.
  • An occupant or the like inputs various kinds of data or gives an instruction for processing operation to the vehicle control system 7000 by operating the input section 7800.
  • the storage section 7690 may include a read only memory (for example NVM 106) that stores various kinds of programs executed by the microcomputer and a random access memory (for example SDRAM 104) that stores various kinds of parameters, operation results, sensor values, or the like.
  • the storage section 7690 may be implemented by a magnetic storage device such as a hard disc drive (HDD) or the like, a semiconductor storage device, an optical storage device, a magneto-optical storage device, or the like.
  • the general-purpose communication I/F 7620 is a communication I/F used widely, which communication I/F mediates communication with various apparatuses present in an external environment 7750.
  • the general-purpose communication I/F 7620 may implement a cellular communication protocol such as global system for mobile communications (GSM (registered trademark)), worldwide interoperability for microwave access (WiMAX (registered trademark)), long term evolution (LTE (registered trademark)), LTE-advanced (LTE-A), or the like, or another wireless communication protocol such as wireless LAN (referred to also as wireless fidelity (Wi-Fi (registered trademark)), Bluetooth (registered trademark), or the like.
  • GSM global system for mobile communications
  • WiMAX worldwide interoperability for microwave access
  • LTE registered trademark
  • LTE-advanced LTE-advanced
  • WiFi wireless fidelity
  • Bluetooth registered trademark
  • the general-purpose communication I/F 7620 may, for example, connect to an apparatus (for example, an application server or a control server) present on an external network (for example, the Internet, a cloud network, or a company-specific network) via a base station or an access point.
  • the general-purpose communication I/F 7620 may connect to a terminal present in the vicinity of the vehicle (which terminal is, for example, a terminal of the driver, a pedestrian, or a store, or a machine type communication (MTC) terminal) using a peer to peer (P2P) technology, for example.
  • an apparatus for example, an application server or a control server
  • an external network for example, the Internet, a cloud network, or a company-specific network
  • MTC machine type communication
  • P2P peer to peer
  • the dedicated communication I/F 7630 is a communication I/F that supports a communication protocol developed for use in vehicles.
  • the dedicated communication I/F 7630 may implement a standard protocol such, for example, as wireless access in vehicle environment (WAVE), which is a combination of institute of electrical and electronic engineers (IEEE) 802.1 lp as a lower layer and IEEE 1609 as a higher layer, dedicated short range communications (DSRC), or a cellular communication protocol.
  • WAVE wireless access in vehicle environment
  • IEEE institute of electrical and electronic engineers
  • DSRC dedicated short range communications
  • the dedicated communication I/F 7630 typically carries out V2X communication as a concept including one or more of communication between a vehicle and a vehicle (Vehicle to Vehicle), communication between a road and a vehicle (Vehicle to Infrastructure), communication between a vehicle and a home (Vehicle to Home), and communication between a pedestrian and a vehicle (Vehicle to Pedestrian).
  • the positioning section 7640 performs positioning by receiving a global navigation satellite system (GNSS) signal from a GNSS satellite (for example, a GPS signal from a global positioning system (GPS) satellite), and generates positional information including the latitude, longitude, and altitude of the vehicle.
  • GNSS global navigation satellite system
  • GPS global positioning system
  • the positioning section 7640 may identify a current position by exchanging signals with a wireless access point, or may obtain the positional information from a terminal such as a mobile telephone, a personal handyphone system (PHS), or a smart phone that has a positioning function.
  • the beacon receiving section 7650 receives a radio wave or an electromagnetic wave transmitted from a radio station installed on a road or the like, and thereby obtains information about the current position, congestion, a closed road, a necessary time, or the like.
  • the function of the beacon receiving section 7650 may be included in the dedicated communication I/F 7630 described above.
  • the in-vehicle device I/F 7660 is a communication interface that mediates connection between the microcomputer 7610 and various in-vehicle devices 7760 present within the vehicle.
  • the in-vehicle device I/F 7660 may establish wireless connection using a wireless communication protocol such as wireless LAN, Bluetooth (registered trademark), near field communication (NFC), or wireless universal serial bus (WUSB).
  • a wireless communication protocol such as wireless LAN, Bluetooth (registered trademark), near field communication (NFC), or wireless universal serial bus (WUSB).
  • WUSB wireless universal serial bus
  • the in-vehicle device I/F 7660 may establish wired connection by universal serial bus (USB), high-definition multimedia interface (HDMI (registered trademark)), mobile high-definition link (MHL), or the like via a connection terminal (and a cable if necessary) not depicted in the figures.
  • USB universal serial bus
  • HDMI high-definition multimedia interface
  • MHL mobile high-definition link
  • the in-vehicle devices 7760 may, for example, include at least one of a mobile device and a wearable device possessed by an occupant and an information device carried into or attached to the vehicle.
  • the in-vehicle devices 7760 may also include a navigation device that searches for a path to an arbitrary destination.
  • the in-vehicle device I/F 7660 exchanges control signals or data signals with these in-vehicle devices 7760.
  • the vehicle-mounted network I/F 7680 is an interface that mediates communication between the microcomputer 7610 and the communication network 7010.
  • the vehicle-mounted network I/F 7680 transmits and receives signals or the like in conformity with a predetermined protocol supported by the communication network 7010.
  • the microcomputer 7610 (for example the host processor 103) of the integrated control unit 7600 controls the vehicle control system 7000 in accordance with various kinds of programs on the basis of information obtained via at least one of the general-purpose communication I/F 7620, the dedicated communication I/F 7630, the positioning section 7640, the beacon receiving section 7650, the in-vehicle device I/F 7660, and the vehicle-mounted network I/F 7680.
  • the microcomputer 7610 may calculate a control target value for the driving force generating device, the steering mechanism, or the braking device on the basis of the obtained information about the inside and outside of the vehicle, and output a control command to the driving system control unit 7100.
  • the microcomputer 7610 may perform cooperative control intended to implement functions of an advanced driver assistance system (ADAS) which functions include collision avoidance or shock mitigation for the vehicle, following driving based on a following distance, vehicle speed maintaining driving, a warning of collision of the vehicle, a warning of deviation of the vehicle from a lane, or the like.
  • ADAS advanced driver assistance system
  • the microcomputer 7610 may perform cooperative control intended for automatic driving, which makes the vehicle to travel autonomously without depending on the operation of the driver, or the like, by controlling the driving force generating device, the steering mechanism, the braking device, or the like on the basis of the obtained information about the surroundings of the vehicle.
  • the microcomputer 7610 may generate three-dimensional distance information between the vehicle and an object such as a surrounding structure, a person, or the like, and generate local map information including information about the surroundings of the current position of the vehicle, on the basis of information obtained via at least one of the general-purpose communication I/F 7620, the dedicated communication I/F 7630, the positioning section 7640, the beacon receiving section 7650, the in-vehicle device I/F 7660, and the vehicle-mounted network I/F 7680.
  • the microcomputer 7610 may predict danger such as collision of the vehicle, approaching of a pedestrian or the like, an entry to a closed road, or the like on the basis of the obtained information, and generate a warning signal.
  • the warning signal may, for example, be a signal for producing a warning sound or lighting a warning lamp.
  • the sound/image output section 7670 transmits an output signal of at least one of a sound and an image to an output device capable of visually or auditorily notifying information to an occupant of the vehicle or the outside of the vehicle.
  • an audio speaker 7710, a display section 7720, and an instrument panel 7730 are illustrated as the output device.
  • the display section 7720 may, for example, include at least one of an on-board display and a head-up display.
  • the display section 7720 may have an augmented reality (AR) display function.
  • the output device may be other than these devices, and may be another device such as headphones, a wearable device such as an eyeglass type display worn by an occupant or the like, a projector, a lamp, or the like.
  • the output device is a display device
  • the display device visually displays results obtained by various kinds of processing performed by the microcomputer 7610 or information received from another control unit in various forms such as text, an image, a table, a graph, or the like.
  • the audio output device converts an audio signal constituted of reproduced audio data or sound data or the like into an analog signal, and auditorily outputs the analog signal.
  • at least two control units connected to each other via the communication network 7010 in the example depicted in FIG. 7 may be integrated into one control unit.
  • each individual control unit may include a plurality of control units.
  • the vehicle control system 7000 may include another control unit not depicted in the figures.
  • part or the whole of the functions performed by one of the control units in the above description may be assigned to another control unit. That is, predetermined arithmetic processing may be performed by any of the control units as long as information is transmitted and received via the communication network 7010. Similarly, a sensor or a device connected to one of the control units may be connected to another control unit, and a plurality of control units may mutually transmit and receive detection information via the communication network 7010.
  • An electronic device comprising circuitry configured to obtain depth data; perform (504) classification of the depth data by an artificial neural network, ANN, to determine classification information for the depth data, perform an intended functionality if a primary criterion or secondary criterion is fulfilled; determine that the primary criterion for performing the intended functionality is fulfilled based on the classification information obtained by the ANN; determine that the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism; determine (510), based on the secondary mechanism and the classification information obtained by the ANN, if the classification of the depth data is false negative.
  • the circuitry is configured to re-train the ANN if the depth data is classified as false negative.
  • circuitry is configured to determine (510) that the classification of the depth data is false negative if the circuitry determined that the secondary criterion for performing the intended functionality is fulfilled based on the secondary mechanism within a predetermined time span (Tsw) after the circuitry determined that the primary criterion for performing the intended functionality is not fulfilled based on the classification information.
  • Tsw time span
  • the ANN is configured to perform face identification, and wherein the intended functionality relates to opening a door lock of a car, and wherein the secondary criterion for performing the intended functionality is a criterion for opening the door lock.
  • a method comprising: obtaining depth data and storing (502) the depth data on a volatile memory (104); performing (504) classification of the depth data by an artificial neural network, ANN, to determine classification information for the depth data, performing an intended functionality if a primary criterion or a secondary criterion is fulfilled; determining that the primary criterion for performing the intended functionality is fulfilled based on the classification information obtained by the ANN; determining that the secondary criterion for performing the intended functionality is fulfilled based on a secondary mechanism; determining (510), based on the secondary mechanism and the classification information obtained by the ANN, if the classification of the depth data is false negative.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Lock And Its Accessories (AREA)

Abstract

L'invention concerne un dispositif électronique (103) comprenant de la circuiterie configurée pour obtenir des données de profondeur; effectuer (504) une classification des données de profondeur par un réseau neuronal artificiel (ANN) pour déterminer des informations de classification pour les données de profondeur, réaliser une fonctionnalité prévue si un critère primaire ou un critère secondaire est respecté; déterminer que le critère primaire pour réaliser la fonctionnalité prévue est respecté en se basant sur les informations de classification obtenues par l'ANN; déterminer que le critère secondaire pour réaliser la fonctionnalité prévue est respecté en se basant sur un mécanisme secondaire.
EP22751033.6A 2021-07-19 2022-07-12 Dispositif électronique et procédé Pending EP4374335A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP21186494 2021-07-19
PCT/EP2022/069457 WO2023001636A1 (fr) 2021-07-19 2022-07-12 Dispositif électronique et procédé

Publications (1)

Publication Number Publication Date
EP4374335A1 true EP4374335A1 (fr) 2024-05-29

Family

ID=76971783

Family Applications (1)

Application Number Title Priority Date Filing Date
EP22751033.6A Pending EP4374335A1 (fr) 2021-07-19 2022-07-12 Dispositif électronique et procédé

Country Status (2)

Country Link
EP (1) EP4374335A1 (fr)
WO (1) WO2023001636A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118675204A (zh) * 2024-08-26 2024-09-20 杭州锐见智行科技有限公司 一种嘘声手势检测方法、装置、电子设备和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110930547A (zh) * 2019-02-28 2020-03-27 上海商汤临港智能科技有限公司 车门解锁方法及装置、系统、车、电子设备和存储介质

Also Published As

Publication number Publication date
WO2023001636A1 (fr) 2023-01-26

Similar Documents

Publication Publication Date Title
US11941819B2 (en) Object detection using skewed polygons suitable for parking space detection
US11841458B2 (en) Domain restriction of neural networks through synthetic data pre-training
US20230418299A1 (en) Controlling autonomous vehicles using safe arrival times
US11636689B2 (en) Adaptive object tracking algorithm for autonomous machine applications
US11514293B2 (en) Future object trajectory predictions for autonomous machine applications
US11841987B2 (en) Gaze determination using glare as input
US11538231B2 (en) Projecting images captured using fisheye lenses for feature detection in autonomous machine applications
US11590929B2 (en) Systems and methods for performing commands in a vehicle using speech and image recognition
US20220391766A1 (en) Training perception models using synthetic data for autonomous systems and applications
US20230351807A1 (en) Data set generation and augmentation for machine learning models
US12073604B2 (en) Using temporal filters for automated real-time classification
US11886634B2 (en) Personalized calibration functions for user gaze detection in autonomous driving applications
US20220144304A1 (en) Safety decomposition for path determination in autonomous systems
US20230186640A1 (en) Single and across sensor object tracking using feature descriptor mapping in autonomous systems and applications
US20240017743A1 (en) Task-relevant failure detection for trajectory prediction in machines
CN114841336A (zh) 修补用于自主机器应用的部署的深度神经网络
EP4374335A1 (fr) Dispositif électronique et procédé
US20240020953A1 (en) Surround scene perception using multiple sensors for autonomous systems and applications
US20240010232A1 (en) Differentiable and modular prediction and planning for autonomous machines
CN117581117A (zh) 自主机器系统和应用中使用LiDAR数据的动态对象检测
US20230202489A1 (en) Configuration management system for autonomous vehicle software stack

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20240217

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SONY DEPTHSENSING SOLUTIONS SA/NV

Owner name: SONY SEMICONDUCTOR SOLUTIONS CORPORATION