WO2022175494A1 - Apparatus, method and computer program for analyzing a sensor signal - Google Patents

Apparatus, method and computer program for analyzing a sensor signal Download PDF

Info

Publication number
WO2022175494A1
WO2022175494A1 PCT/EP2022/054157 EP2022054157W WO2022175494A1 WO 2022175494 A1 WO2022175494 A1 WO 2022175494A1 EP 2022054157 W EP2022054157 W EP 2022054157W WO 2022175494 A1 WO2022175494 A1 WO 2022175494A1
Authority
WO
WIPO (PCT)
Prior art keywords
sensor signal
sensor
neural net
signal
memory
Prior art date
Application number
PCT/EP2022/054157
Other languages
French (fr)
Inventor
Peter Reichel
Marc REICHENBACH
Stefan PECHMANN
Dietmar FEY
Original Assignee
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Friedrich-Alexander-Universität Erlangen-Nürnberg
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V., Friedrich-Alexander-Universität Erlangen-Nürnberg filed Critical Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Publication of WO2022175494A1 publication Critical patent/WO2022175494A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/318Heart-related electrical modalities, e.g. electrocardiography [ECG]
    • A61B5/346Analysis of electrocardiograms
    • A61B5/349Detecting specific parameters of the electrocardiograph cycle
    • A61B5/361Detecting fibrillation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8046Systolic arrays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • Embodiments according to the invention are related to apparatuses, methods and computer programs for analyzing a sensor signal.
  • AFIB e.g. atrial fibrillation
  • a processor field e.g. a systolic array
  • Atrial fibrillation may be one of the most common serious abnormal heart rhythm. Atrial fibrillation may comprise rapid and irregular beating of atrial chambers of the heart and may lead to blood clots in the heart. Furthermore, AFIB may increase the risk of severe medical incidents, such as stroke and heart failure.
  • AFIB AFIB detection
  • Embodiments according to the invention comprise an apparatus for analyzing, e.g. for classifying, a sensor signal, e.g. a signal of a wearable sensor; e.g. an electrocardiogram signal or an ECG signal, e.g. for a detection of an atrial fibrillation on the basis of an electro-cardiogram.
  • a sensor signal e.g. a signal of a wearable sensor
  • an electrocardiogram signal or an ECG signal e.g. for a detection of an atrial fibrillation on the basis of an electro-cardiogram.
  • a neural net e.g. a neural network
  • coefficients e.g. weights, of which are quantized to be n-ary weights, wherein, as an example, n is, preferably, a non-negative odd integer number which is larger than or
  • a neural network e.g. neural net
  • a respective sensor may provide an information about a heart rhythm to be analyzed or classified, e.g. as an analog or digital signal, e.g. as a voltage or a current and/or as a bitstream.
  • the coefficients may be quantized or binned to n states.
  • n may be an odd number
  • the weights may, for example, be signed weights that are symmetric around zero, e.g. -1, 0, 1 or -2, -1, 0, 1, 2
  • the n-ary quantization may, for example, be a high or coarse quantization, saving energy because of at least one of a lower bit width, smaller data path and/or less net activity.
  • good analyzation results may be achieved, e.g. despite the e.g. coarse quantization, hence providing the good trade-off.
  • a multiplication with the weights may be performed in an energy efficient manner.
  • the multiplication may, for example, be similarly energy efficient as a multiplication with similar unsigned weights, e.g. for the three state example, the binary weights 0, 1, but with better classification results because of the additional, signed weight (e.g. -1).
  • an inventive apparatus may be worn by a patient for a long period of time without the need for a changing of batteries or a recharging of an accumulator.
  • the inventive apparatus may be built very small, due to only small batteries or a small accumulator, because of the low energy consumption, hence allowing to carry the device comfortably.
  • n-ary coefficients specialized memory elements, e.g. with multi-level cell capabilities, and processing architectures, e.g. systolic arrays, may be used for an energy efficient storing of the weights and processing of the sensor signal, e.g. without wasting any or many hardware resources, e.g. to further increase the energy efficiency.
  • inventive concept may allow a synergistic usage of dedicated hardware and/or hardware architecture to further increase energy efficiency, hence reducing energy costs for storing coefficients and for processing the sensor signal using the coefficients.
  • the coefficients, e.g. weights, of the neural net are quantized to take one of the three values -a, 0, and +a, e.g. one of the values 1, 0 and +1, where a is a real valued number, e.g., preferably, a potency of 2, since this allows for a simple multiplication.
  • a may be a complex number.
  • the neural net is a mapped onto a data flow of a systolic array, e.g. implemented using a two-dimensional array of processing elements.
  • a systolic array may, for example, be a homogeneous network of data processing units.
  • the data processing units may, for example, be tightly coupled and may, for example, be configured to compute a respective partial result using or based on a signal, e.g. data signal, from a neighboring data processing unit, and may further be configured to store a respective result and to provide the result to a neighboring data processing unit.
  • embodiments according to the invention are not limited to systolic arrays, in other words, usage of systolic arrays or apparatuses with systolic arrays may be one optional feature or embodiment.
  • a compiler e.g. an architecture compiler may be used in order to assemble the processing units and local blocks together.
  • local blocks of a non-volatile memory or of a non-volatile distributed memory, e.g. memory elements, and processing elements may be assembled together.
  • neural network inference may be well defined.
  • an architecture of the systolic array may allow to exploit characteristics of the neural network inference in order to provide a classification functionality with only low energy costs for control tasks, e.g. as in conventional approaches in the control path, cache control and/or branch prediction.
  • Usage of a systolic array architecture may further allow to achieve a synergistic reduction of energy consumption in combination with the n-ary weights.
  • the neural net is configured to perform a matrix multiplication, e.g. between a matrix defined by the ternary (or in general n-ary) neural net coefficients and a vector defined by the sensor signal, or by the preprocessed version of the sensor signal, or by the sensor data derived from the sensor signal.
  • the apparatus comprises a non-volatile memory, e.g. a RRAM (e.g. Resistive Random Access Memory) device, comprising multilevel cells, e.g. MLCs, or three-level cells, which may, for example, be configured to store three different values in a single memory cell.
  • the memory is configured to store the, e.g. ternary, or, generally, n-ary, coefficients, e.g. weights, of the neural net.
  • the inventors recognized that using the n-ary coefficients a synergistic reduction of the energy consumption may be achieved by using multi-level cells configured to store the coefficients.
  • the multi-level cells may, be memory cells specifically adapted in order to save n-ary weight coefficients and to provide them for further processing. This way an odd number of coefficient stats may, for example, be stored efficiently, e.g. without wasting hardware resources.
  • an inventive apparatus may further be configured to perform a training adapted to the characteristics of the multi-level cells, e.g. a memory cell aware training.
  • a quantization of the neural network may be performed directly to the capabilities of the memory weight cells, e.g. of the multi-level cells.
  • non-volatile memory may allow to switch off the apparatus, e.g. in times wherein an analysis or classification is not needed, without losing neural network coefficients and/or internal programming.
  • the non-volatile memory may, for example, be a CMOS memory, and may hence be produced with low costs and integrated without increased effort with other circuitry elements.
  • the neural net is a mapped onto a data flow of a systolic array
  • the apparatus comprises a non-volatile distributed memory comprising multi-level cells or three-level cells.
  • the distributed memory is configured to store the coefficients of the neural net and the distributed memory is configured to provide the coefficients of the neural net to processing elements of the systolic array.
  • the distributed memory comprises a plurality of memory elements, wherein memory elements of the plurality of memory elements comprise the multi-level cells or three-level cells and the systolic array comprises the plurality of memory elements. Moreover, memory elements of the plurality of memory elements are configured to provide the coefficients of the neural net to processing elements of the systolic array.
  • the memory may comprise RRAM structures, that may, for example, be arranged within the systolic array.
  • the neural network coefficients e.g. the weights, may already be present at the place where they may be needed.
  • embodiments according to the invention may comprise the advantage that memory elements of a non-volatile distributed memory may be arranged or located or implemented within a systolic array, such that neural network weights may be provided directly within the array, e.g. in order to execute the neural network, which may allow to save energy costs and improve computational efficiency.
  • RRAMs may be configured to provide the distributed non-volatile memory.
  • the memory may be divided into serval local blocks, e.g. memory elements, which may be connected in a dedicated manner to computation units, e.g. processing units of the systolic array.
  • the apparatus comprises a preprocessing unit, e.g. a preprocessing circuit, and/or a filter, and/or a microprocessor performing a preprocessing, and/or an ASIC performing a preprocessing, and/or a microcontroller performing the preprocessing, configured to preprocess the sensor signal, e.g. using one or more filtering operations, to obtain a preprocessed version of the sensor signal.
  • the apparatus may comprise a buffer, e.g. a double buffer, configured to store the preprocessed version of the sensor signal.
  • the preprocessed version of the sensor signal may, for example, be an analogous signal, that may be buffered or stored in the buffer.
  • the apparatus comprises a preprocessing unit, e.g. a preprocessing circuit or a microprocessor performing a preprocessing or a microcontroller performing the preprocessing, configured to preprocess the sensor signal, to obtain preprocessed sensor data.
  • a preprocessing unit e.g. a preprocessing circuit or a microprocessor performing a preprocessing or a microcontroller performing the preprocessing, configured to preprocess the sensor signal, to obtain preprocessed sensor data.
  • the apparatus may comprise a buffer configured to store the preprocessed sensor data.
  • the sensor data may, for example, be a digital signal that may be buffered or stored in the buffer.
  • a wholistic solution for sensor data evaluation may be provided.
  • the sensor signal may be preprocessed, such that an analysis and or classification of the preprocessed sensor data may be performed efficiently, e.g. in an energy efficient manner.
  • the buffer may be configured to store said information.
  • a digitization may be performed before or after a preprocessing, or for example, after a buffering of the processed information.
  • the apparatus is configured to temporarily activate the neural net when a sufficient, e.g. predetermined, number of signal values of the input signal or a sufficient, e.g. predetermined, number of signal values of the preprocessed version of the input signal or a sufficient, e.g. predetermined, amount of sensor data has been accumulated in the buffer.
  • a sufficient e.g. predetermined, number of signal values of the input signal or a sufficient, e.g. predetermined, number of signal values of the preprocessed version of the input signal or a sufficient, e.g. predetermined, amount of sensor data has been accumulated in the buffer.
  • the preprocessed version of the sensor signal and/or the preprocessed sensor data may be stored or accumulated, before it is processed by a portion of the apparatus configured for the analysis and/or classification of the signal.
  • an analysis and/or classification portion may, for example, be turned off, hence reducing energy consumption, until a certain amount of sensor information, e.g. preprocessed version of the sensor signal and/or preprocessed sensor data, is buffered to be evaluated.
  • the apparatus is configured to constantly keep the preprocessing unit turned on, e.g. powered. Furthermore, the apparatus may be configured to intermittently (or intermittendly) activate, e.g. power up, the neural net, e.g. while keeping the neural net switched off at least 50% of the time, or at least 80% of the time, or at least 90% of the time, or at least 95% of the time.
  • an active evaluation unit of the apparatus may, for example, be sufficient for at most 50% of the time or at most 20% of the time or of at most 10% of the time or of at most 5% of the time.
  • the method comprises inputting the sensor signal, or a preprocessed version of the sensor signal, or sensor data derived from the sensor signal, into a neural net, coefficients, e.g. weights, of which are quantized to be ternary weights, e.g. to take three possible values, in order to obtain an analysis result.
  • coefficients e.g. weights, of which are quantized to be ternary weights, e.g. to take three possible values, in order to obtain an analysis result.
  • the method as described above is based on the same considerations as the above- described apparatus.
  • the method can, by the way, be completed with all features and functionalities, which are also described with regard to the apparatus.
  • the method comprises performing a memory ceil aware training of the neural net, e.g. which may, for example, consider a quantization of the neural net coefficients and/or a number of available values for the coefficients.
  • a training of the neural network may, for example, be adapted to the specific usage of the n-ary coefficients. Therefore, training time may be reduced and/or classification quality may be improved.
  • Further embodiments according the invention comprise a computer program for performing any of the above explained methods, when the computer program runs on a computer.
  • Fig. 1 shows a schematic view of an apparatus for analyzing a sensor signal according to embodiments of the invention
  • Fig. 2a shows a schematic view of an apparatus for analyzing a sensor signal with additional, optional features, according to embodiments of the invention
  • Fig. 2b shows a schematic view of an apparatus for analyzing a sensor signal with a distributed memory, according to embodiments of the invention
  • Fig. 3 shows a schematic block diagram of a method for analyzing a sensor signal according to embodiments of the invention
  • Fig. 4 shows an example of a processor architecture according to embodiments of the invention.
  • Fig. 5. shows a schematic system overview according to embodiments of the invention.
  • Equal or equivalent elements or elements with equal or equivalent functionality are denoted in the following description by equal or equivalent reference numerals even if occurring in different figures.
  • a plurality of details is set forth to provide a more throughout explanation of embodiments of the present invention.
  • embodiments of the present invention may be practiced without these specific details.
  • well-known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring embodiments of the present invention.
  • features of the different embodiments described herein after may be combined with each other, unless specifically noted otherwise.
  • Fig. 1 shows a schematic view of an apparatus for analyzing a sensor signal according to embodiments of the invention.
  • apparatus 100 comprising, as an optional feature, a neural network 110 (also referred to as neural net).
  • element 110 may be means to calculate or approximate an output of a neural network, e.g. a processing unit for executing the neural network.
  • apparatus 100 optionally comprises n-ary quantized neural network coefficients 120 (as an example, weights, which are coefficients of the neural network) and may hence be configured to provide the coefficients to the neural network (as indicated by arrow 122).
  • element 120 may be a memory comprising or storing the n-ary coefficients.
  • the neural network 110 e.g.
  • the means to calculate and output a calculation result according to calculation rules determining or approximating the neural network may comprise the coefficients 120.
  • a processing unit e.g. 110
  • a memory e.g. 120
  • Apparatus 100 may be configured to input a sensor information 102 into the neural network 110.
  • the sensor information may, for example, be at least one of a sensor signal, a preprocessed version of the sensor signal, and/or sensor data derived from the sensor signal.
  • the neural network 110 may be configured to obtain an analysis result 104.
  • the sensor information 102 may be a bitstream comprising digitized sensor measurements, e.g. preprocessed sensor measurements.
  • the sensor information 102 may be determined or measured by a wearable sensor, e.g. a sensor configured to provide measurement information for monitoring a state or health of a human, e.g. of a human’s heart.
  • the neural network 110 may be trained and/or configured to analyze and/or to classify and/or to evaluate the sensor information. Therefore, as an example, the neural network 110 may use the n-ary weights.
  • the inventors recognized that using n-ary weights, an efficient, e.g. energy efficient, apparatus 100 for analyzing a sensor information 102, e.g. a sensor signal may be provided.
  • apparatus 100 may, for example, be integrated in a wearable health monitoring device, e.g. comprising a wearable sensor and apparatus 100 for gathering and/or measuring and evaluating long term health data.
  • a wearable health monitoring device e.g. comprising a wearable sensor and apparatus 100 for gathering and/or measuring and evaluating long term health data.
  • small batteries or accumulators may be used which may allow to wear such a device comfortably.
  • long-term heath monitoring may be possible because of an ability of such a device to operate without recharging for a long period of time.
  • the n-ary coefficients 120 may optionally be integer coefficients of the neural network that are symmetric to zero, e.g. -a, 0, a, with a being an integer.
  • the coefficients may be ternary weights, e.g. weights -1, 0, 1.
  • an analysis result 104 may be provided by the neural network 110 based on the sensor information 102 and the n-ary coefficients 120.
  • the analysis result may comprise an information about a body functionality, e.g. the functionality or state or health of an organ, e.g. of a human or an animal.
  • the result 104 may, for example, be provided to the human or a supervising unit (e.g. via wireless communication) in order to provide a warning for an abnormal state of the body functionality.
  • apparatus 100 may optionally comprise a memory in order to store the analysis result 104 and, for example, corresponding sensor information 102 of a corresponding period of time, e.g. a period of time in which the neural network detected unusual sensor information.
  • Fig. 2a shows a schematic view of an apparatus for analyzing a sensor signal with additional, optional features, according to embodiments of the invention.
  • Fig. 2a shows apparatus 200a comprising a neural network 210a, e.g. means to calculate or approximate an output of a neural network, e.g. a processing unit for executing the neural network.
  • a neural network 210a e.g. means to calculate or approximate an output of a neural network, e.g. a processing unit for executing the neural network.
  • the neural net 210a may be mapped onto a data flow of a systolic array 212a.
  • the neural network 210a may be implemented using a two- dimensional array of processing elements 214.
  • element 210a may be a processing unit comprising a systolic array architecture for approximating or calculating or evaluating an output of the neural network.
  • the systolic array 212a may, for example, be a monolithic network of processing elements 214, e.g. of primitive computing nodes.
  • the processing elements 214 may, for example, be hardwired or software configured, e.g. using FPGAs.
  • processing elements 214 may comprise programmable interconnects, e.g. adapted according to n-ary weights. Furthermore, processing elements 214 may, for example, be substantially identical. The processing elements 214 may, for example, be trigged by the arrival of new data, e.g. of new sensor information at a respective processing element 214.
  • One advantage of such an architecture may, for example, be that processing data and partial results may be stored within the systolic array, e.g. while passing through the array. Hence, usage of external buses, main memory and/or internal caches may, for example, be omitted. Furthermore, such an architecture may provide good classification results, e.g. for obtaining a robust and accurate analysis result 204 based on a sensor information 202 e.g. with low energy costs.
  • apparatus 200a optionally comprises a non-volatile memory 220a, e.g. a RRAM device, configured to store the coefficients of the neural network, wherein the memory 220a comprises multi-level cells, e.g. MLCs.
  • non-volatile memory 220a comprises three-level cells 224 as multi-level cells, which are, as optionally shown, configured to store three different values (-a, 0, a) in a single memory cell.
  • the memory may, for example, be a central or centralized memory or a distributed memory, e.g. a memory comprising a plurality of local memory elements that may be coupled in a dedicated manner with processing elements of the apparatus 200a.
  • n-ary weights e.g. as shown in Fig. 2a ternary weights
  • a dedicated processing architecture may be used in order to achieve a low energy consumption.
  • the n-ary weights may provide a good trade-off between an analyzation quality of the sensor information 202 using the neural network 210a and computational and processing costs and hence energy consumption.
  • the storing of the n-ary weights may, as shown in Fig. 2a, be performed with low energy costs using the multi-level cells. This may, for example, mitigate a possible disadvantage of using an odd number of signed weights, since the multi-level cells may allow to store such weights efficiently.
  • the coefficients of the neural net 210a may be quantized to take one of the three values -a, 0, and +a, wherein a is a real valued number, e.g., preferably, a potency of 2, since this may allow for a simple multiplication, or where a is a complex number.
  • the coefficients may be -1, 0, 1.
  • the inventors recognized that ternary weights may allow an accurate classification of sensor data with low energy costs.
  • the apparatus may, for example, further be configured to perform a matrix multiplication.
  • the ternary weights may, for example, be, provided (e.g. as indicated by arrow 222a) by the memory 220a to the neural network 210a in the form a matrix.
  • the neural network 210a may further receive the sensor information 202 in the form of a vector and may hence determine the analysis result 204 based on a multiplication of the matrix and the vector.
  • a multiplication may be performed with low energy costs.
  • apparatus 200a optionally comprises a preprocessing unit
  • Preprocessing unit 230 may be configured to preprocess a sensor signal 206, provided to the apparatus 200a, e.g. by a sensor, e.g. a wearable sensor, to obtain the sensor information 202, e.g. a preprocessed version of the sensor signal 206 and/or preprocessed sensor data. Therefore, the preprocessing unit 230 may, for example, be or comprise at least one of a preprocessing circuit, and/or a filter (e.g. for performing one or more filter operations), and/or a microprocessor (e.g. for performing the preprocessing), and/or an ASIC (e.g. for performing the preprocessing), and/or a microcontroller (e.g. for performing the preprocessing).
  • a filter e.g. for performing one or more filter operations
  • microprocessor e.g. for performing the preprocessing
  • ASIC e.g. for performing the preprocessing
  • microcontroller e.g. for performing the preprocessing
  • the preprocessing unit 230 may receive an analog or digital sensor signal 206.
  • the preprocessing unit 230 may, for example, comprise an analog-to-digital converter in order to digitize the sensor signal.
  • preprocessing unit 230 may be configured to convert sensor signal 206 in sensor information 202, wherein sensor information 202 may be adapted to be easily, e.g. efficiently, processed by neural network 210a.
  • Buffer 240 e.g. a double buffer, may be configured to store the sensor information 202.
  • sensor signal 206 may be provided to the apparatus 200a significantly slower than a processing speed of the apparatus 200a (e.g. of neural network 210a).
  • the neural network 210a may be temporally deactivated until a certain amount of sensor information 202 is accumulated in buffer 240.
  • the apparatus 200a may be configured to temporarily activate the neural net 210a when a sufficient, e.g. predetermined, number of signal values of the input signal 206 or a sufficient, e.g. predetermined, number of signal values of the preprocessed version 202 of the input signal or a sufficient, e.g. predetermined amount of sensor data 202 has been accumulated in the buffer 240.
  • apparatus 200a may optionally comprise an activation unit 250.
  • the activation unit may receive a buffer information 242, the buffer information indicating an amount of sensor information 202 stored in the buffer 240, e.g. a number of signal values, e.g. an amount of sensor data. Based thereon, the activation unit 250 may provide an activation signal 252 to the neural network 210a.
  • the activation signal may stimulate or start a processing of the sensor information 202 provided by the buffer 240 to the neural network 210a.
  • activation signal 252 may as well be used to deactivate, e.g. to turn off neural network 210a.
  • apparatus 200a may be configured to maintain the preprocessing unit 230 in an activated state for a longer period of time than the neural network 210a.
  • the preprocessing unit 230 may always be turned on, e.g. powered, e.g. in order to allow for a constant or streaming preprocessing of incoming sensor signal information 206.
  • the activation unit may activate the neural network 210a.
  • activation unit 250 may be configured to intermittently active or power up the neural net.
  • evaluation of a long-term ECG may be performed by a doctor several days after collecting the measurements.
  • activation unit may, for example, deactivate the neural network, using activation signal 252 for longer periods of time. It is to be noted that in this time no data or information may be lost, because of the storing in the buffer 240.
  • up the neural network 210a may, for example, be switched off at least 50% of the time, or at least 80% of the time, or at least 90% of the time, or at least 95% of the time, hence significantly reducing energy costs.
  • Fig. 2b shows a schematic view of an apparatus for analyzing a sensor signal with a distributed memory, according to embodiments of the invention.
  • apparatus 200b comprises a non-volatile distributed memory 220b.
  • the non-volatile distributed memory 220b may optionally comprise, as shown, a plurality of memory elements 226, wherein memory elements of the plurality of memory elements comprise multi-level cells or three-level cells.
  • memory elements 226 of the plurality of memory elements may be configured to provide 222b the coefficients of the neural net 210b to processing elements 214 of the systolic array 212b.
  • the neural net 210b may be mapped onto a data flow of a systolic array 212b, wherein the systolic array 212b may optionally comprise a plurality of memory elements 226 of a non-volatile distributed memory 220b, in order to provide the coefficients, e.g. the n-ary weights, of the neural net 210b to the processing elements 214, e.g. in order to approximate or execute the neural network.
  • one memory element 226 may provide n-ary weights to a plurality of processing elements 214.
  • each processing element 214 may receive weights from a single memory element.
  • one memory elements may provide weights to an arbitrary number of processing elements.
  • Fig. 3 shows a schematic block diagram of a method for analyzing a sensor signal according to embodiments of the invention.
  • Method 300 may, for example, be a method for classifying a sensor signal, e.g. sensor signal 206 as explained in the context of Fig. 2a or sensor information 102, 202 as explained in the context of Fig. 1 and respectively Fig. 2a.
  • a signal may, for example, be a signal of a wearable sensor; e.g. an electrocardiogram signal or an BCG signal, e.g. for a detection of an atrial fibrillation on the basis of an electro-cardiogram.
  • Method 300 comprises inputting 310 the sensor signal, or a preprocessed version of the sensor signal, or sensor data derived from the sensor signal, into a neural net, coefficients, e.g. weights, of which are quantized to be ternary weights, e.g. to take three possible values, in order to obtain an analysis result.
  • coefficients e.g. weights, of which are quantized to be ternary weights, e.g. to take three possible values, in order to obtain an analysis result.
  • method 300 further comprises performing 320 a memory cell aware training of the neural net, which may, for example, consider a quantization of the neural net coefficients and/or a number of available values for the coefficients. It is to be noted that, according to a specific application or goal, steps 310 and 320 may be performed in an arbitrary order. As an example, for a training of a neural network, a sensor signal may be input into the neural network and the weights of the neural network may be determined based on a subsequent training of the neural network.
  • a training of the neural network may be performed based on artificial data first, and for a subsequent application, the sensor signal may be input into the neural network in order to analyze the sensor signal.
  • embodiments according to the invention may comprise devices and methods for energy-efficient neural network calculations using a processor field (e.g. a systolic array) based on non-volatile processor elements and memory elements and correspondingly optimized (e.g. optimized with regard to the beforementioned elements and/or architecture) n-ary weights.
  • a processor field e.g. a systolic array
  • non-volatile processor elements and memory elements e.g. a systolic array
  • correspondingly optimized e.g. optimized with regard to the beforementioned elements and/or architecture
  • Embodiments according to the invention may comprise a detection of AFIB via neuromorphic hardware. Embodiments may allow an extremely high energy efficiency due to a combination of n-ary weight memory elements, e.g. realized by RRAM; Furthermore, embodiments according to the invention may comprise data-flow oriented processing architectures and non-volatile memory elements for a realization of power cycles.
  • features and functionalities disclosed herein relating to a method can also be used in an apparatus (configured to perform such functionality).
  • any features and functionalities disclosed herein with respect to an apparatus can also be used in a corresponding method.
  • the methods disclosed herein can be supplemented by any of the features and functionalities described with respect to the apparatuses.
  • any of the features and functionalities described herein can be implemented in hardware or in software, or using a combination of hardware and software, as will be described in the section “implementation alternatives”.
  • aspects are described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
  • Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit in some embodiments, one or more of the most important method steps may be executed by such an apparatus.
  • embodiments of the invention can be implemented in hardware or in software.
  • the implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
  • Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
  • embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer.
  • the program code may for example be stored on a machine readable carrier.
  • inventions comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
  • an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
  • a further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
  • the data carrier, the digital storage medium or the recorded medium are typically tangible and/or nontransitionary.
  • a further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein.
  • the data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the
  • a further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
  • a processing means for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
  • a further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
  • a further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver.
  • the receiver may, for example, be a computer, a mobile device, a memory device or the like.
  • the apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
  • a programmable logic device for example a field programmable gate array
  • a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein.
  • the methods are preferably performed by any hardware apparatus.
  • the apparatus described herein may be implemented using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer.
  • the apparatus described herein, or any components of the apparatus described herein, may be implemented at least partially in hardware and/or in software.
  • the methods described herein may be performed using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer.
  • the methods described herein, or any components of the apparatus described herein, may be performed at least partially by hardware and/or by software.
  • a wearable Neuronal Network Processor Architecture for AFIB Detection As an example and inter alia, embodiments according to the invention comprising or for a wearable neuronal network processor architecture for AFIB detection may be discussed in the following.
  • Embodiments according to the invention may be a very important task to diagnose heart disease.
  • Embodiments according to the invention for example, in order to perform such a task, are presented in the following.
  • Embodiments according to the invention for example therefore comprise or are related to an ultra-low power computer architecture for classification of EGG for wearable sensors.
  • a new architecture or the new architecture, according to embodiments of the invention may combine three approaches e.g. to gain for example a significant, or large or even maximum energy savings: (1) Usage of systolic arrays e.g. to avoid control overhead, (2) High quantization e.g. by using ternary weights and (3) novel non-volatile multi-level memory cells e.g. for storing weights.
  • embodiments of the invention may comprise or use any of these approaches separately or in combination with any other approaches. With approaches according to embodiments of the invention up to 94.7% energy may be saved, compared to traditional processor architectures.
  • NN neuronal networks
  • the inference of NN may be mainly based on matrix multiplications.
  • the architecture may comprise, separately or in combination, or may combine, one or more of the following three features.
  • Standard CPUs may be generic compute cores for arbitrary applications but spent huge energy in the control path, e.g. cache control or branch prediction.
  • the inventors have realized that the domain of the inference of NN may be well-defined.
  • embodiments according to the invention may comprise, for example a more specialized architecture for example to reduce, or for example to reduce as much as possible control logic.
  • Devices according to the invention may comprise, or methods according to the invention may use systolic arrays, a data flow driven architecture, for example organized in a 2D array fashion. In other words, embodiments of the invention may focus on the usage of systolic arrays. While data flows through this array, according to embodiments a Matrix Multiplication may be inherently performed.
  • a high quantization of the weights may be able to save energy for example due to at least one lower bit width, smaller data path and less net activity. In most traditional approaches this was done to fit in existing register widths (e.g. 8 bits or in extreme cases only 1 bit).
  • quantization may be performed to three states. With three states (-1, 0, +1), the multiplication may be very energy efficient for example similar to binary NN, but may deliver better accuracy for example due to the presence of an additional state (-1).
  • embodiments according to the invention may comprise specialized RRAM devices, for example, with MLC capabilities.
  • embodiments of the invention comprise a new method, called memory cell aware training of the NN, which may perform the quantization of the NN directly to the capabilities of the memory weight cells.
  • RRAM which may perform a non-volatile memory technology, which may make it possible to store weights, permanently inside the systolic array, for example even if the power is switched off. For example, due to their CMOS compatibility, it could be embedded deep in the systolic array.
  • the device can be switched off, without losing their internal programming. This may allow a huge energy saving, for example every time the chip processing the NN inference is not needed.
  • the RRAM devices can be reprogrammed, for example to address different (medical) applications.
  • FIG. 4 an example of a processor architecture according to embodiments of the invention, e.g. to detect AFIB in wearable ECG sensors, is shown.
  • apparatus 400 comprises an neural network (NN) architecture 410 and a preprocessing unit 420.
  • the NN architecture 410 may comprise means 412 to execute or process the neural network.
  • the NN architecture 410 may, for example, comprise RRAM blocks 414 to store the coefficients, e.g. weights of the neural network.
  • processing unit 420 may comprise a RRAM Manager 428, wherein the RRAM Manager 428 may, for example, be configured to configure the neural network, e.g. means 412.
  • RRAM Manager 428 may, for example, be configured to adapt the weights stored in RRAM blocks 414. Therefore, the weights 414 may, for example, be coupled to each other.
  • RRAM Manager 428 may be coupled with each of the RRAM blocks 414 individually. New coefficients to be stored in RRAM blocks 414 may, for example, be provided by an external RRAM interface 430.
  • weights stored in the RRAM blocks may, for example, be read out, e.g. provided to an external device, e.g. via the external RRAM interface 430.
  • Preprocessing unit 420 comprises, as an optional feature, means 422 to preprocess a sensor signal and a buffer 424.
  • buffer 424 may be a double buffer, e.g. comprising two SRAM (e.g. static random-access memory) blocks 426.
  • buffer 424 may, for example, receive a preprocessed sensor signal 401 (e.g. a preprocessed version of the sensor signal or preprocessed sensor data) from the preprocessing means 420. For example, when a certain amount of preprocessed information is gathered or stored in buffer 424, the buffer may provide the information to the NN architecture 410.
  • the sensor information 402 provided to the NN architecture may be the, or may be equal to the, preprocessed sensor signal 401.
  • the NN architecture 410 may, for example, be deactivated in order to save energy (e.g. until the information is provided from the buffer 424). Therefore, as optionally shown, apparatus 400 may comprise an activation unit 440 (Pwr switch, e.g. a power switch), which may be provided with a signal 442 from the preprocessing unit 420.
  • the signal 442 may be a command or an impulse to activate or to enable power (e.g. pwr_ena signal) of the NN architecture 410.
  • the signal 442 may, for example, be a buffer information indicating a sufficient amount of sensor information buffered, for a subsequent analyzation using the NN architecture 410.
  • a power switch may, for example, be configured to activate or to deactivate the NN architecture 410, e.g. by providing or disabling one or more supply voltages.
  • activation unit 440 may comprise or regulate or control a power supply, configured to provide a first supply voltage VDD33, e.g. of 3.3 V and a second supply voltage VDD12, e.g. of 1.2 V.
  • the NN architecture may be provided with more than one voltage level, e.g. for different functionalities.
  • means 412 and RRAM blocks 414 may be provided with a lower supply voltage than interfaces of the NN architecture 410.
  • the preprocessing unit 420 may as well be provided with a supply voltage, e.g. VDD12, e.g. of 1.2 V, for example, with a same voltage level as one of the voltage levels provided to the NN architecture 410.
  • VDD12 e.g. of 1.2 V
  • the preprocessing unit 420 may, for example, be provided with a sensor signal 406, e.g. as shown via a recording interface 432.
  • the recording interface 432 may be an input/output, e.g. a bidirectional interface.
  • an analysis result 434 from means 412 may be provided to an external device via the interface 432.
  • preprocessing unit 420 may, for example, preprocess the sensor signal 406, e.g. raw measurement data, in order to provide the preprocessed sensor signal 401 for the NN architecture 410.
  • preprocessing unit 420 comprises a switch 436, that may be controlled by a bypass value 438 (e.g. cfg_enabel_bypass).
  • a bypass value 438 e.g. cfg_enabel_bypass.
  • the processed sensor information from means 422 mays be neglected and the sensor signal 406 may be provided to the NN architecture 412.
  • sensor information 402 may be equal to sensor signal 406.
  • means 422 may optionally comprise a window invalidation functionality.
  • means 422 may be configured to detect a disturbance of the sensor signal 406 and may therfore, as an example, mark certain sensor signal data as invalid, such that it is not used for a subsequent classification.
  • an electromagnetic disturbance may be detected, that may be different from an abnormality of a sensor signal data for example caused by an irregular heartbeat. This may allow a more robust analyzation of the sensor signal, e.g. with less false alarms.
  • apparatus 400 may be configured to be provided with a continuous mode signal 439 (e.g. cfg_enable_continous_mode_n). Based on such a signal, apparatus 400 may, for example, be configured to keep the NN architecture 410 activated, for example, irrespective of an amount of buffered data that is not fulfilling a threshold for an activation of the NN architecture 410.
  • a continuous mode signal 439 e.g. cfg_enable_continous_mode_n
  • the chip, e.g. apparatus 400 is divided as an example into two parts.
  • the preprocessing of ECG data and a buffer e.g. 424 is shown.
  • the incoming data may be processed by different filters and may be finally accumulated in the double buffers e.g. 424.
  • an example of an, for example actual NN architecture e.g. 410 is shown, which may be based on the aforementioned systolic array.
  • the explained RRAM blocks e.g. 414 which for example stores the ternary weights are shown or are recognizable.
  • the for example complete power of the right part e.g. 410 can be for example switched off, for example when it is not needed which may be the case in a high percentage of time or, for example in 99.8% of the time the case.
  • a classification using the NN may be executed for example to detect if a AFIB occurred.
  • the real-time classification of an ECG data of 12.66 seconds with 512Hz sample rate using the explained architecture may be 12.356 uJ energy. This is about 94.7% less compared to a traditional architecture, without non-volatile ternary weight RRAM.
  • the NN contains 1520 ternary weights, used in four convolution layers and two fully connected layers.
  • Fig. 5 shows a schematic system overview according to embodiments of the invention.
  • Fig. 5 shows apparatus 500 comprising an optional input/output (I/O) interface 510, a main unit 520 and a processing core 530.
  • the main unit 520 optionally comprises a power control unit (pwr control) 522, a preprocessing unit 524, a buffer 526 and a control logic 528.
  • the processing core 530 optionally comprises a plurality of RRAM memory blocks 532 and a neural network 534.
  • the I/O interface 510 may provide a sensor signal 512 to the preprocessing unit 524.
  • the pre-processing unit 524 may provide a sensor information 525, e.g. a preprocessed version of the sensor signal or preprocessed sensor data to the buffer 526 which may be configured to store the sensor information 525.
  • the power control unit 522 may be configured to temporarily activate the processing core 530 (or, for example, selectively the neural network 534 of the processing core), e.g. via an activation signal 523 or vice versa to deactivate the processing core. This may, for example, be performed based on a number of buffered sensor information data points in buffer 526.
  • the sensor information 525 may be provided by the buffer to the neural network 534, e.g. when the processing core 530 is activated using activation signal 523. Based on the sensor information 525 the neural net 534 may provide an analysis result 535 to the I/O interface 510.
  • the neural network 534 is provided with neural network coefficients 532 in the form of n-ary weights. It is to be noted that the RRAM memory blocks 532 may form a distributed non-volatile memory that may be part of a systolic array used to execute the neural network 534. The memory blocks 532 are optionally controlled by the control logic 528.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Cardiology (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Surgery (AREA)
  • Medical Informatics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computer Hardware Design (AREA)
  • Psychiatry (AREA)
  • Fuzzy Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
  • Signal Processing (AREA)
  • Neurology (AREA)

Abstract

Embodiments according to the invention comprise an apparatus for analyzing, e.g. classifying, a sensor signal, e.g. a signal of a wearable sensor; e.g. an electrocardiogram signal or an ECG signal, e.g. for a detection of an atrial fibrillation on the basis of an electro-cardiogram. Furthermore, the apparatus is configured to input the sensor signal, or a preprocessed version of the sensor signal, or sensor data derived from the sensor signal, into a neural net (e.g. a neural network), coefficients, e.g. weights, of which are quantized to be n-ary weights, wherein, as an example, n is, preferably, a non-negative odd integer number which is larger than or equal to 3, e.g. ternary weights, e.g. to take three possible values or, e.g. 5-ary weights, which may, for example, take values of -2,- 1,0,+1,+2, or, e.g. 7-ary weights, in order to obtain an analysis result.

Description

Apparatus, Method and Computer Program for Analyzing a Sensor Signal
Description
Technical Field
Embodiments according to the invention are related to apparatuses, methods and computer programs for analyzing a sensor signal.
Further embodiments according to the invention are related to a wearable neuronal network processor architecture for AFIB (e.g. atrial fibrillation) detection.
Further embodiments according to the invention are related to devices and methods for energy-efficient neural network calculations using a processor field (e.g. a systolic array) based on non-volatile processor elements and memory elements and correspondingly optimized n-ary weights.
Background of the Invention
Atrial fibrillation (AFIB, AF or A-fib) may be one of the most common serious abnormal heart rhythm. Atrial fibrillation may comprise rapid and irregular beating of atrial chambers of the heart and may lead to blood clots in the heart. Furthermore, AFIB may increase the risk of severe medical incidents, such as stroke and heart failure.
One challenge related to AFIB may be that many people do not have symptoms, and even episodes of AFIB may occur unregularly. Hence, it may be difficult to detect AFIB. Therefore, devices for AFIB detection may have to be worn by a patient over a long period of time in order to accumulate and classify measurement data.
For practical applications such devices should be of small dimensions and with long operating times. Therefore, it is desired to provide a concept which makes a better compromise between a quality and accuracy of a data classification, a complexity, a size and an operating time of a corresponding device for performing said concept.
This is achieved by the subject matter of the independent claims of the present application.
Further embodiments according to the invention are defined by the subject matter of the dependent claims of the present application.
Summary of the Invention
Embodiments according to the invention comprise an apparatus for analyzing, e.g. for classifying, a sensor signal, e.g. a signal of a wearable sensor; e.g. an electrocardiogram signal or an ECG signal, e.g. for a detection of an atrial fibrillation on the basis of an electro-cardiogram.
Furthermore, the apparatus is configured to input the sensor signal, or a preprocessed version of the sensor signal, or sensor data derived from the sensor signal, into a neural net (e.g. a neural network), coefficients, e.g. weights, of which are quantized to be n-ary weights, wherein, as an example, n is, preferably, a non-negative odd integer number which is larger than or equal to 3, e.g. ternary weights (for example for n=3, e.g. to take three possible values) or, e.g. 5-ary weights, which may, for example, take values of -2, - 1, 0, +1, +2, or, e.g. 7-ary weights, in order to obtain an analysis result.
The inventors recognized that usage of a neural network (e.g. neural net) may allow an accurate and reliable analysis and/or classification of a sensor signal, in order to obtain an analysis result. As an example, a respective sensor may provide an information about a heart rhythm to be analyzed or classified, e.g. as an analog or digital signal, e.g. as a voltage or a current and/or as a bitstream.
Furthermore, the inventors recognized that using n-ary neural network coefficients, e.g. weights -1, 0, 1 (e.g. for n=3) an energy consumption for the analyzation of the sensor signal may be kept at a low level. As an example, the coefficients may be quantized or binned to n states. As explained before, n may be an odd number, and the weights may, for example, be signed weights that are symmetric around zero, e.g. -1, 0, 1 or -2, -1, 0, 1, 2
On the one hand, the inventors recognized that a good trade-off between an energy consumption and a classification robustness and/or classification accuracy of an inventive apparatus may be achieved due to the n-ary quantization. The n-ary quantization may, for example, be a high or coarse quantization, saving energy because of at least one of a lower bit width, smaller data path and/or less net activity. For example, in contrast to coarse quantization with weights that are not n-ary quantized, for example because of the usage of signed, symmetric (around zero) weights, good analyzation results may be achieved, e.g. despite the e.g. coarse quantization, hence providing the good trade-off.
The inventors recognized that with an n-ary quantization, e.g. with the three states -1 , 0, 1 for the neural network weights, a multiplication with the weights may be performed in an energy efficient manner. The multiplication may, for example, be similarly energy efficient as a multiplication with similar unsigned weights, e.g. for the three state example, the binary weights 0, 1, but with better classification results because of the additional, signed weight (e.g. -1).
Consequently, with reduced energy consumption, an inventive apparatus may be worn by a patient for a long period of time without the need for a changing of batteries or a recharging of an accumulator. In addition, the inventive apparatus may be built very small, due to only small batteries or a small accumulator, because of the low energy consumption, hence allowing to carry the device comfortably.
Furthermore, the inventors recognized that using n-ary coefficients, specialized memory elements, e.g. with multi-level cell capabilities, and processing architectures, e.g. systolic arrays, may be used for an energy efficient storing of the weights and processing of the sensor signal, e.g. without wasting any or many hardware resources, e.g. to further increase the energy efficiency. In other words, the inventive concept may allow a synergistic usage of dedicated hardware and/or hardware architecture to further increase energy efficiency, hence reducing energy costs for storing coefficients and for processing the sensor signal using the coefficients.
According to further embodiments of the invention, the coefficients, e.g. weights, of the neural net are quantized to take one of the three values -a, 0, and +a, e.g. one of the values 1, 0 and +1, where a is a real valued number, e.g., preferably, a potency of 2, since this allows for a simple multiplication. Alternatively, a may be a complex number.
The inventors recognized that usage of three values for the coefficients may allow a good trade-off between computational complexity, and therefore need of energy, and a quality and/or accuracy and/or robustness of the analysis and/or classification of the sensor signal.
According to further embodiments of the invention, the neural net is a mapped onto a data flow of a systolic array, e.g. implemented using a two-dimensional array of processing elements.
As an example, a systolic array may, for example, be a homogeneous network of data processing units. The data processing units may, for example, be tightly coupled and may, for example, be configured to compute a respective partial result using or based on a signal, e.g. data signal, from a neighboring data processing unit, and may further be configured to store a respective result and to provide the result to a neighboring data processing unit.
However it is to be noted that embodiments according to the invention are not limited to systolic arrays, in other words, usage of systolic arrays or apparatuses with systolic arrays may be one optional feature or embodiment. According to embodiments, a compiler, e.g. an architecture compiler may be used in order to assemble the processing units and local blocks together. As an example, local blocks of a non-volatile memory or of a non-volatile distributed memory, e.g. memory elements, and processing elements may be assembled together.
The inventors recognized that neural network inference may be well defined. Hence, the inventors recognized that mapping the neural net onto a flow of a systolic array, an architecture of the systolic array may allow to exploit characteristics of the neural network inference in order to provide a classification functionality with only low energy costs for control tasks, e.g. as in conventional approaches in the control path, cache control and/or branch prediction. Usage of a systolic array architecture may further allow to achieve a synergistic reduction of energy consumption in combination with the n-ary weights. According to further embodiments of the invention, the neural net is configured to perform a matrix multiplication, e.g. between a matrix defined by the ternary (or in general n-ary) neural net coefficients and a vector defined by the sensor signal, or by the preprocessed version of the sensor signal, or by the sensor data derived from the sensor signal.
The inventors recognized that a sensor signal analysis may be performed based on an energy efficient matrix multiplication using the n-ary weights.
According to further embodiments of the invention, the apparatus comprises a non-volatile memory, e.g. a RRAM (e.g. Resistive Random Access Memory) device, comprising multilevel cells, e.g. MLCs, or three-level cells, which may, for example, be configured to store three different values in a single memory cell. Moreover, the memory is configured to store the, e.g. ternary, or, generally, n-ary, coefficients, e.g. weights, of the neural net.
The inventors recognized that using the n-ary coefficients a synergistic reduction of the energy consumption may be achieved by using multi-level cells configured to store the coefficients. As an example, the multi-level cells may, be memory cells specifically adapted in order to save n-ary weight coefficients and to provide them for further processing. This way an odd number of coefficient stats may, for example, be stored efficiently, e.g. without wasting hardware resources.
As another optional feature, an inventive apparatus may further be configured to perform a training adapted to the characteristics of the multi-level cells, e.g. a memory cell aware training. As an example, a quantization of the neural network may be performed directly to the capabilities of the memory weight cells, e.g. of the multi-level cells.
Furthermore, the non-volatile memory may allow to switch off the apparatus, e.g. in times wherein an analysis or classification is not needed, without losing neural network coefficients and/or internal programming. Moreover, the non-volatile memory may, for example, be a CMOS memory, and may hence be produced with low costs and integrated without increased effort with other circuitry elements.
According to further embodiments of the invention, the neural net is a mapped onto a data flow of a systolic array the apparatus comprises a non-volatile distributed memory comprising multi-level cells or three-level cells. Furthermore, the distributed memory is configured to store the coefficients of the neural net and the distributed memory is configured to provide the coefficients of the neural net to processing elements of the systolic array.
According to further embodiments of the invention, the distributed memory comprises a plurality of memory elements, wherein memory elements of the plurality of memory elements comprise the multi-level cells or three-level cells and the systolic array comprises the plurality of memory elements. Moreover, memory elements of the plurality of memory elements are configured to provide the coefficients of the neural net to processing elements of the systolic array.
As an example, the memory may comprise RRAM structures, that may, for example, be arranged within the systolic array. Hence, the neural network coefficients, e.g. the weights, may already be present at the place where they may be needed.
In other words embodiments according to the invention may comprise the advantage that memory elements of a non-volatile distributed memory may be arranged or located or implemented within a systolic array, such that neural network weights may be provided directly within the array, e.g. in order to execute the neural network, which may allow to save energy costs and improve computational efficiency.
As another example, RRAMs may be configured to provide the distributed non-volatile memory. Hence, the memory may be divided into serval local blocks, e.g. memory elements, which may be connected in a dedicated manner to computation units, e.g. processing units of the systolic array.
According to further embodiments of the invention, the apparatus comprises a preprocessing unit, e.g. a preprocessing circuit, and/or a filter, and/or a microprocessor performing a preprocessing, and/or an ASIC performing a preprocessing, and/or a microcontroller performing the preprocessing, configured to preprocess the sensor signal, e.g. using one or more filtering operations, to obtain a preprocessed version of the sensor signal. Furthermore, the apparatus may comprise a buffer, e.g. a double buffer, configured to store the preprocessed version of the sensor signal. As an example, the preprocessed version of the sensor signal may, for example, be an analogous signal, that may be buffered or stored in the buffer. According to further embodiments of the invention, the apparatus comprises a preprocessing unit, e.g. a preprocessing circuit or a microprocessor performing a preprocessing or a microcontroller performing the preprocessing, configured to preprocess the sensor signal, to obtain preprocessed sensor data. Furthermore, the apparatus may comprise a buffer configured to store the preprocessed sensor data. As an example, the sensor data may, for example, be a digital signal that may be buffered or stored in the buffer.
According to embodiments, wherein the apparatus comprises preprocessing capabilities, a wholistic solution for sensor data evaluation may be provided. Hence, the sensor signal may be preprocessed, such that an analysis and or classification of the preprocessed sensor data may be performed efficiently, e.g. in an energy efficient manner. According to a respective data format of the processed information, the buffer may be configured to store said information. As an example a digitization may be performed before or after a preprocessing, or for example, after a buffering of the processed information.
According to further embodiments of the invention, the apparatus is configured to temporarily activate the neural net when a sufficient, e.g. predetermined, number of signal values of the input signal or a sufficient, e.g. predetermined, number of signal values of the preprocessed version of the input signal or a sufficient, e.g. predetermined, amount of sensor data has been accumulated in the buffer.
The inventors recognized that using a buffer, the preprocessed version of the sensor signal and/or the preprocessed sensor data may be stored or accumulated, before it is processed by a portion of the apparatus configured for the analysis and/or classification of the signal. Hence, such an analysis and/or classification portion may, for example, be turned off, hence reducing energy consumption, until a certain amount of sensor information, e.g. preprocessed version of the sensor signal and/or preprocessed sensor data, is buffered to be evaluated.
This may, for example, be in particular advantageous in applications wherein the sensor data is only provided slowly, e.g. for slow EGG data (e.g. because of a slow operating speed of the human heart in contrast to the processing speed of an inventive apparatus).
According to further embodiments of the invention, the apparatus is configured to constantly keep the preprocessing unit turned on, e.g. powered. Furthermore, the apparatus may be configured to intermittently (or intermittendly) activate, e.g. power up, the neural net, e.g. while keeping the neural net switched off at least 50% of the time, or at least 80% of the time, or at least 90% of the time, or at least 95% of the time.
The inventors recognized that this way, e.g. all available measurement data may be processed, or as an example a patient may be provided with ECG data of the whole time period, the apparatus was worn, but at the same time reducing energy consumption by, e.g. only selectively, operating the evaluation functionality. The inventors recognized that an active evaluation unit of the apparatus may, for example, be sufficient for at most 50% of the time or at most 20% of the time or of at most 10% of the time or of at most 5% of the time.
Further embodiments according to the invention comprise a method for analyzing, e.g. classifying, a sensor signal, e.g. a signal of a wearable sensor; e.g. an electrocardiogram signal or an ECG signal, e.g, for a detection of an atrial fibrillation on the basis of an electro-cardiogram.
The method comprises inputting the sensor signal, or a preprocessed version of the sensor signal, or sensor data derived from the sensor signal, into a neural net, coefficients, e.g. weights, of which are quantized to be ternary weights, e.g. to take three possible values, in order to obtain an analysis result.
The method as described above is based on the same considerations as the above- described apparatus. The method can, by the way, be completed with all features and functionalities, which are also described with regard to the apparatus.
According to further embodiments of the invention, the method comprises performing a memory ceil aware training of the neural net, e.g. which may, for example, consider a quantization of the neural net coefficients and/or a number of available values for the coefficients.
The inventors recognized that a training of the neural network may, for example, be adapted to the specific usage of the n-ary coefficients. Therefore, training time may be reduced and/or classification quality may be improved. Further embodiments according the invention comprise a computer program for performing any of the above explained methods, when the computer program runs on a computer.
Brief Description of the Drawings
The drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the invention. In the following description, various embodiments of the invention are described with reference to the following drawings, in which:
Fig. 1 shows a schematic view of an apparatus for analyzing a sensor signal according to embodiments of the invention;
Fig. 2a shows a schematic view of an apparatus for analyzing a sensor signal with additional, optional features, according to embodiments of the invention;
Fig. 2b shows a schematic view of an apparatus for analyzing a sensor signal with a distributed memory, according to embodiments of the invention;
Fig. 3 shows a schematic block diagram of a method for analyzing a sensor signal according to embodiments of the invention;
Fig. 4 shows an example of a processor architecture according to embodiments of the invention; and
Fig. 5. shows a schematic system overview according to embodiments of the invention.
Detailed Description of the Embodiments
Equal or equivalent elements or elements with equal or equivalent functionality are denoted in the following description by equal or equivalent reference numerals even if occurring in different figures. in the following description, a plurality of details is set forth to provide a more throughout explanation of embodiments of the present invention. However, it will be apparent to those skilled in the art that embodiments of the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring embodiments of the present invention. In addition, features of the different embodiments described herein after may be combined with each other, unless specifically noted otherwise.
Fig. 1 shows a schematic view of an apparatus for analyzing a sensor signal according to embodiments of the invention. Fig. 1 shows apparatus 100 comprising, as an optional feature, a neural network 110 (also referred to as neural net). As an example, element 110 may be means to calculate or approximate an output of a neural network, e.g. a processing unit for executing the neural network. Furthermore, apparatus 100 optionally comprises n-ary quantized neural network coefficients 120 (as an example, weights, which are coefficients of the neural network) and may hence be configured to provide the coefficients to the neural network (as indicated by arrow 122). As an example, element 120 may be a memory comprising or storing the n-ary coefficients. Optionally, the neural network 110, e.g. the means to calculate and output a calculation result according to calculation rules determining or approximating the neural network (e.g. element 110), may comprise the coefficients 120. In other words, a processing unit (e.g. 110) may comprise a memory (e.g. 120) in order to store the n-ary coefficients.
Apparatus 100 may be configured to input a sensor information 102 into the neural network 110. The sensor information may, for example, be at least one of a sensor signal, a preprocessed version of the sensor signal, and/or sensor data derived from the sensor signal. Using the sensor information 102 and the n-ary coefficients 120, the neural network 110 may be configured to obtain an analysis result 104.
As an example, the sensor information 102 may be a bitstream comprising digitized sensor measurements, e.g. preprocessed sensor measurements. The sensor information 102 may be determined or measured by a wearable sensor, e.g. a sensor configured to provide measurement information for monitoring a state or health of a human, e.g. of a human’s heart. The neural network 110 may be trained and/or configured to analyze and/or to classify and/or to evaluate the sensor information. Therefore, as an example, the neural network 110 may use the n-ary weights. The inventors recognized that using n-ary weights, an efficient, e.g. energy efficient, apparatus 100 for analyzing a sensor information 102, e.g. a sensor signal may be provided. This way, apparatus 100 may, for example, be integrated in a wearable health monitoring device, e.g. comprising a wearable sensor and apparatus 100 for gathering and/or measuring and evaluating long term health data. With reduced energy consumption small batteries or accumulators may be used which may allow to wear such a device comfortably. Furthermore, long-term heath monitoring may be possible because of an ability of such a device to operate without recharging for a long period of time.
The n-ary coefficients 120, may optionally be integer coefficients of the neural network that are symmetric to zero, e.g. -a, 0, a, with a being an integer. As an example, the coefficients may be ternary weights, e.g. weights -1, 0, 1.
Hence, an analysis result 104 may be provided by the neural network 110 based on the sensor information 102 and the n-ary coefficients 120. The analysis result may comprise an information about a body functionality, e.g. the functionality or state or health of an organ, e.g. of a human or an animal. The result 104 may, for example, be provided to the human or a supervising unit (e.g. via wireless communication) in order to provide a warning for an abnormal state of the body functionality.
Furthermore, apparatus 100 may optionally comprise a memory in order to store the analysis result 104 and, for example, corresponding sensor information 102 of a corresponding period of time, e.g. a period of time in which the neural network detected unusual sensor information.
Fig. 2a shows a schematic view of an apparatus for analyzing a sensor signal with additional, optional features, according to embodiments of the invention. Fig. 2a shows apparatus 200a comprising a neural network 210a, e.g. means to calculate or approximate an output of a neural network, e.g. a processing unit for executing the neural network.
As an optional feature, the neural net 210a may be mapped onto a data flow of a systolic array 212a. As an example, the neural network 210a may be implemented using a two- dimensional array of processing elements 214. Hence, as an example, element 210a may be a processing unit comprising a systolic array architecture for approximating or calculating or evaluating an output of the neural network. The systolic array 212a may, for example, be a monolithic network of processing elements 214, e.g. of primitive computing nodes. The processing elements 214 may, for example, be hardwired or software configured, e.g. using FPGAs.
As an example, the processing elements 214 may comprise programmable interconnects, e.g. adapted according to n-ary weights. Furthermore, processing elements 214 may, for example, be substantially identical. The processing elements 214 may, for example, be trigged by the arrival of new data, e.g. of new sensor information at a respective processing element 214.
One advantage of such an architecture may, for example, be that processing data and partial results may be stored within the systolic array, e.g. while passing through the array. Hence, usage of external buses, main memory and/or internal caches may, for example, be omitted. Furthermore, such an architecture may provide good classification results, e.g. for obtaining a robust and accurate analysis result 204 based on a sensor information 202 e.g. with low energy costs.
Furthermore, apparatus 200a optionally comprises a non-volatile memory 220a, e.g. a RRAM device, configured to store the coefficients of the neural network, wherein the memory 220a comprises multi-level cells, e.g. MLCs. As an optional example, non-volatile memory 220a comprises three-level cells 224 as multi-level cells, which are, as optionally shown, configured to store three different values (-a, 0, a) in a single memory cell.
The memory may, for example, be a central or centralized memory or a distributed memory, e.g. a memory comprising a plurality of local memory elements that may be coupled in a dedicated manner with processing elements of the apparatus 200a.
The inventors recognized that for n-ary weights, e.g. as shown in Fig. 2a ternary weights, a dedicated processing architecture may be used in order to achieve a low energy consumption. The n-ary weights may provide a good trade-off between an analyzation quality of the sensor information 202 using the neural network 210a and computational and processing costs and hence energy consumption. The storing of the n-ary weights may, as shown in Fig. 2a, be performed with low energy costs using the multi-level cells. This may, for example, mitigate a possible disadvantage of using an odd number of signed weights, since the multi-level cells may allow to store such weights efficiently.
As optionally shown in Fig. 2a, the coefficients of the neural net 210a may be quantized to take one of the three values -a, 0, and +a, wherein a is a real valued number, e.g., preferably, a potency of 2, since this may allow for a simple multiplication, or where a is a complex number. As an example, the coefficients may be -1, 0, 1. The inventors recognized that ternary weights may allow an accurate classification of sensor data with low energy costs.
For obtaining the analysis data, the apparatus may, for example, further be configured to perform a matrix multiplication. The ternary weights may, for example, be, provided (e.g. as indicated by arrow 222a) by the memory 220a to the neural network 210a in the form a matrix. The neural network 210a may further receive the sensor information 202 in the form of a vector and may hence determine the analysis result 204 based on a multiplication of the matrix and the vector. With an architecture as shown in Fig. 2a with any or all of the optional features as explained herein, such a multiplication may be performed with low energy costs.
As another optional feature, apparatus 200a optionally comprises a preprocessing unit
230 and a buffer 240. Preprocessing unit 230 may be configured to preprocess a sensor signal 206, provided to the apparatus 200a, e.g. by a sensor, e.g. a wearable sensor, to obtain the sensor information 202, e.g. a preprocessed version of the sensor signal 206 and/or preprocessed sensor data. Therefore, the preprocessing unit 230 may, for example, be or comprise at least one of a preprocessing circuit, and/or a filter (e.g. for performing one or more filter operations), and/or a microprocessor (e.g. for performing the preprocessing), and/or an ASIC (e.g. for performing the preprocessing), and/or a microcontroller (e.g. for performing the preprocessing).
The preprocessing unit 230 may receive an analog or digital sensor signal 206. In the case of an analog sensor signal, the preprocessing unit 230 may, for example, comprise an analog-to-digital converter in order to digitize the sensor signal. Furthermore, preprocessing unit 230 may be configured to convert sensor signal 206 in sensor information 202, wherein sensor information 202 may be adapted to be easily, e.g. efficiently, processed by neural network 210a. Buffer 240, e.g. a double buffer, may be configured to store the sensor information 202. In some applications, sensor signal 206 may be provided to the apparatus 200a significantly slower than a processing speed of the apparatus 200a (e.g. of neural network 210a). Hence, the inventors recognized that optionally, the neural network 210a may be temporally deactivated until a certain amount of sensor information 202 is accumulated in buffer 240. In other words, the apparatus 200a may be configured to temporarily activate the neural net 210a when a sufficient, e.g. predetermined, number of signal values of the input signal 206 or a sufficient, e.g. predetermined, number of signal values of the preprocessed version 202 of the input signal or a sufficient, e.g. predetermined amount of sensor data 202 has been accumulated in the buffer 240.
Therefore, apparatus 200a may optionally comprise an activation unit 250. The activation unit may receive a buffer information 242, the buffer information indicating an amount of sensor information 202 stored in the buffer 240, e.g. a number of signal values, e.g. an amount of sensor data. Based thereon, the activation unit 250 may provide an activation signal 252 to the neural network 210a. The activation signal may stimulate or start a processing of the sensor information 202 provided by the buffer 240 to the neural network 210a. Vice versa, activation signal 252 may as well be used to deactivate, e.g. to turn off neural network 210a.
As an example, apparatus 200a may be configured to maintain the preprocessing unit 230 in an activated state for a longer period of time than the neural network 210a. Optionally, the preprocessing unit 230 may always be turned on, e.g. powered, e.g. in order to allow for a constant or streaming preprocessing of incoming sensor signal information 206. Whenever suitable, e.g. when activation unit 250 detects an exceeding of a threshold of buffer information 242 (e.g. buffer 240 comprising a sufficient amount of sensor data points), the activation unit may activate the neural network 210a. Hence, activation unit 250 may be configured to intermittently active or power up the neural net.
The inventors recognized that energy may be saved by deactivating or turning off the neural network 210a for certain periods of time, e.g. when data is accumulated in buffer 240. In many application it may not be important to fulfill hard real time constraints for the evaluation of the sensor information. As an example, evaluation of a long-term ECG may be performed by a doctor several days after collecting the measurements. However, even a delay of several seconds or minutes or hours may not be relevant for certain applications. Hence, activation unit may, for example, deactivate the neural network, using activation signal 252 for longer periods of time. It is to be noted that in this time no data or information may be lost, because of the storing in the buffer 240.
To sum, up the neural network 210a may, for example, be switched off at least 50% of the time, or at least 80% of the time, or at least 90% of the time, or at least 95% of the time, hence significantly reducing energy costs.
Fig. 2b shows a schematic view of an apparatus for analyzing a sensor signal with a distributed memory, according to embodiments of the invention. Apart from the elements explained in the context of Fig. 2a, apparatus 200b comprises a non-volatile distributed memory 220b. The non-volatile distributed memory 220b may optionally comprise, as shown, a plurality of memory elements 226, wherein memory elements of the plurality of memory elements comprise multi-level cells or three-level cells.
As optionally shown, memory elements 226 of the plurality of memory elements may be configured to provide 222b the coefficients of the neural net 210b to processing elements 214 of the systolic array 212b.
In other words, as shown, the neural net 210b may be mapped onto a data flow of a systolic array 212b, wherein the systolic array 212b may optionally comprise a plurality of memory elements 226 of a non-volatile distributed memory 220b, in order to provide the coefficients, e.g. the n-ary weights, of the neural net 210b to the processing elements 214, e.g. in order to approximate or execute the neural network. As shown one memory element 226 may provide n-ary weights to a plurality of processing elements 214. However, it is to be noted, that optionally each processing element 214 may receive weights from a single memory element. Furthermore, one memory elements may provide weights to an arbitrary number of processing elements.
Fig. 3 shows a schematic block diagram of a method for analyzing a sensor signal according to embodiments of the invention. Method 300 may, for example, be a method for classifying a sensor signal, e.g. sensor signal 206 as explained in the context of Fig. 2a or sensor information 102, 202 as explained in the context of Fig. 1 and respectively Fig. 2a. As explained before, such a signal may, for example, be a signal of a wearable sensor; e.g. an electrocardiogram signal or an BCG signal, e.g. for a detection of an atrial fibrillation on the basis of an electro-cardiogram. Method 300 comprises inputting 310 the sensor signal, or a preprocessed version of the sensor signal, or sensor data derived from the sensor signal, into a neural net, coefficients, e.g. weights, of which are quantized to be ternary weights, e.g. to take three possible values, in order to obtain an analysis result.
As an optional feature, method 300 further comprises performing 320 a memory cell aware training of the neural net, which may, for example, consider a quantization of the neural net coefficients and/or a number of available values for the coefficients. It is to be noted that, according to a specific application or goal, steps 310 and 320 may be performed in an arbitrary order. As an example, for a training of a neural network, a sensor signal may be input into the neural network and the weights of the neural network may be determined based on a subsequent training of the neural network.
As another example, a training of the neural network may be performed based on artificial data first, and for a subsequent application, the sensor signal may be input into the neural network in order to analyze the sensor signal.
In general, embodiments according to the invention may comprise devices and methods for energy-efficient neural network calculations using a processor field (e.g. a systolic array) based on non-volatile processor elements and memory elements and correspondingly optimized (e.g. optimized with regard to the beforementioned elements and/or architecture) n-ary weights.
Embodiments according to the invention may comprise a detection of AFIB via neuromorphic hardware. Embodiments may allow an extremely high energy efficiency due to a combination of n-ary weight memory elements, e.g. realized by RRAM; Furthermore, embodiments according to the invention may comprise data-flow oriented processing architectures and non-volatile memory elements for a realization of power cycles.
In the following, different inventive embodiments and aspects will be described in Short summary", in a chapter “Introduction to embodiments of the invention and methodology according to embodiments of the invention” and in a chapter “Architecture and results according to embodiments”.
Moreover already explained embodiments will be discussed in other words, or with additional, optional details. It is to be noted that any features, functionalities and details as disclosed above may be combined or used with or incorporated in any of the following embodiments, taken individually or in combination. Furthermore, any features, functionalities and details as disclosed in the following may be combined or used with or incorporated in any of the above embodiments, taken individually or in combination.
Also, further embodiments will be defined by the enclosed claims.
It should be noted that any embodiments as defined by the claims can be supplemented by any of the details (features and functionalities) described in the above mentioned chapters and according to the before explained embodiments.
Also, the embodiments described in the above mentioned chapters and the before explained embodiments can be used individually, and can also be supplemented by any of the features in another chapter, or by any feature included in the claims or by any features as explained before.
Also, it should be noted that individual aspects described herein can be used individually or in combination. Thus, details can be added to each of said individual aspects without adding details to another one of said aspects.
Moreover, features and functionalities disclosed herein relating to a method can also be used in an apparatus (configured to perform such functionality). Furthermore, any features and functionalities disclosed herein with respect to an apparatus can also be used in a corresponding method. In other words, the methods disclosed herein can be supplemented by any of the features and functionalities described with respect to the apparatuses.
Also, any of the features and functionalities described herein can be implemented in hardware or in software, or using a combination of hardware and software, as will be described in the section “implementation alternatives”.
Implementation alternatives;
Although some aspects are described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus. Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit in some embodiments, one or more of the most important method steps may be executed by such an apparatus.
Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.
Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or nontransitionary. A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the
Internet.
A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are preferably performed by any hardware apparatus.
The apparatus described herein may be implemented using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer.
The apparatus described herein, or any components of the apparatus described herein, may be implemented at least partially in hardware and/or in software.
The methods described herein may be performed using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer. The methods described herein, or any components of the apparatus described herein, may be performed at least partially by hardware and/or by software.
The above described embodiments are merely illustrative for the principles of the present invention. It is understood that modifications and variations of the arrangements and the details described herein will be apparent to others skilled in the art. It is the intent, therefore, to be limited only by the scope of the impending patent claims and not by the specific details presented by way of description and explanation of the embodiments herein.
The followings section may be titled “A Wearable Neuronal Network Processor Architecture for AFIB Detection”. As an example and inter alia, embodiments according to the invention comprising or for a wearable neuronal network processor architecture for AFIB detection may be discussed in the following.
Short summary — The detection of atrial fibrillation (AFIB) may be a very important task to diagnose heart disease. Embodiments according to the invention, for example, in order to perform such a task, are presented in the following. Embodiments according to the invention, for example therefore comprise or are related to an ultra-low power computer architecture for classification of EGG for wearable sensors. A new architecture or the new architecture, according to embodiments of the invention may combine three approaches e.g. to gain for example a significant, or large or even maximum energy savings: (1) Usage of systolic arrays e.g. to avoid control overhead, (2) High quantization e.g. by using ternary weights and (3) novel non-volatile multi-level memory cells e.g. for storing weights. Yet, it is to be noted, that embodiments of the invention may comprise or use any of these approaches separately or in combination with any other approaches. With approaches according to embodiments of the invention up to 94.7% energy may be saved, compared to traditional processor architectures.
I. Introduction to the Embodiments of the Invention and Methodology according to Embodiments of the Invention.
The detection of atrial fibrillation using wearable sensors is highly recommended (or recommend) to diagnose heart diseases and prevent strokes. In contrast to state-of-the art approaches, for example an ultra-lower power architecture for example based on neuronal networks (NN). The inference of NN may be mainly based on matrix multiplications. For example, to save as much energy as possible, and/or (an or) the architecture, according to embodiments of the invention, may comprise, separately or in combination, or may combine, one or more of the following three features.
(1) Standard CPUs may be generic compute cores for arbitrary applications but spent huge energy in the control path, e.g. cache control or branch prediction. In contrast, the inventors have realized that the domain of the inference of NN may be well-defined. For example, based on this inventive finding, embodiments according to the invention may comprise, for example a more specialized architecture for example to reduce, or for example to reduce as much as possible control logic. Devices according to the invention may comprise, or methods according to the invention may use systolic arrays, a data flow driven architecture, for example organized in a 2D array fashion. In other words, embodiments of the invention may focus on the usage of systolic arrays. While data flows through this array, according to embodiments a Matrix Multiplication may be inherently performed.
(2) A high quantization of the weights may be able to save energy for example due to at least one lower bit width, smaller data path and less net activity. In most traditional approaches this was done to fit in existing register widths (e.g. 8 bits or in extreme cases only 1 bit). In an architecture according to embodiments of the invention quantization may be performed to three states. With three states (-1, 0, +1), the multiplication may be very energy efficient for example similar to binary NN, but may deliver better accuracy for example due to the presence of an additional state (-1).
(3) At the first glance, using three or so-called ternary weights might not make sense in binary computers, due to the need to store this with two independent bits, which could store per definition four states. However, the inventors have recognized, that by using specialized RRAM devices for example with MLC (Multi Level Cell) capabilities, a storage of ternary weights may become possible for example without wasting hardware resources and therefore energy. In other words, embodiments according to the invention may comprise specialized RRAM devices, for example, with MLC capabilities. Hence, embodiments of the invention comprise a new method, called memory cell aware training of the NN, which may perform the quantization of the NN directly to the capabilities of the memory weight cells. Moreover, RRAM which may perform a non-volatile memory technology, which may make it possible to store weights, permanently inside the systolic array, for example even if the power is switched off. For example, due to their CMOS compatibility, it could be embedded deep in the systolic array. Using RRAM technology according to the embodiments, the device can be switched off, without losing their internal programming. This may allow a huge energy saving, for example every time the chip processing the NN inference is not needed. The RRAM devices can be reprogrammed, for example to address different (medical) applications.
II. Architecture and Results according to Embodiments
In Fig. 4, an example of a processor architecture according to embodiments of the invention, e.g. to detect AFIB in wearable ECG sensors, is shown.
As an example, apparatus 400 comprises an neural network (NN) architecture 410 and a preprocessing unit 420. As an example, the NN architecture 410 may comprise means 412 to execute or process the neural network. Furthermore, the NN architecture 410 may, for example, comprise RRAM blocks 414 to store the coefficients, e.g. weights of the neural network. Optionally, processing unit 420 may comprise a RRAM Manager 428, wherein the RRAM Manager 428 may, for example, be configured to configure the neural network, e.g. means 412. Optionally, RRAM Manager 428 may, for example, be configured to adapt the weights stored in RRAM blocks 414. Therefore, the weights 414 may, for example, be coupled to each other. Optionally, RRAM Manager 428 may be coupled with each of the RRAM blocks 414 individually. New coefficients to be stored in RRAM blocks 414 may, for example, be provided by an external RRAM interface 430. Optionally, weights stored in the RRAM blocks may, for example, be read out, e.g. provided to an external device, e.g. via the external RRAM interface 430.
Preprocessing unit 420 comprises, as an optional feature, means 422 to preprocess a sensor signal and a buffer 424. As shown, buffer 424 may be a double buffer, e.g. comprising two SRAM (e.g. static random-access memory) blocks 426. As explained before, buffer 424 may, for example, receive a preprocessed sensor signal 401 (e.g. a preprocessed version of the sensor signal or preprocessed sensor data) from the preprocessing means 420. For example, when a certain amount of preprocessed information is gathered or stored in buffer 424, the buffer may provide the information to the NN architecture 410. In this case, the sensor information 402 provided to the NN architecture may be the, or may be equal to the, preprocessed sensor signal 401. In the meantime, as explained before, the NN architecture 410 may, for example, be deactivated in order to save energy (e.g. until the information is provided from the buffer 424). Therefore, as optionally shown, apparatus 400 may comprise an activation unit 440 (Pwr switch, e.g. a power switch), which may be provided with a signal 442 from the preprocessing unit 420. The signal 442 may be a command or an impulse to activate or to enable power (e.g. pwr_ena signal) of the NN architecture 410. The signal 442 may, for example, be a buffer information indicating a sufficient amount of sensor information buffered, for a subsequent analyzation using the NN architecture 410. Such a power switch may, for example, be configured to activate or to deactivate the NN architecture 410, e.g. by providing or disabling one or more supply voltages. As an example, activation unit 440 may comprise or regulate or control a power supply, configured to provide a first supply voltage VDD33, e.g. of 3.3 V and a second supply voltage VDD12, e.g. of 1.2 V. The NN architecture may be provided with more than one voltage level, e.g. for different functionalities. As an example, means 412 and RRAM blocks 414 may be provided with a lower supply voltage than interfaces of the NN architecture 410.
Correspondingly, as an example, the preprocessing unit 420 may as well be provided with a supply voltage, e.g. VDD12, e.g. of 1.2 V, for example, with a same voltage level as one of the voltage levels provided to the NN architecture 410.
As explained before, the preprocessing unit 420 may, for example, be provided with a sensor signal 406, e.g. as shown via a recording interface 432. As an example, the recording interface 432 may be an input/output, e.g. a bidirectional interface. Hence, optionally, an analysis result 434 from means 412 may be provided to an external device via the interface 432.
Moreover, preprocessing unit 420 may, for example, preprocess the sensor signal 406, e.g. raw measurement data, in order to provide the preprocessed sensor signal 401 for the NN architecture 410. As an optional feature, preprocessing unit 420 comprises a switch 436, that may be controlled by a bypass value 438 (e.g. cfg_enabel_bypass). Hence, the processed sensor information from means 422 mays be neglected and the sensor signal 406 may be provided to the NN architecture 412. In this case, as an example, sensor information 402 may be equal to sensor signal 406.
In addition, means 422 may optionally comprise a window invalidation functionality. Hence, as an example, means 422 may be configured to detect a disturbance of the sensor signal 406 and may therfore, as an example, mark certain sensor signal data as invalid, such that it is not used for a subsequent classification. As an example, an electromagnetic disturbance may be detected, that may be different from an abnormality of a sensor signal data for example caused by an irregular heartbeat. This may allow a more robust analyzation of the sensor signal, e.g. with less false alarms.
As another optional feature, apparatus 400 may be configured to be provided with a continuous mode signal 439 (e.g. cfg_enable_continous_mode_n). Based on such a signal, apparatus 400 may, for example, be configured to keep the NN architecture 410 activated, for example, irrespective of an amount of buffered data that is not fulfilling a threshold for an activation of the NN architecture 410.
In other words, the chip, e.g. apparatus 400 is divided as an example into two parts. In the left part e.g. 420, for example the always on part, the preprocessing of ECG data and a buffer e.g. 424 is shown. For example, due to the fact, that the data may stream in very slowly, the incoming data may be processed by different filters and may be finally accumulated in the double buffers e.g. 424. On the right-hand side, an example of an, for example actual NN architecture e.g. 410 is shown, which may be based on the aforementioned systolic array. Moreover, the explained RRAM blocks e.g. 414 which for example stores the ternary weights are shown or are recognizable. As shown, the for example complete power of the right part e.g. 410 can be for example switched off, for example when it is not needed which may be the case in a high percentage of time or, for example in 99.8% of the time the case. For example, only if the buffer e.g. 424 is full, a classification using the NN may be executed for example to detect if a AFIB occurred.
The real-time classification of an ECG data of 12.66 seconds with 512Hz sample rate using the explained architecture may be 12.356 uJ energy. This is about 94.7% less compared to a traditional architecture, without non-volatile ternary weight RRAM. The NN contains 1520 ternary weights, used in four convolution layers and two fully connected layers.
Moreover, reference is made to Fig. 5. Fig. 5 shows a schematic system overview according to embodiments of the invention. Fig. 5 shows apparatus 500 comprising an optional input/output (I/O) interface 510, a main unit 520 and a processing core 530. The main unit 520 optionally comprises a power control unit (pwr control) 522, a preprocessing unit 524, a buffer 526 and a control logic 528. The processing core 530 optionally comprises a plurality of RRAM memory blocks 532 and a neural network 534.
As an example, the I/O interface 510 may provide a sensor signal 512 to the preprocessing unit 524. The pre-processing unit 524 may provide a sensor information 525, e.g. a preprocessed version of the sensor signal or preprocessed sensor data to the buffer 526 which may be configured to store the sensor information 525.
The power control unit 522 may be configured to temporarily activate the processing core 530 (or, for example, selectively the neural network 534 of the processing core), e.g. via an activation signal 523 or vice versa to deactivate the processing core. This may, for example, be performed based on a number of buffered sensor information data points in buffer 526.
The sensor information 525 may be provided by the buffer to the neural network 534, e.g. when the processing core 530 is activated using activation signal 523. Based on the sensor information 525 the neural net 534 may provide an analysis result 535 to the I/O interface 510.
For the processing of the sensor information 525, the neural network 534 is provided with neural network coefficients 532 in the form of n-ary weights. It is to be noted that the RRAM memory blocks 532 may form a distributed non-volatile memory that may be part of a systolic array used to execute the neural network 534. The memory blocks 532 are optionally controlled by the control logic 528.

Claims

Claims
1. An apparatus (100, 200a, 200b, 400, 500) for analyzing a sensor signal (102, 206, 402, 406, 512), wherein the apparatus is configured to input the sensor signal, or a preprocessed version of the sensor signal (102, 202, 401, 402, 512), or sensor data (102, 202, 401, 402, 512) derived from the sensor signal, into a neural net (110, 210a, 210b, 410, 534), coefficients (120, 533) of which are quantized to be n-ary weights, in order to obtain an analysis result (104, 204, 434, 535).
2. Apparatus (100, 200a, 200b, 400, 500) according to claim 1, wherein the coefficients (120, 533) of the neural net (110, 210a, 210b, 410, 534) are quantized to take one of the three values -a, 0, and +a, where a is a real valued number or where a is a complex number.
3. Apparatus (100, 200a, 200b, 400, 500) according to claim 1 or claim 2, wherein the neural net (110, 210a, 210b, 410, 534) is a mapped onto a data flow of a systolic array (212a, 212b).
4. Apparatus (100, 200a, 200b, 400, 500) according to one of claims 1 to 3, wherein the neural net (110, 210a, 210b, 410, 534) is configured to perform a matrix multiplication.
5. Apparatus (100, 200a, 200b, 400, 500) according to one of claims 1 to 4, wherein the apparatus comprises a non-volatile memory (220a, 220b, 414, 532) comprising multi-level ceils or three-level cells (224), wherein the memory is configured to store the coefficients (120, 533) of the neural net (110, 210a, 210b, 410, 534).
6 Apparatus (100, 200a, 200b, 400, 500) according to one of claims 1 to 5, wherein the neural net (110, 210a, 210b, 410 , 534) is mapped onto a data flow of a systolic array (212a, 212b); and wherein the apparatus comprises a non-volatile distributed memory (220a, 220b, 414, 532) comprising multi-level cells or three-level cells (224); and wherein the distributed memory (220b) is configured to store the coefficients (120, 533) of the neural net (110, 210a, 210b, 410, 534); and wherein the distributed memory is configured to provide the coefficients of the neural net to processing elements (214) of the systolic array (212a, 212b).
7. Apparatus (100, 200a, 200b, 400, 500) according to claim 6, wherein the distributed memory (220a, 220b, 414, 532) comprises a plurality of memory elements (226), wherein memory elements of the plurality of memory elements comprise the multi-level cells or three-level cells (224); and wherein the systolic array comprises the plurality of memory elements (226); and wherein memory elements of the plurality of memory elements are configured to provide the coefficients of the neural net to processing elements (214) of the systolic array (212a, 212b).
8. Apparatus (100, 200a, 200b, 400, 500) according to one of claims 1 to 7, wherein the apparatus comprises a preprocessing unit (230, 420, 524) configured to preprocess the sensor signal (102, 206, 402, 406, 512), to obtain a preprocessed version (102, 202, 401 , 402, 512) of the sensor signal, and wherein the apparatus comprises a buffer (240, 424, 526) configured to store the preprocessed version (102, 202, 401, 402, 512) of the sensor signal.
9. Apparatus (100, 200a, 200b, 400, 500) according to one of claims 1 to 8, wherein the apparatus comprises a preprocessing unit (230, 420, 524) configured to preprocess the sensor signal (102, 206, 402, 406, 512), to obtain preprocessed sensor data (102, 202, 401, 402, 512), and wherein the apparatus comprises a buffer (240, 424, 526) configured to store the preprocessed sensor data (102, 202, 401, 402, 512).
10. Apparatus (100, 200a, 200b, 400, 500) according to claim 8 or claim 9, wherein the apparatus is configured to temporarily activate the neural net (110, 210a, 210b, 410, 534) when a sufficient number of signal values of the input signal (102, 206, 402, 406, 512) or a sufficient number of signal values of the preprocessed version (102, 202, 401, 402, 512) of the input signal or a sufficient amount of sensor data (102, 202, 401, 402, 512) has been accumulated in the buffer (240, 424, 526).
11. Apparatus (100, 200a, 200b, 400, 500) according to one of claims 8 to 10, wherein the apparatus is configured to constantly keep the preprocessing unit (230, 420, 524) turned on, and wherein the apparatus is configured to intermittently activate the neural net (110, 210a, 210b, 410, 534).
12. A method (300) for analyzing a sensor signal (102, 206, 402, 406, 512), wherein the method comprises inputting (310) the sensor signal, or a preprocessed version (102, 202, 401, 402, 512) of the sensor signal, or sensor data (102, 202, 401, 402, 512) derived from the sensor signal, into a neural net (110, 210a, 210b, 410, 534), coefficients (120, 533) of which are quantized to be ternary weights, in order to obtain an analysis result (104, 204, 434, 535).
13. Method (300) according to claim 12, wherein the method comprises performing (320) a memory cell aware training of the neural net (110, 210a, 210b, 410, 534).
14. A computer program for performing the method (300) of claim 12 or 13 when the computer program runs on a computer.
PCT/EP2022/054157 2021-02-18 2022-02-18 Apparatus, method and computer program for analyzing a sensor signal WO2022175494A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP21158011 2021-02-18
EP21158011.3 2021-02-18

Publications (1)

Publication Number Publication Date
WO2022175494A1 true WO2022175494A1 (en) 2022-08-25

Family

ID=74844669

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/054157 WO2022175494A1 (en) 2021-02-18 2022-02-18 Apparatus, method and computer program for analyzing a sensor signal

Country Status (1)

Country Link
WO (1) WO2022175494A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200225996A1 (en) * 2019-01-15 2020-07-16 BigStream Solutions, Inc. Systems, apparatus, methods, and architectures for a neural network workflow to generate a hardware acceletator

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200225996A1 (en) * 2019-01-15 2020-07-16 BigStream Solutions, Inc. Systems, apparatus, methods, and architectures for a neural network workflow to generate a hardware acceletator

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANTONINO FARAONE ET AL: "Convolutional-Recurrent Neural Networks on Low-Power Wearable Platforms for Cardiac Arrhythmia Detection", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 8 January 2020 (2020-01-08), XP081576317 *
CAI YI ET AL: "Low Bit-Width Convolutional Neural Network on RRAM", IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, IEEE, USA, vol. 39, no. 7, 17 May 2019 (2019-05-17), pages 1414 - 1427, XP011793485, ISSN: 0278-0070, [retrieved on 20200617], DOI: 10.1109/TCAD.2019.2917852 *
WU QING ET AL: "ECG signal classification with binarized convolutional neural network", COMPUTERS IN BIOLOGY AND MEDICINE, NEW YORK, NY, US, vol. 121, 5 May 2020 (2020-05-05), XP086195287, ISSN: 0010-4825, [retrieved on 20200505], DOI: 10.1016/J.COMPBIOMED.2020.103800 *

Similar Documents

Publication Publication Date Title
Zhao et al. A 13.34 μW event-driven patient-specific ANN cardiac arrhythmia classifier for wearable ECG sensors
Shoaran et al. Energy-efficient classification for resource-constrained biomedical applications
Yan et al. Energy efficient ECG classification with spiking neural network
US11727279B2 (en) Method and apparatus for performing anomaly detection using neural network
Liu et al. 4.5 BioAIP: A reconfigurable biomedical AI processor with adaptive learning for versatile intelligent health monitoring
Lee et al. RISC-V CNN coprocessor for real-time epilepsy detection in wearable application
San et al. Evolvable rough-block-based neural network and its biomedical application to hypoglycemia detection system
KR20190114694A (en) Method for learning and analyzing time series data by using artificial intelligence
Chu et al. A neuromorphic processing system with spike-driven SNN processor for wearable ECG classification
Cherupally et al. ECG authentication hardware design with low-power signal processing and neural network optimization with low precision and structured compression
Andersson et al. A 290 mV Sub-$ V_ {\rm T} $ ASIC for Real-Time Atrial Fibrillation Detection
Mendez et al. A DSP for sensing the bladder volume through afferent neural pathways
CN111528832A (en) Arrhythmia classification method and validity verification method thereof
Ali et al. An efficient hybrid LSTM-ANN joint classification-regression model for PPG based blood pressure monitoring
Lammie et al. Towards memristive deep learning systems for real-time mobile epileptic seizure prediction
Marathe et al. Prediction of heart disease and diabetes using naive Bayes algorithm
WO2022175494A1 (en) Apparatus, method and computer program for analyzing a sensor signal
Zhao et al. A 0.99-to-4.38 uJ/class event-driven hybrid neural network processor for full-spectrum neural signal analyses
KR102469238B1 (en) Real-time neural spike detection
JP2022538417A (en) An Event-Driven Spiking Neural Network System for Physiological State Detection
CN114386479B (en) Medical data processing method and device, storage medium and electronic equipment
Hallgrímsson et al. Learning individualized cardiovascular responses from large-scale wearable sensors data
Hu et al. Energy Efficient Software-Hardware Co-Design of Quantized Recurrent Convolutional Neural Network for Continuous Cardiac Monitoring
US20180214038A1 (en) Determining blood pulse characteristics based on stethoscope data
Zheng et al. Using machine learning techniques to optimize fall detection algorithms in smart wristband

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22711176

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22711176

Country of ref document: EP

Kind code of ref document: A1