DE102020202633A1

DE102020202633A1 - Error-proof inference calculation for neural networks

Info

Publication number: DE102020202633A1
Application number: DE102020202633.5A
Authority: DE
Inventors: Jo Pletinckx; Andre GUNTORO; Christoph Schorn; Sebastian Vogel; Leonardo Luiz Ecco
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2021-09-02

Abstract

Verfahren (100) zum Betreiben einer Hardwareplattform für die Inferenzberechnung eines faltenden neuronalen Netzwerks, mit den Schritten:• eine Eingangsmatrix (1) mit Eingangsdaten des neuronalen Netzwerks wird mittels des Beschleunigungsmoduls mit einer Mehrzahl von Faltungskernen (2a-2c) gefaltet (110), so dass eine Mehrzahl zweidimensionaler Ausgangsmatrizen (3a-3c) entsteht;• die Faltungskerne (2a-2c) werden elementweise zu einem Kontrollkern (4) summiert (120);• die Eingangsmatrix (1) wird mittels des Beschleunigungsmoduls mit dem Kontrollkern (4) gefaltet (130), so dass eine zweidimensionale Kontrollmatrix (5) entsteht;• jedes Element (5*) der Kontrollmatrix (5) wird mit der Summe der hierzu korrespondierenden Elemente (3a*-3c*) in den Ausgangsmatrizen (3a-3c) verglichen (140);• in Antwort darauf, dass dieser Vergleich (140) für ein Element (5*) der Kontrollmatrix (5) eine Abweichung ergibt (150), wird mit mindestens einer zusätzlichen Kontrollberechnung geprüft (160), ob ein zu diesem Element (5*) der Kontrollmatrix (5) korrespondierendes Element (3a*-3c*) mindestens einer Ausgangsmatrix (3a-3c) richtig berechnet wurde.Method (100) for operating a hardware platform for calculating the inference of a folding neural network, with the following steps: • an input matrix (1) with input data of the neural network is folded (110) by means of the acceleration module with a plurality of convolution cores (2a-2c), so that a plurality of two-dimensional output matrices (3a-3c) is created; • the convolution kernels (2a-2c) are added element-wise to a control kernel (4) (120); • the input matrix (1) is linked to the control kernel (4) by means of the acceleration module folded (130), so that a two-dimensional control matrix (5) is created; • each element (5 *) of the control matrix (5) is combined with the sum of the corresponding elements (3a * -3c *) in the output matrices (3a-3c) compared (140); in response to the fact that this comparison (140) results (150) for an element (5 *) of the control matrix (5), at least one additional control calculation is used to check (160) whether a to This element (5 *) of the control matrix (5) corresponding element (3a * -3c *) of at least one output matrix (3a-3c) has been correctly calculated.

Description

Die vorliegende Erfindung betrifft die Sicherung von Berechnungen, die im Inferenzbetrieb neuronaler Netzwerke anfallen, gegen transiente Fehler auf der verwendeten Hardwareplattform.The present invention relates to securing calculations that occur in the inference mode of neural networks against transient errors on the hardware platform used.

Stand der TechnikState of the art

Bei der Inferenz neuronaler Netzwerke werden in sehr großer Zahl Aktivierungen von Neuronen berechnet, indem Eingaben, die diesen Neuronen zugeführt werden, anhand von im Training des neuronalen Netzwerks erarbeiteten Gewichten gewichtet aufsummiert werden. Es findet also eine Vielzahl von Multiplikationen statt, deren Ergebnisse anschließend addiert werden (multiplyand-accumulate, MAC). Insbesondere in mobilen Anwendungen, wie beispielsweise beim zumindest teilweise automatisierten Führen von Fahrzeugen im Straßenverkehr, werden neuronale Netzwerke auf Hardwareplattformen implementiert, die auf derartige Berechnungen spezialisiert sind. Diese Plattformen sind in Bezug auf Hardwarekosten und Energieverbrauch je Einheit Rechenleistung besonders effizient.In the inference of neural networks, a very large number of activations of neurons are calculated by adding up, weighted, inputs that are fed to these neurons using weights developed in the training of the neural network. So there is a multitude of multiplications, the results of which are then added (multiplyand-accumulate, MAC). In particular in mobile applications, such as the at least partially automated driving of vehicles in road traffic, neural networks are implemented on hardware platforms that are specialized in such calculations. These platforms are particularly efficient in terms of hardware costs and energy consumption per unit of computing power.

Mit zunehmender Integrationsdichte dieser Hardwareplattformen nimmt die Wahrscheinlichkeit für transiente, d.h. sporadisch auftretende, Rechenfehler zu. So kann beispielsweise durch das Auftreffen eines hochenergetischen Photons aus der Hintergrundstrahlung auf eine Speicherstelle oder eine Verarbeitungseinheit der Hardwareplattform ein Bit zufällig „umgekippt“ werden. Weiterhin teilt sich die Hardwareplattform gerade in einem Fahrzeug das Bordnetz mit einer Vielzahl weiterer Verbraucher, die Störungen, wie etwa Spannungsspitzen in die Hardwareplattform einkoppeln können. Die diesbezüglichen Toleranzen werden mit zunehmender Integrationsdichte der Hardwareplattform enger.As the integration density of these hardware platforms increases, the probability of transient, i.e. sporadic, calculation errors increases. For example, when a high-energy photon from the background radiation hits a memory location or a processing unit on the hardware platform, a bit can be randomly “flipped over”. Furthermore, the hardware platform in a vehicle shares the on-board network with a large number of other loads that can couple disturbances, such as voltage peaks, into the hardware platform. The relevant tolerances become tighter as the integration density of the hardware platform increases.

Die DE 10 2018 202 095 A1 offenbart ein Verfahren, mit dem bei der Verarbeitung eines Tensors von Eingangswerten zu einem Tensor von Ausgangswerten durch ein neuronales Netzwerk falsch berechnete Ausgangswerte mittels zusätzlicher Kontrollberechnungen identifiziert und auch korrigiert werden können.the DE 10 2018 202 095 A1 discloses a method with which, when a tensor of input values is processed into a tensor of output values by a neural network, incorrectly calculated output values can be identified and also corrected by means of additional control calculations.

Offenbarung der ErfindungDisclosure of the invention

Im Rahmen der Erfindung wurde ein Verfahren zum Betreiben einer Hardwareplattform für die Inferenzberechnung eines faltenden neuronalen Netzwerks entwickelt. Die Hardwareplattform weist mindestens ein Beschleunigungsmodul auf, das darauf spezialisiert ist, eine Faltung einer Eingangsmatrix mit einem Faltungskern durch Anwendung dieses Faltungskerns an verschiedenen Positionen innerhalb der Eingangsmatrix zu berechnen und das Ergebnis dieser Faltung als zweidimensionale Ausgangsmatrix auszugeben. Hierbei ist unter „spezialisiert“ insbesondere beispielsweise zu verstehen, dass der Kreis der Aufgaben, die dieses Beschleunigungsmodul ausführen kann, gegenüber einer CPU oder GPU eines herkömmlichen Computers deutlich eingeschränkt ist zugunsten einer wesentlich höheren Leistung bei genau diesen Aufgaben. Die Eingangsmatrix und die Faltungskerne können hierbei insbesondere beispielsweise dreidimensional sein, was für die Verarbeitung von Bilddaten besonders vorteilhaft ist. Sie lassen sich jedoch auch auf höhere Dimensionen verallgemeinern. So können beispielsweise bei Videodaten oder anderen zeitveränderlichen Daten drei Dimensionen räumliche Koordinaten und eine vierte Dimension die Zeit repräsentieren.In the context of the invention, a method for operating a hardware platform for the inference calculation of a convolutional neural network was developed. The hardware platform has at least one acceleration module that is specialized in calculating a convolution of an input matrix with a convolution kernel by using this convolution kernel at different positions within the input matrix and outputting the result of this convolution as a two-dimensional output matrix. Here, “specialized” is to be understood in particular, for example, as the fact that the group of tasks that this acceleration module can perform is significantly limited compared to a CPU or GPU of a conventional computer in favor of significantly higher performance in precisely these tasks. The input matrix and the convolution kernels can be, for example, three-dimensional, which is particularly advantageous for the processing of image data. However, they can also be generalized to higher dimensions. For example, in the case of video data or other time-varying data, three dimensions can represent spatial coordinates and a fourth dimension can represent time.

Ganz allgemein kann das neuronale Netzwerk also beispielsweise als Klassifikator für die Zuordnung von Beobachtungsdaten, wie beispielsweise Kamerabildern, Wärmebildern, Radardaten, LIDAR-Daten oder Ultraschalldaten, zu einer oder mehreren Klassen einer vorgegebenen Klassifikation ausgebildet sein. Diese Klassen können beispielsweise Objekte oder Zustände im beobachteten Bereich repräsentieren, die es zu detektieren gilt. Die Beobachtungsdaten können beispielsweise von einem oder mehreren Sensoren stammen, die an einem Fahrzeug montiert sind. Aus der vom neuronalen Netzwerk gelieferten Zuordnung zu Klassen können dann beispielsweise Aktionen eines Fahrassistenzsystems oder eines Systems für das zumindest teilweise automatisierte Führen des Fahrzeugs abgeleitet werden, die zu der konkreten Verkehrssituation passen. Das neuronale Netzwerk kann beispielsweise ein in Schichten unterteiltes faltendes neuronales Netzwerk (convolutional neural network, CNN) sein.In general, the neural network can be designed as a classifier for assigning observation data, such as camera images, thermal images, radar data, LIDAR data or ultrasound data, to one or more classes of a given classification. These classes can, for example, represent objects or states in the observed area that are to be detected. The observation data can originate, for example, from one or more sensors that are mounted on a vehicle. From the assignment to classes provided by the neural network, actions of a driver assistance system or of a system for at least partially automated driving of the vehicle can then be derived, for example, which match the specific traffic situation. The neural network can be, for example, a convolutional neural network (CNN) divided into layers.

Bei dem Verfahren wird eine Eingangsmatrix mit Eingangsdaten des neuronalen Netzwerks mittels des Beschleunigungsmoduls mit einer Mehrzahl von Faltungskernen gefaltet. Das bedeutet, dass für jede Position, an der der Faltungskern innerhalb der Eingangsmatrix angewendet wird, die durch den Faltungskern abgedeckten Elemente der Eingangsmatrix gewichtet aufsummiert werden, wobei die Gewichte durch die Elemente des Faltungskerns gegeben sind. Indem die Eingangsmatrix durch den Faltungskern in zwei Dimensionen „abgetastet“ wird, entsteht eine Vielzahl derartiger gewichteter Summen, die eine zu dem Faltungskern korrespondierende Ausgangsmatrix bilden. Für mehrere Faltungskerne entstehen dementsprechend mehrere solche Ausgangsmatrizen.In the method, an input matrix with input data from the neural network is convoluted with a plurality of convolution kernels by means of the acceleration module. This means that for each position at which the convolution kernel is applied within the input matrix, the elements of the input matrix covered by the convolution kernel are added up in a weighted manner, the weights being given by the elements of the convolution kernel. As the input matrix is "scanned" in two dimensions by the convolution kernel, a large number of such weighted sums are created, which form an output matrix corresponding to the convolution kernel. Accordingly, several such output matrices are created for several convolution cores.

Die Faltungskerne werden elementweise zu einem Kontrollkern summiert. Mittels des Beschleunigungsmoduls wird die Eingangsmatrix mit dem Kontrollkern gefaltet, so dass, analog zur Anwendung der Faltungskerne, eine zweidimensionale Kontrollmatrix entsteht.The convolution kernels are summed up element by element to form a control kernel. The input matrix with the control core is folded by means of the acceleration module, so that, analogously to the application of the Convolution kernels, a two-dimensional control matrix is created.

Die Faltungskerne können insbesondere beispielsweise gleich groß sein. Dies ist jedoch nicht zwingend erforderlich. Sind die Faltungskerne unterschiedlich groß, so können sie beispielsweise virtuell an den Rändern mit Nullen auf die Größe des größten Faltungskerns aufgefüllt werden, um dann alle Faltungskerne elementweise zum Kontrollkern summieren zu können.The convolution cores can in particular, for example, be of the same size. However, this is not absolutely necessary. If the convolution kernels are of different sizes, they can, for example, be virtually filled with zeros at the edges to the size of the largest convolution kernel, in order then to be able to add up all the convolution kernels element by element to form the control kernel.

Jedes Element der Kontrollmatrix wird mit der Summe der hierzu korrespondierenden Elemente in den Ausgangsmatrizen verglichen. Wenn beispielsweise die Faltungskerne und der Kontrollkern die Eingangsmatrix jeweils in den Dimensionen x und y „abtasten“ und in der dritten Dimension z die gleiche Tiefe haben wie die Eingangsmatrix, dann erstrecken sich auch die zu den Faltungskernen korrespondierenden Ausgangsmatrizen sowie die Kontrollmatrix entlang der Dimensionen x und y, und sie werden in der dritten Dimension z „gestapelt“. Dann sollte für jedes Koordinatenpaar (x, y) die Summe der Elemente aller Ausgangsmatrizen mit diesen Koordinaten (x, y), d.h. die entlang einer „Säule“ in z-Richtung gebildete Summe, gleich dem Element der Kontrollmatrix mit den gleichen Koordinaten (x, y) sein. Dies folgt aus dem Assoziativgesetz der Mathematik und lässt sich mit der Analogie verständlich machen, dass sich beim Zählen von Münzgeld unabhängig davon, ob die Einzelwerte der Münzen direkt addiert werden oder ob die Münzen erst nach Wertigkeiten in Rollen gebündelt und dann die Werte der Rollen addiert werden, der gleiche Geldbetrag ergeben sollte.Each element of the control matrix is compared with the sum of the corresponding elements in the output matrices. If, for example, the convolution kernels and the control kernel "scan" the input matrix in the dimensions x and y and have the same depth as the input matrix in the third dimension z, then the output matrices corresponding to the convolution kernels and the control matrix also extend along the dimensions x and y, and they are "stacked" in the third dimension z. Then for each pair of coordinates (x, y) the sum of the elements of all output matrices with these coordinates (x, y), ie the sum formed along a "column" in the z-direction, should be equal to the element of the control matrix with the same coordinates (x , y) be. This follows from the associative law of mathematics and can be made understandable with the analogy that when counting coins, regardless of whether the individual values of the coins are added directly or whether the coins are bundled in rolls according to their valencies and then the values of the rolls are added should result in the same amount of money.

In Antwort darauf, dass dieser Vergleich für ein Element der Kontrollmatrix eine Abweichung ergibt, wird mit mindestens einer zusätzlichen Kontrollberechnung geprüft, ob ein zu diesem Element der Kontrollmatrix korrespondierendes Element mindestens einer Ausgangsmatrix richtig berechnet wurde.In response to the fact that this comparison results in a deviation for an element of the control matrix, at least one additional control calculation is used to check whether an element of at least one output matrix corresponding to this element of the control matrix has been correctly calculated.

Es wurde erkannt, dass diese Organisation der Fehlerprüfung in Verbindung mit der spezifischen genannten Hardwareplattform den zusätzlichen Aufwand in Form von Rechenzeit und Speicher deutlich verringert. Indem für die Berechnung der Kontrollmatrix das gleiche Beschleunigungsmodul verwendet wird wie für die Berechnung der Ausgangsmatrizen, kostet diese Berechnung nur sehr wenig zusätzliche Zeit. Da das Ziel ist, gerade transiente und somit sporadisch auftretende Fehler zu finden, ist in üblichen Betriebsumgebungen für die weitaus meisten (über 99 %) der Vergleiche zu erwarten, dass sich keine Abweichung ergibt. Wenn diese Fälle maximal effizient abgearbeitet werden, kann im Gegenzug im Falle einer Abweichung Zeit in die zusätzliche Kontrollberechnung investiert werden, um den Fehler genauer zu lokalisieren. Dabei sind die konkrete Art dieser zusätzlichen Kontrollberechnung sowie die Maßnahmen, die zur Behebung genauer lokalisierter Fehler ergriffen werden, prinzipiell nicht eingeschränkt. Vielmehr kann die Wahl der Kontrollberechnung, bzw. der sonstigen Maßnahmen, sich sinnvollerweise insbesondere danach richten, wieviel Aufwand die Berechnung bzw. sonstige Maßnahme kostet und wie häufig in der konkreten Anwendung transiente Fehler zu erwarten sind.It was recognized that this organization of the error check in connection with the specific hardware platform mentioned significantly reduces the additional effort in the form of computing time and memory. Since the same acceleration module is used for the calculation of the control matrix as for the calculation of the output matrices, this calculation costs very little additional time. Since the goal is to find transient and thus sporadic errors, in normal operating environments it can be expected for the vast majority (over 99%) of the comparisons that there will be no discrepancies. If these cases are processed as efficiently as possible, in return, in the event of a discrepancy, time can be invested in the additional control calculation in order to localize the error more precisely. In principle, the specific type of this additional control calculation and the measures that are taken to eliminate more precisely localized errors are not restricted. Rather, the choice of the control calculation or the other measures can sensibly be based in particular on how much effort the calculation or other measure costs and how often transient errors are to be expected in the specific application.

Wenn eine Abweichung festgestellt wird, kann dies prinzipiell durch falsche Berechnung eines oder mehrerer der zu dem Element der Kontrollmatrix korrespondierenden Elemente in den Ausgangsmatrizen verursacht sein, und/oder durch falsche Berechnung des Elements der Kontrollmatrix selbst. Jedoch ist gerade bei den transienten Fehlern, die es im Kontext der Erfindung zu erkennen gilt, die Wahrscheinlichkeit sehr gering, dass

• zwei transiente Fehler mit einem solchen Timing auftreten, dass hiervon Elemente in zwei in z-Richtung voneinander beabstandeten Ausgangsmatrizen betroffen sind, die aber jeweils die gleichen Koordinaten (x, y) haben; oder
• zwei transiente Fehler mit einem solchen Timing auftreten, dass hiervon sowohl mindestens ein Element einer Ausgangsmatrix mit Koordinaten (x, y) als auch das Element der Kontrollmatrix mit den gleichen Koordinaten (x, y) betroffen sind.

If a discrepancy is found, this can in principle be caused by incorrect calculation of one or more of the elements corresponding to the element of the control matrix in the output matrices, and / or by incorrect calculation of the element of the control matrix itself It is to be recognized in the context of the invention that the probability is very low that

• two transient errors occur with a timing such that elements in two output matrices spaced apart from one another in the z-direction are affected, but which each have the same coordinates (x, y); or
• Two transient errors occur with a timing such that at least one element of an output matrix with coordinates (x, y) and the element of the control matrix with the same coordinates (x, y) are affected.

Selbst wenn in einem solchen Fall die komplette Inferenzberechnung wiederholt werden muss, weil der Fehler sich nicht weiter eingrenzen lässt, bedeutet dies wegen der geringen Wahrscheinlichkeit keinen in der konkreten Anwendung spürbaren Leistungsverlust. Daher kann für die Zwecke der weiteren Eingrenzung und Korrektur transienter Fehler von der Annahme ausgegangen werden, dass

• entweder genau ein Element einer Ausgangsmatrix, das die gleichen Koordinaten (x, y) hat wie das gerade untersuchte Element der Kontrollmatrix, falsch berechnet wurde
• oder das Element der Kontrollmatrix selbst falsch berechnet wurde.

Even if the complete inference calculation has to be repeated in such a case because the error cannot be narrowed down further, due to the low probability this does not mean any noticeable loss of performance in the specific application. Therefore, for the purposes of further isolating and correcting transient errors, it can be assumed that

• Either exactly one element of an output matrix, which has the same coordinates (x, y) as the element of the control matrix that has just been examined, was incorrectly calculated
• or the element of the control matrix itself was calculated incorrectly.

Das Auftreten solcher einzelner transienter Fehler ist beispielsweise bei gängigen Hardwareplattformen, die für das zumindest teilweise automatisierte Fahren verwendet werden, so häufig zu erwarten, dass ein komplettes Verwerfen und Wiederholen der Inferenzberechnung im Vergleich zum im Folgenden beschriebenen weiteren Eingrenzen und ggfs. auch Korrigieren dieser Fehler eine spürbare Verlangsamung der konkreten Anwendung bedeuten würde.The occurrence of such individual transient errors is to be expected so frequently, for example with common hardware platforms that are used for at least partially automated driving, that a complete rejection and repetition of the inference calculation compared to the further narrowing down and, if necessary, also correcting of these errors described below would mean a noticeable slowdown in the actual application.

Die vorgenannten und alle folgenden Überlegungen sind unabhängig davon gültig, ob die Eingangsmatrix die vollständigen Eingangsdaten des neuronalen Netzwerks umfasst oder lediglich einen Teil hiervon. In vielen Anwendungen passen die vollständigen Eingangsdaten des neuronalen Netzwerks, und auch die daraus generierten vollständigen Ausgangsmatrizen, nicht in den internen Puffer („on-chip-Speicher“) der Hardwareplattform, so dass die Hardwareplattform die Daten stückweise (in sogenannten Kacheln, „Tiles“) verarbeitet. Die für die einzelnen Kacheln erzielten Ergebnisse werden dann in einem größeren externen Speicher außerhalb der beschleunigten Hardwareplattform zusammengesetzt.The above and all the following considerations apply regardless of whether the input matrix comprises the complete input data of the neural network or only a part of it. In many applications, the complete input data of the neural network, and also the complete output matrices generated from it, do not fit into the internal buffer ("on-chip memory") of the hardware platform, so that the hardware platform transfers the data piece by piece (in so-called tiles ") Processed. The results obtained for each tile are then put together in larger external storage outside of the accelerated hardware platform.

Bei der Faltung mit mindestens einem Faltungskern kann weiterhin ein zu diesem Faltungskern korrespondierender Bias-Wert zu den Elementen der mit diesem Faltungskern erzeugten Ausgangsmatrix addiert werden. Die Summe dieser Bias-Werte kann dann auch zu allen Elementen der Kontrollmatrix addiert werden.In the case of convolution with at least one convolution kernel, a bias value corresponding to this convolution kernel can also be added to the elements of the output matrix generated with this convolution kernel. The sum of these bias values can then also be added to all elements of the control matrix.

In einer besonders vorteilhaften Ausgestaltung wird mit der zusätzlichen Kontrollberechnung geprüft, ob eine das zu prüfende Element enthaltende Zeile oder Spalte der mindestens einen Ausgangsmatrix richtig berechnet wurde. Für eine solche Prüfung lässt sich ebenfalls das Beschleunigungsmodul nutzen, obwohl es nicht primär für diese Aufgabe gedacht ist. Wenn auf diese Weise die Information gewonnen wird, dass ein Element einer bestimmten Ausgangsmatrix (d.h., ein Element mit einer bestimmten z-Koordinate) nicht richtig berechnet wurde, lässt dies gleich zwei Schlussfolgerungen zu. Zum einen ist es dann erwiesen, dass tatsächlich ein Fehler in einer Ausgangsmatrix vorliegt und nicht etwa lediglich die Berechnung des Elements der Kontrollmatrix falsch ist. Zum anderen ist dann auch die konkrete Ausgangsmatrix bekannt, in der sich der Fehler befindet, also die z-Koordinate des Fehlers. In Verbindung mit den bereits mit dem ersten Vergleich ermittelten Koordinaten (x, y) ist der Fehler dann also auf ein konkretes Element lokalisiert.In a particularly advantageous embodiment, the additional control calculation is used to check whether a row or column of the at least one output matrix containing the element to be checked has been calculated correctly. The acceleration module can also be used for such a test, although it is not primarily intended for this task. If the information is obtained in this way that an element of a certain output matrix (i.e., an element with a certain z-coordinate) was not correctly calculated, this allows two conclusions to be drawn. On the one hand, it has then been proven that there is actually an error in an output matrix and that it is not just the calculation of the element of the control matrix that is incorrect. On the other hand, the specific output matrix in which the error is located is also known, i.e. the z-coordinate of the error. In connection with the coordinates (x, y) already determined with the first comparison, the error is then localized to a specific element.

Um das Beschleunigungsmodul gleichsam für diese Ausgabe „zweckzuentfremden“, wird in einer besonders vorteilhaften Ausgestaltung die Eingangsmatrix um Überprüfungselemente erweitert. Jedes dieser Überprüfungselemente kann insbesondere beispielsweise eine einfache Summe von Elementen aus einem bestimmten Bereich der Eingangsmatrix sein. Die Überprüfungselemente werden mittels des Beschleunigungsmoduls mit demjenigen Faltungskern, der zu der gerade untersuchten mindestens einen Ausgangsmatrix korrespondiert, gefaltet, um so einen Kontrollwert zu erhalten.In order to “alienate” the acceleration module, as it were, for this output, the input matrix is expanded to include checking elements in a particularly advantageous embodiment. Each of these checking elements can in particular be, for example, a simple sum of elements from a specific area of the input matrix. The checking elements are folded by means of the acceleration module with that convolution kernel which corresponds to the at least one output matrix that has just been examined, in order to obtain a control value in this way.

Die Summe der Elemente in der untersuchten Zeile bzw. Spalte wird mit dem Kontrollwert verglichen. In Antwort darauf, dass dieser Vergleich eine Abweichung ergibt, wird festgestellt, dass die Zeile bzw. Spalte nicht richtig berechnet wurde. Damit wird auch festgestellt, dass das ursprünglich zu prüfende Element der Ausgangsmatrix nicht richtig berechnet wurde.The sum of the elements in the examined row or column is compared with the control value. In response to the fact that this comparison shows a discrepancy, it is determined that the row or column was not calculated correctly. This also establishes that the element of the output matrix that was originally to be checked was not calculated correctly.

Wenn festgestellt wird, dass tatsächlich ein Element einer Ausgangsmatrix nicht richtig berechnet wurde, dann kann dieses Element um die beim Vergleich ermittelte Abweichung korrigiert werden. Wie zuvor erläutert, ist davon auszugehen, dass nur genau ein Fehler vorliegt. Daher liefern sowohl der ursprüngliche Vergleich mit dem Element der Kontrollmatrix als auch der Vergleich mit dem Kontrollwert das gleiche Ergebnis.If it is found that an element of an output matrix was actually incorrectly calculated, then this element can be corrected by the deviation determined during the comparison. As explained above, it can be assumed that there is only exactly one error. Therefore, both the original comparison with the element of the control matrix and the comparison with the control value provide the same result.

Da der Wahrscheinlichkeit nach nur mit einem einzigen Fehler gerechnet werden muss, kann die Suche nach weiteren Fehlern abgebrochen werden, sobald ein erster Fehler festgestellt wurde.Since only a single error has to be expected, the search for further errors can be terminated as soon as a first error has been detected.

Es kann aber auch der Fall eintreten, dass alle zu dem Element der Kontrollmatrix korrespondierenden Elemente (d.h., die Elemente mit den gleichen Koordinaten (x, y)) in allen Ausgangsmatrizen durch die Kontrollberechnungen als richtig erkannt werden. Dann kann festgestellt werden, dass das Element der Kontrollmatrix nicht richtig berechnet wurde. Das heißt, die ursprüngliche Berechnung der Ausgangsmatrizen war richtig, und der einzige zu erwartende transiente Fehler ist erst bei der anschließenden Berechnung der Kontrollmatrix aufgetreten. Es kann dann also mit den Ausgangsmatrizen gemäß der vorgesehenen Anwendung normal weitergerechnet werden. Der Fehler in der Berechnung der Kontrollmatrix kann ansonsten ignoriert werden.However, it can also happen that all elements corresponding to the element of the control matrix (i.e. the elements with the same coordinates (x, y)) in all output matrices are recognized as correct by the control calculations. It can then be determined that the element of the control matrix was not calculated correctly. This means that the original calculation of the output matrices was correct, and the only expected transient error only occurred during the subsequent calculation of the control matrix. You can then continue to calculate normally with the starting matrices in accordance with the intended application. Otherwise, the error in the calculation of the control matrix can be ignored.

Die bisherigen Überlegungen gingen davon aus, dass immer nur ein transienter Fehler vorliegt. Ein gehäuftes Auftreten von Fehlern kann jedoch ein Signal dafür sein, dass es sich nicht mehr um völlig zufällige transiente Fehler handelt, sondern eine Hardwarekomponente oder eine Speicherstelle zu versagen beginnt. Wenn etwa bei einem Halbleiter an einem pn-Übergang zwischen einer mit Löchern dotierten Schicht und einer mit Elektronen dotierten Schicht auf Grund von Überhitzung oder Alterung eine Interdiffusion stattfindet, kann der zum Umkippen eines Bits im Speicher benötigte Energiebetrag gegenüber dem Normalzustand vermindert sein, und es gelingt etwa Gammaquanten oder geladenen Teilchen aus der Hintergrundstrahlung mit einer höheren Wahrscheinlichkeit, diesen Energiebetrag aufzubringen. Die Fehler treten dann zwar immer noch zu zufälligen Zeitpunkten auf, häufen sich aber an der Hardwarekomponente oder Speicherzelle mit dem lädierten pn-Übergang immer mehr.The previous considerations were based on the assumption that there is always only one transient error. An increased occurrence of errors can, however, be a signal that it is no longer a completely random transient error, but that a hardware component or a memory location is beginning to fail. If, for example, interdiffusion takes place in a semiconductor at a pn junction between a layer doped with holes and a layer doped with electrons due to overheating or aging, the amount of energy required to flip a bit in the memory can be reduced compared to the normal state If, for example, gamma quanta or charged particles from the background radiation succeed with a higher probability of generating this amount of energy. The errors then still occur at random points in time, but they accumulate more and more on the hardware component or memory cell with the damaged pn junction.

Daher wird in einer weiteren besonders vorteilhaften Ausgestaltung in Antwort darauf, dass einer der Vergleiche eine Abweichung ergibt, in Bezug auf mindestens eine Hardwarekomponente oder mindestens einen Speicherbereich, die oder der als Ursache für die Abweichung in Betracht kommt, ein Fehlerzähler inkrementiert. Es können dann beispielsweise im Rahmen der allgemeinen Wartung die Fehlerzähler für vergleichbare Komponenten miteinander verglichen werden. Sticht dann beispielsweise von mehreren nominell baugleichen Hardwarekomponenten eine mit einem auffällig erhöhten Fehlerzähler hervor, bahnt sich möglicherweise ein Defekt dieser Hardwarekomponente an.Therefore, in a further particularly advantageous embodiment, in response to the fact that one of the comparisons results in a deviation, an error counter is incremented with respect to at least one hardware component or at least one memory area that is considered to be the cause of the deviation. The error counters for comparable components can then be compared with one another, for example as part of general maintenance. If, for example, one of several hardware components that are nominally identical in construction stands out with a noticeably increased error counter, a defect in this hardware component may be on the way.

So kann insbesondere beispielsweise in Antwort auf die Feststellung, dass der Fehlerzähler einen vorgegebenen Schwellwert überschreitet, die Hardwarekomponente, bzw. der Speicherbereich, als defekt erkannt werden. In Antwort hierauf kann beispielsweise die Hardwareplattform so rekonfiguriert werden, dass für weitere Berechnungen an Stelle der als defekt erkannten Hardwarekomponente, bzw. des als defekt erkannten Speicherbereichs, eine Reserve-Hardwarekomponente, bzw. ein Reserve-Speicherbereich, genutzt wird. Insbesondere für das vollständig automatisierte Führen von Fahrzeugen, bei dem auch im Fehlerfall keine Übernahme der Kontrolle durch einen Fahrer mehr vorgesehen ist, kann es sinnvoll sein, derartige Reserven vorzusehen. Das Fahrzeug kann dann im Falle eines Defekts noch eine Werkstatt erreichen („Limp-Home-Modus“) und muss nicht kostspielig abgeschleppt werden.In particular, for example, in response to the determination that the error counter exceeds a predetermined threshold value, the hardware component or the memory area can be recognized as defective. In response to this, for example, the hardware platform can be reconfigured in such a way that a reserve hardware component or a reserve memory area is used for further calculations instead of the hardware component recognized as defective or the memory area recognized as defective. In particular for the fully automated driving of vehicles, in which there is no longer any provision for a driver to take over control even in the event of a fault, it can be useful to provide such reserves. In the event of a defect, the vehicle can still reach a workshop (“limp home mode”) and does not have to be towed away at great expense.

Vorteilhaft werden optische Bilddaten, Wärmebilddaten, Videodaten, Radardaten, Ultraschalldaten, und/oder LIDAR-Daten als Eingangsdaten bereitgestellt. Dies sind die wichtigsten Typen von Messdaten, anhand derer sich zumindest teilweise automatisiert fahrende Fahrzeuge im Verkehrsraum orientieren. Die Messdaten können durch einen physikalischen Messprozess, und/oder durch eine teilweise oder vollständige Simulation eines solchen Messprozesses, und/oder durch eine teilweise oder vollständige Simulation eines mit einem solchen Messprozess beobachtbaren technischen Systems, erhalten werden. Beispielsweise können fotorealistische Bilder von Situationen mittels rechnerischer Verfolgung von Lichtstrahlen („Raytracing“) oder auch mit neuronalen Generatornetzwerken (etwa Generative Adversarial Networks, GAN) erzeugt werden. Hierbei lassen sich auch beispielsweise Erkenntnisse aus der Simulation eines technischen Systems, wie etwa Positionen bestimmter Objekte, als Nebenbedingungen einbringen. Das Generatornetzwerk kann darauf trainiert werden, gezielt Bilder zu erzeugen, die diesen Nebenbedingungen genügen (etwa Conditional GAN, cGAN).Optical image data, thermal image data, video data, radar data, ultrasound data and / or LIDAR data are advantageously provided as input data. These are the most important types of measurement data with which at least partially automated vehicles orientate themselves in the traffic area. The measurement data can be obtained through a physical measurement process and / or through a partial or complete simulation of such a measurement process and / or through a partial or complete simulation of a technical system that can be observed with such a measurement process. For example, photo-realistic images of situations can be generated by means of computational tracking of light beams (“ray tracing”) or with neural generator networks (such as Generative Adversarial Networks, GAN). For example, findings from the simulation of a technical system, such as the positions of certain objects, can also be incorporated as secondary conditions. The generator network can be trained to specifically generate images that meet these constraints (e.g. Conditional GAN, cGAN).

Die Ausgangsmatrizen können zu einem Ansteuersignal verarbeitet werden. Mit diesem Ansteuersignal kann dann ein Fahrzeug, und/oder ein System für die Qualitätskontrolle von in Serie gefertigten Produkten, und/oder ein System für die medizinische Bildgebung, und/oder ein Zutrittskontrollsystem, angesteuert werden. Die zuvor beschriebene Fehlerprüfung hat in diesem Zusammenhang die Wirkung, dass sporadische Funktionsstörungen, die ohne konkreten Anlass „aus dem Nichts“ kommen und somit normalerweise äußerst schwer zu diagnostizieren wären, vorteilhaft vermieden werden.The output matrices can be processed into a control signal. A vehicle and / or a system for quality control of mass-produced products and / or a system for medical imaging and / or an access control system can then be controlled with this control signal. In this context, the error check described above has the effect of advantageously avoiding sporadic malfunctions that come "out of nowhere" without a specific reason and thus would normally be extremely difficult to diagnose.

Die Verfahren können insbesondere ganz oder teilweise computerimplementiert sein. Daher bezieht sich die Erfindung auch auf ein Computerprogramm mit maschinenlesbaren Anweisungen, die, wenn sie auf einem oder mehreren Computern ausgeführt werden, den oder die Computer dazu veranlassen, eines der beschriebenen Verfahren auszuführen. In diesem Sinne sind auch Steuergeräte für Fahrzeuge und Embedded-Systeme für technische Geräte, die ebenfalls in der Lage sind, maschinenlesbare Anweisungen auszuführen, als Computer anzusehen.In particular, the methods can be implemented in whole or in part by a computer. The invention therefore also relates to a computer program with machine-readable instructions which, when they are executed on one or more computers, cause the computer or computers to carry out one of the described methods. In this sense, control devices for vehicles and embedded systems for technical devices, which are also able to execute machine-readable instructions, are to be regarded as computers.

Ebenso bezieht sich die Erfindung auch auf einen maschinenlesbaren Datenträger und/oder auf ein Downloadprodukt mit dem Computerprogramm. Ein Downloadprodukt ist ein über ein Datennetzwerk übertragbares, d.h. von einem Benutzer des Datennetzwerks downloadbares, digitales Produkt, das beispielsweise in einem Online-Shop zum sofortigen Download feilgeboten werden kann.The invention also relates to a machine-readable data carrier and / or to a download product with the computer program. A download product is a digital product that can be transmitted via a data network, i.e. that can be downloaded by a user of the data network and that can be offered for immediate download in an online shop, for example.

Weiterhin kann ein Computer mit dem Computerprogramm, mit dem maschinenlesbaren Datenträger bzw. mit dem Downloadprodukt ausgerüstet sein.Furthermore, a computer can be equipped with the computer program, with the machine-readable data carrier or with the download product.

Weitere, die Erfindung verbessernde Maßnahmen werden nachstehend gemeinsam mit der Beschreibung der bevorzugten Ausführungsbeispiele der Erfindung anhand von Figuren näher dargestellt.Further measures improving the invention are illustrated in more detail below together with the description of the preferred exemplary embodiments of the invention with reference to figures.

AusführungsbeispieleEmbodiments

Es zeigt:

1 Ausführungsbeispiel des Verfahrens 100;
2 Schnelle Ermittlung einer Kontrollmatrix 5 mit einem Kontrollkern 4;
3 Genaue Lokalisierung eines Fehlers anhand von Zeilen (3a) bzw. Spalten (3b) 3a#-3c# der Ausgangsmatrizen 3a-3c.

It shows:

1 Embodiment of the method 100 ;
2 Fast determination of a control matrix 5 with a control core 4th ;
3 Exact localization of an error based on lines ( 3a) or columns ( 3b) 3a # -3c # of the starting matrices 3a-3c .

1 ist ein schematisches Ablaufdiagramm eines Ausführungsbeispiels des Verfahrens 100. Gemäß Schritt 105 können diejenigen Datentypen, die speziell für die Orientierung eines zumindest teilweise automatisierten Fahrzeugs im Straßenverkehr am wichtigsten sind, als Eingangsdaten in der Eingangsmatrix 1 bereitgestellt werden. 1 Figure 3 is a schematic flow diagram of an embodiment of the method 100 . According to step 105 can be those data types that are specifically designed for the orientation of at least one partially automated vehicle in road traffic are most important as input data in the input matrix 1 to be provided.

In Schritt 110 wird die in diesem Beispiel dreidimensionale Eingangsmatrix 1 mit den in diesem Beispiel ebenfalls dreidimensionalen Faltungskernen 2a-2c gefaltet, was jeweils zweidimensionale Ausgangsmatrizen 3a-3c produziert. In Schritt 120 werden die Faltungskerne 2a-2c elementweise zu einem Kontrollkern 4 summiert. Die Eingangsmatrix 1 wird mit dem Kontrollkern 4 gefaltet, so dass eine zweidimensionale Kontrollmatrix 5 entsteht.In step 110 becomes the three-dimensional input matrix in this example 1 with the convolution cores, which are also three-dimensional in this example 2a-2c folded, each two-dimensional starting matrices 3a-3c produced. In step 120 become the convolution cores 2a-2c element by element to a control core 4th summed up. The input matrix 1 will be with the control kernel 4th folded so that a two-dimensional control matrix 5 arises.

In Schritt 140 wird jedes Element 5* der Kontrollmatrix 5 mit der Summe der hierzu korrespondierenden Elemente 3a*-3c* in den Ausgangsmatrizen 3a-3c verglichen. In Schritt 150 wird geprüft, ob dieser Vergleich 140 eine Abweichung ergibt. Wenn dies der Fall ist (Wahrheitswert 1), wird in Schritt 160 geprüft, ob ein zu diesem Element 5* der Kontrollmatrix 5 korrespondierendes Element 3a*-3c* mindestens einer Ausgangsmatrix 3a-3c richtig berechnet wurde.In step 140 becomes every element 5 * the control matrix 5 with the sum of the corresponding elements 3a * -3c * in the starting matrices 3a-3c compared. In step 150 it is checked whether this comparison 140 results in a discrepancy. If so (truth value 1 ), is in step 160 checked whether there is a related to this element 5 * the control matrix 5 corresponding element 3a * -3c * at least one output matrix 3a-3c was calculated correctly.

Wenn in Schritt 170 festgestellt wird, dass ein Element 3a*-3c* einer Ausgangsmatrix 3a-3c nicht richtig berechnet wurde, so kann dieses in Schritt 180 um die beim Vergleich ermittelte Abweichung korrigiert werden.If in step 170 it is found that an item 3a * -3c * an output matrix 3a-3c was not calculated correctly, this can be done in step 180 corrected for the deviation determined during the comparison.

Es ist aber auch möglich, dass gemäß Schritt 190 die zu dem Element 5* der Kontrollmatrix 5 korrespondierenden Elemente aller Ausgangsmatrizen 3a-3c daraufhin geprüft wurden, ob sie richtig berechnet wurden, und gemäß Schritt 200 festgestellt wurde, dass alle diese Elemente 3a*-3c* richtig berechnet wurden (Wahrheitswert 1). Dann wird in Schritt 210 festgestellt, dass das Element 5* der Kontrollmatrix 5 nicht richtig berechnet wurde, während zugleich die Ausgangsmatrizen 3a-3c alle korrekt sind.But it is also possible that according to step 190 which to the item 5 * the control matrix 5 corresponding elements of all output matrices 3a-3c then checked whether they were calculated correctly and according to step 200 it was found that all of these elements 3a * -3c * have been calculated correctly (truth value 1 ). Then in step 210 found that the item 5 * the control matrix 5 was not calculated correctly while at the same time the output matrices 3a-3c all are correct.

Wenn dies der Fall ist, oder wenn ein eventueller Fehler in Schritt 180 korrigiert wurde, sind die Ausgangsmatrizen 3a-3c bereit für die weitere Auswertung. Gemäß Schritt 270 können diese Ausgangsmatrizen 3a-3c insbesondere zu einem Ansteuersignal 6 verarbeitet werden. Gemäß Schritt 280 kann dann ein Fahrzeug 50, und/oder ein Klassifikationssystem 60, und/oder ein System 70 für die Qualitätskontrolle von in Serie gefertigten Produkten, und/oder ein System 80 für die medizinische Bildgebung, und/oder ein Zutrittskontrollsystem 90, mit diesem Ansteuersignal 6 angesteuert werden.If so, or if there is a possible error in step 180 corrected, are the output matrices 3a-3c ready for further evaluation. According to step 270 can use these starting matrices 3a-3c in particular to a control signal 6th are processed. According to step 280 can then a vehicle 50 , and / or a classification system 60 , and / or a system 70 for the quality control of mass-produced products, and / or a system 80 for medical imaging, and / or an access control system 90 , with this control signal 6th can be controlled.

Wird hingegen in Schritt 220 festgestellt, dass eine Ausgangsmatrix 3a-3c nicht richtig berechnet wurde, so kann gemäß Schritt 230 in Bezug auf mindestens eine Hardwarekomponente oder mindestens einen Speicherbereich, die oder der als Ursache für die Abweichung in Betracht kommt, ein Fehlerzähler inkrementiert werden. Wird dann in Schritt 240 festgestellt, dass der Fehlerzähler einen vorgegebenen Schwellwert überschreitet (Wahrheitswert 1), so kann die Hardwarekomponente, bzw. der Speicherbereich, in Schritt 250 als defekt erkannt werden. Die Hardwareplattform kann dann in Schritt 260 so rekonfiguriert werden, dass für weitere Berechnungen an Stelle der als defekt erkannten Hardwarekomponente, bzw. des als defekt erkannten Speicherbereichs, eine Reserve-Hardwarekomponente, bzw. ein Reserve-Speicherbereich, genutzt wird.However, in step 220 found that an output matrix 3a-3c has not been calculated correctly, according to step 230 an error counter can be incremented with respect to at least one hardware component or at least one memory area which is considered to be the cause of the discrepancy. Will then step in 240 found that the error counter exceeds a specified threshold value (truth value 1 ), then the hardware component or the memory area in step 250 recognized as defective. The hardware platform can then go into step 260 can be reconfigured in such a way that a reserve hardware component or a reserve memory area is used for further calculations instead of the hardware component recognized as defective or the memory area recognized as defective.

Innerhalb des Kastens 110 ist eine mögliche Ausgestaltung der Faltung mit den Faltungskernen 2a-2c angegeben: Gemäß Block 111 werden bei der Faltung ein erster Bias-Wert 7a auf die Werte der ersten Ausgangsmatrix 3a, ein zweiter Bias-Wert 7b auf die Werte der zweiten Ausgangsmatrix 3b und ein dritter Bias-Wert 7c auf die Werte der dritten Ausgangsmatrix 3c addiert. Gemäß Block 112 wird die Summe 7a+7b+7c dieser Bias-Werte 7a, 7b, 7c auch zu allen Elementen der Kontrollmatrix 5 addiert.Inside the box 110 is a possible configuration of the fold with the fold cores 2a-2c specified: According to block 111 become a first bias value during the convolution 7a on the values of the first output matrix 3a , a second bias value 7b on the values of the second output matrix 3b and a third bias value 7c on the values of the third output matrix 3c added. According to block 112 becomes the sum 7a + 7b + 7c these bias values 7a , 7b , 7c also for all elements of the control matrix 5 added.

Gemäß Block 161 kann bei der zusätzlichen Kontrollberechnung 160 insbesondere geprüft werden, ob eine das zu prüfende Element 3a*-3c* enthaltende Zeile oder Spalte 3a#-3c# der mindestens einen Ausgangsmatrix 3a-3c richtig berechnet wurde. Dies ist in 3 näher veranschaulicht.According to block 161 can with the additional control calculation 160 In particular, it is checked whether one of the elements to be tested 3a * -3c * containing row or column 3a # -3c # the at least one output matrix 3a-3c was calculated correctly. This is in 3 illustrated in more detail.

Für diese Prüfung kann insbesondere beispielsweise das für die Faltung vorgesehene Beschleunigermodul der Hardwareplattform „zweckentfremdet“ werden. Hierzu wird gemäß Block 162 die Eingangsmatrix 1 um Überprüfungselemente 11 erweitert. Die Überprüfungselemente 11 werden dann gemäß Block 163 mittels des Beschleunigungsmoduls mit dem Faltungskern 2a-2c, der zu der mindestens einen Ausgangsmatrix 3a-3c korrespondiert, gefaltet, um so einen Kontrollwert 31 zu erhalten. Gemäß Block 164 wird die Summe der Elemente in der Zeile bzw. Spalte 3a#-3c# mit dem Kontrollwert 31 verglichen. Wenn in Block 165 festgestellt wird, dass dieser Vergleich eine Abweichung ergibt (Wahrheitswert 1), wird in Block 166 festgestellt, dass die Zeile bzw. Spalte 3a#-3c# nicht richtig berechnet wurde und dass somit auch das zu prüfende Element 3a*-3c* der Ausgangsmatrix 3a-3c nicht richtig berechnet wurde.For this test, for example, the accelerator module of the hardware platform provided for the folding can be “misused”. For this purpose, according to block 162 the input matrix 1 to review items 11 expanded. The verification items 11 are then according to block 163 by means of the acceleration module with the convolution core 2a-2c associated with the at least one output matrix 3a-3c corresponds, folded, to such a control value 31 to obtain. According to block 164 becomes the sum of the elements in the row or column 3a # -3c # with the control value 31 compared. If in block 165 it is determined that this comparison results in a discrepancy (truth value 1 ), is in block 166 found that the row or column 3a # -3c # was not calculated correctly and that the element to be checked, too 3a * -3c * the output matrix 3a-3c was not calculated correctly.

2 verdeutlicht, wie die erste Prüfung auf mögliche Rechenfehler durch die Nutzung eines Kontrollkerns 4 auf der Hardwareplattform mit dem Beschleunigermodul besonders effizient gestaltet werden kann. Die Faltung der Eingangsmatrix 1 mit jedem der Faltungskerne 2a-2c produziert Ausgangsmatrizen 3a-3c. Der Kontrollkern 4 wird gebildet, indem die Faltungskerne 2a-2c elementweise summiert werden. Wird der Eingangstensor 1 mit dem Kontrollkern 4 gefaltet, ergibt sich eine Kontrollmatrix 5, die genauso groß ist wie die Ausgangsmatrizen 3a-3c. Ein jedes Element 5* der Kontrollmatrix 5 sollte gleich der Summe der hierzu korrespondierenden Elemente 3a*-3c* der Ausgangsmatrizen 3a-3c mit gleichen Koordinaten (x, y) in der Ebene der jeweiligen Ausgangsmatrix 3a-3c sein. 2 illustrates how the first check for possible calculation errors through the use of a control kernel 4th can be designed particularly efficiently on the hardware platform with the accelerator module. The convolution of the input matrix 1 with each of the convolution cores 2a-2c produces starting matrices 3a-3c . The control core 4th is formed by the convolution cores 2a-2c be totaled element by element. Becomes the input tensor 1 with the control core 4th folded, a control matrix results 5 which is the same size as the starting matrices 3a-3c . Every element 5 * the control matrix 5 should be equal to the sum of the corresponding elements 3a * -3c * of the starting matrices 3a-3c with the same coordinates (x, y) in the plane of the respective output matrix 3a-3c be.

3 veranschaulicht die weitere Kontrollberechnung, mit der gemäß Block 161 ein möglicher Fehler weiter eingegrenzt werden kann. 3 illustrates the further control calculation with which according to block 161 a possible error can be narrowed down further.

3a geht davon aus, dass das Element 5* in der linken oberen Ecke der Kontrollmatrix 5 nicht mit der Summe der hierzu korrespondierenden Elemente 3a*-3c* der Ausgangsmatrizen 3a-3c übereinstimmt. Daraufhin wird für jede der Ausgangsmatrizen 3a-3c geprüft, ob die jeweilige Zeile 3a#-3c#, die das entsprechende korrespondierende Element 3a*-3c* enthält, richtig berechnet wurde. Wie zuvor erläutert, lässt sich dies schneller prüfen als das jeweilige Element 3a*-3c* einzeln nachgerechnet werden könnte. 3a assumes that the element 5 * in the upper left corner of the control matrix 5 not with the sum of the corresponding elements 3a * -3c * of the starting matrices 3a-3c matches. Thereupon, for each of the output matrices 3a-3c checked whether the respective line 3a # -3c # that is the corresponding corresponding element 3a * -3c * contains has been calculated correctly. As explained earlier, this can be checked more quickly than the respective element 3a * -3c * could be recalculated individually.

In dem in 3a gezeigten Beispiel ergibt sich bei dieser Kontrollberechnung, dass die Zeile 3b# der Ausgangsmatrix 3b nicht richtig berechnet wurde. Damit steht fest, dass das Element 3b* nicht richtig berechnet wurde, und es kann eine entsprechende Korrektur vorgenommen werden.In the in 3a In the example shown, this control calculation shows that row 3b # of the output matrix 3b was not calculated correctly. It is thus certain that element 3b * was not calculated correctly and a corresponding correction can be made.

Wie in 3b veranschaulicht ist, läuft der Prozess völlig analog ab, wenn die Spalten 3a#-3c# der Ausgabematrizen 3a-3c, die das jeweils zu prüfende Element 3a*-3c* enthalten, auf korrekte Berechnung überprüft werden.As in 3b is illustrated, the process is completely analogous when the columns 3a # -3c # of the output matrices 3a-3c each item to be tested 3a * -3c * must be checked for correct calculation.

ZITATE ENTHALTEN IN DER BESCHREIBUNGQUOTES INCLUDED IN THE DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of the documents listed by the applicant was generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte PatentliteraturPatent literature cited

DE 102018202095 A1 [0004]

Claims

Method (100) for operating a hardware platform for the inference calculation of a convolutional neural network, this hardware platform having at least one acceleration module which is specialized in convolution of an input matrix (1) with a convolution kernel (2a-2c) by using this convolution kernel (2a -2c) at different positions within the input matrix (1) and output the result of this convolution as a two-dimensional output matrix (3a-3c), with the steps: • an input matrix (1) with input data of the neural network is folded (110) by means of the acceleration module with a plurality of convolution cores (2a-2c), so that a plurality of two-dimensional output matrices (3a-3c) is produced; • the convolution kernels (2a-2c) are added element-wise to a control kernel (4) (120); • the input matrix (1) is folded (130) with the control core (4) by means of the acceleration module, so that a two-dimensional control matrix (5) is created; • each element (5 *) of the control matrix (5) is compared (140) with the sum of the corresponding elements (3a * -3c *) in the output matrices (3a-3c); • in response to the fact that this comparison (140) for an element (5 *) of the control matrix (5) results in a deviation (150), at least one additional control calculation is used to check (160) whether a certain element (5 *) the control matrix (5) corresponding element (3a * -3c *) of at least one output matrix (3a-3c) has been correctly calculated.

Method (100) according to Claim 1 , wherein in the case of the convolution (110) with at least one convolution kernel (2a-2c) a bias value (7a-7c) corresponding to this convolution kernel (2a-2c) for the elements of the output matrix generated with this convolution kernel (2a-2c) ( 3a-3c) is added (111) and the sum of all bias values (7a-7c) is also added to all elements of the control matrix (5) (112).

Method (100) according to one of the Claims 1 until 2 , with the additional control calculation (160) being used to check (161) whether a row or column (3a # -3c #) of the at least one output matrix (3a-3c) containing the element to be checked (3a * -3c *) is calculated correctly became.

Method (100) according to Claim 3 In the context of the control calculation • the input matrix (1) is expanded to include checking elements (11) (162); • the checking elements (11) are folded (163) by means of the acceleration module with the folding core (2a-2c), which corresponds to the at least one output matrix (3a-3c), in order to obtain a control value (31); • the sum of the elements in the row or column (3a # -3c #) is compared with the control value (31) (164); and • in response to the fact that this comparison (164) yields (165) a discrepancy, it is determined (166) that the row or column (3a # -3c #) has not been calculated correctly and that the element to be checked is thus also (3a * -3c *) of the output matrix (3a-3c) was not calculated correctly.

Method (100) according to one of the Claims 1 until 4th , wherein in response to the determination (166, 170) that an element (3a * -3c *) of an output matrix (3) was not correctly calculated, this element (3a * -3c *) is corrected by the deviation determined in the comparison becomes (180).

Method (100) according to one of the Claims 1 until 5 , the elements (3a * -3c *) of all output matrices (3a-3c) corresponding to the element (5 *) of the control matrix (5) being checked (190) to determine whether they have been calculated correctly, and in response to the Determination (200) that all these elements (3a * -3c *) were calculated correctly, it is determined (210) that the element (5 *) of the control matrix (5) was not calculated correctly.

Method (100) according to one of the Claims 1 until 6th In response to the fact that one of the comparisons (140, 164) results in a discrepancy (220) with respect to at least one hardware component or at least one memory area that is considered to be the cause of the discrepancy, an error counter is incremented ( 230).

Method (100) according to Claim 7 wherein, in response to the determination that the error counter exceeds a predetermined threshold value (240), the hardware component or the memory area is recognized as defective (250).

Method (100) according to Claim 8 , the hardware platform being reconfigured (260) in such a way that a reserve hardware component or a reserve memory area is used for further calculations instead of the hardware component recognized as defective or the memory area recognized as defective.

Method (100) according to one of the Claims 1 until 9 , whereby optical image data, thermal image data, video data, radar data, ultrasound data, and / or LIDAR data generated by a physical measurement process, and / or by a partial or complete simulation of such a measurement process, and / or by a partial or complete simulation of one with such a measurement process observable technical system, are provided as input data (105).

Method (100) according to one of the Claims 1 until 10 , wherein the output matrices (3a-3c) are processed (270) to form a control signal (6) and wherein a vehicle (50), and / or a system (70) for the quality control of mass-produced products, and / or a system (80) for medical imaging, and / or an access control system (90), is controlled (280) with this control signal (6).

Computer program, containing machine-readable instructions which, when executed on one or more computers, cause the computer or computers to carry out a method (100) according to one of the Claims 1 until 11 to execute.

Machine-readable data carrier with the computer program after Claim 12 .

Computer equipped with the computer program according to Claim 12 , and / or with the machine-readable data carrier Claim 13 .