DE102017218914A1

DE102017218914A1 - Method for recognizing persons

Info

Publication number: DE102017218914A1
Application number: DE102017218914.2A
Authority: DE
Inventors: Gregor Blott
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2017-10-24
Filing date: 2017-10-24
Publication date: 2019-04-25

Abstract

Die vorliegende Erfindung betrifft ein Verfahren zum Erkennen von Personen, umfassend: Aufnehmen eines Abbildes (1) einer Person mittels einer Kamera (2), Extrahieren einzelner Bildmerkmale aus dem Abbild (1) und Beschreiben der Bildmerkmale mittels Deskriptoren (7), Bestimmen eines Deskriptorabstands (8) zwischen den Deskriptoren (7) des Abbilds (1) und vorbekannten Deskriptoren (7) von zumindest einem vorbekannten Referenzbild (3) einer vorbekannten Bildersammlung (4) mittels einer vorbekannten Metrik, und Ausgeben eines Wahrscheinlichkeitswerts, dass die Person auf dem Referenzbild (3) abgebildet ist, der anhand des Deskriptorabstands (8) bestimmt wurde, oder einer Reihenfolge mehrerer Referenzbilder (3), die einen zunehmenden Deskriptorabstand (8) zwischen dem Deskriptor (7) des Abbilds (1) und dem jeweiligen vorbekannten Deskriptor (7) des Referenzbildes (3) wiederspiegelt, wobei zumindest ein Bildaufnahmeparameter der Kamera (2) während der Aufnahme des Abbilds (1) mit zumindest einem korrespondierenden Bildaufnahmeparameter während der Aufnahme des Referenzbilds (3) verglichen wird, um einen Differenzwert zwischen den Bildaufnahmeparametern zu bestimmen, und dass bei dem Bestimmen des Deskriptorabstands (8) solche Referenzbilder (3) unberücksichtigt bleiben, für die der Differenzwert größer als ein vordefinierter Schwellwert ist.

The present invention relates to a method for recognizing persons, comprising: capturing an image (1) of a person by means of a camera (2), extracting individual image features from the image (1) and describing the image features using descriptors (7), determining a descriptor spacing (8) between the descriptors (7) of the image (1) and known descriptors (7) of at least one known reference image (3) of a previously known image collection (4) by means of a previously known metric, and outputting a probability value that the person on the reference image (3), which has been determined on the basis of the descriptor spacing (8), or an order of a plurality of reference images (3) having an increasing descriptor spacing (8) between the descriptor (7) of the image (1) and the respective previously known descriptor (7 ) of the reference image (3), wherein at least one image acquisition parameter of the camera (2) during the acquisition of the image (1) with at least a corresponding image acquisition parameter during the recording of the reference image (3) is determined in order to determine a difference value between the image acquisition parameters, and that in determining the descriptor distance (8) those reference images (3) for which the difference value is greater than a predefined threshold value are disregarded is.

Description

Stand der TechnikState of the art

Die vorliegende Erfindung betrifft ein Verfahren zum Erkennen von Personen. Insbesondere handelt es sich um das Wiedererkennen von Personen, von denen zuvor bereits ein Abbild aufgenommen wurde. Außerdem betrifft die Erfindung ein entsprechendes Computerprogrammprodukt sowie ein System umfassend eine Kamera und ein Steuergerät zum Ausführen eines derartigen Verfahrens.The present invention relates to a method for recognizing persons. In particular, it is the recognition of persons from whom an image has already been taken. Moreover, the invention relates to a corresponding computer program product and a system comprising a camera and a control device for carrying out such a method.

Aus dem Stand der Technik sind Personen-Wiedererkennungsalgorithmen bekannt. Diese werden insbesondere auch „person re-identification, PRID“ genannt. Personen im Nahbereich von Kameras lassen sich aufgrund deren Gesichter mit hoher Genauigkeit wiedererkennen. Dieses Verfahren wird insbesondere „face recognition“ genannt. Im erweiterten Nahbereich oder Fernbereich der Kamera können diese Algorithmen nicht robust arbeiten, da die Abbildung der Gesichter auf das Bild zu klein für eine zuverlässige Wiedererkennung ist. Auch können die Verfahren nicht robust eingesetzt werden, wenn Personen sich absichtlich oder zufällig von der Kamera abwenden. Nicht nur aus diesem Grund lässt sich eine erhöhte Aktivität im Forschungsbereich der Personen-Wiedererkennungsalgorithmen beobachten, in dem versucht wird, die ganze Person anhand ihrer Farb- und Texturmerkmale (engl. Appearance) wiederzuerkennen und nicht lediglich anhand ihres Gesichts.Personal recognition algorithms are known in the art. These are also called "person re-identification, PRID". People in the vicinity of cameras can be recognized due to their faces with high accuracy. This method is called "face recognition" in particular. In the extended near or far range of the camera, these algorithms can not work robustly because the image of the faces on the image is too small for reliable recognition. Also, the methods can not be used stably if people deliberately or accidentally turn away from the camera. Not only for this reason is there an increased activity in the field of research of the person recognition algorithms, in which one tries to recognize the whole person on the basis of their color and texture characteristics (English Appearance) and not only on the basis of their face.

Aus der WO 2015/192997 A1 ist ein Verfahren zur Personenidentifizierung bekannt, wobei eine Person kamerabasiert erkannt wird, wobei eine positive Personenidentifizierung nur dann vorliegt, wenn die Augenposition der Person identifiziert wird.From the WO 2015/192997 A1 For example, a person identification method is known wherein a person is detected by camera, with a positive person identification being present only when the person's eye position is identified.

Offenbarung der ErfindungDisclosure of the invention

Das erfindungsgemäße Verfahren zum Erkennen von Personen erlaubt ein zuverlässiges Erkennen, da insbesondere die Bildaufnahmeparameter verwendet werden. Ergibt eine Analyse der Bildaufnahmeparameter, dass ein Abbild einer Person mit gänzlich anderen Bildaufnahmeparametern aufgenommen wurde als ein vorbekanntes Referenzbild einer vorbekannten Bildersammlung, so ist ein Vergleich mittels Deskriptoren nicht aussagekräftig, da die Bilder zwar dieselbe Person zeigen können, dies jedoch aufgrund der unterschiedlichen Bildaufnahmeparameter nicht unmittelbar ersichtlich ist. Um die Gefahr von Fehlerkennungen zu minimieren, ist daher vorgesehen, dass solche vorbekannten Referenzbilder unberücksichtigt bleiben, bei denen eine zu große Abweichung der Bildaufnahmeparameter zu dem Abbild einer wiederzuerkennenden Person vorhanden ist oder explizit die Bildaufnahmeparameter bei der Bildauswertung zu berücksichtigen und den Deskriptor unter Einbezug der Bildaufnahmeparameter zu bestimmen.The inventive method for recognizing persons allows a reliable recognition, since in particular the image acquisition parameters are used. If an analysis of the image acquisition parameters shows that an image of a person with completely different image acquisition parameters was taken than a previously known reference image of a previously known image collection, a comparison by means of descriptors is not meaningful since the images can show the same person, but not because of the different image acquisition parameters is immediately apparent. In order to minimize the risk of false identifications, it is therefore provided that such prior art reference images are not taken into account, in which there is too great a deviation of the image acquisition parameters from the image of a person to be recognized, or explicitly take into account the image acquisition parameters in the image analysis, and include the descriptor with reference to the image Determine image acquisition parameters.

Das Verfahren zum Erkennen von Personen umfasst die folgenden Schritte: Zunächst erfolgt das Aufnahmen eines Abbilds einer Person mittels einer Kamera. Die Person auf dem Abbild soll anschließend aus einer vorbekannten Bildersammlung wiedererkannt werden. Dazu erfolgt als nächster Schritt das Extrahieren einzelner Bildmerkmale aus dem Abbild und Beschreiben der Bildmerkmale mittels Deskriptoren. Dies wurde eingangs bereits erläutert. Anschließend wird in einem Deskriptorraum ein Deskriptorabstand zwischen den Deskriptoren des Abbilds und vorbekannten Deskriptoren von zumindest einem vorbekannten Referenzbild einer vorbekannten Bildersammlung mittels einer vorbekannten Metrik bestimmt. Die Metrik wird insbesondere von einem selbstlernenden System gelernt und/oder optimiert. Anschließend erfolgt das Ausgeben eines Wahrscheinlichkeitswerts oder eines Rankings bzw. Distanzmaßes, dass die Person auf dem Referenzbild abgebildet ist, der anhand des Deskriptorabstands bestimmt wurde. Erfindungsgemäß ist vorgesehen, dass zumindest ein Bildaufnahmeparameter der Kamera während der Aufnahme des Abbilds mit zumindest einem korrespondierenden Bildaufnahmeparameter während der Aufnahme des Referenzbilds verglichen wird. So lässt sich ein Differenzwert zwischen den Bildaufnahmeparametern bestimmen. Erfindungsgemäß ist weiterhin vorgesehen, dass bei dem Bestimmen des Deskriptorabstands solche Referenzbilder unberücksichtigt bleiben, für die der zuvor bestimmte Differenzwert zwischen den Bildaufnahmeparametern größer als ein vordefinierter Schwellwert ist. Besonders vorteilhaft ist vorgesehen, dass ein Vergleich des Abbilds mit mehreren Referenzbildern der Bildersammlung erfolgt. Somit werden die Deskriptorabstände einer Vielzahl von Deskriptoren ermittelt. Unterschreitet der Deskriptorabstand zweier Deskriptoren einen vordefinierten Grenzabstand und/oder ist der Deskriptorabstand ein kleinster Deskriptorabstand, so wird ein hoher Wahrscheinlichkeitswert angenommen, dass die Person auf dem Referenzbild abgebildet ist. Da erheblich abweichende Bildaufnahmeparameter zu einer gänzlich unterschiedlichen Darstellung derselben Szenerie führen können, besteht die Gefahr, dass durch gänzlich unterschiedliche Bildaufnahmeparameter unterschiedliche Merkmale solche Deskriptoren aufweisen, die nur einen geringen Deskriptorabstand darstellen. Somit besteht die Gefahr einer fehlerhaften Erkennung der Person auf einem Referenzbild, das die Person gar nicht darstellt. Um dies zu verhindern, werden solche Referenzbilder nicht berücksichtigt, die hinsichtlich ihrer Bildaufnahmeparameter von den Bildaufnahmeparametern des Abbilds abweichen.The method for recognizing persons comprises the following steps: First of all, an image of a person is taken by means of a camera. The person on the image should then be recognized from a previously known image collection. For this purpose, the next step is the extraction of individual image features from the image and describing the image features using descriptors. This has already been explained at the beginning. Subsequently, in a descriptor space, a descriptor distance between the descriptors of the image and known descriptors of at least one known reference image of a previously known image collection is determined by means of a previously known metric. In particular, the metric is learned and / or optimized by a self-learning system. Subsequently, the outputting of a probability value or a ranking or distance measure takes place, so that the person is depicted on the reference image, which was determined on the basis of the descriptor distance. According to the invention, at least one image acquisition parameter of the camera is compared during the acquisition of the image with at least one corresponding image acquisition parameter during the recording of the reference image. Thus, a difference value between the image acquisition parameters can be determined. According to the invention, it is further provided that, in determining the descriptor distance, reference images for which the previously determined difference value between the image acquisition parameters is greater than a predefined threshold value are disregarded. It is particularly advantageously provided that a comparison of the image with a plurality of reference images of the image collection takes place. Thus, the descriptor distances of a plurality of descriptors are determined. If the descriptor distance of two descriptors falls below a predefined limit distance and / or if the descriptor distance is a smallest descriptor distance, then a high probability value is assumed that the person is depicted on the reference image. Since significantly different image acquisition parameters can lead to a completely different representation of the same scene, there is the danger that completely different characteristics of such descriptors are represented by completely different image acquisition parameters, which represent only a small descriptor distance. Thus, there is a risk of erroneous recognition of the person on a reference image that does not represent the person. In order to prevent this, those reference images are not taken into account that deviate from the image acquisition parameters of the image with respect to their image acquisition parameters.

Weiter wird vorgeschlagen, bei Deep Learning basierten Systemen die Bildaufnahmeparameter mit als zusätzliche Eingangsdaten zu integrieren, beim Training und Ausführen der künstlichen Intelligenz kann das Netzwerk dann selbstständig entscheiden und lernen, ob eine Kameraregelung stattgefunden hat und eine mögliche Wiedererkennung ausgeschlossen oder anders als sonst durchgeführt werden muss. Furthermore, it is proposed to integrate the image acquisition parameters in the case of deep learning-based systems as additional input data, during the training and execution of the artificial intelligence, the network can decide independently and learn whether a camera control has taken place and a possible recognition is excluded or performed differently than usual got to.

Die Unteransprüche haben bevorzugte Weiterbildungen der Erfindung zum Inhalt.The dependent claims have preferred developments of the invention to the content.

Bevorzugt ist vorgesehen, dass die Metrik in Abhängigkeit des Differenzwerts variiert wird. Somit ist insbesondere ermöglicht, eine andere Metrik zu verwenden, wenn anhand der Bildaufnahmeparameter erkannt wurde, dass eine Abweichung vorliegt. In diesem Fall kann beispielsweise der Weißabgleich verändert worden sein, so dass nicht direkt weiße Farben des Abbilds mit weißen Farben des Referenzbilds verglichen werden dürfen. Dem wird insbesondere durch eine spezielle Metrik Rechnung getragen. Durch das Variieren der Metrik in Abhängigkeit des Referenzwerts ist somit eine optimierte Wiedererkennung der Personen in den Referenzbildern ermöglicht. Das Unberücksichtigtlassen von Referenzbildern bei zu großen Differenzwerten bleibt insbesondere unberührt. Es ist außerdem anzumerken, dass das obige Nennen des Weißabgleichs lediglich beispielhaft für die Bildaufnahmeparameter in ihrer Allgemeinheit erfolgt. Bevorzugt wird anhand der Bildaufnahmeparameter festgestellt, dass die Kameraregelung noch keinen eingeschwungenen Zustand erreicht hat. In diesem Fall ist von einer Personenwiedererkennung abzusehen, da die von der Kamera gelieferten Farben beispielweise nicht denen der echten Welt entsprechen. Sobald der eingeschwungene Zustand erreicht ist, können Farben aber wieder besser verglichen werden. Der Zustand des Reglers wird bei aktuellen Verfahren gemäß dem Stand der Technik nicht berücksichtigt.It is preferably provided that the metric is varied as a function of the difference value. Thus, in particular, it is possible to use a different metric if it has been detected on the basis of the image acquisition parameters that a deviation exists. In this case, for example, the white balance may have been changed, so that not directly white colors of the image may be compared with white colors of the reference image. This is taken into account in particular by a special metric. By varying the metric as a function of the reference value, an optimized recognition of the persons in the reference images is thus made possible. The consideration of reference images for excessive difference values remains unaffected. It should also be noted that the above naming of the white balance is merely exemplary of the image pickup parameters in their generality. It is preferably determined on the basis of the image acquisition parameters that the camera control has not yet reached a steady state. In this case, you do not need personal recognition because the colors supplied by the camera, for example, do not match those of the real world. As soon as the steady state is reached, colors can be better compared again. The state of the regulator is not taken into account in current methods according to the prior art.

Insbesondere erfolgt außerdem ein Variieren der Beschreibung der Bildmerkmale des Abbilds basierend auf dem Differenzwert. Dies erlaubt insbesondere ein Anpassen der Deskriptoren an die verglichen mit dem Referenzbild geänderten Bildaufnahmeparameter des Abbilds. Somit kann wiederum der Tatsache Rechnung getragen werden, dass aufgrund der unterschiedlichen Bildaufnahmeparameter dieselben Personen unterschiedlich in dem Referenzbild und dem Abbild dargestellt werden. Dasselbe gilt vorteilhafterweise auch für andere bildverarbeitende Schritte, die während des Personen-Erkennungsalgorithmus ausgeführt werden, wie beispielsweise dem Post-Re-Ranking oder dem Person-Matching.In particular, there is also a variation in the description of the image features of the image based on the difference value. In particular, this makes it possible to adapt the descriptors to the image acquisition parameters of the image which have been changed compared to the reference image. Thus, again, the fact can be taken into account that the same persons are represented differently in the reference image and the image due to the different image acquisition parameters. The same is advantageously true for other image processing steps performed during the person recognition algorithm, such as post-re-ranking or person matching.

In einer weiteren bevorzugten Ausführungsform ist vorgesehen, dass das Abbild zusammen mit dem Referenzbild ausgegeben wird, wenn der Differenzwert den Schwellwert überschreitet. Dies ermöglicht es einer Person, beispielsweise einem Wachmann, ein manuelles Vergleichen von Abbild und Referenzbild vorzunehmen, um somit entscheiden zu können, ob das Abbild und das Referenzbild dieselbe Person zeigen oder nicht. Somit ist ermöglicht, eine manuelle Entscheidung durch eine Person einzufordern, wenn ein automatisches Erkennen nicht möglich ist.In a further preferred embodiment, it is provided that the image is output together with the reference image if the difference value exceeds the threshold value. This allows a person, such as a security guard, to manually compare the image and reference image so as to decide whether the image and the reference image show the same person or not. Thus, it is possible to request a manual decision by a person when automatic recognition is not possible.

Die Bildaufnahmeparameter umfassen insbesondere eine Belichtungszeit und/oder einen Weißabgleich und/oder eine Dynamikkompression (Tone Mapping) und/oder eine Signalverstärkung (Gain) und/oder eine Apertur. All diese Bildaufnahmeparameter werden bei modernen Überwachungskameras laufend geregelt, um stets ein optimales Bild erfassen zu können. Somit passt sich die Überwachungskamera an sich verändernde Umgebungsbedingungen an. Dies führt zu unterschiedlichen Bildern, die mittels der Kamera aufgenommen werden, obwohl stets dieselbe Szenerie gezeigt wird. Der Extremfall liegt vor, wenn die Kamera gerade regelt und noch nicht den eingeschwungenen Zustand erreicht hat. Daher besteht auch die Gefahr, dass Bildaufnahmeparameter von Abbildern von Personen unterschiedlich sind im Vergleich zu Bildaufnahmeparametern von Referenzbildern. Um diese Diskrepanz zu berücksichtigen, wurde zuvor bereits beschrieben, dass der Referenzwert zwischen den Bildaufnahmeparametern des Abbilds und des Referenzbilds berechnet wird, um anhand des Differenzwerts zu bestimmen, ob ein Vergleich von Referenzbild und Abbild sinnvoll ist oder nicht.The image acquisition parameters include, in particular, exposure time and / or white balance and / or tone mapping and / or signal gain and / or aperture. All these image acquisition parameters are constantly regulated in modern surveillance cameras in order to always be able to capture an optimal image. Thus, the surveillance camera adapts to changing environmental conditions. This results in different pictures being taken by the camera, although the same scene is always shown. The extreme case occurs when the camera is currently controlling and has not yet reached the steady state. Therefore, there is also a risk that image pickup parameters of images of people are different from image pickup parameters of reference images. In order to take account of this discrepancy, it has already been described above that the reference value between the image acquisition parameters of the image and the reference image is calculated in order to use the difference value to determine whether a comparison of reference image and image makes sense or not.

Die Metrik und/oder der Schwellwert werden vorteilhafterweise durch ein System mit Methoden der künstlichen Intelligenz optimiert. Hierzu werden insbesondere selbstlernende Algorithmen verwendet. Bei solchen selbstlernenden Algorithmen kann es sich insbesondere um Deep-Learning oder Machine-Learning handeln. Somit lassen sich die Systeme trainieren, um stets eine optimale Metrik und/oder einen optimalen Schwellwert zu finden. Mit jedem neuen Vergleich eines Abbilds mit den Referenzbildern aus der Bildersammlung kann die Metrik und/oder der Schwellwert weiter optimiert werden. Somit ist ein zuverlässiges Wiedererkennen von Personen ermöglicht.The metric and / or threshold are advantageously optimized by a system using artificial intelligence techniques. Self-learning algorithms are used in particular for this purpose. Such self-learning algorithms can in particular be deep learning or machine learning. Thus, the systems can be trained to always find an optimal metric and / or an optimal threshold. With each new comparison of an image with the reference images from the image collection, the metric and / or the threshold can be further optimized. Thus, a reliable recognition of persons is possible.

Werden Referenzbild und Abbild ausgegeben, um eine manuelle Entscheidung einzufordern, so wird vorteilhafterweise eine manuelle Eingabe ausgelesen, die angibt, ob das Referenzbild und das Abbild dieselbe Person zeigen. Diese Eingabe wird insbesondere durch eine Person, beispielsweise einen Wachmann, getätigt. Es ist wiederum vorgesehen, dass die Metrik und/oder der Schwellwert durch ein System mit Methoden der künstlichen Intelligenz, insbesondere durch selbstlernende Algorithmen, optimiert werden. Somit ist ermöglicht, anhand der manuellen Eingabe Metrik und/oder den Schwellwert weiter zu optimieren. Auf diese Weise wird ermöglicht, dass die Metrik und/oder der Schwellwert auch bei stark abweichenden Bildaufnahmeparametern trainiert werden, so dass die Anzahl der Fälle, in denen ein Vergleich zwischen Abbild und Referenzbild aufgrund erheblich unterschiedlicher Bildaufnahmeparameter nicht möglich ist, reduziert wird.If the reference image and the image are output in order to request a manual decision, then advantageously a manual input is read out, which indicates whether the reference image and the image show the same person. This input is made in particular by a person, for example a security guard. In turn, it is envisaged that the metric and / or the threshold value through a system using methods of artificial Intelligence, in particular by self-learning algorithms, optimized. This makes it possible to further optimize the metric and / or the threshold value by means of the manual input. In this way it is made possible that the metric and / or the threshold value are also trained in the case of strongly deviating image acquisition parameters, so that the number of cases in which a comparison between image and reference image is not possible due to considerably different image acquisition parameters is reduced.

Die Erfindung betrifft außerdem ein Computerprogramm, das eingerichtet ist, das Verfahren wie zuvor beschrieben auszuführen. Insbesondere ist das Computerprogramm ein Computerprogrammprodukt umfassend Instruktionen, die, wenn sie auf einem Prozessor ausgeführt werden, den Prozessor veranlassen, das zuvor beschriebene Verfahren auszuführen. Des Weiteren betrifft die Erfindung ein maschinenlesbares Speichermedium, auf dem das Computerprogramm wie zuvor beschrieben gespeichert ist. Bei dem maschinenlesbaren Speichermedium handelt es sich insbesondere um ein magnetisches Speichermedium und/oder um ein optisches Speichermedium und/oder um einen Flash-Speicher.The invention also relates to a computer program configured to carry out the method as described above. In particular, the computer program is a computer program product comprising instructions that, when executed on a processor, cause the processor to execute the method described above. Furthermore, the invention relates to a machine-readable storage medium on which the computer program is stored as described above. The machine-readable storage medium is, in particular, a magnetic storage medium and / or an optical storage medium and / or a flash memory.

Zuletzt betrifft die Erfindung ein System. Das System umfasst eine Kamera zum Aufnehmen von Abbildern von Personen. Außerdem ist die Kamera zur Signalübertragung mit einem Steuergerät des Systems verbunden. Das Steuergerät ist eingerichtet, das zuvor beschriebene Verfahren auszuführen.Finally, the invention relates to a system. The system includes a camera for taking pictures of people. In addition, the camera is connected for signal transmission with a control unit of the system. The controller is configured to execute the method described above.

Bei den vorbekannten Referenzbildern der vorbekannten Bildersammlung handelt es sich insbesondere um solche Referenzbilder, die zuvor als Abbild der Person mittels der Kamera aufgenommen wurde. Alternativ oder zusätzlich handelt es sich bei den Referenzbildern um Abbilder von Personen, die mit einer weiteren Kamera aufgenommen wurden. Somit ist ermöglicht, die zuvor mittels derselben Kamera oder mittels einer anderen Kamera erfassten Personen wieder zu erkennen.The previously known reference images of the previously known image collection are in particular those reference images which were previously recorded as an image of the person by means of the camera. Alternatively or additionally, the reference images are images of people who were taken with another camera. This makes it possible to recognize the persons previously detected by the same camera or by another camera.

Figurenlistelist of figures

Nachfolgend werden Ausführungsbeispiele der Erfindung unter Bezugnahme auf die begleitende Zeichnung im Detail beschrieben. In der Zeichnung ist:

1 eine schematische Abbildung eines Systems gemäß einem Ausführungsbeispiel der Erfindung, und
2 eine schematische Abbildung eines Ablaufs des Verfahrens gemäß einem Ausführungsbeispiel der Erfindung.

Hereinafter, embodiments of the invention will be described in detail with reference to the accompanying drawings. In the drawing is:

1 a schematic illustration of a system according to an embodiment of the invention, and
2 a schematic illustration of a sequence of the method according to an embodiment of the invention.

Ausführungsformen der ErfindungEmbodiments of the invention

Vorzugsweise wird das Verfahren zur Personenreidentifiation (PRID Verfahren) durch die folgenden vier Schritte durchgeführt:

1. Eine Person wird durch ein Detektionsverfahren detektiert. Für dieses Abbild wird eine Korrespondenz in einer vorbekannten Bildersammlung gesucht. Diese Bildersammlung kann beispielsweise alle Personenbilder aus einer anderen Kamera oder aus einem vorherigen Zeitschritt beinhalten. Alle Abbilder werden anschließend zu einer datensatz- und ansatzabhängigen Größe herunterskaliert.
2. Das Abbild der Person wird vorzugsweise in mehrere kleinere Streifen oder Kacheln unterteilt (Regionen, Patches). Für jeden Streifen bzw. jede Kachel werden diskriminative Merkmale über Deskriptoren extrahiert. Ein solches Extrahieren von Deskriptoren erfolgt beispielsweise mittels Gausscher Deskriptoren. Details hierzu ist in der folgender Veröffentlichung beschrieben: Matsukawa, T.; et. al.: „Hierarchical Gaussian Descriptor for Person Re-Identification“, 2016, ISBN: 978-1-4673-8851-1 (2016 IEEE Conference on Computer Vision and Pattern Recnognition, CVPR) .
3. Anschließend wird eine Metrik gelernt, die den Abstand zwischen zwei Deskriptoren in einem Deskriptorraum für gleiche Personen minimiert und für ungleiche Personen erhöht. Dies bedeutet, dass ein Bestimmen eines spezifischen Deskriptorabstands mit der gelernten Metrik dann zu minimalen Deskriptorabständen führt, wenn die diskriminativen Merkmale des Abbilds und des Referenzbilds identisch oder zumindest sehr ähnlich sind. Weisen besagte diskriminative Merkmale einen erheblichen Unterschied auf, so wird durch die Metrik ein großer Deskriptorabstand festgestellt.
4. Final findet ein Bestimmen eines Wahrscheinlichkeitswerts oder eines Abstandsmaßen statt, der/das eine Aussage darüber trifft, ob die Person auf dem Abbild mit einer Person aus der Bildersammlung übereinstimmt. Alternativ oder zusätzlich findet ein Ranking statt, das eine geordnete Liste zurückgibt, deren Reihenfolge mit abnehmender Ähnlichkeit der gesuchten Person und den bekannten Personen angeordnet ist.

Preferably, the method of personal identification (PRID) is performed by the following four steps:

1. A person is detected by a detection method. For this image, a correspondence is searched in a previously known image collection. For example, this image collection can contain all person images from another camera or from a previous time step. All images are then scaled down to a record-dependent and record-dependent size.
2. The image of the person is preferably divided into several smaller stripes or tiles (regions, patches). For each stripe or tile, discriminative features are extracted via descriptors. Such extracting of descriptors takes place, for example, by means of Gaussian descriptors. Details are described in the following publication: Matsukawa, T .; et. al .: "Hierarchical Gaussian Descriptor for Person Re-Identification", 2016, ISBN: 978-1-4673-8851-1 (2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR) ,
3. Then a metric is learned that minimizes the distance between two descriptors in a descriptor room for the same person and increases it for unequal people. This means that determining a specific descriptor spacing with the learned metric then results in minimal descriptor spacing if the discriminative features of the image and the reference image are identical or at least very similar. If the discriminative features indicate a significant difference, the metric determines a large descriptor distance.
4. Finally, a determination of a probability value or a distance measure takes place, which makes a statement as to whether the person in the image agrees with a person from the image collection. Alternatively or additionally, a ranking takes place which returns an ordered list whose order is arranged with decreasing similarity of the searched person and the known persons.

Ausgehend von den bestimmten Deskriptoren im Schritt 2 und der gelernten Metrik im Schritt 3 wird bestimmt, welche Deskriptoren den geringsten Abstand aufweisen um so dasjenige Referenzbild aus der Bildersammlung zu finden, das zu dem Abbild die höchste Ähnlichkeit (repräsentiert durch den geringsten Deskriptorabstand aufgrund der gelernten Metrik) aufweist.Starting from the specific descriptors in the step 2 and the learned metric in step 3 It is determined which descriptors have the shortest distance so as to find the reference image from the image collection that has the highest similarity to the image (represented by the least descriptor distance due to the learned metric).

Alternativ oder zusätzlich werden Deep Learning basierte Verfahren verwendet, die ein End-to-End Learning durchführen, eine Siamesische Struktur aufweisen oder eine Triplet Loss Architektur verwenden, um ausgehend von einem Trainingsdatensatz eine Künstliche Intelligenz zu lernen, die verwendet werden kann, um Personen wiederzuerkennen. Hierbei entscheidet das Deep Learning Netzwerk selbstständig, wie es am besten parametrisiert sein muss, um eine Person wiederzuerkennen, insbesondere durch Lernen von Gewichten. Das Netzwerk lernt für sich selber die Aufgabe der Wiedererkennung. Alternatively or additionally, deep learning based methods are used that perform end-to-end learning, have a Thai structure, or use a triplet loss architecture to learn artificial intelligence from a training record that can be used to recognize people , In doing so, the deep learning network autonomously decides how best to parametrize in order to recognize a person, in particular by learning weights. The network learns for itself the task of recognition.

Kameras, insbesondere Überwachungskameras, weisen Regelschleifen auf, die Bildaufnahmeparameter, wie die Belichtungszeit und/oder den Weißabgleich und/oder das Tone Mapping (Dynamik-Kompression) und/oder die Gain (Signalverstärkung) und/oder die Apertur, regeln. Wird die Szenerie, die von der Kamera erfasst wird, verändert, was beispielsweise durch eine untergehende Sonne oder durch ein mit Scheinwerfer durch ein Parkhaus fahrendes Fahrzeug geschehen kann, verändern sich auch die Bildaufnahmeparameter. Durch eine solche Kameraregelung kann ein und dieselbe Person einen komplett unterschiedlichen farblichen Verlauf aufweisen, beispielsweise wenn sich der Weißabgleich erheblich verändert.Cameras, in particular surveillance cameras, have control loops that regulate image acquisition parameters, such as the exposure time and / or the white balance and / or the tone mapping and / or the gain (signal amplification) and / or the aperture. If the scene, which is captured by the camera, changed, which can happen, for example, by a setting sun or by driving with a headlight through a parking garage vehicle, also changing the image acquisition parameters. By means of such a camera control, the same person can have a completely different color progression, for example if the white balance changes considerably.

1 zeigt schematisch ein System 5 gemäß einem Ausführungsbeispiel der Erfindung. Das System 5 umfasst eine Kamera 2 zum Aufnehmen von Abbildern von Personen. Die Kamera 2 ist zur Signalübertragung mit einem Steuergerät 6 verbunden. Ebenfalls zur Signalübertragung mit dem Steuergerät 6 verbunden ist ein Speicher 10. In dem Speicher 10 lassen sich Referenzbilder einer Bildersammlung speichern. Dabei ist vorgesehen, dass die Referenzbilder entweder mit einer anderen Kamera 2 oder mit der in 1 gezeigten Kamera 2 zu einem anderen, vorhergehenden Zeitpunkt aufgenommen wurden. 1 schematically shows a system 5 according to an embodiment of the invention. The system 5 includes a camera 2 for taking pictures of people. The camera 2 is for signal transmission with a control unit 6 connected. Also for signal transmission with the control unit 6 connected is a memory 10 , In the store 10 You can save reference pictures of a picture collection. It is provided that the reference images either with another camera 2 or with the in 1 shown camera 2 recorded at another previous time.

2 zeigt schematisch den Ablauf eines Verfahrens gemäß einem Ausführungsbeispiel der Erfindung. Das besagte Verfahren wird insbesondere von dem Steuergerät 6 ausgeführt. 2 schematically shows the sequence of a method according to an embodiment of the invention. The said method is used in particular by the control unit 6 executed.

Durch die Kamera 2 wird eine Person erfasst, wobei ein Abbild 1 einer Person aufgenommen wird. Das Abbild 1 soll mit Referenzbildern 3a, 3b einer Bildersammlung 4 verglichen werden, wobei die Referenzbilder 3a, 3b ebenso wie die Bildersammlung 4 vorbekannt sind und insbesondere in dem Speicher 10 gespeichert sind.Through the camera 2 a person is captured, with an image 1 a person is recorded. The image 1 should with reference pictures 3a . 3b a picture collection 4 be compared with the reference images 3a . 3b as well as the picture collection 4 are previously known and in particular in the memory 10 are stored.

Aus dem Abbild 1 werden Bildmerkmale extrahiert und mittels Deskriptoren 7 beschrieben. In 2 ist beispielhaft ein einzelner Deskriptor 7 in einem Deskriptorraum 9 gezeigt. Bei dem Deskriptor handelt es sich beispielsweise um einen Gausschen Deskriptor (Gaussian of Gaussian, GOG) oder eines von diversen anderen Verfahren. Besonders vorteilhaft kann das Abbild 1 zuvor in horizontale Streifen aufgeteilt werden, um den Deskriptoren 7 lediglich einen der Streifen 11 zuzuweisen.From the image 1 image features are extracted and using descriptors 7 described. In 2 is an example of a single descriptor 7 in a descriptor room 9 shown. The descriptor is, for example, a Gaussian Descriptor (Gaussian of Gaussian, GOG) or one of several other methods. Particularly advantageous is the image 1 previously split into horizontal strips to the descriptors 7 just one of the stripes 11 assign.

Ebenso wie für das Abbild 1 sind Deskriptoren auch für die Referenzbilder 3a und 3b vorhanden. In 2 ist beispielhaft ein erstes Referenzbild 3a und ein zweites Referenzbild 3b gezeigt, wobei das zweite Referenzbild 3b dieselbe Person wie das Abbild 1 zeigt. In dem Deskriptorraum 9 ist daher ein Deskriptorabstand 8 zwischen den Deskriptoren 7 des Abbilds 1 und des zweiten Referenzbilds 3b minimiert, insbesondere kleiner als ein Deskriptorabstand 8 zwischen des Deskriptoren 7 des Abbilds 1 und des ersten Referenzbilds 3a.As well as for the image 1 are also descriptors for the reference pictures 3a and 3b available. In 2 is an example of a first reference image 3a and a second reference picture 3b shown, wherein the second reference image 3b the same person as the image 1 shows. In the descriptor room 9 is therefore a descriptor distance 8th between the descriptors 7 of the image 1 and the second reference picture 3b minimized, in particular smaller than a descriptor distance 8th between the descriptors 7 of the image 1 and the first reference image 3a ,

Es wird eine Metrik vorteilhafterweise durch ein selbstlernendes System gelernt, dass die Deskriptorabstände 8 bestimmt. Anschließend kann anhand der Deskriptorabstände ein Wahrscheinlichkeitswert ausgegeben werden, dass die Person, die auf dem Abbild 1 dargestellt ist, auf dem ersten Referenzbild 3a und auf dem zweiten Referenzbild 3b ebenfalls dargestellt ist. Als Ergebnis wird für das erste Referenzbild 3a ein geringer Wahrscheinlichkeitswert ermittelt, während für das zweite Referenzbild 3b ein großer Wahrscheinlichkeitswert ermittelt wird. Die Wahrscheinlichkeitswerte lassen sich insbesondere unmittelbar aus den Deskriptorabständen 8 ermitteln. Auch kann statt einer Wahrscheinlichkeit eine Reihenfolge bzw. ein Ranking der Referenzbilder 3a, 3b erstellt werden, bei dem ein größer werdender Rang Aussage darüber macht, dass die Ähnlichkeit zwischen dem Abbild 1 und dem jeweiligen Referenzbild 3a, 3b abnimmt/geringer ist. Der Abstand zwischen zwei Deskriptoren 7 wird hier direkt oder via Metrik bestimmt.A metric is advantageously learned by a self-learning system that the descriptor distances 8th certainly. Then, based on the descriptor distances, a probability value can be output that the person on the image 1 is shown on the first reference picture 3a and on the second reference picture 3b is also shown. As a result, for the first reference picture 3a a low probability value, while for the second reference image 3b a large probability value is determined. The probability values can in particular be derived directly from the descriptor distances 8th determine. Also, instead of a probability, an order or a ranking of the reference images 3a . 3b in which a growing rank makes statement that the similarity between the image 1 and the respective reference picture 3a . 3b decreases / less. The distance between two descriptors 7 is determined here directly or via metric.

Erfindungsgemäß erfolgt zusätzlich die Berücksichtigung von Bildaufnahmeparametern. So ist insbesondere ermöglicht, einen Differenzwert zu ermitteln. Dazu werden die Bildaufnahmeparameter des Abbilds 1 und die Bildaufnahmeparameter der Referenzbilder 3a, 3b verglichen. Bei den Bildaufnahmeparametern handelt es sich insbesondere um eine Belichtungszeit und/oder einen Weißabgleich und/oder eine Dynamikkompression (Tone-Mapping) und/oder um eine Signalverstärkung (Gain) und/oder um eine Apertur und/oder um sonstige Bildaufnahmeparameter. Anhand des Differenzwerts kann angegeben werden, welche Unterschiede zwischen den Bildaufnahmeparametern vorhanden sind. Sind zu große Unterschiede vorhanden oder ist der eingeschwungene Zustand nicht erreicht, so kann ein Fehler bei der Personenerkennung auftreten. Auch dieser Fall ist schematisch in 2 dargestellt.According to the invention, the consideration of image acquisition parameters additionally takes place. This makes it possible in particular to determine a difference value. To do this, the image acquisition parameters of the image become 1 and the image pickup parameters of the reference images 3a . 3b compared. The image acquisition parameters are in particular exposure time and / or white balance and / or dynamic compression (tone mapping) and / or signal amplification (gain) and / or an aperture and / or other image acquisition parameters. The difference value can be used to indicate the differences between the image acquisition parameters. If there are too large differences or if the steady state has not been reached, an error in the recognition of persons can occur. Also this case is schematic in 2 shown.

So zeigt 2 ein alternatives Abbild 1a. Das alternative Abbild 1a zeigt dieselbe Person wie das Abbild 1, jedoch wurde ein anderer Weißabgleich verwendet. Dadurch ist ein Farbmerkmal der Person anders abgebildet, und weist daher eine sehr große Ähnlichkeit mit dem ersten Referenzbild 3a auf. Somit würde das Verfahren fälschlicherweise das erste Referenzbild 3a anstelle des zweiten Referenzbilds 3b erkannt, was zu dem Ergebnis führen würde, dass die in dem alternativen Abbild 1a dargestellte Person dieselbe ist wie in dem ersten Referenzbild 3a. Diese Erkennung ist jedoch fehlerhaft und dem unterschiedlichen Weißabgleich geschuldet. So shows 2 an alternative image 1a , The alternative image 1a shows the same person as the image 1 but a different white balance was used. As a result, a color feature of the person is shown differently, and therefore has a very close resemblance to the first reference image 3a on. Thus, the method would mistakenly be the first reference image 3a instead of the second reference picture 3b detected, which would lead to the result that in the alternative image 1a Person shown is the same as in the first reference image 3a , However, this detection is faulty and due to the different white balance.

Erfindungsgemäß wird somit anhand des Differenzwerts der Bildaufnahmeparameter ermittelt, ob ein Vergleich einzelner Referenzbilder 3a, 3b mit dem Abbild 1 bzw. dem alternativen Abbild 1a sinnvoll ist. Liegt der Differenzwert oberhalb eines vordefinierten Schwellwerts, so besteht die zuvor beschriebene Gefahr der fehlerhaften Erkennung. Die Referenzbilder 3a, 3b, die Bildaufnahmeparameter aufweisen, die zu stark von dem Bildaufnahmeparametern des Abbilds 1 bzw. des alternativen Abbilds 1a abweichen, d.h., wenn der Differenzwert zwischen den Bildaufnahmeparametern des Referenzbilds 3a, 3b und den Bildaufnahmeparametern des Abbilds 1 bzw. alternativen Abbilds 1a den Schwellwert überschreitet, bleiben unberücksichtigt. Somit wird insbesondere der Deskriptorabstand 8 zwischen dem ersten Referenzbild 3a und dem alternativen Abbild 1a nicht bestimmt. Somit ist die Gefahr der fehlerhaften Erkennung vermindert.According to the invention, it is thus determined based on the difference value of the image acquisition parameters, whether a comparison of individual reference images 3a . 3b with the image 1 or the alternative image 1a makes sense. If the difference value lies above a predefined threshold value, then the previously described danger of erroneous recognition exists. The reference pictures 3a . 3b that have picture-taking parameters that are too strong from the picture-taking parameters of the picture 1 or the alternative image 1a differ, that is, if the difference value between the image acquisition parameters of the reference image 3a . 3b and the image capture parameters of the image 1 or alternative image 1a exceeds the threshold, ignored. Thus, in particular, the descriptor distance 8th between the first reference picture 3a and the alternative image 1a not determined. Thus, the risk of erroneous detection is reduced.

Der Schwellwert kann insbesondere von einem selbstlernenden System, beispielsweise mittels Methoden der künstlichen Intelligenz, angepasst werden. The threshold value can in particular be adapted by a self-learning system, for example by means of artificial intelligence.

Dazu lernt das selbstlernende System nach jedem Vergleich und/oder anhand jedes neuen Abbilds 1, um den Schwellwert zu optimieren.To do this, the self-learning system learns after each comparison and / or on the basis of each new image 1 to optimize the threshold.

Bevorzugt ist außerdem vorgesehen, dass die Metrik zum Bestimmen des Deskriptorabstands 8 variiert wird, wenn eine Abweichung der Bildaufnahmeparameter vorliegt. Somit wird die Metrik insbesondere basierend auf dem Differenzwert optimiert. Dies erfolgt vorteilhafterweise durch selbstlernende Systeme, insbesondere mittels Methoden der künstlichen Intelligenz. Beispielsweise kann die Methodik des „Transfer-Learning“ eingesetzt werden, um eine spezielle Metrik zu lernen, wobei die spezielle Metrik dann eingesetzt wird, wenn die Bildaufnahmeparameter von Abbild 1 und Referenzbild 3a, 3b abweichen, aber dennoch unterhalb des Schwellwerts liegen. Diese spezielle Metrik wird insbesondere mit Referenzbildern 3a, 3b und Abbildern 1 trainiert, die bereits unterschiedliche Bildaufnahmeparameter aufweisen.Preferably, it is also provided that the metric for determining the descriptor distance 8th is varied if there is a deviation of the image acquisition parameters. Thus, the metric is optimized based in particular on the difference value. This is advantageously done by self-learning systems, in particular by means of artificial intelligence. For example, the transfer learning methodology may be used to learn a particular metric, with the particular metric being used when the image acquisition parameters of the image 1 and reference picture 3a . 3b differ but still below the threshold. This particular metric is used in particular with reference pictures 3a . 3b and images 1 trained, which already have different image acquisition parameters.

Wird ein Referenzbild 3a, 3b aufgrund zu hoher Abweichungen der Bildaufnahmeparameter nicht berücksichtigt, so erfolgt vorteilhafterweise eine Ausgabe des Abbilds 1 zusammen mit dem nicht berücksichtigten Referenzbild 3a, 3b an eine Person, beispielsweise an einen Wachmann. Somit kann die Person manuell entscheiden, ob das Abbild 1 und das entsprechend ausgegebene Referenzbild 3a, 3b dieselbe Person darstellen. Wird eine solche manuelle Entscheidung getroffen, so kann insbesondere wiederum die Metrik und/oder der Schwellwert trainiert und damit optimiert werden.Becomes a reference picture 3a . 3b due to excessive deviations of the image acquisition parameters are not taken into account, it is advantageous to output the image 1 together with the disregarded reference image 3a . 3b to a person, for example to a security guard. Thus, the person can manually decide if the image 1 and the corresponding output reference image 3a . 3b represent the same person. If such a manual decision is made, in turn, in turn, the metric and / or the threshold can be trained and thus optimized.

ZITATE ENTHALTEN IN DER BESCHREIBUNG QUOTES INCLUDE IN THE DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of the documents listed by the applicant has been generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte PatentliteraturCited patent literature

WO 2015/192997 A1 [0003]

Zitierte Nicht-PatentliteraturCited non-patent literature

Matsukawa, T .; et. al .: "Hierarchical Gaussian Descriptor for Person Re-Identification", 2016, ISBN: 978-1-4673-8851-1 (2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR). [0018]

Claims

A method of recognizing persons, comprising: • taking an image (1) of a person by means of a camera (2), • extracting individual image features from the image (1) and describing the image features using descriptors (7), • determining a descriptor spacing (8 ) between the descriptors (7) of the image (1) and known descriptors (7) of at least one known reference image (3) of a previously known image collection (4) by means of a previously known metric, and • outputting a probability value that the person on the reference image ( 3) determined from the descriptor distance (8) or a sequence of reference images (3) having an increasing descriptor distance (8) between the descriptor (7) of the image (1) and the respective prior art descriptor (7). the reference image (3), characterized in that • at least one image recording parameter of the camera (2) during recording of the image (1) with zumi ndest a corresponding picture-taking parameter during the recording of the reference picture (3) to determine a difference value between the picture-taking parameters, and • disregarding in the determination of the descriptor-spacing (8) those reference pictures (3) for which the difference value is greater than one predefined threshold is.

Method according to Claim 1 , characterized in that the metric is varied in dependence of the difference value.

Method according to one of the preceding claims, characterized in that it is recognized from the image acquisition parameters of the camera (2) that a camera control for adjusting the image acquisition parameters was active when the image (1) was taken, wherein instead of the probability value or the order a message is issued that recognition of persons with the image (1) is not possible.

Method according to one of the preceding claims, characterized in that the description of the image features of the image (1) is varied based on the difference value.

Method according to one of the preceding claims, characterized in that the image (1) is output together with the reference image (3) if the difference value exceeds the threshold value.

Method according to one of the preceding claims, characterized in that the image acquisition parameter comprises an exposure time and / or a white balance and / or a dynamic compression and / or a signal amplification and / or an aperture.

Method according to one of the preceding claims, characterized in that the metric and / or the threshold value are optimized by a system with methods of artificial intelligence, in particular by self-learning algorithms.

Method according to Claim 5 , characterized in that a manual input is read, whether the output reference image (3) and the output image (1) show the same person, wherein the metric and / or the threshold value by a system with methods of artificial intelligence, in particular by self-learning algorithms , are optimized by means of manual input.

Computer program adapted to carry out the method according to one of the preceding claims.

Machine-readable storage medium on which the computer program is based Claim 9 is stored.

System (5) comprising a camera (2) and a control unit (6) connected to the camera (2) for signal transmission, wherein the control unit (6) is set up, the method according to one of Claims 1 to 8th perform.