CN111723610B - Image recognition method, device and equipment - Google Patents

Image recognition method, device and equipment Download PDF

Info

Publication number
CN111723610B
CN111723610B CN201910213315.7A CN201910213315A CN111723610B CN 111723610 B CN111723610 B CN 111723610B CN 201910213315 A CN201910213315 A CN 201910213315A CN 111723610 B CN111723610 B CN 111723610B
Authority
CN
China
Prior art keywords
image
person
region
body area
identified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910213315.7A
Other languages
Chinese (zh)
Other versions
CN111723610A (en
Inventor
刘通
李姣
刘朋樟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Wodong Tianjun Information Technology Co Ltd
Priority to CN201910213315.7A priority Critical patent/CN111723610B/en
Publication of CN111723610A publication Critical patent/CN111723610A/en
Application granted granted Critical
Publication of CN111723610B publication Critical patent/CN111723610B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/40Filling a planar surface by adding surface attributes, e.g. colour or texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the invention provides an image identification method, device and equipment, wherein the method comprises the following steps: acquiring an image to be identified, wherein the image to be identified comprises at least one person, detecting the image to be identified, and determining a first body area and a second body area of each person; and for each person, rotating the image to be identified according to the first body area and the second body area of the person to obtain a rotated image, and re-identifying the person according to the rotated image, wherein the gesture direction of the person in the rotated image is consistent with the target direction. Therefore, the character gesture direction in the rotated image is adjusted to the target direction by rotating the image to be identified, so that the difference of character gestures of different images is reduced, and the accuracy of the re-identification of the people can be improved.

Description

Image recognition method, device and equipment
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to an image identification method, an image identification device and image identification equipment.
Background
With the development of computer and internet technologies, unmanned stores are gradually becoming integrated into people's lives. An unmanned store, also called an unmanned store or an unmanned supermarket, refers to a store in which no salesperson or cashier sells and manages. Specifically, cameras are arranged at different positions in an unmanned store, the cameras acquire image information of customers entering the store in real time, and then people in the acquired images are identified by adopting a person re-identification technology.
The person re-identification technology, also called as a cross-mirror tracking technology, is a technology for judging whether a specific person exists in an image or a video sequence by utilizing a computer vision technology, and determining whether the person in different images is the same person by carrying out feature extraction and identification on images shot by different cameras. In the prior art, when a person re-identifies an image crossing a lens, firstly, an image detection technology is utilized to detect a person in the image, then, feature extraction is carried out on the person, similarity measurement is carried out on the extracted feature and the feature of a target person, and whether the person in the image is the target person is determined.
However, in the scene of an unmanned store, images acquired from multiple angles by multiple cameras are involved, the pose difference of the same person shot by different cameras is large, and the accuracy of person re-identification is low.
Disclosure of Invention
The embodiment of the invention provides an image recognition method, device and equipment, which are used for improving the accuracy of personnel re-recognition.
In a first aspect, an embodiment of the present invention provides an image recognition method, including:
acquiring an image to be identified, wherein the image to be identified comprises at least one person;
Detecting the image to be identified, and determining a first body area and a second body area of each person, wherein the first body area and the second body area are two areas which are different in position along the height direction;
and for each person, rotating the image to be identified according to the first body area and the second body area of the person to obtain a rotated image, and re-identifying the person according to the rotated image, wherein the gesture direction of the person in the rotated image is consistent with the target direction.
Optionally, the rotating the image to be identified according to the first body area and the second body area of the person to obtain a rotated image includes:
acquiring a connecting line direction between a central point of the first body area and a central point of the second body area;
determining a rotation angle to be generated corresponding to the person according to the connection line direction and the target direction;
and rotating the image to be identified by the angle to be rotated by taking the center point of the second body area as a rotation center, so as to obtain the rotated image.
Optionally, before the image to be identified is rotated according to the first body area and the second body area of the person, the method further includes:
Determining the height and width of a target image corresponding to the person according to the first body area and the second body area of the person, wherein the target image is at least part of the rotated image, and the target image comprises the upper body of the person;
the re-identifying the person according to the rotated image comprises the following steps:
cutting and/or filling the rotated image according to the height and the width to obtain a target image corresponding to the person;
and re-identifying the person according to the target image corresponding to the person.
Optionally, the first body area is a head area, the second body area is an upper body area, and determining, according to the first body area and the second body area of the person, a height and a width of a target image corresponding to the person includes:
and determining the height and the width of the target image corresponding to the person according to the center point coordinates of the head area, the center point coordinates of the upper body area and the vertex coordinates of the upper body area of the person.
Optionally, the rotating the image to be identified by the angle to be rotated with the center point of the second body area as a rotation center, to obtain the rotated image includes:
Converting each pixel point in the image to be identified from an image coordinate system to a Cartesian coordinate system, wherein the origin of the Cartesian coordinate system is the center point of the second body area of the person;
in a Cartesian coordinate system, taking the origin as the center, and rotating the pixel points by the angle to be rotated;
and converting each pixel point after rotation into an image coordinate system from a Cartesian coordinate system to obtain a rotated image.
Optionally, the first body region is a head region and the second body region is an upper body region.
Optionally, the detecting the image to be identified, determining a first body area and a second body area of each person, includes:
inputting the image to be identified into a head detection model, and acquiring at least one head area in the image to be identified;
inputting the image to be identified into an upper body detection model, and acquiring at least one upper body region in the image to be identified;
and matching the at least one head area and the at least one upper body area to determine the head area and the upper body area of each person.
Optionally, the matching the at least one head region and the at least one upper body region includes:
And acquiring an intersection ratio IOU index between the head region and each upper body region aiming at any head region in the at least one head region, and taking the upper body region corresponding to the head region and the maximum IOU index as the head region and the upper body region of the same person.
In a second aspect, an embodiment of the present invention provides an image recognition apparatus, including:
the acquisition module is used for acquiring an image to be identified, wherein the image to be identified comprises at least one person;
the detection module is used for detecting the image to be identified and determining a first body area and a second body area of each person, wherein the first body area and the second body area are two areas which are different in position along the height direction;
the rotating module is used for rotating the image to be identified according to the first body area and the second body area of each person to obtain a rotated image, and the gesture direction of the person in the rotated image is consistent with the target direction;
and the identification module is used for re-identifying the person according to the rotated image.
Optionally, the rotation module is specifically configured to:
Acquiring a connecting line direction between a central point of the first body area and a central point of the second body area;
determining a rotation angle to be generated corresponding to the person according to the connection line direction and the target direction;
and rotating the image to be identified by the angle to be rotated by taking the center point of the second body area as a rotation center, so as to obtain the rotated image.
Optionally, the rotation module is further configured to:
determining the height and width of a target image corresponding to the person according to the first body area and the second body area of the person, wherein the target image is at least part of the rotated image, and the target image comprises the upper body of the person;
the identification module is specifically used for:
cutting and/or filling the rotated image according to the height and the width to obtain a target image corresponding to the person;
and re-identifying the person according to the target image corresponding to the person.
Optionally, the first body area is a head area, the second body area is an upper body area, and the rotation module is specifically configured to:
and determining the height and the width of the target image corresponding to the person according to the center point coordinates of the head area, the center point coordinates of the upper body area and the vertex coordinates of the upper body area of the person.
Optionally, the rotation module is specifically configured to:
converting each pixel point in the image to be identified from an image coordinate system to a Cartesian coordinate system, wherein the origin of the Cartesian coordinate system is the center point of the second body area of the person;
in a Cartesian coordinate system, taking the origin as the center, and rotating the pixel points by the angle to be rotated;
and converting each pixel point after rotation into an image coordinate system from a Cartesian coordinate system to obtain a rotated image.
Optionally, the first body region is a head region and the second body region is an upper body region.
Optionally, the detection module is specifically configured to:
inputting the image to be identified into a head detection model, and acquiring at least one head area in the image to be identified;
inputting the image to be identified into an upper body detection model, and acquiring at least one upper body region in the image to be identified;
and matching the at least one head area and the at least one upper body area to determine the head area and the upper body area of each person.
Optionally, the detection module is specifically configured to:
and acquiring an intersection ratio IOU index between the head region and each upper body region aiming at any head region in the at least one head region, and taking the upper body region corresponding to the head region and the maximum IOU index as the head region and the upper body region of the same person.
In a third aspect, an embodiment of the present invention provides an image recognition apparatus, including: at least one processor and memory;
the memory stores a computer program;
the at least one processor executing the computer program stored by the memory causes the at least one processor to perform the method of any one of the first aspects.
In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements a method according to any of the first aspects.
The embodiment of the invention provides an image identification method, an image identification device and image identification equipment, wherein the method comprises the following steps: acquiring an image to be identified, wherein the image to be identified comprises at least one person, detecting the image to be identified, and determining a first body area and a second body area of each person; and for each person, rotating the image to be identified according to the first body area and the second body area of the person to obtain a rotated image, and re-identifying the person according to the rotated image, wherein the gesture direction of the person in the rotated image is consistent with the target direction. Therefore, the character gesture direction in the rotated image is adjusted to the target direction by rotating the image to be identified, so that the difference of character gestures of different images is reduced, and the accuracy of the re-identification of the people can be improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
Fig. 1 is a schematic diagram of a network architecture to which an embodiment of the present invention is applicable;
fig. 2 is a schematic flow chart of an image recognition method according to an embodiment of the present invention;
fig. 3A and 3B are schematic views of a gesture direction of a person in an embodiment of the present invention;
FIGS. 4A and 4B are schematic views of a human body region according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of adjusting a character gesture direction to a target direction according to an embodiment of the present invention;
fig. 6 is a second schematic flow chart of an image recognition method according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of an image to be identified and a target image according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a to-be-rotated angle in an image to be identified according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an image recognition device according to an embodiment of the present invention;
Fig. 10 is a schematic hardware structure of an image recognition device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Fig. 1 is a schematic diagram of a network architecture to which an embodiment of the present invention is applicable, as shown in fig. 1, including: a plurality of cameras and an image recognition device. Each camera can be arranged at different preset positions and is used for collecting videos or images of people at the preset positions and sending the collected videos or images to the image recognition equipment; the image recognition device performs personnel re-recognition on the video or the image, so that the tracking of the moving track of the person is realized, or the rapid positioning of a certain target person is realized.
The image recognition device may be a server or a mobile terminal. In connection with the network architecture shown in fig. 1, several application scenarios to which the embodiments of the present invention are applicable are described below.
One possible application scenario is an unmanned store scenario. In the scene, cameras are required to be arranged at different positions in a store, and the moving behavior and/or purchasing behavior of a customer are tracked by utilizing videos or images shot by the cameras. The specific tracking process is as follows: when a customer A enters an unmanned store, the No. 1 camera collects an image of the customer and sends the image to the image recognition device. The image recognition device recognizes a person in the image, and judges that the person is a customer a newly entering the store, and adds the person to the tracking queue. And, the image recognition apparatus recognizes and records the purchasing behavior of the customer a from the image. In the moving process of the customer A in the unmanned store, the customer A may move from the coverage area of the No. 1 camera to the coverage area of the No. 2 camera, and after the No. 2 camera sends the acquired image to the image recognition device, the image recognition device recognizes the person in the image, and records the moving behavior and purchasing behavior of the customer A when recognizing that the person is the customer A in the tracking queue. Through the above process, the tracking of the customer behavior is realized.
Another possible application scenario is a security scenario. In the scene, cameras are arranged at a plurality of positions needing to be monitored, and are used for collecting videos or images of the scene and uploading the collected videos or images to image recognition equipment. When a certain target person needs to be quickly positioned, the image recognition device searches videos or images acquired by the camera according to the image of the target person so as to position the target person.
It should be noted that, besides the two application scenarios, other application scenarios are possible, and the embodiments of the present invention are not limited to the above. For convenience of description, the following embodiments will take an unmanned store scene as an example.
In the prior art, when a person re-identifies an image crossing a lens, firstly, an image detection technology is utilized to detect a person in the image, then, feature extraction is carried out on the person, similarity measurement is carried out on the extracted feature and the feature of a target person, and whether the person in the image is the target person is determined.
However, in the unmanned store scene, images acquired from multiple angles by multiple cameras are involved, and the pose difference of the same person photographed by different cameras is large. In addition, in order to obtain the behavior information of the customers in the unmanned store in an omnibearing manner, cameras are usually arranged in the top area of the store, namely, a plurality of cameras collect images downwards from the top, so that the difference and uncertainty of the figure postures in the images are further increased, and the accuracy of the re-identification of the personnel is low.
In order to solve the above problems, an embodiment of the present invention provides an image recognition method, in which, before a person re-recognizes an image to be recognized, the image to be recognized is rotated so that the gesture direction of a person in the rotated image is consistent with a target direction, and then the person re-recognizes the rotated image. The character gesture directions in the images are the same by rotating the images, namely, the character gesture directions in the images are consistent with the target directions, so that the difference of the character gesture directions is reduced, and the accuracy of re-identification of the people is improved.
The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Fig. 2 is a schematic flow chart of an image recognition method according to an embodiment of the present invention, where the method of the present embodiment may be performed by an image recognition apparatus, and the apparatus may be in the form of software and/or hardware, and the apparatus may be specifically provided in the image recognition device in fig. 1.
As shown in fig. 2, the method of the present embodiment includes:
s201: and acquiring an image to be identified, wherein the image to be identified comprises at least one person.
Specifically, the image to be recognized is photographed by a camera in the unmanned store and transmitted to the image recognition device. One or more persons are included in the image to be identified. It is understood that when a plurality of persons are included in the image to be recognized, the directions of the attitudes of the plurality of persons may be the same or may be different.
Fig. 3A and 3B are schematic views of a human gesture direction according to an embodiment of the present invention. In an exemplary, unmanned store scene, a camera at the top of the store shoots downwards from the top, the shot image includes a person a and a person B, and when the distance between the person a and the person B is relatively short, the gesture directions of the person a and the person B in the image to be recognized are the same or similar, as shown in fig. 3A. When the distance between the person a and the person B is large, the orientation of the person a and the person B in the image to be recognized is different, and the difference is large, as shown in fig. 3B.
It should be noted that, in the embodiment of the present invention, the gesture direction refers to the direction of the height of the person in the image plane (from the head to the foot), and may be identified by using an angle between the direction of the height of the person and any coordinate axis of the image coordinate system. Illustratively, taking the y-axis of the image coordinate system as an example, as shown in fig. 3B, the posture direction of the person a may be represented by θ1, and the posture direction of the person B may be represented by θ2.
S202: and detecting the image to be identified, and determining a first body area and a second body area of each person, wherein the first body area and the second body area are two areas which are different in position along the height direction.
Wherein the first body area and the second body area are both part areas of the human body. The first body area and the second body area are two areas arranged in the direction of the height of the person. Illustratively, the first body region may be a head region and the second body region may be an upper body region; alternatively, the first body region may be a head region and the second body region may be a foot region; or alternatively; the first body region is an upper body region and the second body region is a lower body region.
Fig. 4A and 4B are schematic views of a human body region according to an embodiment of the present invention, and exemplary cases in which a first body region is a head region and a second body region is an upper body region are illustrated in fig. 4A; in fig. 4B, the case where the first body area is the head area and the second body area is the lower body area is illustrated.
It will be appreciated that other combinations of the first body area and the second body area are possible in addition to those listed above, and embodiments of the present invention are not specifically limited thereto.
In step S202, when the image to be identified is detected, a trained detection model may be used to detect the image to be identified, so as to obtain a first body area and a second body area. Illustratively, after inputting an image to be identified into a head detection model, the head detection model marks a frame corresponding to a head region in the image; after an image to be identified is input into an upper body detection model, the upper body detection model marks a frame corresponding to an upper body region in the image; after an image to be identified is input into a lower body detection model, the lower body detection model marks a frame body corresponding to a lower body area in the image; after the image to be identified is input into the foot detection model, the foot detection model marks the frame corresponding to the foot region in the image.
The detection model may be an existing detection model based on machine learning, which is not limited in the embodiment of the present invention. In an alternative embodiment, a SSD detection model based on neural networks MobileNet is employed. MobileNets is an efficient model proposed for mobile and embedded devices, which uses depth separable convolution (Depthwise Separable Convolutions) to build lightweight deep neural networks based on a streamlined architecture (streamline). MobileNet has the characteristic of light weight, and can be operated on mobile equipment with high efficiency.
S203: and for each person, rotating the image to be identified according to the first body area and the second body area of the person to obtain a rotated image, and re-identifying the person according to the rotated image, wherein the gesture direction of the person in the rotated image is consistent with the target direction.
After determining the first body area and the second body area of each person in step S202, since the first body area and the second body area are two areas along the height direction of the person, the current posture direction of the person can be determined according to the relative positions of the two areas. Then, the image to be recognized may be rotated so that the posture direction of the person in the rotated image coincides with the target direction.
In this embodiment, the gesture directions of all the people in the image to be recognized are adjusted to the target direction. It will be appreciated that the target direction may be any direction. The target direction may be a y-axis direction in the image coordinate system, an x-axis direction in the image coordinate system, or other directions. Fig. 5 is a schematic diagram of adjusting the gesture direction of a person to a target direction according to an embodiment of the present invention, where, as shown in fig. 5, a first image is an image to be identified, and a second image is a rotated image, in order to conform to the visual characteristics of a human being, in fig. 5, the gesture directions of all the persons are adjusted to be consistent with the y-axis direction in the image coordinate system.
In addition, in one possible scenario, given a certain specific image, when it is required to determine whether a person in the image to be recognized is the same person as the person in the specific image, the person posture direction in the specific image may be taken as the target direction, that is, the person posture direction in the image to be recognized is adjusted to be consistent with the person posture direction in the specific image.
It is understood that when a plurality of persons are included in the image to be recognized, since the posture directions of the plurality of persons may not be the same, as shown in fig. 3B. Therefore, in this embodiment, it is necessary to rotate the image to be recognized separately for each person in the image to be recognized, obtain a rotated image, and then re-recognize the person according to the rotated image. For example, assuming that the image to be identified includes a person a and a person B, performing first rotation on the image to be identified so that the gesture direction of the person a is consistent with the target direction, and re-identifying the person a according to the rotated image; then, the image to be identified is rotated for the second time, so that the gesture direction of the person B is consistent with the target direction, and the person B is re-identified according to the rotated image.
In an alternative embodiment, the rotation of the image to be identified may be performed in the following manner: acquiring a connecting line direction between a central point of the first body area and a central point of the second body area; determining a rotation angle to be generated corresponding to the person according to the connection line direction and the target direction; and rotating the image to be identified by the angle to be rotated by taking the center point of the second body area corresponding to the person as a rotation center, so as to obtain the rotated image.
Further, the person is re-identified based on the rotated image, and specifically, the person may be identified using an existing re-identification model, which is not particularly limited in the embodiment of the present invention.
In the prior art, in order to avoid the problem of poor accuracy of person re-identification caused by the difference of the gestures of the person, the method generally adopted is as follows: a large number of training images are generated by using a generating network, and characters with different gesture directions are included in the generated images, so that the diversity of the gesture of the characters in the training images is ensured as much as possible. And training the recognition model by utilizing a large number of generated images, so as to reduce the influence of the difference of the figure gestures on the recognition accuracy.
In the embodiment, the image to be identified is rotated, so that the gesture direction of the person in the rotated image is adjusted to the target direction, the difference of the gesture of the person in different images is reduced, and the accuracy of person re-identification can be improved; compared with the prior art based on the generated network, the method has the advantages that the gesture adjustment and the recognition are carried out according to the original image, the information in the original image is reserved, the loss of useful information is avoided, and the accuracy of personnel re-recognition is further improved. In addition, since the rotation of the image does not involve a large amount of computation, the efficiency of the person re-recognition can be improved as compared with the above-described conventional technique based on the generation network.
The image recognition method provided in this embodiment includes: acquiring an image to be identified, wherein the image to be identified comprises at least one person, detecting the image to be identified, and determining a first body area and a second body area of each person; and for each person, rotating the image to be identified according to the first body area and the second body area of the person to obtain a rotated image, and re-identifying the person according to the rotated image, wherein the gesture direction of the person in the rotated image is consistent with the target direction. Therefore, the character gesture direction in the rotated image is adjusted to the target direction by rotating the image to be identified, so that the difference of character gestures of different images is reduced, and the accuracy of the re-identification of the people can be improved.
Fig. 6 is a second schematic flow chart of the image recognition method according to the embodiment of the present invention, and the embodiment further refines steps S202 and S203 based on the embodiment shown in fig. 2. In this embodiment, the first body region is the head region and the second body region is the upper body region.
As shown in fig. 6, the method of the present embodiment includes:
s601: and acquiring an image to be identified, wherein the image to be identified comprises at least one person.
S602: inputting the image to be identified into a head detection model, acquiring at least one head region in the image to be identified, inputting the image to be identified into an upper body detection model, and acquiring at least one upper body region in the image to be identified.
In this embodiment, the implementation manners of S601 and S602 are similar to those of the embodiment shown in fig. 2, and will not be repeated here.
S603: and matching the at least one head area and the at least one upper body area to determine the head area and the upper body area of each person.
It will be appreciated that when at least two persons are included in the image to be identified, after the detection of the two models, a plurality of head regions and a plurality of upper body regions may be marked in the image to be identified. In order to identify which head region and which upper body region correspond to the same person, the head region and the upper body region need to be matched.
In the specific implementation process, various matching modes can be adopted, and only one of the alternative implementation modes is described below as an example.
In the embodiment of the invention, the upper body refers to the part above the waist or the hip of the person, namely the upper body comprises the head, so that the upper body region of the same person can be overlapped with the head region to a certain extent, and the different persons can not be overlapped or are less overlapped, so that the matching can be performed according to the overlapping condition between the head region and the upper body region.
Specifically, for any one of the at least one head region, acquiring an index of an intersection ratio IOU between the head region and each of the upper body regions, and taking the upper body region corresponding to the head region and the maximum IOU index as the head region and the upper body region of the same person.
The IOU index is an index for calculating the weight of two frames and is used for measuring the correlation between the two frames, and the higher the IOU index is, the higher the correlation between the two frames is.
The specific calculation mode of the IOU index is as follows: assuming that the two frames are a and B, respectively, then:
IOU=(A∩B)/(A∪B)
that is, the IOU index indicates a ratio of an intersection area of the frame A, B to a union area of the frame A, B.
It should be noted that, in the actual application process, there may be an area of the head or the upper body that is not successfully matched, which indicates that there may be an incomplete person in the image to be identified, and for the incomplete person, no subsequent identification is required.
After the head area and the upper body area of each person are determined through the above steps, the subsequent steps S604 to S607 are respectively performed for each person.
S604: and determining the height and the width of a target image corresponding to the person according to the head area and the upper body area of the person, wherein the target image is at least part of the rotated image, and the target image comprises the upper body of the person.
In this embodiment, in the case where a plurality of persons are included in an image to be identified, or in the case where the number of persons in the image to be identified is relatively small, in order to facilitate re-identification of the person, after the image to be identified is rotated to obtain a rotated image, a portion including the person is taken out from the rotated image as a target image, and then the person is re-identified for the target image.
Fig. 7 is a schematic diagram of an image to be identified and a target image according to an embodiment of the present invention, where, as shown in fig. 7, the image to be identified includes a person a. And aiming at the person A, rotating the image to be identified so that the gesture direction of the person A is in the vertical direction, and then intercepting a target image corresponding to the person A from the rotated image, so that the person A can be re-identified according to the target image, and the accuracy of re-identification can be improved.
In this embodiment, the target image may be any region in the rotated image as long as the target image includes the upper body of the person. In an alternative embodiment, the target image is a minimal area comprising the upper body of the person. Specifically, the height and width of the target image corresponding to the person may be determined according to the center point coordinates of the head region, the center point coordinates of the upper body region, and the vertex coordinates of the upper body region of the person.
For example, if it is determined that the human body posture direction in the image to be recognized is the vertical direction (the y-axis direction of the image coordinate system) based on the center point of the head region and the center point of the upper body region, the height and width of the upper body region are directly taken as the height and width of the target image. If the human body posture direction in the image to be recognized is determined to be the horizontal direction (x-axis direction of the image coordinate system) based on the center point of the head region and the center point of the upper body region, the width of the upper body region is taken as the height of the target image, and the height of the upper body region is taken as the width of the target image. If the body posture direction in the image to be recognized is determined to be other directions (other directions than the vertical direction and the horizontal direction) according to the center point of the head area and the center point of the upper body area, the height and the width of the target image are determined according to the calculation. The calculation modes may be various, and this embodiment is not specifically described.
S605: and acquiring a connecting line direction between the central point of the head area and the central point of the upper body area of the person, determining a rotation angle to be formed by the person according to the connecting line direction and the target direction, and rotating the image to be identified by the rotation angle with the central point of the upper body area of the person as a rotation center to obtain a rotated image.
In an alternative embodiment, when determining the angle to be rotated corresponding to the person, the angle to be rotated may be calculated using an inverse trigonometric function. Fig. 8 is a schematic diagram of an angle to be rotated in an image to be identified according to an embodiment of the present invention, as shown in fig. 8, assuming that a center point coordinate of an upper body region of a person in the image to be identified is (body_x, body_y) and a center point coordinate of a head region is (head_x, head_y), the angle θ to be rotated may be calculated by the following formula.
deltax=head_x-body_x
deltax=head_y-body_y
θ=tan -1 (deltax/deltay)
In this embodiment, after the rotation angle θ is determined, the rotation is performed with the center point of the upper body region as the rotation center. The specific rotation process is as follows:
and converting each pixel point in the image to be identified from an image coordinate system to a Cartesian coordinate system, wherein the origin of the Cartesian coordinate system is the center point of the upper body area of the person. Assuming that (x, y) is the coordinate of the pixel P in the image coordinate system, the coordinate of the pixel P after conversion to the cartesian coordinate system is (x ', y'), the corresponding conversion formula is as follows:
The conversion formula in matrix form is:
in the cartesian coordinate system, the rotation of the angle θ to be rotated is performed for each pixel point with the origin (center point of the upper body region) as the center.
Assuming that (x ', y') is the coordinate of the pixel P before rotation, the coordinate of the pixel P after rotation is (x″, y″), the corresponding conversion formula is as follows:
and finally, converting each pixel point after rotation into an image coordinate system by a Cartesian coordinate system to obtain a rotated image.
S606: and cutting and/or filling the rotated image according to the height and the width to obtain a target image corresponding to the person.
It will be appreciated that since the image is rectangular, there may be some pixels beyond the boundaries of the original image after a certain angular rotation of the rectangular image. In this embodiment, after the image to be identified is rotated, the target image is obtained by cutting from the rotated image according to the height and width of the target image obtained in step S604, and if there are missing pixel points in the cut target image, the pixel filling process is performed on the target image, so that the finally obtained target image is rectangular.
When filling the missing pixel points with pixels, the filled pixel values should not affect the subsequent re-identification of personnel as much as possible.
S607: and re-identifying the person according to the target image corresponding to the person.
In the embodiment, the image to be identified is rotated, so that the gesture direction of the person in the rotated image is adjusted to the target direction, the difference of the gesture of the person in different images is reduced, and the accuracy of person re-identification can be improved. Taking three person re-recognition algorithms as examples for verification, specifically, adopting the embodiment shown in fig. 6, performing person re-recognition by adopting the three algorithms after advanced person posture adjustment, and improving the accuracy of the obtained re-recognition result by at least 2 percentage points compared with the re-recognition result obtained by directly adopting the three person re-recognition algorithms (i.e. person posture adjustment is not performed before person re-recognition).
Fig. 9 is a schematic structural diagram of an image recognition device according to an embodiment of the present invention, as shown in fig. 9, an image recognition device 900 according to the present embodiment includes: an acquisition module 901, a detection module 902, a rotation module 903, and an identification module 904.
The acquiring module 901 is configured to acquire an image to be identified, where the image to be identified includes at least one person;
the detection module 902 is configured to detect the image to be identified, and determine a first body area and a second body area of each person, where the first body area and the second body area are two areas with different positions along the height direction;
the rotation module 903 is configured to rotate, for each person, the image to be identified according to a first body area and a second body area of the person, to obtain a rotated image, where a gesture direction of the person in the rotated image is consistent with a target direction;
and the identification module 904 is used for re-identifying the person according to the rotated image.
Optionally, the rotation module 903 is specifically configured to:
acquiring a connecting line direction between a central point of the first body area and a central point of the second body area;
determining a rotation angle to be generated corresponding to the person according to the connection line direction and the target direction;
and rotating the image to be identified by the angle to be rotated by taking the center point of the second body area as a rotation center, so as to obtain the rotated image.
Optionally, the rotation module 903 is further configured to:
determining the height and width of a target image corresponding to the person according to the first body area and the second body area of the person, wherein the target image is at least part of the rotated image, and the target image comprises the upper body of the person;
the identifying module 904 is specifically configured to:
cutting and/or filling the rotated image according to the height and the width to obtain a target image corresponding to the person;
and re-identifying the person according to the target image corresponding to the person.
Optionally, the first body area is a head area, the second body area is an upper body area, and the rotation module is specifically configured to:
and determining the height and the width of the target image corresponding to the person according to the center point coordinates of the head area, the center point coordinates of the upper body area and the vertex coordinates of the upper body area of the person.
Optionally, the rotation module 903 is specifically configured to:
converting each pixel point in the image to be identified from an image coordinate system to a Cartesian coordinate system, wherein the origin of the Cartesian coordinate system is the center point of the second body area of the person;
In a Cartesian coordinate system, taking the origin as the center, and rotating the pixel points by the angle to be rotated;
and converting each pixel point after rotation into an image coordinate system from a Cartesian coordinate system to obtain a rotated image.
Optionally, the first body region is a head region and the second body region is an upper body region.
Optionally, the detection module 902 is specifically configured to:
inputting the image to be identified into a head detection model, and acquiring at least one head area in the image to be identified;
inputting the image to be identified into an upper body detection model, and acquiring at least one upper body region in the image to be identified;
and matching the at least one head area and the at least one upper body area to determine the head area and the upper body area of each person.
Optionally, the detection module 902 is specifically configured to:
and acquiring an intersection ratio IOU index between the head region and each upper body region aiming at any head region in the at least one head region, and taking the upper body region corresponding to the head region and the maximum IOU index as the head region and the upper body region of the same person.
The device of the present embodiment may be used to implement the technical solution of any of the above method embodiments, and its implementation principle and technical effects are similar, and are not described here again.
Fig. 10 is a schematic hardware structure of an image recognition device according to an embodiment of the present invention, as shown in fig. 10, an image recognition device 1000 of the present embodiment includes: at least one processor 1001 and memory 1002. The processor 1001 and the memory 1002 are connected by a bus 1003.
In a specific implementation, at least one processor 1001 executes a computer program stored in the memory 1002, so that at least one processor 1001 executes the image recognition method in any of the above method embodiments.
The specific implementation process of the processor 1001 may refer to the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein again.
In the embodiment shown in fig. 10 described above, it should be understood that the processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: digital Signal Processor, abbreviated as DSP), application specific integrated circuits (english: application Specific Integrated Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The memory may comprise high speed RAM memory or may further comprise non-volatile storage NVM, such as at least one disk memory.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or one type of bus.
The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and when a processor executes the computer program, the image recognition method in any method embodiment is realized.
The computer readable storage medium described above may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, or optical disk. A readable storage medium can be any available medium that can be accessed by a general purpose or special purpose computer.
An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). The processor and the readable storage medium may reside as discrete components in a device.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (16)

1. An image recognition method, comprising:
acquiring an image to be identified, wherein the image to be identified comprises at least one person;
detecting the image to be identified, and determining a first body area and a second body area of each person, wherein the first body area and the second body area are two areas which are different in position along the height direction; wherein the first body region is a head region and the second body region is an upper body region; or, the first body region is a head region and the second body region is a foot region; or, the first body region is an upper body region and the second body region is a lower body region;
for each person, rotating the image to be identified according to a first body area and a second body area of the person to obtain a rotated image, and re-identifying the person according to the rotated image, wherein the gesture direction of the person in the rotated image is consistent with the target direction;
the step of rotating the image to be identified according to the first body area and the second body area of the person to obtain a rotated image, including:
Acquiring a connecting line direction between a central point of the first body area and a central point of the second body area;
determining a rotation angle to be generated corresponding to the person according to the connection line direction and the target direction;
and rotating the image to be identified by the angle to be rotated by taking the center point of the second body area as a rotation center, so as to obtain the rotated image.
2. The method of claim 1, wherein prior to rotating the image to be identified based on the first and second body areas of the person, further comprising:
determining the height and width of a target image corresponding to the person according to the first body area and the second body area of the person, wherein the target image is at least part of the rotated image, and the target image comprises the upper body of the person;
the re-identifying the person according to the rotated image comprises the following steps:
cutting and/or filling the rotated image according to the height and the width to obtain a target image corresponding to the person;
and re-identifying the person according to the target image corresponding to the person.
3. The method of claim 2, wherein the first body area is a head area and the second body area is an upper body area, wherein determining the height and width of the target image corresponding to the person from the first body area and the second body area of the person comprises:
and determining the height and the width of the target image corresponding to the person according to the center point coordinates of the head area, the center point coordinates of the upper body area and the vertex coordinates of the upper body area of the person.
4. The method according to claim 1, wherein the rotating the image to be identified by the angle to be rotated with the center point of the second body area as the rotation center, to obtain the rotated image, includes:
converting each pixel point in the image to be identified from an image coordinate system to a Cartesian coordinate system, wherein the origin of the Cartesian coordinate system is the center point of the second body area of the person;
in a Cartesian coordinate system, taking the origin as the center, and rotating the pixel points by the angle to be rotated;
and converting each pixel point after rotation into an image coordinate system from a Cartesian coordinate system to obtain a rotated image.
5. The method of claim 4, wherein the first body region is a head region and the second body region is an upper body region.
6. The method of claim 5, wherein detecting the image to be identified to determine a first body area and a second body area for each of the persons comprises:
inputting the image to be identified into a head detection model, and acquiring at least one head area in the image to be identified;
inputting the image to be identified into an upper body detection model, and acquiring at least one upper body region in the image to be identified;
and matching the at least one head area and the at least one upper body area to determine the head area and the upper body area of each person.
7. The method of claim 6, wherein said matching said at least one head region and said at least one upper body region comprises:
and acquiring an intersection ratio IOU index between the head region and each upper body region aiming at any head region in the at least one head region, and taking the upper body region corresponding to the head region and the maximum IOU index as the head region and the upper body region of the same person.
8. An image recognition apparatus, comprising:
the acquisition module is used for acquiring an image to be identified, wherein the image to be identified comprises at least one person;
the detection module is used for detecting the image to be identified and determining a first body area and a second body area of each person, wherein the first body area and the second body area are two areas which are different in position along the height direction; wherein the first body region is a head region and the second body region is an upper body region; or, the first body region is a head region and the second body region is a foot region; or, the first body region is an upper body region and the second body region is a lower body region;
the rotating module is used for rotating the image to be identified according to the first body area and the second body area of each person to obtain a rotated image, and the gesture direction of the person in the rotated image is consistent with the target direction;
the identification module is used for re-identifying the person according to the rotated image;
the rotation module is specifically used for:
Acquiring a connecting line direction between a central point of the first body area and a central point of the second body area;
determining a rotation angle to be generated corresponding to the person according to the connection line direction and the target direction;
and rotating the image to be identified by the angle to be rotated by taking the center point of the second body area as a rotation center, so as to obtain the rotated image.
9. The apparatus of claim 8, wherein the rotation module is further to:
determining the height and width of a target image corresponding to the person according to the first body area and the second body area of the person, wherein the target image is at least part of the rotated image, and the target image comprises the upper body of the person;
the identification module is specifically used for:
cutting and/or filling the rotated image according to the height and the width to obtain a target image corresponding to the person;
and re-identifying the person according to the target image corresponding to the person.
10. The device according to claim 9, wherein the first body area is a head area and the second body area is an upper body area, the rotation module being specifically configured to:
And determining the height and the width of the target image corresponding to the person according to the center point coordinates of the head area, the center point coordinates of the upper body area and the vertex coordinates of the upper body area of the person.
11. The device according to claim 8, wherein the rotation module is specifically configured to:
converting each pixel point in the image to be identified from an image coordinate system to a Cartesian coordinate system, wherein the origin of the Cartesian coordinate system is the center point of the second body area of the person;
in a Cartesian coordinate system, taking the origin as the center, and rotating the pixel points by the angle to be rotated;
and converting each pixel point after rotation into an image coordinate system from a Cartesian coordinate system to obtain a rotated image.
12. The device of any of claims 11, wherein the first body region is a head region and the second body region is an upper body region.
13. The apparatus of claim 12, wherein the detection module is specifically configured to:
inputting the image to be identified into a head detection model, and acquiring at least one head area in the image to be identified;
Inputting the image to be identified into an upper body detection model, and acquiring at least one upper body region in the image to be identified;
and matching the at least one head area and the at least one upper body area to determine the head area and the upper body area of each person.
14. The apparatus of claim 13, wherein the detection module is specifically configured to:
and acquiring an intersection ratio IOU index between the head region and each upper body region aiming at any head region in the at least one head region, and taking the upper body region corresponding to the head region and the maximum IOU index as the head region and the upper body region of the same person.
15. An image recognition apparatus, characterized by comprising: at least one processor and memory;
the memory stores a computer program;
the at least one processor executing the computer program stored by the memory, causes the at least one processor to perform the method of any one of claims 1 to 7.
16. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the method according to any of claims 1 to 7.
CN201910213315.7A 2019-03-20 2019-03-20 Image recognition method, device and equipment Active CN111723610B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910213315.7A CN111723610B (en) 2019-03-20 2019-03-20 Image recognition method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910213315.7A CN111723610B (en) 2019-03-20 2019-03-20 Image recognition method, device and equipment

Publications (2)

Publication Number Publication Date
CN111723610A CN111723610A (en) 2020-09-29
CN111723610B true CN111723610B (en) 2024-03-08

Family

ID=72562936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910213315.7A Active CN111723610B (en) 2019-03-20 2019-03-20 Image recognition method, device and equipment

Country Status (1)

Country Link
CN (1) CN111723610B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191354A (en) * 2021-04-21 2021-07-30 青岛海尔电冰箱有限公司 Method and equipment for improving image recognition accuracy rate and refrigerator

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992869A (en) * 2016-10-26 2018-05-04 深圳超多维科技有限公司 For tilting the method, apparatus and electronic equipment of word correction

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005038700A1 (en) * 2003-10-09 2005-04-28 University Of York Image recognition
JP3962803B2 (en) * 2005-12-16 2007-08-22 インターナショナル・ビジネス・マシーンズ・コーポレーション Head detection device, head detection method, and head detection program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992869A (en) * 2016-10-26 2018-05-04 深圳超多维科技有限公司 For tilting the method, apparatus and electronic equipment of word correction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种快速文本图像倾斜校正方法;曾凡锋;吴飞飞;肖珂;王晓;;计算机应用与软件(第04期);全文 *

Also Published As

Publication number Publication date
CN111723610A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
US11205276B2 (en) Object tracking method, object tracking device, electronic device and storage medium
CN108875524B (en) Sight estimation method, device, system and storage medium
KR101645722B1 (en) Unmanned aerial vehicle having Automatic Tracking and Method of the same
CN111862296B (en) Three-dimensional reconstruction method, three-dimensional reconstruction device, three-dimensional reconstruction system, model training method and storage medium
CN110287772B (en) Method and device for extracting palm and palm center area of plane palm
WO2022170844A1 (en) Video annotation method, apparatus and device, and computer readable storage medium
JP5715833B2 (en) Posture state estimation apparatus and posture state estimation method
KR101769601B1 (en) Unmanned aerial vehicle having Automatic Tracking
JP4951498B2 (en) Face image recognition device, face image recognition method, face image recognition program, and recording medium recording the program
US20130028517A1 (en) Apparatus, method, and medium detecting object pose
CN110926330B (en) Image processing apparatus, image processing method, and program
CN111488775B (en) Device and method for judging degree of visibility
CN112257696B (en) Sight estimation method and computing equipment
CN105205459B (en) A kind of recognition methods of characteristics of image vertex type and device
JP5001930B2 (en) Motion recognition apparatus and method
CN111027481A (en) Behavior analysis method and device based on human body key point detection
WO2016070300A1 (en) System and method for detecting genuine user
CN112749655B (en) Sight line tracking method, device, computer equipment and storage medium
CN112017212A (en) Training and tracking method and system of face key point tracking model
CN111723610B (en) Image recognition method, device and equipment
JP2019012497A (en) Portion recognition method, device, program, and imaging control system
CN112884804A (en) Action object tracking method and related equipment
EP2128820A1 (en) Information extracting method, registering device, collating device and program
CN113673288A (en) Idle parking space detection method and device, computer equipment and storage medium
CN116883981A (en) License plate positioning and identifying method, system, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant