CN112183412A - Personnel identity identification method and device, electronic equipment and storage medium - Google Patents

Personnel identity identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112183412A
CN112183412A CN202011067605.4A CN202011067605A CN112183412A CN 112183412 A CN112183412 A CN 112183412A CN 202011067605 A CN202011067605 A CN 202011067605A CN 112183412 A CN112183412 A CN 112183412A
Authority
CN
China
Prior art keywords
personnel
person
identity
information
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011067605.4A
Other languages
Chinese (zh)
Inventor
朱晓宁
吴喆峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingying Digital Technology Co Ltd
Original Assignee
Jingying Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingying Digital Technology Co Ltd filed Critical Jingying Digital Technology Co Ltd
Priority to CN202011067605.4A priority Critical patent/CN112183412A/en
Publication of CN112183412A publication Critical patent/CN112183412A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The technical scheme includes that firstly, personnel moving images corresponding to personnel contained in each frame of images to be detected in a plurality of frames of images in video data collected in a target area are extracted; then respectively carrying out identity prediction on each personnel moving image according to a time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information; then summarizing a plurality of personnel identity prediction information values corresponding to each personnel to obtain a personnel identity prediction information summary value corresponding to each personnel; and finally, determining the personnel identity information of each personnel according to the matching relation among the predicted identity information, the total value of each personnel identity predicted information and the preset matching degree. Therefore, the labor cost can be reduced, and meanwhile, the timeliness and the accuracy of the identity information of the identification personnel are improved.

Description

Personnel identity identification method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of intelligent monitoring technologies, and in particular, to a method and an apparatus for identifying a person, an electronic device, and a storage medium.
Background
In the production operation process, in order to ensure production safety and production scheduling, the identity information of personnel needs to be identified.
In the related technology, the motion state and the law of the personnel can be known by watching the monitoring video of the production operation personnel, and the identity information of the personnel can be tracked and identified in time, so that the personnel can be better managed, and the production safety and the production scheduling can be ensured.
However, in the related art, most people realize the tracking and identification of the personnel identity information by experience, and the method is tedious, complicated, high in investment, low in return and hard.
Disclosure of Invention
In order to solve the problems in the related art, the application provides a personnel identity identification method, a personnel identity identification device, an electronic device and a storage medium.
The application provides a person identity identification method in a first aspect, which comprises the following steps:
acquiring video data acquired by a video acquisition device; the video data at least comprises a plurality of frames of images;
extracting personnel moving images corresponding to personnel contained in each frame of image to be detected in the multi-frame images;
respectively carrying out identity prediction on each personnel moving image according to a pre-trained time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information;
respectively summarizing a plurality of personnel identity prediction information values corresponding to each personnel to obtain a personnel identity prediction information summary value corresponding to each personnel;
and determining the personnel identity information of each personnel according to the matching relationship among the predicted identity information, the total value of each personnel identity predicted information and the preset matching degree.
In some embodiments, before the extracting the person moving image corresponding to each person included in each to-be-detected image in the multiple frames of images, the method further includes:
detecting whether each image in the multi-frame images meets preset requirements or not; the preset requirements are used for representing actions of people and/or positions of people which can be extracted from the images;
and taking the image meeting the preset requirement as the image to be detected.
In some embodiments, the extracting of the person moving images corresponding to the persons included in each frame of the image to be detected in the multiple frames of images includes:
calling a human body image segmentation model to segment the human body image of each frame of image to be detected to obtain a personnel motion image of each personnel and corresponding personnel position information; the human body image segmentation model is a machine learning model which is trained by using basic data and training samples, the basic data is data of a human figure part in a COCO data set, and the training samples are image samples which mark segmented images of a person to be identified in video data of the target area.
In some embodiments, further comprising:
determining a plurality of personal moving images corresponding to each person according to the personal position information of each person;
dividing a plurality of person moving images corresponding to each person into respective person categories; wherein each person corresponds to a category of persons.
In some embodiments, the person position information includes a target frame range, and the determining the plurality of person moving images respectively corresponding to each person from the person position information of each person includes:
determining the center point coordinate of the target frame range of any one person moving image as a first coordinate;
determining the center point coordinate of the target frame of the nth human moving image as an adjacent image frame to the human moving image as a second coordinate;
and if the difference value of the first coordinate and the second coordinate meets a preset variation threshold range, determining that the first person motion image and the Nth person motion image are the person motion images of the same person.
In some embodiments, the collecting the plurality of person identity prediction information values corresponding to each person to obtain a person identity prediction information collected value corresponding to each person includes:
carrying out weighted average calculation on a plurality of personnel identity prediction information values corresponding to any one person, and taking a weighted average calculation result as a personnel identity prediction information summary value of the person;
alternatively, the first and second electrodes may be,
and taking the ratio of the number of the successfully matched identity of the person in the identity prediction information values corresponding to any person to the number of the identity prediction information values of the person as a person identity prediction information summary value of the person.
In some embodiments, the determining the person identity information of each person according to the matching relationship between the predicted identity information, each of the person identity predicted information summary values, and a preset matching degree includes:
judging whether any one of the personnel identity prediction information summary values meets a preset matching degree set in a database;
if the preset matching degree is met, determining the identity information of the person as the identity information corresponding to the predicted identity information in the database;
and if the matching degree does not accord with the preset matching degree, determining that the identity information of the person is unknown.
The second aspect of the present application provides a person identification apparatus, including:
the acquisition module is used for acquiring video data acquired by the video acquisition device; the video data at least comprises a plurality of frames of images;
the extraction module is used for extracting a personnel moving image which corresponds to each personnel contained in each frame of image to be detected in the multi-frame images;
the prediction module is used for respectively predicting the identity of each personnel moving image according to a pre-trained time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information; the summarizing module is used for summarizing the plurality of personnel identity prediction information values corresponding to each personnel respectively to obtain a personnel identity prediction information summarizing value corresponding to each personnel;
and the determining module is used for determining the personnel identity information of each personnel according to the matching relation between the predicted identity information, each personnel identity predicted information aggregate value and a preset matching degree.
A third aspect of the present application provides an electronic device comprising:
a processor; and
a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method as described above.
A fourth aspect of the present application provides a non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform a method as described above.
The technical scheme includes that firstly, personnel moving images corresponding to personnel contained in each frame of images to be detected in a plurality of frames of images in video data collected in a target area are extracted; then respectively carrying out identity prediction on each personnel moving image according to a time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information; then summarizing a plurality of personnel identity prediction information values corresponding to each personnel to obtain a personnel identity prediction information summary value corresponding to each personnel; and finally, determining the personnel identity information of each personnel according to the matching relation among the predicted identity information, the total value of each personnel identity predicted information and the preset matching degree.
Therefore, the technical scheme provided by the application is based on the deep learning computer vision method, the action posture of the personnel can be mastered in real time, the personnel on site can be automatically identified, the labor cost is reduced, the timeliness and the accuracy of identifying the personnel identity information are improved, the working efficiency is improved, and therefore the production safety and the production scheduling are guaranteed.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The foregoing and other objects, features and advantages of the application will be apparent from the following more particular descriptions of exemplary embodiments of the application, as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the application.
Fig. 1 is a schematic flow chart of a method for identifying a person according to an embodiment of the present application;
fig. 2 is a schematic diagram illustrating human motion image recognition according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a human motion image shown in an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating classification of human motion images according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a person identification apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Detailed Description
Preferred embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms "first," "second," "third," etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
In the embodiment of the application, the method and the device are mainly applied to personnel identity recognition of field operation. In the related art, the personnel identity information needs to be identified in a manual mode, and the video data collected by the monitoring equipment needs to be concerned all the time to ensure production safety, production scheduling and the like. The mode is time-consuming and labor-consuming, has poor timeliness and low working efficiency.
In order to solve the problem, the embodiment of the application provides a personnel identity identification method and related equipment, which can automatically identify the identity of personnel on site, reduce labor cost, improve timeliness and accuracy of identifying personnel identity information, improve working efficiency, and ensure production safety and production scheduling.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for identifying a person according to an embodiment of the present application.
The personnel identity identification method provided by the embodiment of the application comprises the following steps:
s100, obtaining video data collected by a video collecting device; the video data at least comprises a plurality of frames of images;
in the embodiment of the application, the person identification method can be applied to an analysis server, and the front end of the analysis server is in communication connection with the video acquisition equipment. The video data collected by the video collecting device in real time can be received. Of course, the analysis server may also be other forms of devices that can handle the person identification method, such as a mobile terminal, a handheld terminal, etc. The method for identifying a person provided in the embodiment of the present application is not particularly limited as long as the method can be operated and processed.
Wherein the video data may be video data of the target area. The target area is preferably an area at the site of the production environment, but may of course be other environmental areas. The video capture device in the embodiment of the present application may be disposed above the target area, for example, above the human console, as long as the video data of the target area can be captured, which is not specifically limited herein.
It is understood that the video data in the embodiments of the present application may be in the form of a video stream or in other forms. The video data at least comprises a plurality of frames of images.
In the embodiment of the present application, after the video data of a certain time period is collected by the video collecting device, the analysis server obtains the video data, and step S200 is triggered.
S200, extracting a personnel moving image corresponding to each personnel contained in each frame of image to be detected in the multi-frame image;
in the embodiment of the application, the person motion image may be an image collected by a specific motion performed by a person during performing work, and may include image information related to the person's body shape, motion, and the like of the person.
In the embodiment of the application, a plurality of frames of images can be screened, and the images meeting the preset requirements are used as the images to be detected.
Optionally, before extracting the person moving image corresponding to each person included in each frame of the image to be detected in the multiple frames of images, the method further includes:
detecting whether each image in the multi-frame images meets preset requirements or not; the preset requirements are used for representing actions of people and/or positions of people which can be extracted from the images;
and taking the image meeting the preset requirement as the image to be detected.
In the embodiment of the application, in order to improve the identification efficiency, an image from which the actions of the person and/or the positions of the person can be extracted in the multi-frame image may be used as the image to be detected.
In the embodiment of the present application, step S200 and detecting whether the image meets the preset requirement may be implemented by a machine learning model.
Preferably, the extracting of the person moving image corresponding to each person included in each frame of the to-be-detected image in the multiple frames of images includes:
calling a human body image segmentation model to segment the human body image of each frame of image to be detected to obtain a personnel motion image of each personnel and corresponding personnel position information; the human body image segmentation model is a machine learning model which is trained by using basic data and training samples, the basic data is data of a human figure part in a COCO data set, and the training samples are image samples which mark segmented images of a person to be identified in video data of the target area.
The human body image segmentation model may be a machine learning model obtained by applying an open COCO dataset, extracting only data of a human image dataset part therein, and performing pre-training on a reference model YOLACT + + instance segmentation model, and the machine learning model is selected because it has:
1) target detection: a target frame (bounding box) is directly drawn on a video image frame, so that the central point positions of different persons can be easily obtained, and further conditions are provided for the following person tracking;
2) and (3) target classification: according to the type of the detected target, irrelevant categories are screened and removed, and the action focused on human is extracted, so that the speed of identifying the identity of the personnel is increased;
3) pixel level person segmentation: in each video frame image, people at different positions are accurately distinguished at a pixel level, and the requirement of segmentation is met.
The goal of this human image segmentation model is to add Mask branches to the existing one-stage object detection model in the same way that Mask R-CNN operates on fast-CNN, but without an explicit localization step (e.g., feature-mapping). To this end, the complex task of the human image segmentation model is decomposed into two simpler parallel tasks that can be combined to form the final mask.
1) The first branch uses FCN to generate a set of image-sized "proto-type masks" (proto-type masks) that are not dependent on any one instance.
2) The second branch adds an extra header (head) to the target detection branch to predict the vector of "mask coefficients" (mask coefficients) used to encode the example representation in the prototype space.
Yolcact + + decomposes the problem into two parallel parts, generates "mask coefficients" and "prototype masks" using the FCN layer (good at generating vectors of semantics) and the convolutional layer (good at generating spatially coherent masks), respectively, and constructs masks for the people in the video image frame by linearly combining the results of these two branches; and finishing the judgment requirement of the personnel detection and the task of the personnel action extraction through the example segmentation model.
It can be understood that the human body image segmentation model in the embodiment of the present application may be used to determine the detection effect of a person included in a video shot by a camera in real time, segment a person meeting requirements, generate a person moving image of the person, and obtain location information of the person.
For example, fig. 2 and 2 show schematic diagrams of human moving image recognition, and if 3 persons are recognized in the same video frame, respective human moving images and respective position information are generated corresponding to the 3 persons. As shown in fig. 3, fig. 3 is a schematic diagram of a human moving image shown in the embodiment of the present application.
In the training sample, the content to be labeled may be, for example, the name, height, job number, sex, age, post, etc. of the person, and the specific content to be labeled may be defined by the user according to the actual need. Wherein the content of the annotation affects the output result.
In the embodiment of the application, only the images meeting the preset requirements are processed, so that the processing efficiency is improved, and because only the recognition and extraction of the actions of the personnel are concerned in the adopted machine learning model, the recognition efficiency and the precision are improved compared with those of other machine learning models, and in addition, because the adopted machine learning model can simultaneously execute a plurality of functions such as screening, segmentation and the like, the system integration level is greatly improved.
In the embodiment of the present application, after obtaining the individual person moving image of each person, the method further includes:
determining a plurality of personal moving images corresponding to each person according to the personal position information of each person;
dividing a plurality of person moving images corresponding to each person into respective person categories; wherein each person corresponds to a category of persons.
In the embodiment of the present invention, since processing is performed on a plurality of frames of images, each person corresponds to each action route, and therefore, in the embodiment of the present invention, classification processing is performed on each person. Each person corresponds to one person type, and all the person types are stored in the cache.
Fig. 4 is a schematic diagram illustrating classification of human motion images according to an embodiment of the present application.
The personnel in the video image can be classified into the personnel categories according to the different coordinate position areas of the personnel.
It is to be understood that the process of sorting may also be understood as a process of person tracking of a person.
The person position information includes a target frame range, and determining a plurality of person moving images corresponding to each person according to the person position information of each person includes:
determining the center point coordinate of the target frame range of any one person moving image as a first coordinate;
determining the center point coordinate of the target frame of the nth human moving image as an adjacent image frame to the human moving image as a second coordinate;
and if the difference value of the first coordinate and the second coordinate meets a preset variation threshold range, determining that the first person motion image and the Nth person motion image are the person motion images of the same person.
In this embodiment, the person position information may include a target frame range, which may be obtained by using a human body image segmentation model during position extraction. The target frame range may include center point coordinates. Since inter-frame consistency, that is, the position difference of the same person in the adjacent frame images is not large, taking any one person moving image as the first moving image as an example, the position of each person in the first moving image and the position of each person in the adjacent frame images do not change much, and therefore, the range of the change of the center point coordinates of the person in the first person moving image and the change of the center point coordinates of the person in the adjacent frame images should be small. For example, if the difference between the first coordinates (222 ) in the first person moving image and the second coordinates (220 ) in the nth person moving image is (2, 2), and the preset variation threshold range is (5, 5), the preset variation threshold range is met. The two central point coordinates are taken out of the same person. Of course, the preset variation range may also be (-5, -5), etc., and the setting may be performed according to actual needs, which is not described herein.
It should be noted that the center point coordinate of the target frame range may be the center point coordinate of the human body, and of course, the first coordinate may also be the human body center point coordinate of the first person moving image in order to improve the accuracy.
In practical use, the classification of the person in the video image according to the different coordinate position areas of the person may include:
using the extracted different personnel moving images of the same frame and the coordinates of the center point of the human body in the different personnel moving images; and a person target box (bounding box) realizes the classification of the persons;
the detected person is taken as an example for explanation, and the coordinates of the upper left corner of the target frame are (x) in the same manner as the other persons1,y1) The coordinate of the lower right corner is (x)2,y2) Then the coordinates of the center point of the person are:
Pcenter=[(x1+x2)÷2,(y1+y2)÷2])
considering that the height is larger than the width in the upright state of the human body, but the height is uncertain in the state of lying, squatting or sitting, therefore if the action image is cut into a data image which can be judged, the action image is cut, and the posture of the human body can not be stretched, so the action image is cut according to the position of the center point of the person according to the larger one of the length and the width, wherein (x)2-x1) The expression width (y)2-y1) Denotes high, LmaxThe larger of the two is represented by the following calculation:
Lmax=max((x2-x1),(y2-y1))
then, according to the center point coordinate PcenterAnd a longer side LmaxCan be obtained as PcenterIs a center with a side length of LmaxThen the image is scaled to a size that meets the input conditions.
It is understood that, in the embodiment of the present application, the input of the subsequent step S300 is a matrix image of 224 × 224, and therefore, the height and width of the human motion image cropped in the process are both 224 pixels.
Finally, classifying the personnel at the central points of different areas through interframe consistency, namely the positions of the same personnel in adjacent frames are not greatly different, and distinguishing according to the front and the back of the positions if the shielding problem among the personnel occurs; and then saving the action images generated by the personnel with different categories to different places, such as a cache folder.
Therefore, in the embodiment of the application, the personnel in the image are tracked in a frame-by-frame mode according to the inter-frame consistency, so that the tracking processing of different personnel is achieved, and the path movement information of different personnel is accurately acquired. The path moved by a person in two adjacent frames of images does not have large deviation, so that different persons in different frames can be accurately classified by using the path movement information.
S300, respectively carrying out identity prediction on each personnel moving image according to a pre-trained time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information;
s400, summarizing a plurality of personnel identity prediction information values corresponding to each personnel respectively to obtain a personnel identity prediction information summary value corresponding to each personnel;
the identity prediction of each person moving image according to the time series prediction model to obtain the predicted identity information of each person and the plurality of person identity prediction information values corresponding to the predicted identity information specifically includes: and inputting a plurality of personnel moving images under any personnel category into the time series prediction model as an image sequence to obtain personnel identity information output by the time series prediction model and a plurality of personnel identity prediction information values corresponding to the personnel identity information.
In the embodiment of the application, a time series prediction model is provided. Wherein, the model is obtained by training by utilizing a long and short memory network.
In practical use, the training process of the time series prediction model in the embodiment of the present application may include:
making a video frame image serialization sample containing personnel from the image of the target area;
carrying out segmentation and labeling on serialized samples containing personnel in the samples, and then making into detection samples;
marking a tag of personnel identity information by using the marked serialized sample of the personnel, and expanding a personnel action database;
the time series prediction model (Long Short Term Memory network LSTM for memorizing the movement of the person) is trained and improved, and is more stable and robust as time goes on.
Preferably, the time-series prediction model in the embodiment of the present application is trained by using the human motion image in the human motion database.
Wherein the respectively performing identity prediction on each of the human moving images according to the time-series prediction model may include:
acquiring a classified personnel action image;
the moving images of all the persons appearing in the current frame are respectively put into a trained time sequence prediction model to predict the identity information of the persons;
and respectively recording and storing all the personnel identity prediction information values of the frame.
The person identity prediction information value can be the person identity information predicted by the person and a specific prediction value or a prediction result and other related specific parameter values.
In this embodiment of the application, summarizing the plurality of person identity prediction information values corresponding to each person respectively to obtain a person identity prediction information summarized value corresponding to each person includes:
carrying out weighted average calculation on a plurality of personnel identity prediction information values corresponding to any one person, and taking a weighted average calculation result as a personnel identity prediction information summary value of the person;
alternatively, the first and second electrodes may be,
and taking the ratio of the number of the successfully matched identity of the person in the identity prediction information values corresponding to any person to the number of the identity prediction information values of the person as a person identity prediction information summary value of the person.
In actual use, the method can comprise the following steps:
and acquiring all the obtained personnel identity prediction information values and the corresponding personnel identity prediction information in the past.
Taking an image of 40 frames corresponding to a person as an example, model prediction is performed on 40 frames to obtain 40 person prediction information. And matching the 40 pieces of personnel predicted information with the personnel identity information stored in the personnel action database, wherein if the 30 frames of personnel predicted information are matched with the personnel identity information of the first in the database, the matching degree of the predicted personnel as the first is 0.75.
Or carrying out weighted average calculation on each predicted value to obtain a total value of the personnel identity predicted information of the personnel to be 0.75.
Of course, a summary manner such as summarizing the number of matching results may be used, and in short, the summary manner may be a specific process as long as the summary result is obtained.
S500, determining the personnel identity information of each personnel according to the matching relation between the predicted identity information, the total value of each personnel identity predicted information and a preset matching degree.
Finally, the embodiment of the application needs to determine the personal identity information of each person.
Wherein, the determining the personnel identity information of each personnel according to the matching relationship between the predicted identity information, the total value of each personnel identity predicted information and the preset matching degree comprises:
judging whether any one of the personnel identity prediction information summary values accords with a preset matching degree set in a personnel action database;
if the preset matching degree is met, determining the identity information of the person as the identity information corresponding to the predicted identity information in the person action database;
and if the matching degree does not accord with the preset matching degree, determining that the identity information of the person is unknown.
If the preset matching degree is not met, the method further comprises the following steps:
and storing the personnel moving images corresponding to the personnel identity prediction information summary value which does not accord with the preset matching degree as the data to be trained of the time series prediction model.
Because the total value of the personnel identity prediction information is obtained in the previous step, if the matching degree is high, the personnel is a certain person in the personnel action database, and the identity information of the personnel at the specific position and the video stream is obtained; otherwise, the person at the position is a person identity that is not recorded in the person action database, and the person moving image of the person, for example, the above-described cache folder, may be saved in the database.
And then, the method is used as a serialized sample of the person to be labeled, and a time series prediction model is trained and improved, so that an accurate result can be output in subsequent recognition.
According to the embodiment, people can be shot in real time through the camera, all people in the video data are detected and the action information of the people is extracted to obtain the action image of the people according to the obtained video data, the inter-frame consistency is fully utilized, the people at different positions are tracked in the whole process, the identity of different people in the same frame is predicted and recognized, the total value of the identity prediction of the people of different people in the video data is collected and counted, and the identity recognition information of the people of each person is finally obtained. Compared with the prior art, the technical scheme shown in the embodiment of the application is based on the deep learning computer vision method, the action posture and the position state of the personnel are mastered in real time, the intelligent monitoring of the personnel identity recognition process is realized, and the quality and the speed of identity recognition are fundamentally ensured.
Corresponding to the embodiment of the application function implementation method, the application also provides a personnel identity recognition device, electronic equipment and a corresponding embodiment.
Fig. 5 is a schematic structural diagram of a person identification apparatus according to an embodiment of the present application.
Referring to fig. 5, a person identification apparatus disclosed in an embodiment of the present application includes:
the acquisition module 1 is used for acquiring video data acquired by the video acquisition device; the video data at least comprises a plurality of frames of images;
the extraction module 2 is used for extracting a personnel moving image corresponding to each personnel contained in each frame of image to be detected in the multi-frame images;
the prediction module 3 is used for respectively predicting the identity of each personnel moving image according to a pre-trained time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information;
the summarizing module 4 is used for summarizing the plurality of personnel identity prediction information values corresponding to each personnel respectively to obtain a personnel identity prediction information summarizing value corresponding to each personnel;
and the determining module is used for determining the personnel identity information of each personnel according to the matching relation between the predicted identity information, each personnel identity predicted information aggregate value and a preset matching degree.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 6 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Referring to fig. 6, the electronic device 1000 includes a memory 1010 and a processor 1020.
The Processor 1020 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 1010 may include various types of storage units, such as system memory, Read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions that are needed by the processor 1020 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 1010 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, among others. In some embodiments, memory 1010 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disc, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.
The memory 1010 has stored thereon executable code that, when processed by the processor 1020, may cause the processor 1020 to perform some or all of the methods described above.
The aspects of the present application have been described in detail hereinabove with reference to the accompanying drawings. In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments. Those skilled in the art should also appreciate that the acts and modules referred to in the specification are not necessarily required in the present application. In addition, it can be understood that the steps in the method of the embodiment of the present application may be sequentially adjusted, combined, and deleted according to actual needs, and the modules in the device of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
Furthermore, the method according to the present application may also be implemented as a computer program or computer program product comprising computer program code instructions for performing some or all of the steps of the above-described method of the present application.
Alternatively, the present application may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or electronic device, server, etc.), causes the processor to perform part or all of the various steps of the above-described method according to the present application.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the applications disclosed herein may be implemented as electronic hardware, computer software, or combinations of both.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present application, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A personnel identification method is characterized by comprising the following steps:
acquiring video data acquired by a video acquisition device; the video data at least comprises a plurality of frames of images;
extracting personnel moving images corresponding to personnel contained in each frame of image to be detected in the multi-frame images;
respectively carrying out identity prediction on each personnel moving image according to a pre-trained time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information;
respectively summarizing a plurality of personnel identity prediction information values corresponding to each personnel to obtain a personnel identity prediction information summary value corresponding to each personnel;
and determining the personnel identity information of each personnel according to the matching relationship among the predicted identity information, the total value of each personnel identity predicted information and the preset matching degree.
2. The method according to claim 1, wherein before extracting the human motion image corresponding to each of the human contained in each of the to-be-detected images in the plurality of frames of images, the method further comprises:
detecting whether each image in the multi-frame images meets preset requirements or not; the preset requirements are used for representing actions of people and/or positions of people which can be extracted from the images;
and taking the image meeting the preset requirement as the image to be detected.
3. The method according to claim 1, wherein the extracting of the human motion image corresponding to each of the human contained in each of the to-be-detected images in the plurality of frames of images comprises:
calling a human body image segmentation model to segment the human body image of each frame of image to be detected to obtain a personnel motion image of each personnel and corresponding personnel position information; the human body image segmentation model is a machine learning model which is trained by using basic data and training samples, the basic data is data of a human figure part in a COCO data set, and the training samples are image samples which mark segmented images of a person to be identified in video data of the target area.
4. The method of claim 3, further comprising:
determining a plurality of personal moving images corresponding to each person according to the personal position information of each person;
dividing a plurality of person moving images corresponding to each person into respective person categories; wherein each person corresponds to a category of persons.
5. The method according to claim 4, wherein the person position information includes a target frame range, and the determining a plurality of person moving images respectively corresponding to each person from the person position information of each person includes:
determining the center point coordinate of the target frame range of any one person moving image as a first coordinate;
determining the center point coordinate of the target frame of the nth human moving image as an adjacent image frame to the human moving image as a second coordinate;
and if the difference value of the first coordinate and the second coordinate meets a preset variation threshold range, determining that the first person motion image and the Nth person motion image are the person motion images of the same person.
6. The method according to claim 1, wherein the collecting the plurality of predicted personal identity information values corresponding to each person to obtain a collected personal identity information value corresponding to each person comprises:
carrying out weighted average calculation on a plurality of personnel identity prediction information values corresponding to any one person, and taking a weighted average calculation result as a personnel identity prediction information summary value of the person;
alternatively, the first and second electrodes may be,
and taking the ratio of the number of the successfully matched identity of the person in the identity prediction information values corresponding to any person to the number of the identity prediction information values of the person as a person identity prediction information summary value of the person.
7. The method according to claim 1, wherein the determining the person identity information of each person according to the matching relationship between the predicted identity information, the total value of each person identity predicted information and a preset matching degree comprises:
judging whether any one of the personnel identity prediction information summary values meets a preset matching degree set in a database;
if the preset matching degree is met, determining the identity information of the person as the identity information corresponding to the predicted identity information in the database;
and if the matching degree does not accord with the preset matching degree, determining that the identity information of the person is unknown.
8. A person identification apparatus, comprising:
the acquisition module is used for acquiring video data acquired by the video acquisition device; the video data at least comprises a plurality of frames of images;
the extraction module is used for extracting a personnel moving image which corresponds to each personnel contained in each frame of image to be detected in the multi-frame images;
the prediction module is used for respectively predicting the identity of each personnel moving image according to a pre-trained time sequence prediction model to obtain the predicted identity information of each personnel and a plurality of personnel identity prediction information values corresponding to the predicted identity information;
the summarizing module is used for summarizing the plurality of personnel identity prediction information values corresponding to each personnel respectively to obtain a personnel identity prediction information summarizing value corresponding to each personnel;
and the determining module is used for determining the personnel identity information of each personnel according to the matching relation between the predicted identity information, each personnel identity predicted information aggregate value and a preset matching degree.
9. An electronic device, comprising:
a processor; and
a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method of any one of claims 1-7.
10. A non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method of any one of claims 1-6.
CN202011067605.4A 2020-10-06 2020-10-06 Personnel identity identification method and device, electronic equipment and storage medium Pending CN112183412A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011067605.4A CN112183412A (en) 2020-10-06 2020-10-06 Personnel identity identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011067605.4A CN112183412A (en) 2020-10-06 2020-10-06 Personnel identity identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112183412A true CN112183412A (en) 2021-01-05

Family

ID=73947791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011067605.4A Pending CN112183412A (en) 2020-10-06 2020-10-06 Personnel identity identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112183412A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158813A (en) * 2021-03-26 2021-07-23 精英数智科技股份有限公司 Real-time statistical method and device for flow target
CN113887373A (en) * 2021-09-27 2022-01-04 中关村科学城城市大脑股份有限公司 Attitude identification method and system based on urban intelligent sports parallel fusion network
CN117495063A (en) * 2024-01-03 2024-02-02 中关村科学城城市大脑股份有限公司 Police resource scheduling method, apparatus, electronic device and computer readable medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158813A (en) * 2021-03-26 2021-07-23 精英数智科技股份有限公司 Real-time statistical method and device for flow target
CN113887373A (en) * 2021-09-27 2022-01-04 中关村科学城城市大脑股份有限公司 Attitude identification method and system based on urban intelligent sports parallel fusion network
CN117495063A (en) * 2024-01-03 2024-02-02 中关村科学城城市大脑股份有限公司 Police resource scheduling method, apparatus, electronic device and computer readable medium
CN117495063B (en) * 2024-01-03 2024-04-12 中关村科学城城市大脑股份有限公司 Police resource scheduling method, apparatus, electronic device and computer readable medium

Similar Documents

Publication Publication Date Title
CN107784282B (en) Object attribute identification method, device and system
WO2021047232A1 (en) Interaction behavior recognition method, apparatus, computer device, and storage medium
CN112183412A (en) Personnel identity identification method and device, electronic equipment and storage medium
CN110738101A (en) Behavior recognition method and device and computer readable storage medium
CN110751022A (en) Urban pet activity track monitoring method based on image recognition and related equipment
CN108171112A (en) Vehicle identification and tracking based on convolutional neural networks
CN107992819B (en) Method and device for determining vehicle attribute structural features
CN105512683A (en) Target positioning method and device based on convolution neural network
CN111639653B (en) False detection image determining method, device, equipment and medium
CN111061898A (en) Image processing method, image processing device, computer equipment and storage medium
CN111126393A (en) Vehicle appearance refitting judgment method and device, computer equipment and storage medium
CN111814733A (en) Concentration degree detection method and device based on head posture
CN115620393A (en) Fine-grained pedestrian behavior recognition method and system oriented to automatic driving
CN114870384A (en) Taijiquan training method and system based on dynamic recognition
CN110956157A (en) Deep learning remote sensing image target detection method and device based on candidate frame selection
CN114331961A (en) Method for defect detection of an object
CN110472608A (en) Image recognition tracking processing method and system
CN114511589A (en) Human body tracking method and system
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
CN113313098A (en) Video processing method, device, system and storage medium
CN115482425A (en) Key point identification method, model training method, device and storage medium
Harish et al. New features for webcam proctoring using python and opencv
CN110728316A (en) Classroom behavior detection method, system, device and storage medium
CN112581495A (en) Image processing method, device, equipment and storage medium
CN114743257A (en) Method for detecting and identifying image target behaviors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination