WO2022121498A1 - 身份识别方法、模型训练方法、装置、设备和存储介质 - Google Patents
身份识别方法、模型训练方法、装置、设备和存储介质 Download PDFInfo
- Publication number
- WO2022121498A1 WO2022121498A1 PCT/CN2021/124112 CN2021124112W WO2022121498A1 WO 2022121498 A1 WO2022121498 A1 WO 2022121498A1 CN 2021124112 W CN2021124112 W CN 2021124112W WO 2022121498 A1 WO2022121498 A1 WO 2022121498A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- attributes
- network
- target person
- classification model
- standard
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000012549 training Methods 0.000 title claims abstract description 41
- 238000013145 classification model Methods 0.000 claims abstract description 78
- 238000012544 monitoring process Methods 0.000 claims description 57
- 238000002372 labelling Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 6
- 238000001514 detection method Methods 0.000 description 28
- 238000004519 manufacturing process Methods 0.000 description 13
- 238000013135 deep learning Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 5
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013139 quantization Methods 0.000 description 3
- 240000003186 Stachytarpheta cayennensis Species 0.000 description 2
- 235000009233 Stachytarpheta cayennensis Nutrition 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 240000007241 Agrostis stolonifera Species 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Definitions
- the embodiments of the present application relate to the technical field of security monitoring, and in particular, to an identification method, a model training method, an apparatus, a device and a storage medium.
- An embodiment of the present application provides an identity recognition method, including: acquiring a video image in a monitoring scene; if it is detected that a target person appears in the video image, according to a pre-trained multi-attribute classification model, determining the number of the target person wherein, the multi-attribute classification model is obtained by training based on a pre-built sample set, and the sample set includes several images marked with attributes; determine the standard attributes of the identities that meet the entry conditions of the monitoring scene; according to the The various attributes of the target person and the standard attributes identify whether the identity of the target person meets the entry conditions.
- the embodiment of the present application also provides a method for training a multi-attribute classification model, including: obtaining a public image data set; labeling various attributes of persons in the images that meet preset labeling conditions in the image data set, and constructing the sample set; determine the structure of the network, and configure the network hyperparameters of the network; train the network configured with the network hyperparameters according to the sample set to obtain the multi-attribute classification model.
- the embodiment of the present application also provides a training device for a multi-attribute classification model, including: an acquisition module, used for acquiring a public image data set; Annotate various attributes of the characters, and construct the sample set; the configuration module is used to determine the structure of the network and configure the network hyperparameters of the network; the training module is used to train and configure the network according to the sample set.
- a network of network hyperparameters to obtain the multi-attribute classification model.
- An embodiment of the present application further provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a program that can be executed by the at least one processor Instructions, the instructions being executed by the at least one processor to enable the at least one processor to execute the above-described identification method.
- Embodiments of the present application further provide a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the above-mentioned identification method is implemented.
- Fig. 1 is the flow chart of the identification method mentioned in the first embodiment of the present application.
- FIG. 2 is a schematic diagram of the multi-task classification model and the single-task classification model mentioned in the first embodiment of the present application;
- FIG. 3 is a schematic diagram of introducing an attention mechanism into a multi-attribute classification model mentioned in the second embodiment of the present application;
- Fig. 4 is the unmarked original picture of the mark mentioned in the second embodiment of the present application.
- FIG. 5 is a schematic diagram of different regions marked with different colors mentioned in the second embodiment of the present application.
- Fig. 6 is the flow chart of the realization mode of determining the multiple attributes of the target person according to the pre-trained multiple attribute classification model mentioned in the second embodiment of the present application;
- FIG. 7 is a schematic diagram of a mask image corresponding to the upper garment area mentioned in the second embodiment of the present application.
- FIG. 9 is a schematic diagram of a training device for a multi-attribute classification model mentioned in the fourth embodiment of the present application.
- FIG. 10 is a schematic structural diagram of the electronic device mentioned in the fifth embodiment of the present application.
- the main purpose of the embodiments of the present application is to propose an identification method, a model training method, an apparatus, equipment and a storage medium, which aim to simplify the acquisition process of the sample set, reduce the risk of model overfitting, and improve the generalization ability of the model to Adapt to the monitoring needs of more monitoring scenarios.
- the model suitable for the specified scene has higher accuracy in the specified scene, but if you switch to other similar scenes, the model may fail completely. For example, if the model applied in the reference room of hospital A is migrated to the reference room of hospital B, the style and color of the uniforms of the staff of hospital B may be different from those of hospital A. However, since the model only focuses on the characteristics of the uniforms of the staff of hospital A, when the model is applied to the reference room of hospital B, the model may fail completely. If the model needs to be applied to hospital B, it is necessary to collect data in the reference room of hospital B and retrain the model. This limits the large-scale deployment of the model, and the generalization ability of the model is poor.
- the embodiment of the present application provides the following identification method, which aims to simplify the sample
- the acquisition process of the set can reduce the risk of model overfitting and improve the generalization ability of the model.
- the first embodiment of the present application relates to an identification method, which is applied to an electronic device; wherein, the electronic device may be a server.
- the application scenarios of this embodiment may include, but are not limited to, scenarios with security monitoring requirements, such as hospital data rooms, police station data rooms, bank data rooms, military jurisdictions, prisons, and factory workshops.
- security monitoring requirements such as hospital data rooms, police station data rooms, bank data rooms, military jurisdictions, prisons, and factory workshops.
- the flowchart of the identity recognition method of this embodiment includes:
- Step 101 Acquire a video image in the monitoring scene.
- the monitoring scene may be the above-mentioned hospital data room, police station data room, bank data room, military jurisdiction, prison, factory workshop, and so on.
- Several surveillance cameras can be deployed in the monitoring scene to collect video images in the monitoring scene, and transmit the collected video images to the server, so that the server can obtain the video images in the monitoring scene.
- several surveillance cameras can collect video images in the surveillance scene in real time, so that the server can acquire the video images in the surveillance scene in real time, so as to improve the reliability of the surveillance.
- Step 102 If it is detected that the target person appears in the video image, according to the pre-trained multi-attribute classification model, various attributes of the target person are determined.
- the target person can be understood as any person appearing in the video image. That is, if any one person is detected in the video image, it can be determined that the target person is detected in the video image.
- the server performs target detection on the video image, and when the detected target is a person, it is determined that the target person appears in the video image.
- the method of determining whether the target person appears in the video image may be: using a pre-trained pedestrian detection model to detect whether the target person in the video image is a person.
- the target detection network structure may be a one-stage target detection network structure or a two-stage target detection network structure
- the one-stage target detection network structure may include but not limited to Single Shot Detector referred to as SSD, You Only Look Once referred to as YOLO, Fully Convolutional One- Stage Object Detection is referred to as FCOS
- the two-stage target detection network structure can be Faster Region CNN referred to as Faster RCNN.
- the pedestrian detection model obtained by training, after the pedestrian detection model is obtained by training, it may further include:
- Performance evaluation of pedestrian detection model evaluate the performance of the trained pedestrian detection model. If the performance does not meet the needs of the application, you can return to step (2) above to reselect the target detection network structure, or reconfigure the network hyperparameters to retrain the pedestrian detection model.
- step (3) if the performance meets the requirements of the application, the following steps may also be performed:
- the data processed by the pedestrian detection model is video data. Due to the limited computing power of hardware, in order to ensure the efficiency of the model operation, the trained pedestrian detection model can be quantified and compressed. Quantization compression can effectively improve the efficiency of model operation.
- various attributes of the target person are determined according to a pre-trained multi-attribute classification model; wherein, the multi-attribute classification model is obtained by training according to a pre-built sample set, and the sample set includes Several images annotated with attributes.
- the above-mentioned multi-attribute classification model can be understood as a multi-task classification model, each classification task can be understood as the classification of one attribute, and multiple classification tasks can be understood as the classification of multiple attributes. Compared with the single-task classification model, multiple classification tasks share the same backbone network, and multi-task learning can promote the model to learn shared feature representations and improve the generalization ability of the model.
- the above-mentioned various attributes may include, but are not limited to: whether to wear a hat, whether to wear an epaulet, the color of the clothes, the texture of the clothes, and the style of the clothes.
- the single-task classification model is the classification model 1, classification model 2...classification model n in the figure
- the classification task of classification model 1 is to classify the attribute of clothing style
- the classification task of classification model 2 is to classify clothes.
- the classification task of the classification model n is to classify the attribute of whether to wear a hat.
- the classification task of the multi-task classification model is: classification of various attributes such as clothing style, clothing color, whether to wear a hat, etc.
- each single-task classification model requires one backbone network, and multiple backbone networks are required to complete multi-classification tasks, while in this embodiment, the multi-task classification model only needs one backbone network, and the completion of multi-classification tasks shares a backbone network , which is conducive to improving the operating efficiency of the network.
- a multi-attribute classification model can be trained as follows:
- the image data set may be an image data set constructed when training the above pedestrian detection model.
- the above-mentioned image data sets can use a large number of public data sets, and the workload of collecting data in actual deployment scenarios is very large, and the diversity of data is limited.
- a large number of public data sets can be used when training a model, and it is not necessary to collect data in an actual deployment scenario, which simplifies the complicated image data set production process, and can use more data to train the model.
- the preset annotation conditions can be set according to actual needs, such as: the characters in the image are not is occluded, the area occupied by the person in the image is larger than the preset area, the number of body parts displayed by the person in the image exceeds the preset number, etc.
- the above-mentioned preset area and preset number can be set according to actual needs, which are not specifically limited in this embodiment.
- various attributes annotated to a character in an image include, but are not limited to: the style, color, texture of the clothes worn by the character, whether the character wears a hat, whether the character wears epaulets, and the like. That is to say, the attributes of some people in the image data set can be annotated to construct a sample set of people's attributes.
- the structure of the network includes a backbone network, and MobileNet can be selected for the backbone network.
- MobileNet is a lightweight network with high operating efficiency.
- the multi-attribute classification model obtained by training, after the multi-attribute classification model is obtained by training, it may further include:
- step (5) if the performance meets the requirements of the application, the following steps may also be performed:
- Model quantization and compression for example, TensorRT can be used to quantify and compress the trained multi-attribute classification model. Model acceleration and quantization compression can effectively improve the efficiency of model operation. TensorRT is a high-performance deep learning inference optimizer that provides low-latency, high-throughput deployment inference for deep learning applications. TensorRT can be used to accelerate inference in hyperscale data centers, embedded platforms, or autonomous driving platforms.
- Step 103 Determine the standard attributes of the identities that meet the entry conditions of the monitoring scenario.
- the identities of persons allowed to enter different monitoring scenarios may be different, and therefore, different monitoring scenarios may correspond to different standard attributes.
- the monitoring scene is the data room of hospital A
- the identities of the people who are allowed to enter the data room of hospital A are doctors, nurses, and hospital logistics personnel.
- the doctors and nurses are all wearing long white work clothes
- the logistics personnel are wearing blue Cropped top, and blue pants.
- the standard attributes of the identities that meet the entry conditions of the monitoring scene set for the data room of hospital A include: white long work clothes (standard attributes for doctors and nurses), blue short tops, and blue pants (standard attributes for logisticians) .
- the monitoring scene is the production workshop of a factory.
- the production workshop of the factory is a dangerous area, and non-factory staff are strictly prohibited from entering.
- the factory's production workshop consists of three types of workers: Job A in a blue top and gray pants, Job B in a red top and red pants, and Job C in an orange vest and orange pants.
- the standard attributes of the identity set for the production workshop of factory a that meet the entry conditions of the monitoring scene include: blue top and gray pants (standard attributes of work A), red tops and red pants (standard attributes of work B), orange Vest and orange trousers (standard attributes for craft C).
- the server may pre-store the standard attributes of the identities that meet the entry conditions of the monitoring scenario.
- the server may be the monitoring server of the data room of hospital A, and the monitoring server may pre-store the standard attributes of the identities that meet the entry conditions of the data room of hospital A.
- the server may be a monitoring server of the production workshop of the a factory, and the monitoring server may pre-store the standard attributes of the identity that meet the entry conditions of the production workshop of the a factory.
- Step 104 Identify whether the identity of the target person meets the entry conditions according to various attributes and standard attributes of the target person.
- the server can match various attributes of the target person with standard attributes. If the matching is successful, it recognizes that the identity of the target person meets the entry conditions; otherwise, it recognizes that the identity of the target person does not meet the entry conditions.
- the matching method may be as follows: the server compares multiple attributes of the target person with standard attributes, and if the multiple attributes of the target person have the same attributes as the standard attributes, it can be considered that the identity of the target person meets the entry conditions.
- the standard attributes of the identities that meet the entry conditions of the monitoring scene include multiple standard attributes corresponding to multiple identities.
- the method of identifying whether the identity of the target person meets the entry conditions may be For: the server matches the various attributes of the target person with each standard attribute respectively, if the various attributes of the target person are successfully matched with the standard attributes corresponding to any identity, the identity of the recognized target person meets the entry conditions. That is to say, the server sequentially matches various attributes of the target person with each standard attribute until the matching succeeds and determines that the identity of the target person meets the entry conditions, or until the matching fails, determines that the identity of the target person does not meet the entry conditions.
- the standard attributes of the identities that meet the entry conditions of the monitoring scenario set in the production workshop of factory a mentioned in the above example include: standard attributes of work type A, standard attributes of work type B, and standard attributes of work type C, that is, the standard attributes that meet the monitoring scene
- the standard attributes of the identities of the entry conditions include: 3 standard attributes corresponding to the 3 identities.
- the server can first match the various attributes of the target person with the standard attributes of type A, that is, determine whether there is an attribute that is the same as the standard attribute of type A in the various attributes of the target person. The attribute matches the standard attribute of trade A successfully.
- the various attributes of the target person do not have the same attributes as the standard attributes of type A, then the various attributes of the target person can be matched with the standard attributes of type B, that is, it is determined whether the various attributes of the target person exist. If there are the same attributes as the standard attributes of type B, it is considered that the various attributes of the target person are successfully matched with the standard attributes of type B. If the attributes of the target person do not have the same attributes as the standard attributes of type B, then the various attributes of the target person can be matched with the standard attributes of type C, that is, to determine whether the various attributes of the target person are There is an attribute that is the same as the standard attribute of type C.
- the various attributes of the target person are successfully matched with the standard attribute of type C. If it does not exist, it means that the various attributes of the target person do not match the above three standard attributes. If it matches, it can be recognized that the identity of the target person does not meet the entry conditions.
- the method of matching the various attributes of the target person with each standard attribute may be: determining the priority of the various standard attributes, and sequentially matching the various standard attributes of the target person according to the priority of the various standard attributes. Attributes are matched against each standard attribute separately. Among them, the priorities of various standard attributes can be preset according to actual needs and stored in the server. For example, the above-mentioned standard attributes of type A, standard attributes of type B, and standard attributes of type C are in descending order of priority: the standard attributes of type C, the standard attributes of type B, and the standard attributes of type A. When the server is matching, it can first match the various attributes of the target person with the standard attributes of work type C.
- match If the match is unsuccessful, then match the various attributes of the target person with the standard attributes of work type B. If it still matches If unsuccessful, then match the various attributes of the target person with the standard attributes of the craft A. By setting priorities for multiple standard attributes, it is beneficial to match multiple attributes of the target person with each standard attribute in a reasonable order.
- the priority may be determined based on the actual number of people corresponding to multiple identities in the monitoring scenario; wherein, the higher the actual number of identities, the higher the priority of the standard attribute corresponding to the identity.
- the actual number of employees corresponding to the above-mentioned type A is 50
- the actual number of employees corresponding to type B is 60
- the actual number of employees corresponding to type C is 70. That is to say, in the above-mentioned production workshop of factory a, theoretically, there are 50 workers belonging to type A, 60 workers belonging to type B, and 60 workers belonging to type C.
- the priority of the three standard attributes corresponding to the above three types of work is in descending order: standard attributes of work type C, standard attributes of work type B, and standard attributes of work type A. Since the number of workers belonging to type C is the largest among the workers in the production workshop of factory a, the probability of belonging to type C is higher among the workers entering the production workshop of factory a. Therefore, when matching, the various attributes of the target person are given priority. Matching with a standard attribute with a high priority makes it easier to match successfully, so that there is no need to match the standard attribute with the next priority, which is beneficial to improve the speed of identification.
- an alarm mechanism can be triggered to remind relevant personnel that there may be illegal intrusions in the monitoring scene, so as to conduct timely verification.
- the alarm mechanism may be set according to actual needs, which is not specifically limited in this embodiment.
- Monitoring Scenario 1 A hospital data room, only doctors and nurses and hospital logistics personnel are allowed to enter, and no other people are allowed to enter. Among them, the doctors and nurses are wearing white long work clothes, and the logistics staff are all wearing blue short work shirts and blue pants. Therefore, the standard attributes of the identities that meet the entry conditions of the reference room of hospital A can be preset include: white long work clothes (standard attributes corresponding to the two identities of doctors and nurses), blue short work shirts and blue pants (logistics staff corresponding standard properties). The standard attributes corresponding to the above three identities can be pre-stored in the monitoring server of the data room of hospital A, and the monitoring process can be as follows:
- the monitoring server in the data room of the A hospital uses the pedestrian detection model to detect the presence of a human target in the video image.
- the monitoring server in the data room of the A hospital uses the multi-attribute classification model to classify the relevant attributes of the person target detected in the previous step, and obtains various attributes of the person.
- the various attributes of the character include whether to wear a hat, the color, texture, style of the clothes, whether there are epaulets, etc.
- Whitelist identity settings adding doctors and nurses as well as hospital logistics staff to the whitelist.
- doctors and nurses are defined as white long work clothes
- hospital logistics staff are defined as blue short tops and blue pants.
- the standard attributes of the identities that meet the entry conditions of the data room of A hospital will be added to the whitelist.
- a blacklist prohibiting entry into the data room of hospital A may also be set according to actual needs, which is not specifically limited in this embodiment.
- Surveillance scene 2 The data room of hospital B.
- the data room of hospital B also only allows doctors, nurses and logistics personnel to enter. Doctors may only wear long white overalls, but nurses will wear short white or pink overalls, and logistics personnel will wear green short overalls. Top and green pants. Therefore, the standard attributes of the identities that meet the entry conditions of the data room of hospital B can be preset, including: white long work clothes (standard attributes corresponding to doctors), white or pink short work clothes (standard attributes corresponding to nurses), green short tops and green pants (standard attribute for logisticians).
- the standard attributes corresponding to the above three identities can be pre-stored in the monitoring server of the data room of hospital B, and the monitoring process can be as follows:
- the monitoring server in the data room of S2 and B hospitals uses the pedestrian detection model to detect the presence of human objects in the video images. Among them, after the pedestrian detection model for hospital A is trained, the pedestrian detection model can be directly applied to hospital B without retraining the pedestrian detection model.
- the monitoring server in the data room of the B hospital uses the multi-attribute classification model to classify the relevant attributes of the person target detected in the previous step, and obtains various attributes of the person.
- the various attributes of the character include whether to wear a hat, the color, texture, style of the clothes, whether there are epaulets, etc.
- the multi-attribute classification model deployed in the reference room of hospital A is trained, the multi-attribute classification model can be directly applied to the reference room of hospital B without retraining the multi-attribute classification model.
- Whitelist identity settings adding doctors and nurses as well as hospital logistics staff to the whitelist.
- doctors are defined as white long overalls
- nurses are defined as white or pink short overalls
- hospital logistics staff are defined as green short tops and green pants.
- the standard attributes of identities that meet the entry requirements for the data room of hospital B will be added to the whitelist.
- a blacklist for prohibiting entry into the data room of hospital B may also be set according to actual needs, which is not specifically limited in this embodiment.
- the beneficial effects of this embodiment are: strong generalization performance, good flexibility, and high efficiency, which can realize effective identity verification, improve emergency response capability of illegal intrusion events, and facilitate timely early warning and prevention. Mainly in the following aspects:
- this embodiment defines the identity that meets the entry conditions of the monitoring scene through standard attributes, and different monitoring scenes can define different standard attributes, so that in this embodiment, one can train a Multi-attribute classification model to adapt to the monitoring needs of different monitoring scenarios. Therefore, the multi-attribute classification model in this embodiment does not need to retrain the network when migrating to other monitoring scenarios, has stronger generalization ability, can be flexibly applied to various monitoring scenarios, and is conducive to the large-scale deployment of the model.
- a large number of public image data sets can be used to train the network model.
- the workload of collecting data is large, and the diversity of data is limited.
- a large number of public image data sets can be used when training a multi-attribute classification model, and it is not necessary to collect data in an actual deployment scenario, which simplifies the complicated data set acquisition process, and can use more data to train more data sets. Attribute classification model.
- the multi-attribute classification model used in this embodiment that is, the multi-task classification network adopts the form of a shared backbone network, which can allow the network to learn more shared feature representations and improve the generalization effect of the network. Compared with training a model for each task as shown in FIG. 2 , only one multi-attribute classification model is used in this embodiment, which effectively improves the operation efficiency of the network.
- the second embodiment of the present application relates to an identification method.
- This embodiment is a further improvement of the first embodiment.
- the main improvement lies in that: an attention mechanism is introduced into the multi-attribute classification model, as shown in FIG. 3 , in the After using the shared backbone network to extract features to obtain the intermediate feature map, when classifying the attributes of a certain region of the target person, the mask image corresponding to the region can be predicted first, and then the mask images corresponding to different regions can be applied. Go to the intermediate feature map, obtain the target area feature maps corresponding to different areas in the intermediate feature map, and finally determine various attributes of the target object according to the target area feature maps corresponding to different areas.
- This embodiment is equivalent to a further improvement on "determining multiple attributes of a target person according to a pre-trained multiple attribute classification model" in the first embodiment.
- the difference between the multi-attribute classification model in this embodiment and the multi-attribute classification model in the first embodiment is that the sample sets constructed during the training of the model are different.
- various attributes of the characters in the images that meet the preset labeling conditions in the image data set are labeled to construct a sample set; in this embodiment, the characters in the images that meet the preset labeling conditions in the image data set are labeled.
- the various attributes and different areas of the characters are annotated to construct a sample set. That is to say, in the first embodiment, various attributes of the character are marked. In this embodiment, in addition to marking the various attributes of the character, different regions of the character are also marked.
- FIG. 4 and FIG. 5 for the annotation of different regions of the character.
- FIG. 4 is the original image without labeling
- FIG. 5 the top coat area, the pants area, and the hat area of the head are marked with different colors.
- Figure 6 the implementation of "determining multiple attributes of a target person according to a pre-trained multi-attribute classification model" may be shown in Figure 6, including:
- Step 501 Input the video image into the backbone network in the multi-attribute classification model to obtain an intermediate feature map.
- the backbone network in the multi-attribute classification model in this embodiment may be a Residual Neural Network (ResNet for short), and the ResNet may further be ResNet18.
- ResNet18 has fewer parameters and can achieve higher speed and accuracy.
- ResNet18 can extract the features of the video image and obtain the intermediate feature map corresponding to the video image.
- Step 502 Determine the mask images corresponding to different regions of the target person in the intermediate feature map.
- mask images corresponding to different regions of the target person in the intermediate feature map can be obtained.
- the mask image can be understood as a binary image, for example, the mask image corresponding to the top area of the intermediate feature map can refer to Figure 7, that is, the values in the top area are all 1, and the values in the rest areas are all 0.
- Step 503 Apply the mask images corresponding to different regions to the intermediate feature map, and obtain the target region feature maps corresponding to different regions in the intermediate feature map respectively.
- Step 504 Determine various attributes of the target object according to the target area feature maps corresponding to different areas.
- the intermediate feature maps may be multiplied by mask images corresponding to different regions, respectively, to obtain target region feature maps corresponding to different regions in the intermediate feature map. According to the target area feature maps corresponding to different areas, various attributes of the target object are determined. By multiplying the intermediate feature maps with the mask images corresponding to different regions, information irrelevant to the current region of interest can be removed, so that the network's attention can be focused on the target region that needs to be focused.
- the information in the image that does not belong to the top area may affect the network's judgment. Therefore, the intermediate feature map can be multiplied by the mask image corresponding to the top area to remove the information that is not related to the top area. In this way, the attention of the network can be focused on the top area that needs to be focused on, that is, the feature map of the target area corresponding to the top area can be obtained. Then, according to the target area feature map corresponding to the shirt area, the relevant attributes of the shirt area of the target object are determined. For example, according to the target area feature map corresponding to the top area, the top color and/or top style of the target object is determined.
- the information in the image that does not belong to the trousers area may affect the judgment of the network. Therefore, the intermediate feature map can be multiplied by the mask image corresponding to the trousers area to remove the information unrelated to the trousers area. In this way, the attention of the network can be focused on the pants area that needs to be focused, that is, the feature map of the target area corresponding to the pants area can be obtained. Then, according to the feature map of the target area corresponding to the pants area, the relevant attributes of the pants area of the target object are determined. For example, according to the target area feature map corresponding to the pants area, the pants color and/or pants style of the target object are determined.
- determining the various attributes of the target object according to the target area feature maps corresponding to different areas may include: determining the top color and/or top style of the target object according to the target area feature maps corresponding to the top area;
- the target area feature map corresponding to the pants area determines the pants color and/or pants style of the target object; according to the target area feature map corresponding to the head area, it is determined whether the target object wears a hat and/or glasses, etc.
- the mask image of the area is first determined, and the mask image of the area is applied to the intermediate feature map to remove irrelevant backgrounds information, and then classify the attributes of the area, which can effectively improve the accuracy of various attributes of the determined target object.
- the third embodiment of the present application relates to a method for training a multi-attribute classification model, as shown in FIG. 8 , including:
- Step 701 Acquire a public image dataset.
- Step 702 Annotate various attributes of persons in the images that satisfy the preset labeling conditions in the image dataset to construct a sample set.
- Step 703 Determine the structure of the network, and configure network hyperparameters of the network.
- Step 704 Train a network configured with network hyperparameters according to the sample set to obtain a multi-attribute classification model.
- a large number of public image data sets can be used to train the network model.
- the workload of collecting data is large, and the diversity of data is limited.
- a large number of public image data sets can be used when training a multi-attribute classification model, and it is not necessary to collect data in an actual deployment scenario, which simplifies the complicated data set acquisition process, and can use more data to train more data sets.
- Attribute classification model Moreover, the multi-attribute classification model used in this embodiment, that is, the multi-task classification network adopts the form of a shared backbone network, which can allow the network to learn more shared feature representations and improve the generalization effect of the network.
- the fourth embodiment of the present application relates to a training device for a multi-attribute classification model, as shown in FIG. 9 , including:
- an acquisition module 801 configured to acquire a public image data set
- An annotation module 802 configured to annotate various attributes of characters in images that meet preset annotation conditions in the image data set to construct a sample set;
- the configuration module 803 is used to determine the structure of the network and configure the network hyperparameters of the network;
- the training module 804 is configured to train a network configured with network hyperparameters according to the sample set to obtain a multi-attribute classification model.
- this embodiment is an apparatus embodiment corresponding to the third embodiment, and the related technical details and technical effects mentioned in the third embodiment are still valid in this embodiment, and are not repeated here in order to reduce repetition.
- the relevant technical details mentioned in this embodiment can also be applied in the third embodiment.
- the fifth embodiment of the present application relates to an electronic device. As shown in FIG. 10 , it includes at least one processor 901; The instructions are executed by the processor 901, and the instructions are executed by the at least one processor 901, so that the at least one processor 901 can execute the identification method in the first or second embodiment.
- the memory 902 and the processor 901 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors 901 and various circuits of the memory 902 together.
- the bus may also connect together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein.
- the bus interface provides the interface between the bus and the transceiver.
- a transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other devices over a transmission medium.
- the data processed by the processor 901 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 901 .
- Processor 901 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interface, voltage regulation, power management, and other control functions.
- the memory 902 may be used to store data used by the processor 901 when performing operations.
- the sixth embodiment of the present application relates to a computer-readable storage medium storing a computer program.
- the above method embodiments are implemented when the computer program is executed by the processor.
- a storage medium includes several instructions to make a device ( It may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, Read-Only Memory (ROM for short), Random Access Memory (RAM for short), magnetic disk or optical disk, etc. medium of program code.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Image Analysis (AREA)
Abstract
一种身份识别方法、模型训练方法、装置、设备和存储介质。身份识别方法包括:获取监控场景内的视频图像(101);若检测到视频图像中出现目标人物,根据预先训练的多属性分类模型,确定目标人物的多种属性(102);其中,多属性分类模型根据预先构建的样本集训练得到,样本集包括若干标注有属性的图像;确定符合监控场景的进入条件的身份的标准属性(103);根据目标人物的多种属性和标准属性,识别目标人物的身份是否符合所述进入条件(104)。
Description
相关申请的交叉引用
本申请基于申请号为“202011448967.8”、申请日为2020年12月09日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
本申请德实施例涉及安防监控技术领域,特别涉及一种身份识别方法、模型训练方法、装置、设备和存储介质。
近年来,安防监控领域的技术取得了飞速发展,人物识别是安防监控领域的一个典型应用。有些场所仅允许特定身份并穿着特定服装的工作人员进入,不允许闲杂人等进入。如果该区域中出现不符合着装要求的人,需要报警。例如,在军事管辖区仅允许出现穿着指定服装的军人,当系统检测到着装不符合要求的人时,说明检测到了可疑人员,系统需要报警并请求工作人员对可疑人员的身份进行核实。使用传统的图像处理方法的身份识别系统精度较低,故现有的身份识别系统主要采用深度学习的方法。
目前,大多数深度学习系统都需要在每个应用场景中收集大量的数据作为训练集,并训练出一个适用于指定场景的模型。然而,这样的模型存在以下缺点:特定场景下样本集的获取非常困难,训练得到的模型容易过拟合,且模型的泛化能力较差,难以适应更多监控场景的监控需求。
发明内容
本申请实施例提供了一种身份识别方法,包括:获取监控场景内的视频图像;若检测到所述视频图像中出现目标人物,根据预先训练的多属性分类模型,确定所述目标人物的多种属性;其中,所述多属性分类模型根据预先构建的样本集训练得到,所述样本集包括若干标注有属性的图像;确定符合所述监控场景的进入条件的身份的标准属性;根据所述目标人物的多种属性和所述标准属性,识别所述目标人物的身份是否符合所述进入条件。
本申请实施例还提供了一种多属性分类模型的训练方法,包括:获取公开的图像数据集;对所述图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建所述样本集;确定网络的结构,并配置所述网络的网络超参数;根据所述样本集训练配置有所述网络超参数的网络,得到所述多属性分类模型。
本申请实施例还提供了一种多属性分类模型的训练装置,包括:获取模块,用于获取公开的图像数据集;标注模块,用于对所述图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建所述样本集;配置模块,用于确定网络的结构,并配置所述网络的网络超参数;训练模块,用于根据所述样本集训练配置有所述网络超参数的网络,得到所述多属性分类模型。
本申请实施例还提供了一种电子设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令, 所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的身份识别方法。
本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现上述的身份识别方法。
图1是本申请第一实施例提到的身份识别方法的流程图;
图2是本申请第一实施例提到的多任务分类模型与单任务分类模型的示意图;
图3是本申请第二实施例提到的在多属性分类模型中引入注意力机制的示意图;
图4是本申请第二实施例提到的标注的未进行标注的原图;
图5是本申请第二实施例提到的用不同颜色标注的不同区域的示意图;
图6是本申请第二实施例提到的根据预先训练的多属性分类模型,确定目标人物的多种属性的实现方式的流程图;
图7是本申请第二实施例提到的上衣区域对应的掩码图像的示意图;
图8是本申请第三实施例提到的多属性分类模型的训练方法的流程图;
图9是本申请第四实施例提到的多属性分类模型的训练装置的示意图;
图10是本申请第五实施例提到的电子设备的结构示意图。
本申请实施例的主要目的在于提出一种身份识别方法、模型训练方法、装置、设备和存储介质,旨在简化样本集的获取过程,降低模型过拟合的风险,提高模型的泛化能力以适应更多监控场景的监控需求。
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请的各实施例进行详细的阐述。然而,本领域的普通技术人员可以理解,在本申请各实施例中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施例的种种变化和修改,也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便,不应对本申请的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。
大多数深度学习系统都需要在每个应用场景中收集大量的数据作为训练集,并训练出一个适用于指定场景的模型,然而,本申请的发明人发现,这样的模型存在以下缺点:
(1)特定场景下高质量样本集的获取非常困难,训练深度学习网络模型需要海量且多样化的数据。有些场所属于涉密场所,从这些涉密场所中能获取的数据量有限。同时,在特定场景下获取的数据的模式较为单一,多样性有限,不利于深度学习网络模型的训练,极易导致网络模型过拟合。
(2)适用于指定场景的模型应用于指定场景中的精度较高,但如果切换到其他类似场景,模型可能会完全失效。例如,将应用于A医院资料室的模型迁移到B医院资料室,B医院工作人员制服的款式和颜色可能与A医院不同。但由于该模型仅关注A医院工作人员制服的特征,因此将该模型应用到B医院资料室时,模型可能会完全失效。如果需要将该模型应用于B医院,则需要在B医院资料室采集数据,并对模型进行重新训练。这限制了模型的规模化部署,模型的泛化能力较差。
为了解决上述的特定场景下样本集的获取非常困难,训练得到的模型容易过拟合,且模型的泛化能力较差的技术问题,本申请实施例提供如下的身份识别方法,旨在简化样本集的获取过程,降低模型过拟合的风险,提高模型的泛化能力。
本申请第一实施例涉及一种身份识别方法,应用于电子设备;其中,电子设备可以为服务器。本实施例的应用场景可以包括但不限于:医院资料室、警局资料室、银行资料室、军事管辖区域、监狱和工厂生产车间等具有安防监控需求的场景。下面对本实施例的身份识别方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须。
本实施例的身份识别方法的流程图可以参考图1,包括:
步骤101:获取监控场景内的视频图像。
在一个例子中,监控场景可以为上述的医院资料室、警局资料室、银行资料室、军事管辖区域、监狱和工厂生产车间等。监控场景内可以部署若干个监控摄像头,对监控场景内的视频图像进行采集,并将采集的视频图像传输到服务器,使得服务器可以获取到监控场景内的视频图像。
在具体实现中,若干个监控摄像头可以对监控场景内的视频图像进行实时采集,使得服务器可以实时获取到监控场景内的视频图像,以提高监控的可靠性。
步骤102:若检测到视频图像中出现目标人物,根据预先训练的多属性分类模型,确定目标人物的多种属性。
在一个例子中,目标人物可以理解为出现在视频图像中的任意一个人物。也就是说,如果检测到视频图像中出现任意一个人物,则可以确定检测到视频图像中出现目标人物。
在具体实现中,也可以理解为,服务器对视频图像进行目标检测,当检测到的目标为人物时,确定检测到视频图像中出现目标人物。
在一个例子中,确定视频图像中是否出现目标人物的方式可以为:使用预先训练的行人检测模型检测视频图像中的目标是否为人物。下面对行人检测模型的训练方式进行说明:
(1)图像数据集的建立:获取公开的图像数据集。也就是说,上述图像数据集可以使用大量的公开数据集,在实际的部署场景中采集数据的工作量很大,且数据的多样性有限,使用公开的图像数据集,而不必到实际的部署场景中去采集数据,简化了繁杂的图像数据集制作过程,且可以利用更多的数据训练模型。然而,在具体实现中,也可以收集多种监控场景下的图像,构建图像数据集。
(2)行人检测模型的训练:选择目标检测网络结构,并配置网络超参数,使用构建的图像数据集训练行人检测模型。其中,目标检测网络结构可以为一阶段目标检测网络结构或二阶段目标检测网络结构,一阶段目标检测网络结构可以包括但不限于Single Shot Detector简称SSD、You Only Look Once简称YOLO、Fully Convolutional One-Stage Object Detection简称FCOS,二阶段目标检测网络结构可以为Faster Region CNN简称Faster RCNN。
可选的,为了提高训练得到的行人检测模型的可靠性,在训练得到行人检测模型之后,还可以包括:
(3)行人检测模型的性能评估:对训练好的行人检测模型的性能进行评估。如果性能不满足应用的需求,则可以返回上述第(2)步,重新选择目标检测网络结构,或者重新配置网络超参数,重新训练行人检测模型。
可选的,为了提高训练得到的行人检测模型的运行效率,在第(3)步中,如果性能满足应用的需求,则还可以进行如下步骤:
(4)行人检测模型的量化压缩:该行人检测模型处理的数据为视频数据,由于硬件计算能力有限,为了保证模型运行的效率,可以对训练好的行人检测模型进行量化压缩,模型的加速和量化压缩可有效提高模型运行的效率。
本实施例中,若检测到视频图像中出现目标人物,根据预先训练的多属性分类模型,确定目标人物的多种属性;其中,多属性分类模型根据预先构建的样本集训练得到,样本集包括若干标注有属性的图像。上述的多属性分类模型可以理解为多任务分类模型,每一项分类任务可以理解为一种属性的分类,多项分类任务可以理解为多种属性的分类。相比于单项任务分类模型,多项分类任务共享同一个骨干网络,多项任务学习可以促进模型学到共享的特征表示,提升模型的泛化能力。上述多种属性可以包括但不限于:是否佩戴帽子、是否佩戴肩章、衣服的颜色、衣服的纹理、衣服的款式。
为便于理解本实施例中的多任务分类模型与单任务分类模型的区别,可以参考图2。其中,单任务分类模型,即为图中的分类模型1、分类模型2……分类模型n,分类模型1的分类任务为对衣服款式这一属性的分类,分类模型2的分类任务为对衣服颜色这一属性的分类,分类模型n的分类任务为对是否佩戴帽子这一属性的分类。多任务分类模型的分类任务为:对衣服款式、衣服颜色、是否佩戴帽子等多种属性的分类。也就是说,每个单任务分类模型均需要一个骨干网络,完成多分类任务需要多个骨干网络,而本实施例中,多任务分类模型仅需一个骨干网络,完成多分类任务共享一个骨干网络,有利于提升网络的运行效率。
在一个例子中,多属性分类模型的训练方式可以如下:
(1)获取公开的图像数据集;其中,该图像数据集可以为训练上述的行人检测模型时构建的图像数据集。在具体实现中,上述图像数据集可以使用大量的公开数据集,在实际的部署场景中采集数据的工作量很大,且数据的多样性有限。本实施例在训练模型时可使用大量的公开数据集,而不必到实际的部署场景中去采集数据,简化了繁杂的图像数据集制作过程,且可以利用更多的数据训练模型。
(2)对图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建样本集;其中,预设标注条件可以根据实际需要进行设置,比如可以为:图像中的人物未被遮挡、图像中的人物所占区域面积大于预设面积、图像中的人物显示出的身体部位的数量超过预设数量等。上述的预设面积和预设数量均可以根据实际需要进行设置,本实施例对此不做具体限定。在具体实现中,对图像中的人物标注的多种属性包括但不限于:该人物穿着的衣服的款式、颜色、纹理、该人物是否佩戴帽子、是否佩戴肩章等。也就是说,可以对图像数据集中的部分人物的属性进行标注,构建人物属性样本集。
(3)确定网络的结构,并配置网络的网络超参数。其中,网络的结构包括骨干网络,骨干网络可以选择MobileNet,MobileNet属于轻量级网络,运行效率较高。
(4)根据样本集训练配置有网络超参数的网络,得到多属性分类模型。
可选的,为了提高训练得到的多属性分类模型的可靠性,在训练得到多属性分类模型之后,还可以包括:
(5)对训练好的多属性分类模型的性能进行评估。如果模型性能不满足应用的需求,则重新设计多属性分类模型的骨干网络或者重新配置网络超参数,重新训练多属性分类模型。
可选的,为了提高训练得到的多属性分类模型的运行效率,在第(5)步中,如果性能满足应用的需求,则还可以进行如下步骤:
(6)模型量化压缩,比如可以对训练好的多属性分类模型使用TensorRT进行量化压缩。模型的加速和量化压缩可有效提高模型运行的效率。TensorRT是一个高性能的深度学习推理优化器,可以为深度学习应用提供低延迟、高吞吐率的部署推理。TensorRT可用于对超大规模数据中心、嵌入式平台或自动驾驶平台进行推理加速。
步骤103:确定符合监控场景的进入条件的身份的标准属性。
在一个例子中,针对不同的监控场景的监控需求,允许进入不同的监控场景的人物的身份可能存在区别,因此,不同的监控场景可能对应不同的标准属性。
在一个例子中,监控场景为A医院资料室,允许进入A医院资料室的人物的身份为医生、护士以及医院后勤人员,其中,医生和护士均穿着白色长款工作服,后勤人员均穿着蓝色短款上衣,以及蓝色裤子。针对A医院资料室设置的符合监控场景的进入条件的身份的标准属性即包括:白色长款工作服(医生和护士的标准属性)、蓝色短款上衣以及蓝色裤子(后勤人员的标准属性)。
在另一个例子中,监控场景为a工厂生产车间,工厂的生产车间属于危险区域,非工厂工作人员严禁入内。该工厂的生产车间的工作人员包含三种:穿着蓝色上衣和灰色裤子的工种A,穿着红色上衣和红色裤子的工种B,以及穿着橘色马甲和橘色裤子的工种C。针对a工厂生产车间设置的符合监控场景的进入条件的身份的标准属性即包括:蓝色上衣和灰色裤子(工种A的标准属性)、红色上衣和红色裤子(工种B的标准属性)、橘色马甲和橘色裤子(工种C的标准属性)。
在具体实现中,服务器中可以预存有符合监控场景的进入条件的身份的标准属性。比如,监控场景为A医院资料室,则服务器可以为A医院资料室的监控服务器,该监控服务器中可以预存符合A医院资料室的进入条件的身份的标准属性。再比如,监控场景为a工厂生产车间,则服务器可以为a工厂生产车间的监控服务器,该监控服务器中可以预存符合a工厂生产车间的进入条件的身份的标准属性。
步骤104:根据目标人物的多种属性和标准属性,识别目标人物的身份是否符合进入条件。
具体的说,服务器可以将目标人物的多种属性和标准属性进行匹配,如果匹配成功,则识别出目标人物的身份符合进入条件,否则识别出目标人物的身份不符合进入条件。其中,匹配的方式可以为:服务器将目标人物的多种属性和标准属性进行对比,如果目标人物的多种属性中存在和标准属性相同的属性,则可以认为目标人物的身份符合进入条件。
在一个例子中,符合监控场景的进入条件的身份的标准属性包括多种身份对应的多种标准属性,根据目标人物的多种属性和标准属性,识别目标人物的身份是否符合进入条件的方式可以为:服务器将目标人物的多种属性分别和每种标准属性进行匹配,若目标人物的多种属性与任意一种身份对应的标准属性匹配成功,识别目标人物的身份符合进入条件。也就是说,服务器将目标人物的多种属性依次和每种标准属性进行匹配,直到匹配成功确定目标人物的身份符合进入条件,或者,直到匹配失败,确定目标人物的身份不符合进入条件。
比如,上述示例中提到的a工厂生产车间设置的符合监控场景的进入条件的身份的标准属性包括:工种A的标准属性、工种B的标准属性、工种C的标准属性,即符合监控场景的 进入条件的身份的标准属性包括:3种身份对应的3种标准属性。服务器可以先将目标人物的多种属性和工种A的标准属性进行匹配,即确定目标人物的多种属性中是否存在和工种A的标准属性相同的属性,如果存在,则认为目标人物的多种属性与工种A的标准属性匹配成功。如果目标人物的多种属性中不存在和工种A的标准属性相同的属性,则可以再将目标人物的多种属性和工种B的标准属性进行匹配,即确定目标人物的多种属性中是否存在和工种B的标准属性相同的属性,如果存在,则认为目标人物的多种属性与工种B的标准属性匹配成功。如果为目标人物的多种属性中不存在和工种B的标准属性相同的属性,则可以再将目标人物的多种属性和工种C的标准属性进行匹配,即确定目标人物的多种属性中是否存在和工种C的标准属性相同的属性,如果存在,则认为目标人物的多种属性与工种C的标准属性匹配成功,如果不存在,说明目标人物的多种属性与上述3种标准属性均不匹配,则可以识别出目标人物的身份不符合进入条件。
在一个例子中,将目标人物的多种属性分别和每种标准属性进行匹配的方式可以为:确定多种标准属性的优先级,按照多种标准属性的优先级,依次将目标人物的多种属性分别和每种标准属性进行匹配。其中,多种标准属性的优先级可以根据实际需要预先设置,并存储在服务器中。比如,上述工种A的标准属性、工种B的标准属性、工种C的标准属性的优先级从高到底依次为:工种C的标准属性、工种B的标准属性、工种A的标准属性。则服务器在进行匹配时,可以先将目标人物的多种属性和工种C的标准属性进行匹配,如果匹配不成功,再将目标人物的多种属性和工种B的标准属性进行匹配,如果依旧匹配不成功,再将目标人物的多种属性和工种A的标准属性进行匹配。通过对多种标准属性设定优先级,有利于以合理的顺序将目标人物的多种属性分别和每种标准属性进行匹配。
在一个例子中,优先级可以基于监控场景下多种身份分别对应的实际人数确定;其中,实际人数越多的身份对应的标准属性的优先级越高。比如,上述工种A对应的实际人数为50人,工种B对应的实际人数为60人,工种C对应的实际人数为70人。也就是说,上述的a工厂生产车间中,理论上属于工种A的工人50人、属于工种B的工人60人、属于工种C的工人60人。则上述3种工种对应的3种标准属性的优先级从高到底依次为:工种C的标准属性、工种B的标准属性、工种A的标准属性。由于,a工厂生产车间的工人中属于工种C的工人的人数最多,那么进入a工厂生产车间的工人中属于工种C的概率较大,因此,在进行匹配时,优先将目标人物的多种属性和优先级高的标准属性进行匹配,更容易匹配成功,从而无需再进行下一个优先级的标准属性的匹配,有利于提高身份识别的速度。
在具体实现中,如果识别出目标人物的身份不符合进入条件,则可以触发报警机制,以提醒相关人员监控场景内可能存在非法人员入侵,从而及时进行核查。其中,报警机制可以根据实际需要进行设置,本实施例对此不做具体限定。
为便于对实施例的理解,下面以两种具体的监控场景进行说明:
监控场景一:A医院资料室,仅允许医生和护士以及医院后勤人员进入,不允许其他人进入。其中,医生和护士均穿着白色长款工作服,后勤人员均穿着蓝色短款工作上衣,以及蓝色裤子。因此,可以预先设置符合A医院资料室的进入条件的身份的标准属性包括:白色长款工作服(医生和护士两种身份对应的标准属性)、蓝色短款工作上衣和蓝色裤子(后勤人员对应的标准属性)。上述三种身份对应的标准属性可以预存在A医院资料室的监控服务器中,监控流程可以如下:
S1、在需要监控的A医院资料室中的关键位置部署若干个监控摄像头,对需要监控区域内的影像进行实时采集,并将采集的视频图像传输到A医院资料室的监控服务器。
S2、A医院资料室的监控服务器使用行人检测模型检测出视频图像中出现了人物目标。
S3、A医院资料室的监控服务器使用多属性分类模型对上一步骤中检测出的人物目标的相关属性进行分类,得到该人物的多种属性。其中,该人物的多种属性包括是否佩戴帽子,衣服的颜色、纹理、款式,是否有肩章等。
S4、白名单身份设置,将医生和护士以及医院后勤工作人员加入白名单。其中,将医生和护士定义为白色长款工作服,医院后勤工作人员定义为蓝色短款上衣及蓝色裤子。即将符合A医院资料室的进入条件的身份的标准属性加入白名单。在具体实现中,也可以根据实际需要设置禁止进入A医院资料室的黑名单,本实施例对此不做具体限定。
S5、人物身份匹配:当系统发现不符合白名单中身份的目标时,会记录非法入侵事件,并进行报警,通知相关工作人员对非法入侵者的身份进行核验。也就是说,根据S3中得到的人物的多种属性和白名单中的标准属性,识别进入A医院资料室的人物的身份是否为A医院的医生、护士或医院后勤工作人员。
监控场景二:B医院资料室,B医院的资料室也仅允许医生、护士和后勤人员进入,医生仅可能穿着白色长款工作服,但护士会穿着白色或粉色短款工作服,后勤人员穿着绿色短款上衣及绿色裤子。因此,可以预先设置符合B医院资料室的进入条件的身份的标准属性包括:白色长款工作服(医生对应的标准属性)、白色或粉色短款工作服(护士对应的标准属性)、绿色短款上衣及绿色裤子(后勤人员对应的标准属性)。上述三种身份对应的标准属性可以预存在B医院资料室的监控服务器中,监控流程可以如下:
S1、在需要监控的B医院资料室中的关键位置部署若干个监控摄像头,对需要监控区域内的影像进行实时采集,并将采集的视频图像传输到B医院资料室的监控服务器。
S2、B医院资料室的监控服务器使用行人检测模型检测出视频图像中出现了人物目标。其中,在训练好用于A医院的行人检测模型后,可直接将该行人检测模型应用于B医院,无需对行人检测模型进行重新训练。
S3、B医院资料室的监控服务器使用多属性分类模型对上一步骤中检测出的人物目标的相关属性进行分类,得到该人物的多种属性。其中,该人物的多种属性包括是否佩戴帽子,衣服的颜色、纹理、款式,是否有肩章等。在具体实现中,当训练好部署在A医院资料室的多属性分类模型时,可直接将该多属性分类模型应用到B医院资料室,不需要对多属性分类模型进行重新训练。
S4、白名单身份设置,将医生和护士以及医院后勤工作人员加入白名单。其中,将医生定义为白色长款工作服,护士定义为白色或粉色短款工作服,医院后勤工作人员定义为绿色短款上衣及绿色裤子。即将符合B医院资料室的进入条件的身份的标准属性加入白名单。在具体实现中,也可以根据实际需要设置禁止进入B医院资料室的黑名单,本实施例对此不做具体限定。
S5、人物身份匹配:当系统发现不符合白名单中身份的目标时,会记录非法入侵事件,并进行报警,通知相关工作人员对非法入侵者的身份进行核验。也就是说,根据S3中得到的人物的多种属性和白名单中的标准属性,识别进入B医院资料室的人物的身份是否为B医院的医生、护士或医院后勤工作人员。
需要说明的是,本实施例中的上述各示例均为为方便理解进行的举例说明,并不对本申请的技术方案构成限定。
本实施例的有益效果在于:泛化性能强、灵活性好、效率高,可实现对身份的有效验证,提高非法入侵事件的应急响应能力,有利于及时预警并进行防范。主要表现在以下几个方面:
1、相比于适用于指定场景的模型,本实施例通过标准属性来定义符合监控场景的进入条件的身份,不同的监控场景可以定义不同的标准属性,从而使得本实施例中可以通过训练一个多属性分类模型来适应不同监控场景的监控需求。因此,本实施例中的多属性分类模型在迁移到其他监控场景时无需重新训练网络,具有更强的泛化能力,可灵活应用到多种监控场景中,有利于模型的规模化部署。
2、本实施例可以使用大量的公开图像数据集进行网络模型的训练。在实际的部署场景中采集数据的工作量很大,且数据的多样性有限。本实施例在训练多属性分类模型时可使用大量的公开图像数据集,而不必到实际的部署场景中去采集数据,这简化了繁杂的数据集获取过程,且可以利用更多的数据训练多属性分类模型。
3、本实施例使用的多属性分类模型,即多任务分类网络采用共享骨干网络的形式,可以让网络学习到更多共享的特征表示,提升网络的泛化效果。相较于如图2所示的对每个任务分别训练出一个模型,本实施例仅使用了一个多属性分类模型模型,有效地提高了网络的运行效率。
本申请第二实施例涉及一种身份识别方法,本实施例是对第一实施例的进一步改进,主要改进之处在于:在多属性分类模型中引入注意力机制,如图3所示,在使用共享的骨干网络提取特征得到中间特征图后,在对目标人物的某一区域的属性进行分类时,可以先预测出该区域对应的掩码图像,然后将不同区域对应的掩码图像,应用到中间特征图,得到中间特征图中不同区域分别对应的目标区域特征图,最后根据不同区域分别对应的目标区域特征图,确定目标对象的多种属性。例如,在预测人物上衣的颜色时,可在共享的骨干网络提取特征得到中间特征图后,先预测出上衣区域对应的掩码图像,然后将该掩码图像应用到中间特征图上,去除中间特征图中与上衣区域无关的区域,最后再预测出上衣的颜色。下面主要对本申请的主要改进之处进行说明:
本实施例中相当于是对第一实施例中“根据预先训练的多属性分类模型,确定目标人物的多种属性”进行的进一步改进。本实施例中的多属性分类模型与第一实施例中的多属性分类模型的不同之处在于:训练模型时构建的样本集不同。第一实施例中,对图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建样本集;本实施例中,对图像数据集中满足预设标注条件的图像中的人物的多种属性和人物的不同区域进行标注,构建样本集。也就是说,第一实施例中标注的是人物的多种属性,本实施例中除了标注人物的多种属性,还标注了人物的不同区域。
在一个例子中,对人物的不同区域的标注可以参考图4、图5。其中,图4为未进行标注的原图,图5中用不同颜色标注出上衣区域、裤子区域、头部的帽子区域。本实施例中,“根据预先训练的多属性分类模型,确定目标人物的多种属性”的实现方式可以如图6所示,包括:
步骤501:将视频图像输入多属性分类模型中的骨干网络,得到中间特征图。
本实施例中的多属性分类模型中的骨干网络可以为残差网络(Residual Neural Network,简称:ResNet),ResNet可以进一步为ResNet18,ResNet18的参数量较少,可实现较高的速度与精度。ResNet18可以提取视频图像的特征,得到该视频图像对应的中间特征图。
步骤502:确定中间特征图中目标人物的不同区域对应的掩码图像。
具体的说,中间特征图经过多属性分类模型中的若干个卷积层后可以得到中间特征图中目标人物的不同区域对应的掩码图像。其中,掩码图像可以理解为二值图像,比如中间特征图的上衣区域对应的掩码图像可以参考图7,即上衣区域内的值均为1,其余区域内的值均为0。
步骤503:将不同区域对应的掩码图像应用到中间特征图,得到中间特征图中不同区域分别对应的目标区域特征图。
步骤504:根据不同区域分别对应的目标区域特征图,确定目标对象的多种属性。
在一个例子中,可以将中间特征图分别与不同区域对应的掩码图像相乘,得到中间特征图中不同区域分别对应的目标区域特征图。根据不同区域分别对应的目标区域特征图,确定目标对象的多种属性。通过将中间特征图分别与不同区域对应的掩码图像相乘,可以去除与当前关注的区域无关的信息,这样就可将网络的注意力集中到需要重点关注的目标区域。
比如,当关注上衣区域的相关属性时,图像中不属于上衣区域的信息可能会影响网络的判断,因此,可以将中间特征图与上衣区域对应的掩码图像相乘,去除与上衣区域无关的信息,这样就可将网络的注意力集中到需要重点关注的上衣区域,即得到上衣区域对应的目标区域特征图。然后,根据上衣区域对应的目标区域特征图,确定目标对象的上衣区域的相关属性。比如,根据上衣区域对应的目标区域特征图,确定目标对象的上衣颜色和/或上衣款式。
再比如,当关注裤子区域的相关属性时,图像中不属于裤子区域的信息可能会影响网络的判断,因此可以将中间特征图与裤子区域对应的掩码图像相乘,去除与裤子区域无关的信息,这样就可将网络的注意力集中到需要重点关注的裤子区域,即得到裤子区域对应的目标区域特征图。然后,根据裤子区域对应的目标区域特征图,确定目标对象的裤子区域的相关属性。比如,根据裤子区域对应的目标区域特征图,确定目标对象的裤子颜色和/或裤子款式。
在具体实现中,根据不同区域分别对应的目标区域特征图,确定目标对象的多种属性可以包括:可以根据上衣区域对应的目标区域特征图,确定目标对象的上衣颜色和/或上衣款式;根据裤子区域对应的目标区域特征图,确定目标对象的裤子颜色和/或裤子款式;根据头部区域对应的目标区域特征图,确定目标对象是否佩戴帽子和/或是否佩戴眼镜等。
本实施例中,通过添加注意力机制,即在确定目标人物某一区域的属性时,先确定出该区域的掩码图像,将该区域的掩码图像应用于中间特征图可去除无关的背景信息,再进行该区域的属性分类,可有效提高确定的目标对象的多种属性的准确性。
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包括相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。
本申请第三实施例涉及一种多属性分类模型的训练方法,如图8所示,包括:
步骤701:获取公开的图像数据集。
步骤702:对图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建样本集。
步骤703:确定网络的结构,并配置网络的网络超参数。
步骤704:根据样本集训练配置有网络超参数的网络,得到多属性分类模型。
不难发现,本实施例中的多属性分类模型的训练方法的实现过程,在第一实施例和第二实施例中已经介绍过。第一实施例和第二实施例中提到的相关技术细节在本实施例中依然有效,为了减少重复,这里不再赘述。相应地,本实施例中提到的相关技术细节也可应用在第一实施例至第二实施例中。
本实施例中,可以使用大量的公开图像数据集进行网络模型的训练。在实际的部署场景中采集数据的工作量很大,且数据的多样性有限。本实施例在训练多属性分类模型时可使用大量的公开图像数据集,而不必到实际的部署场景中去采集数据,这简化了繁杂的数据集获取过程,且可以利用更多的数据训练多属性分类模型。而且,本实施例使用的多属性分类模型,即多任务分类网络采用共享骨干网络的形式,可以让网络学习到更多共享的特征表示,提升网络的泛化效果。
本申请第四实施例涉及一种多属性分类模型的训练装置,如图9所示,包括:
获取模块801,用于获取公开的图像数据集;
标注模块802,用于对图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建样本集;
配置模块803,用于确定网络的结构,并配置网络的网络超参数;
训练模块804,用于根据样本集训练配置有网络超参数的网络,得到多属性分类模型。
不难发现,本实施例为与第三实施例相对应的装置实施例,第三实施例提到的相关技术细节和技术效果在本实施例中依然有效,为了减少重复,这里不再赘述。相应地,本实施例中提到的相关技术细节也可应用在第三实施例中。
本申请第五实施例涉及一种电子设备,如图10所示,包括至少一个处理器901;以及,与至少一个处理器901通信连接的存储器902;其中,存储器902存储有可被至少一个处理器901执行的指令,指令被至少一个处理器901执行,以使至少一个处理器901能够执行第一、或第二实施例中的身份识别方法。
其中,存储器902和处理器901采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器901和存储器902的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器901处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器901。
处理器901负责管理总线和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节、电源管理以及其他控制功能。而存储器902可以被用于存储处理器901在执行操作时所使用的数据。
本申请第六实施例涉及一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。
即,本领域技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称:ROM)、随机存取存储器(Random Access Memory,简称:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本领域的普通技术人员可以理解,上述各实施例是实现本申请的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本申请的精神和范围。
Claims (10)
- 一种身份识别方法,包括:获取监控场景内的视频图像;若检测到所述视频图像中出现目标人物,根据预先训练的多属性分类模型,确定所述目标人物的多种属性;其中,所述多属性分类模型根据预先构建的样本集训练得到,所述样本集包括若干标注有属性的图像;确定符合所述监控场景的进入条件的身份的标准属性;根据所述目标人物的多种属性和所述标准属性,识别所述目标人物的身份是否符合所述进入条件。
- 根据权利要求1所述的身份识别方法,其中,所述多属性分类模型通过以下训练方式训练得到:获取公开的图像数据集;对所述图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建所述样本集;确定网络的结构,并配置所述网络的网络超参数;根据所述样本集训练配置有所述网络超参数的网络,得到所述多属性分类模型。
- 根据权利要求2所述的身份识别方法,其中,所述对所述图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建所述样本集,包括:对所述图像数据集中满足预设标注条件的图像中的人物的多种属性和所述人物的不同区域进行标注,构建所述样本集;所述根据预先训练的多属性分类模型,确定所述目标对象的多种属性,包括:将所述视频图像输入所述多属性分类模型中的骨干网络,得到中间特征图;确定所述中间特征图中所述目标人物的不同区域对应的掩码图像;将所述不同区域对应的掩码图像应用到所述中间特征图,得到所述中间特征图中所述不同区域分别对应的目标区域特征图;根据所述不同区域分别对应的目标区域特征图,确定所述目标对象的多种属性。
- 根据权利要求3所述的身份识别方法,其中,所述将所述不同区域对应的掩码图像应用到所述中间特征图,得到所述中间特征图中所述不同区域分别对应的目标区域特征图,包括:将所述中间特征图分别与所述不同区域对应的掩码图像相乘,得到所述中间特征图中所述不同区域分别对应的目标区域特征图。
- 根据权利要求1-4任一项所述的身份识别方法,其中,所述标准属性包括多种身份对应的多种标准属性,所述根据所述目标人物的多种属性和所述标准属性,识别所述目标人物的身份是否符合所述进入条件,包括:将所述目标人物的多种属性分别和每种所述标准属性进行匹配;若所述目标人物的多种属性与任意一种所述身份对应的标准属性匹配成功,识别所述目 标人物的身份符合所述进入条件。
- 根据权利要求5所述的身份识别方法,其中,所述将所述目标人物的多种属性分别和每种所述标准属性进行匹配,包括:确定所述多种标准属性的优先级;按照所述多种标准属性的优先级,依次将所述目标人物的多种属性分别和每种所述标准属性进行匹配。
- 一种多属性分类模型的训练方法,包括:获取公开的图像数据集;对所述图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建样本集;确定网络的结构,并配置所述网络的网络超参数;根据所述样本集训练配置有所述网络超参数的网络,得到所述多属性分类模型。
- 一种多属性分类模型的训练装置,包括:获取模块,用于获取公开的图像数据集;标注模块,用于对所述图像数据集中满足预设标注条件的图像中的人物的多种属性进行标注,构建样本集;配置模块,用于确定网络的结构,并配置所述网络的网络超参数;训练模块,用于根据所述样本集训练配置有所述网络超参数的网络,得到所述多属性分类模型。
- 一种电子设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至6中任意一项所述的身份识别方法。
- 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至6中任一项所述的身份识别方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011448967.8 | 2020-12-09 | ||
CN202011448967.8A CN114612813A (zh) | 2020-12-09 | 2020-12-09 | 身份识别方法、模型训练方法、装置、设备和存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022121498A1 true WO2022121498A1 (zh) | 2022-06-16 |
Family
ID=81856424
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/124112 WO2022121498A1 (zh) | 2020-12-09 | 2021-10-15 | 身份识别方法、模型训练方法、装置、设备和存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114612813A (zh) |
WO (1) | WO2022121498A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116994338A (zh) * | 2023-09-25 | 2023-11-03 | 四川中交信通网络科技有限公司 | 一种基于行为识别的站点无纸化稽查管理系统 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114821486B (zh) * | 2022-06-29 | 2022-10-11 | 武汉纺织大学 | 一种电力作业场景下人员识别方法 |
CN115937784A (zh) * | 2022-12-27 | 2023-04-07 | 正大农业科学研究有限公司 | 养殖场监控方法、装置、电子设备及存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8959082B2 (en) * | 2011-10-31 | 2015-02-17 | Elwha Llc | Context-sensitive query enrichment |
CN106845373A (zh) * | 2017-01-04 | 2017-06-13 | 天津大学 | 面向监控视频的行人属性预测方法 |
CN109446936A (zh) * | 2018-10-12 | 2019-03-08 | 银河水滴科技(北京)有限公司 | 一种用于监控场景的身份识别方法及装置 |
CN110796079A (zh) * | 2019-10-29 | 2020-02-14 | 深圳龙岗智能视听研究院 | 基于人脸深度特征和人体局部深度特征的多相机访客识别的方法及系统 |
AU2018379393A1 (en) * | 2017-12-06 | 2020-07-02 | Downer Edi Rail Pty Ltd | Monitoring systems, and computer implemented methods for processing data in monitoring systems, programmed to enable identification and tracking of human targets in crowded environments |
CN111488804A (zh) * | 2020-03-19 | 2020-08-04 | 山西大学 | 基于深度学习的劳保用品佩戴情况检测和身份识别的方法 |
US10834363B1 (en) * | 2017-06-22 | 2020-11-10 | Insight, Inc. | Multi-channel sensing system with embedded processing |
-
2020
- 2020-12-09 CN CN202011448967.8A patent/CN114612813A/zh active Pending
-
2021
- 2021-10-15 WO PCT/CN2021/124112 patent/WO2022121498A1/zh active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8959082B2 (en) * | 2011-10-31 | 2015-02-17 | Elwha Llc | Context-sensitive query enrichment |
CN106845373A (zh) * | 2017-01-04 | 2017-06-13 | 天津大学 | 面向监控视频的行人属性预测方法 |
US10834363B1 (en) * | 2017-06-22 | 2020-11-10 | Insight, Inc. | Multi-channel sensing system with embedded processing |
AU2018379393A1 (en) * | 2017-12-06 | 2020-07-02 | Downer Edi Rail Pty Ltd | Monitoring systems, and computer implemented methods for processing data in monitoring systems, programmed to enable identification and tracking of human targets in crowded environments |
CN109446936A (zh) * | 2018-10-12 | 2019-03-08 | 银河水滴科技(北京)有限公司 | 一种用于监控场景的身份识别方法及装置 |
CN110796079A (zh) * | 2019-10-29 | 2020-02-14 | 深圳龙岗智能视听研究院 | 基于人脸深度特征和人体局部深度特征的多相机访客识别的方法及系统 |
CN111488804A (zh) * | 2020-03-19 | 2020-08-04 | 山西大学 | 基于深度学习的劳保用品佩戴情况检测和身份识别的方法 |
Non-Patent Citations (1)
Title |
---|
WANG JUNXI: "RESEARCH ON PERSON RE-IDENTIFICATION BASED ON MULTI-TASK JOINT SUPERVISED LEARNING", CHINA MASTER'S THESES FULL-TEXT DATABASE, INFORMATION TECHNOLOGY, no. 1, 15 January 2019 (2019-01-15), XP055941013 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116994338A (zh) * | 2023-09-25 | 2023-11-03 | 四川中交信通网络科技有限公司 | 一种基于行为识别的站点无纸化稽查管理系统 |
CN116994338B (zh) * | 2023-09-25 | 2024-01-12 | 四川中交信通网络科技有限公司 | 一种基于行为识别的站点无纸化稽查管理系统 |
Also Published As
Publication number | Publication date |
---|---|
CN114612813A (zh) | 2022-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022121498A1 (zh) | 身份识别方法、模型训练方法、装置、设备和存储介质 | |
CN107330396B (zh) | 一种基于多属性和多策略融合学习的行人再识别方法 | |
WO2018116488A1 (ja) | 解析サーバ、監視システム、監視方法及びプログラム | |
WO2021063056A1 (zh) | 人脸属性识别方法、装置、电子设备和存储介质 | |
CN110442742A (zh) | 检索图像的方法及装置、处理器、电子设备及存储介质 | |
CN110619277A (zh) | 一种多社区智慧布控方法以及系统 | |
CN103886283A (zh) | 用于移动用户的多生物特征图像信息融合方法及其应用 | |
CN107909683A (zh) | 实现登机的方法、终端设备及计算机可读存储介质 | |
CN108108711B (zh) | 人脸布控方法、电子设备及存储介质 | |
CN105844245A (zh) | 一种伪装人脸检测方法及系统 | |
CN107169458A (zh) | 数据处理方法、装置及存储介质 | |
CA3055600C (en) | Method and system for enhancing a vms by intelligently employing access control information therein | |
CN114997279A (zh) | 一种基于改进Yolov5模型的建筑工人危险区域入侵检测方法 | |
CN106886771A (zh) | 基于模块化pca的图像主信息提取方法及人脸识别方法 | |
CN109919968A (zh) | 一种用于监控无人机的目标检测与部件识别系统 | |
CN111767880B (zh) | 一种基于脸部特征的活体身份识别方法、装置和存储介质 | |
Wang et al. | An Intelligent Vision‐Based Method of Worker Identification for Industrial Internet of Things (IoT) | |
CN116311082B (zh) | 基于关键部位与图像匹配的穿戴检测方法及系统 | |
WO2023093241A1 (zh) | 行人重识别方法及装置、存储介质 | |
Peng et al. | [Retracted] Helmet Wearing Recognition of Construction Workers Using Convolutional Neural Network | |
KR102617756B1 (ko) | 속성 기반 실종자 추적 장치 및 방법 | |
El Gemayel et al. | Automated face detection and control system using computer vision based video analytics to avoid the spreading of Covid-19 | |
Jamini et al. | Face mask and temperature detection for covid safety using IoT and deep learning | |
CN111832451A (zh) | 基于视频数据处理的适航监察流程监管系统及方法 | |
Babu et al. | IoT based crowd estimation and stranger recognition in closed public areas |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21902206 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.11.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21902206 Country of ref document: EP Kind code of ref document: A1 |