WO2022213540A1 - Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet - Google Patents

Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet Download PDF

Info

Publication number
WO2022213540A1
WO2022213540A1 PCT/CN2021/117025 CN2021117025W WO2022213540A1 WO 2022213540 A1 WO2022213540 A1 WO 2022213540A1 CN 2021117025 W CN2021117025 W CN 2021117025W WO 2022213540 A1 WO2022213540 A1 WO 2022213540A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
tracking
feature map
attribute
network
Prior art date
Application number
PCT/CN2021/117025
Other languages
English (en)
Chinese (zh)
Inventor
于鹏
高朋
刘辰飞
陈英鹏
许野平
刘明顺
Original Assignee
神思电子技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 神思电子技术股份有限公司 filed Critical 神思电子技术股份有限公司
Publication of WO2022213540A1 publication Critical patent/WO2022213540A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the invention relates to the technical field of target detection, in particular to a method and system for target detection, attribute identification and tracking.
  • Patent CN 110188596 A "Real-time detection, attribute recognition and tracking method and system for pedestrians in surveillance video based on deep learning", relates to a method and system for real-time detection, attribute recognition and tracking of pedestrians in surveillance video based on deep learning. It mainly proposes efficient pedestrian detection, attribute recognition and tracking methods, and designs an efficient scheduling method, which schedules modules in series and parallel, so that it can perform real-time detection of multi-channel video pedestrians as much as possible on limited computing resources.
  • Property identification and tracking first uses a deep learning model to extract features in attribute recognition, and then trains 11 classifiers to classify these 11 attributes, and superimposes a model for deep learning to extract features. Increased the number of parameters for the overall framework.
  • the Kalman filter algorithm is used to predict, get the next position, and then perform matching. Multiple trajectory processes need to be stored, and the problem of position loss caused by frame skipping in video transmission cannot be solved. Missing key frames will cause the algorithm Match failed.
  • the patent CN 111274945 A "A recognition method, device, electronic device and storage medium for pedestrian attribute” discloses a pedestrian attribute identification method, device, electronic device and storage medium, which relates to the technical field of machine vision.
  • the specific implementation scheme is as follows: in the target monitoring picture, obtain at least one image of the human body recognition area; in each image of the human body recognition area, obtain the pedestrian attribute recognition result and the small object detection result; The recognition result is corrected, and the corrected pedestrian attribute recognition result matching the image of each human body recognition area is obtained.
  • the patent first performs attribute recognition and then uses a target detection algorithm to correct the attribute analysis results, and uses two deep learning models to superimpose. The amount of parameters has been increased, and the hardware requirements have been improved.
  • Patent CN 112232173 A discloses a pedestrian attribute identification method, deep learning model, equipment and medium, and obtains the target image for pedestrian attribute identification; Attribute color features and attribute position features in the target image; fuse attribute color features and attribute position features to obtain pixel-level features, and predict color information and position information based on pixel-level features; Perform splicing and segmentation to obtain the pedestrian feature map; fuse the pedestrian feature map to obtain the target feature; determine the pedestrian attribute recognition result based on the target feature.
  • the location information, color information and pictures of pedestrians are re-spliced and segmented. The spatial correlation of the original image is destroyed, and the tracking problem of the target is not solved.
  • the existing methods are basically spliced from separate algorithm architectures.
  • the splicing of multiple deep learning frameworks not only increases the difficulty of deep learning training, it is difficult to converge, the unknown situation increases, and the architecture is abandoned. The more, the larger the parameters of deep learning, the slower the running speed, and the higher the hardware requirements.
  • the purpose of the present invention is to address the above deficiencies, provide a target detection, attribute recognition and tracking method, reduce the number of deep learning used, only use one deep learning framework, reduce the difficulty of deep learning framework training, more easily converge, and improve the running rate, The requirement for hardware is reduced, and a target detection, attribute recognition and tracking system based on the above method is also provided.
  • the present invention provides a method for target detection, attribute identification and tracking, comprising the following steps:
  • the method of obtaining the original image of the target described in the present invention uses a high-definition video camera or a digital camera.
  • step S2 of the present invention feature analysis is performed on the acquired original image through a trained feature extraction network, and the target feature map includes target category information, target location information and target attribute information. .
  • step S3 of the present invention target detection is performed on the target feature map through a target detection network, and the target detection result includes target type information and target position information, and the target position information is used for attributes. identification and target tracking;
  • the target detection network adopts a deep learning network.
  • step S4 of the present invention the target feature map and the target detection result are input into the attribute recognition network for target attribute recognition, and the target attribute is a different representation of the common attribute of the target.
  • the target feature map, the target detection result and the target tracking information of the previous frame are input into the target tracking network for target tracking analysis, and the target tracking information of the previous frame is the previous frame.
  • the target detection, attribute recognition and tracking results of the frame including target location information, feature map information and the ID given to the target, the target feature map of this frame is matched with all target feature maps stored in the target tracking information of the previous frame.
  • the target position information is matched with the position information of all targets in the previous frame, and the matching value is used to determine whether the current target is a target in the previous frame.
  • the ID of the target is assigned to the corresponding target of the frame; when it is judged that there is no matching corresponding target, the target is assigned a new ID.
  • the feature extraction network of the present invention adopts the convolutional neural network algorithm.
  • the input information is the original three-dimensional prediction of RGB values, and the original image is marked
  • the labeling information includes target category label, target area label and target attribute label, and is trained through target detection network and attribute recognition network to obtain the optimal training model.
  • the present invention also provides a target detection, attribute recognition and tracking system, including an image acquisition device and a picture processing component, wherein:
  • the image acquisition device is used to acquire the original image of the target
  • the image processing component is used to process the collected target original image to track the target;
  • the image processing component includes a feature map extraction module, a target detection module, an attribute recognition module and a target tracking module, wherein:
  • the feature map extraction module is used to process the collected target original image into a target feature map
  • the target detection module is used to perform target detection from the processed target feature map, and obtain the target detection result
  • the attribute identification module is used to identify the attribute of the target through the target feature map and the target detection result, and obtain the target attribute identification result;
  • the target tracking module performs target tracking through the target detection result, the target tracking information of the previous frame and the feature map.
  • the feature map extraction module of the present invention includes a target feature extraction network
  • the target detection module includes a target detection network
  • the attribute identification module includes an attribute identification network
  • the target tracking module includes target tracking network
  • the target feature extraction network, the attribute recognition network and the target tracking network all use convolutional neural networks, and the target tracking network uses a deep neural network
  • the RGB three-dimensional image is used as the input, and the image is annotated. Extract the network for joint training to obtain the optimal model.
  • each target corresponds to a unique ID.
  • the target tracking module the target tracking information of the previous frame is compared and judged. When a target is the same target, the frame target inherits the ID of the corresponding target in the previous frame, and when it is judged that there is no corresponding target, a new ID is given to the target.
  • the present invention performs feature extraction on the acquired original image through a feature extraction network, and then performs subsequent target detection, attribute recognition and target tracking according to the extracted feature map, so that the target attribute processing is more reasonable, so that target detection, attribute The effect of identification and tracking is more prepared to avoid the occurrence of target loss;
  • the present invention only uses one deep learning framework, which reduces the training difficulty of the neural network, can correspondingly avoid difficult convergence, reduces deep learning parameters, improves operating efficiency, and reduces hardware requirements.
  • FIG. 1 is a schematic flow chart of the present invention.
  • This embodiment provides a method for target detection, attribute identification and tracking, as shown in FIG. 1 , including the following steps:
  • the target original image is generally acquired by an image acquisition device.
  • the used image acquisition device is a high-definition surveillance camera or a digital camera.
  • the resolution of surveillance cameras and digital cameras can use high-resolution equipment, such as 40 million pixel imaging, and the higher ones include 60 million pixel imaging, 80 million pixel imaging and 100 million pixel imaging. , in actual use, the cost of image acquisition equipment should also be taken into account;
  • the target feature map performs feature analysis on the acquired original image through a trained feature extraction network, the feature extraction network needs to be trained by samples, and the sample input information of the feature extraction network is RGB three-dimensional original images, and the method of supervised learning is carried out.
  • label the image of the sample, and the labeling information includes the target category label, the target position label and the attribute information of the target, and the target detection network and the attribute recognition network for subsequent processing are jointly trained to obtain the optimal model;
  • the target feature map contains target category information, target location information and target attribute information
  • the target feature map is a (a, b, c) three-dimensional feature map
  • a is the number of targets detected in the image
  • b*c It is the target area feature map
  • the area feature map containing the portrait is used for the next pedestrian target detection, behavior attribute analysis and pedestrian target tracking;
  • Target detection on the target feature map through a target detection network, and the target detection result includes target type information and target position information, and the target position information is used for attribute identification and target tracking;
  • the target detection network is a deep learning network, and the target detection network includes a classification part and a positioning part, which are used to separate the target category information and target location information from the above-mentioned target feature map, and use the target positioning information for follow-up. Attribute identification and target tracking;
  • Target attribute is the different manifestations of the common attribute of the target, such as the age of the person, clothing style, hair and clothing color, etc.;
  • the target tracking information of the previous frame is the target detection, attribute identification and tracking results of the previous frame, including the target position information. , feature map information and the ID given to the target, the target feature map of this frame is matched with all target feature maps stored in the target tracking information of the previous frame, and the target position information of this frame is matched with the position information of all targets in the previous frame, and Judging whether the current target is a target in the previous frame by the matching value, when it is judged that the current target and a target in the previous frame are the same target, the ID of the target in the previous frame is assigned to the corresponding target in the frame; when it is judged that there is no match After corresponding to the target, assign the target a new ID;
  • the processing information is saved as the judgment basis for the target tracking processing of the next frame;
  • this embodiment also provides a target detection, attribute recognition and tracking system, including an image acquisition device and a picture processing component, wherein:
  • the image acquisition device is used to acquire the original image of the target
  • the image processing component is used to process the collected target original image to track the target;
  • the picture processing component includes a feature map extraction module, a target detection module, an attribute recognition module and a target tracking module, wherein:
  • the feature map extraction module is used to process the collected target original image into a target feature map
  • the target detection module is used to perform target detection from the processed target feature map, and obtain the target detection result
  • the attribute identification module is used to identify the attribute of the target through the target feature map and the target detection result, and obtain the target attribute identification result;
  • the target tracking module performs target tracking through the target detection result, the target tracking information of the previous frame and the feature map.
  • the feature map extraction module includes a target feature extraction network
  • the target detection module includes a target detection network
  • the attribute identification module includes an attribute identification network
  • the target tracking module includes a target tracking network
  • the target The feature extraction network, the attribute recognition network and the target tracking network all use convolutional neural networks, and the target tracking network uses a deep neural network;
  • the RGB three-dimensional image is used as the input, and the image is annotated. Extract the network for joint training to obtain the optimal model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention divulgue un procédé et un système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet. Le procédé selon la présente invention consiste à : S1, obtenir une image d'origine d'un objet; S2, obtenir une carte de caractéristiques d'objet en effectuant une analyse de caractéristiques sur l'image d'origine obtenue de l'objet; S3, obtenir un résultat de détection d'objet en effectuant une détection d'objet sur la carte de caractéristiques d'objet obtenue; S4, obtenir un résultat d'identification d'attribut d'objet en effectuant une identification d'attribut sur la carte de caractéristiques d'objet obtenue et un résultat de détection d'objet; et S5, obtenir un résultat de suivi d'objet en effectuant une analyse de suivi d'objet sur la carte de caractéristiques d'objet obtenue et un résultat de détection d'objet, et des informations de suivi d'objet d'une trame précédente. Selon la présente invention, une extraction de caractéristiques est tout d'abord effectuée sur une image d'origine obtenue au moyen d'un réseau d'extraction de caractéristiques, et une détection d'objet, une identification d'attribut et un suivi d'objet ultérieurs sont effectués en fonction de la carte de caractéristiques extraite, de telle sorte que le traitement d'attribut d'objet est plus raisonnable, les effets de détection, d'identification d'attribut et de suivi d'objet sont plus précis, et l'occurrence d'une perte d'objet est évitée.
PCT/CN2021/117025 2021-04-09 2021-09-07 Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet WO2022213540A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110380619.X 2021-04-09
CN202110380619.XA CN113065568A (zh) 2021-04-09 2021-04-09 目标检测、属性识别与跟踪方法及系统

Publications (1)

Publication Number Publication Date
WO2022213540A1 true WO2022213540A1 (fr) 2022-10-13

Family

ID=76566244

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/117025 WO2022213540A1 (fr) 2021-04-09 2021-09-07 Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet

Country Status (2)

Country Link
CN (1) CN113065568A (fr)
WO (1) WO2022213540A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065568A (zh) * 2021-04-09 2021-07-02 神思电子技术股份有限公司 目标检测、属性识别与跟踪方法及系统
CN113610362B (zh) * 2021-07-20 2023-08-08 苏州超集信息科技有限公司 一种基于深度学习流水线产品追溯方法及系统
CN114898108B (zh) * 2022-03-30 2023-01-06 哈尔滨工业大学 一种基于fpga的cnn模型轻量化方法、目标检测方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711320A (zh) * 2018-12-24 2019-05-03 兴唐通信科技有限公司 一种值班人员违规行为检测方法及系统
CN110188596A (zh) * 2019-01-04 2019-08-30 北京大学 基于深度学习的监控视频行人实时检测、属性识别与跟踪方法及系统
CN111460926A (zh) * 2020-03-16 2020-07-28 华中科技大学 一种融合多目标跟踪线索的视频行人检测方法
US20200272902A1 (en) * 2017-09-04 2020-08-27 Huawei Technologies Co., Ltd. Pedestrian attribute identification and positioning method and convolutional neural network system
CN113065568A (zh) * 2021-04-09 2021-07-02 神思电子技术股份有限公司 目标检测、属性识别与跟踪方法及系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066953B (zh) * 2017-03-22 2019-06-07 北京邮电大学 一种面向监控视频的车型识别、跟踪及矫正方法和装置
CN108875456B (zh) * 2017-05-12 2022-02-18 北京旷视科技有限公司 目标检测方法、目标检测装置和计算机可读存储介质
KR101941994B1 (ko) * 2018-08-24 2019-01-24 전북대학교산학협력단 결합심층네트워크에 기반한 보행자 인식 및 속성 추출 시스템
CN112507835B (zh) * 2020-12-01 2022-09-20 燕山大学 一种基于深度学习技术分析多目标对象行为的方法及系统
CN112528977B (zh) * 2021-02-10 2021-07-02 北京优幕科技有限责任公司 目标检测方法、装置、电子设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200272902A1 (en) * 2017-09-04 2020-08-27 Huawei Technologies Co., Ltd. Pedestrian attribute identification and positioning method and convolutional neural network system
CN109711320A (zh) * 2018-12-24 2019-05-03 兴唐通信科技有限公司 一种值班人员违规行为检测方法及系统
CN110188596A (zh) * 2019-01-04 2019-08-30 北京大学 基于深度学习的监控视频行人实时检测、属性识别与跟踪方法及系统
CN111460926A (zh) * 2020-03-16 2020-07-28 华中科技大学 一种融合多目标跟踪线索的视频行人检测方法
CN113065568A (zh) * 2021-04-09 2021-07-02 神思电子技术股份有限公司 目标检测、属性识别与跟踪方法及系统

Also Published As

Publication number Publication date
CN113065568A (zh) 2021-07-02

Similar Documents

Publication Publication Date Title
WO2022213540A1 (fr) Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet
CN108596277B (zh) 一种车辆身份识别方法、装置和存储介质
US10089556B1 (en) Self-attention deep neural network for action recognition in surveillance videos
Fernandes et al. Predicting heart rate variations of deepfake videos using neural ode
CN109644255B (zh) 标注包括一组帧的视频流的方法和装置
CN109145759B (zh) 车辆属性识别方法、装置、服务器及存储介质
Bonomi et al. Dynamic texture analysis for detecting fake faces in video sequences
US9560323B2 (en) Method and system for metadata extraction from master-slave cameras tracking system
US8175333B2 (en) Estimator identifier component for behavioral recognition system
Liu et al. Weakly-supervised salient object detection with saliency bounding boxes
CN111797653A (zh) 基于高维图像的图像标注方法和装置
CN107025420A (zh) 视频中人体行为识别的方法和装置
WO2012139228A1 (fr) Détection basée sur la vidéo de multiples types d'objets dans diverses poses
US20220301317A1 (en) Method and device for constructing object motion trajectory, and computer storage medium
CN111209818A (zh) 视频个体识别方法、系统、设备及可读存储介质
CN110096945B (zh) 基于机器学习的室内监控视频关键帧实时提取方法
CN109068145A (zh) 分布式智能视频分析系统、方法、装置、设备及存储介质
CN111815528A (zh) 基于卷积模型和特征融合的恶劣天气图像分类增强方法
CN109002776B (zh) 人脸识别方法、系统、计算机设备和计算机可读存储介质
CN109977875A (zh) 基于深度学习的手势识别方法及设备
Zhang et al. Facial action unit detection with local key facial sub-region based multi-label classification for micro-expression analysis
CN110969173B (zh) 目标分类方法及装置
CN112488165A (zh) 一种基于深度学习模型的红外行人识别方法及系统
WO2022228325A1 (fr) Procédé de détection de comportement, dispositif électronique et support d'enregistrement lisible par ordinateur
CN117058568A (zh) 人脸图像选择方法和装置、电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21935759

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21935759

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14-03-2024)

122 Ep: pct application non-entry in european phase

Ref document number: 21935759

Country of ref document: EP

Kind code of ref document: A1