WO2022213540A1 - Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet - Google Patents
Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet Download PDFInfo
- Publication number
- WO2022213540A1 WO2022213540A1 PCT/CN2021/117025 CN2021117025W WO2022213540A1 WO 2022213540 A1 WO2022213540 A1 WO 2022213540A1 CN 2021117025 W CN2021117025 W CN 2021117025W WO 2022213540 A1 WO2022213540 A1 WO 2022213540A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- tracking
- feature map
- attribute
- network
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000001514 detection method Methods 0.000 claims abstract description 85
- 238000000605 extraction Methods 0.000 claims abstract description 31
- 238000012545 processing Methods 0.000 claims abstract description 18
- 238000004458 analytical method Methods 0.000 claims abstract description 15
- 238000013135 deep learning Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 13
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 238000002372 labelling Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 2
- 238000005457 optimization Methods 0.000 description 8
- 238000003384 imaging method Methods 0.000 description 5
- 238000013136 deep learning model Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000011897 real-time detection Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Definitions
- the invention relates to the technical field of target detection, in particular to a method and system for target detection, attribute identification and tracking.
- Patent CN 110188596 A "Real-time detection, attribute recognition and tracking method and system for pedestrians in surveillance video based on deep learning", relates to a method and system for real-time detection, attribute recognition and tracking of pedestrians in surveillance video based on deep learning. It mainly proposes efficient pedestrian detection, attribute recognition and tracking methods, and designs an efficient scheduling method, which schedules modules in series and parallel, so that it can perform real-time detection of multi-channel video pedestrians as much as possible on limited computing resources.
- Property identification and tracking first uses a deep learning model to extract features in attribute recognition, and then trains 11 classifiers to classify these 11 attributes, and superimposes a model for deep learning to extract features. Increased the number of parameters for the overall framework.
- the Kalman filter algorithm is used to predict, get the next position, and then perform matching. Multiple trajectory processes need to be stored, and the problem of position loss caused by frame skipping in video transmission cannot be solved. Missing key frames will cause the algorithm Match failed.
- the patent CN 111274945 A "A recognition method, device, electronic device and storage medium for pedestrian attribute” discloses a pedestrian attribute identification method, device, electronic device and storage medium, which relates to the technical field of machine vision.
- the specific implementation scheme is as follows: in the target monitoring picture, obtain at least one image of the human body recognition area; in each image of the human body recognition area, obtain the pedestrian attribute recognition result and the small object detection result; The recognition result is corrected, and the corrected pedestrian attribute recognition result matching the image of each human body recognition area is obtained.
- the patent first performs attribute recognition and then uses a target detection algorithm to correct the attribute analysis results, and uses two deep learning models to superimpose. The amount of parameters has been increased, and the hardware requirements have been improved.
- Patent CN 112232173 A discloses a pedestrian attribute identification method, deep learning model, equipment and medium, and obtains the target image for pedestrian attribute identification; Attribute color features and attribute position features in the target image; fuse attribute color features and attribute position features to obtain pixel-level features, and predict color information and position information based on pixel-level features; Perform splicing and segmentation to obtain the pedestrian feature map; fuse the pedestrian feature map to obtain the target feature; determine the pedestrian attribute recognition result based on the target feature.
- the location information, color information and pictures of pedestrians are re-spliced and segmented. The spatial correlation of the original image is destroyed, and the tracking problem of the target is not solved.
- the existing methods are basically spliced from separate algorithm architectures.
- the splicing of multiple deep learning frameworks not only increases the difficulty of deep learning training, it is difficult to converge, the unknown situation increases, and the architecture is abandoned. The more, the larger the parameters of deep learning, the slower the running speed, and the higher the hardware requirements.
- the purpose of the present invention is to address the above deficiencies, provide a target detection, attribute recognition and tracking method, reduce the number of deep learning used, only use one deep learning framework, reduce the difficulty of deep learning framework training, more easily converge, and improve the running rate, The requirement for hardware is reduced, and a target detection, attribute recognition and tracking system based on the above method is also provided.
- the present invention provides a method for target detection, attribute identification and tracking, comprising the following steps:
- the method of obtaining the original image of the target described in the present invention uses a high-definition video camera or a digital camera.
- step S2 of the present invention feature analysis is performed on the acquired original image through a trained feature extraction network, and the target feature map includes target category information, target location information and target attribute information. .
- step S3 of the present invention target detection is performed on the target feature map through a target detection network, and the target detection result includes target type information and target position information, and the target position information is used for attributes. identification and target tracking;
- the target detection network adopts a deep learning network.
- step S4 of the present invention the target feature map and the target detection result are input into the attribute recognition network for target attribute recognition, and the target attribute is a different representation of the common attribute of the target.
- the target feature map, the target detection result and the target tracking information of the previous frame are input into the target tracking network for target tracking analysis, and the target tracking information of the previous frame is the previous frame.
- the target detection, attribute recognition and tracking results of the frame including target location information, feature map information and the ID given to the target, the target feature map of this frame is matched with all target feature maps stored in the target tracking information of the previous frame.
- the target position information is matched with the position information of all targets in the previous frame, and the matching value is used to determine whether the current target is a target in the previous frame.
- the ID of the target is assigned to the corresponding target of the frame; when it is judged that there is no matching corresponding target, the target is assigned a new ID.
- the feature extraction network of the present invention adopts the convolutional neural network algorithm.
- the input information is the original three-dimensional prediction of RGB values, and the original image is marked
- the labeling information includes target category label, target area label and target attribute label, and is trained through target detection network and attribute recognition network to obtain the optimal training model.
- the present invention also provides a target detection, attribute recognition and tracking system, including an image acquisition device and a picture processing component, wherein:
- the image acquisition device is used to acquire the original image of the target
- the image processing component is used to process the collected target original image to track the target;
- the image processing component includes a feature map extraction module, a target detection module, an attribute recognition module and a target tracking module, wherein:
- the feature map extraction module is used to process the collected target original image into a target feature map
- the target detection module is used to perform target detection from the processed target feature map, and obtain the target detection result
- the attribute identification module is used to identify the attribute of the target through the target feature map and the target detection result, and obtain the target attribute identification result;
- the target tracking module performs target tracking through the target detection result, the target tracking information of the previous frame and the feature map.
- the feature map extraction module of the present invention includes a target feature extraction network
- the target detection module includes a target detection network
- the attribute identification module includes an attribute identification network
- the target tracking module includes target tracking network
- the target feature extraction network, the attribute recognition network and the target tracking network all use convolutional neural networks, and the target tracking network uses a deep neural network
- the RGB three-dimensional image is used as the input, and the image is annotated. Extract the network for joint training to obtain the optimal model.
- each target corresponds to a unique ID.
- the target tracking module the target tracking information of the previous frame is compared and judged. When a target is the same target, the frame target inherits the ID of the corresponding target in the previous frame, and when it is judged that there is no corresponding target, a new ID is given to the target.
- the present invention performs feature extraction on the acquired original image through a feature extraction network, and then performs subsequent target detection, attribute recognition and target tracking according to the extracted feature map, so that the target attribute processing is more reasonable, so that target detection, attribute The effect of identification and tracking is more prepared to avoid the occurrence of target loss;
- the present invention only uses one deep learning framework, which reduces the training difficulty of the neural network, can correspondingly avoid difficult convergence, reduces deep learning parameters, improves operating efficiency, and reduces hardware requirements.
- FIG. 1 is a schematic flow chart of the present invention.
- This embodiment provides a method for target detection, attribute identification and tracking, as shown in FIG. 1 , including the following steps:
- the target original image is generally acquired by an image acquisition device.
- the used image acquisition device is a high-definition surveillance camera or a digital camera.
- the resolution of surveillance cameras and digital cameras can use high-resolution equipment, such as 40 million pixel imaging, and the higher ones include 60 million pixel imaging, 80 million pixel imaging and 100 million pixel imaging. , in actual use, the cost of image acquisition equipment should also be taken into account;
- the target feature map performs feature analysis on the acquired original image through a trained feature extraction network, the feature extraction network needs to be trained by samples, and the sample input information of the feature extraction network is RGB three-dimensional original images, and the method of supervised learning is carried out.
- label the image of the sample, and the labeling information includes the target category label, the target position label and the attribute information of the target, and the target detection network and the attribute recognition network for subsequent processing are jointly trained to obtain the optimal model;
- the target feature map contains target category information, target location information and target attribute information
- the target feature map is a (a, b, c) three-dimensional feature map
- a is the number of targets detected in the image
- b*c It is the target area feature map
- the area feature map containing the portrait is used for the next pedestrian target detection, behavior attribute analysis and pedestrian target tracking;
- Target detection on the target feature map through a target detection network, and the target detection result includes target type information and target position information, and the target position information is used for attribute identification and target tracking;
- the target detection network is a deep learning network, and the target detection network includes a classification part and a positioning part, which are used to separate the target category information and target location information from the above-mentioned target feature map, and use the target positioning information for follow-up. Attribute identification and target tracking;
- Target attribute is the different manifestations of the common attribute of the target, such as the age of the person, clothing style, hair and clothing color, etc.;
- the target tracking information of the previous frame is the target detection, attribute identification and tracking results of the previous frame, including the target position information. , feature map information and the ID given to the target, the target feature map of this frame is matched with all target feature maps stored in the target tracking information of the previous frame, and the target position information of this frame is matched with the position information of all targets in the previous frame, and Judging whether the current target is a target in the previous frame by the matching value, when it is judged that the current target and a target in the previous frame are the same target, the ID of the target in the previous frame is assigned to the corresponding target in the frame; when it is judged that there is no match After corresponding to the target, assign the target a new ID;
- the processing information is saved as the judgment basis for the target tracking processing of the next frame;
- this embodiment also provides a target detection, attribute recognition and tracking system, including an image acquisition device and a picture processing component, wherein:
- the image acquisition device is used to acquire the original image of the target
- the image processing component is used to process the collected target original image to track the target;
- the picture processing component includes a feature map extraction module, a target detection module, an attribute recognition module and a target tracking module, wherein:
- the feature map extraction module is used to process the collected target original image into a target feature map
- the target detection module is used to perform target detection from the processed target feature map, and obtain the target detection result
- the attribute identification module is used to identify the attribute of the target through the target feature map and the target detection result, and obtain the target attribute identification result;
- the target tracking module performs target tracking through the target detection result, the target tracking information of the previous frame and the feature map.
- the feature map extraction module includes a target feature extraction network
- the target detection module includes a target detection network
- the attribute identification module includes an attribute identification network
- the target tracking module includes a target tracking network
- the target The feature extraction network, the attribute recognition network and the target tracking network all use convolutional neural networks, and the target tracking network uses a deep neural network;
- the RGB three-dimensional image is used as the input, and the image is annotated. Extract the network for joint training to obtain the optimal model.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110380619.X | 2021-04-09 | ||
CN202110380619.XA CN113065568A (zh) | 2021-04-09 | 2021-04-09 | 目标检测、属性识别与跟踪方法及系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022213540A1 true WO2022213540A1 (fr) | 2022-10-13 |
Family
ID=76566244
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/117025 WO2022213540A1 (fr) | 2021-04-09 | 2021-09-07 | Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113065568A (fr) |
WO (1) | WO2022213540A1 (fr) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113065568A (zh) * | 2021-04-09 | 2021-07-02 | 神思电子技术股份有限公司 | 目标检测、属性识别与跟踪方法及系统 |
CN113610362B (zh) * | 2021-07-20 | 2023-08-08 | 苏州超集信息科技有限公司 | 一种基于深度学习流水线产品追溯方法及系统 |
CN114898108B (zh) * | 2022-03-30 | 2023-01-06 | 哈尔滨工业大学 | 一种基于fpga的cnn模型轻量化方法、目标检测方法及系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109711320A (zh) * | 2018-12-24 | 2019-05-03 | 兴唐通信科技有限公司 | 一种值班人员违规行为检测方法及系统 |
CN110188596A (zh) * | 2019-01-04 | 2019-08-30 | 北京大学 | 基于深度学习的监控视频行人实时检测、属性识别与跟踪方法及系统 |
CN111460926A (zh) * | 2020-03-16 | 2020-07-28 | 华中科技大学 | 一种融合多目标跟踪线索的视频行人检测方法 |
US20200272902A1 (en) * | 2017-09-04 | 2020-08-27 | Huawei Technologies Co., Ltd. | Pedestrian attribute identification and positioning method and convolutional neural network system |
CN113065568A (zh) * | 2021-04-09 | 2021-07-02 | 神思电子技术股份有限公司 | 目标检测、属性识别与跟踪方法及系统 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107066953B (zh) * | 2017-03-22 | 2019-06-07 | 北京邮电大学 | 一种面向监控视频的车型识别、跟踪及矫正方法和装置 |
CN108875456B (zh) * | 2017-05-12 | 2022-02-18 | 北京旷视科技有限公司 | 目标检测方法、目标检测装置和计算机可读存储介质 |
KR101941994B1 (ko) * | 2018-08-24 | 2019-01-24 | 전북대학교산학협력단 | 결합심층네트워크에 기반한 보행자 인식 및 속성 추출 시스템 |
CN112507835B (zh) * | 2020-12-01 | 2022-09-20 | 燕山大学 | 一种基于深度学习技术分析多目标对象行为的方法及系统 |
CN112528977B (zh) * | 2021-02-10 | 2021-07-02 | 北京优幕科技有限责任公司 | 目标检测方法、装置、电子设备和存储介质 |
-
2021
- 2021-04-09 CN CN202110380619.XA patent/CN113065568A/zh active Pending
- 2021-09-07 WO PCT/CN2021/117025 patent/WO2022213540A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200272902A1 (en) * | 2017-09-04 | 2020-08-27 | Huawei Technologies Co., Ltd. | Pedestrian attribute identification and positioning method and convolutional neural network system |
CN109711320A (zh) * | 2018-12-24 | 2019-05-03 | 兴唐通信科技有限公司 | 一种值班人员违规行为检测方法及系统 |
CN110188596A (zh) * | 2019-01-04 | 2019-08-30 | 北京大学 | 基于深度学习的监控视频行人实时检测、属性识别与跟踪方法及系统 |
CN111460926A (zh) * | 2020-03-16 | 2020-07-28 | 华中科技大学 | 一种融合多目标跟踪线索的视频行人检测方法 |
CN113065568A (zh) * | 2021-04-09 | 2021-07-02 | 神思电子技术股份有限公司 | 目标检测、属性识别与跟踪方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN113065568A (zh) | 2021-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022213540A1 (fr) | Procédé et système de détection d'objet, d'identification d'attribut d'objet et de suivi d'objet | |
CN108596277B (zh) | 一种车辆身份识别方法、装置和存储介质 | |
US10089556B1 (en) | Self-attention deep neural network for action recognition in surveillance videos | |
Fernandes et al. | Predicting heart rate variations of deepfake videos using neural ode | |
CN109644255B (zh) | 标注包括一组帧的视频流的方法和装置 | |
CN109145759B (zh) | 车辆属性识别方法、装置、服务器及存储介质 | |
Bonomi et al. | Dynamic texture analysis for detecting fake faces in video sequences | |
US9560323B2 (en) | Method and system for metadata extraction from master-slave cameras tracking system | |
US8175333B2 (en) | Estimator identifier component for behavioral recognition system | |
Liu et al. | Weakly-supervised salient object detection with saliency bounding boxes | |
CN111797653A (zh) | 基于高维图像的图像标注方法和装置 | |
CN107025420A (zh) | 视频中人体行为识别的方法和装置 | |
WO2012139228A1 (fr) | Détection basée sur la vidéo de multiples types d'objets dans diverses poses | |
US20220301317A1 (en) | Method and device for constructing object motion trajectory, and computer storage medium | |
CN111209818A (zh) | 视频个体识别方法、系统、设备及可读存储介质 | |
CN110096945B (zh) | 基于机器学习的室内监控视频关键帧实时提取方法 | |
CN109068145A (zh) | 分布式智能视频分析系统、方法、装置、设备及存储介质 | |
CN111815528A (zh) | 基于卷积模型和特征融合的恶劣天气图像分类增强方法 | |
CN109002776B (zh) | 人脸识别方法、系统、计算机设备和计算机可读存储介质 | |
CN109977875A (zh) | 基于深度学习的手势识别方法及设备 | |
Zhang et al. | Facial action unit detection with local key facial sub-region based multi-label classification for micro-expression analysis | |
CN110969173B (zh) | 目标分类方法及装置 | |
CN112488165A (zh) | 一种基于深度学习模型的红外行人识别方法及系统 | |
WO2022228325A1 (fr) | Procédé de détection de comportement, dispositif électronique et support d'enregistrement lisible par ordinateur | |
CN117058568A (zh) | 人脸图像选择方法和装置、电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21935759 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21935759 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14-03-2024) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21935759 Country of ref document: EP Kind code of ref document: A1 |