CN110472552A - The video material object method of counting using camera based on image object detection technique - Google Patents

The video material object method of counting using camera based on image object detection technique Download PDF

Info

Publication number
CN110472552A
CN110472552A CN201910736425.1A CN201910736425A CN110472552A CN 110472552 A CN110472552 A CN 110472552A CN 201910736425 A CN201910736425 A CN 201910736425A CN 110472552 A CN110472552 A CN 110472552A
Authority
CN
China
Prior art keywords
object detection
model
image
counting
material object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910736425.1A
Other languages
Chinese (zh)
Inventor
于长斌
颜力琦
李相清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yishun Technology Co Ltd
Original Assignee
Hangzhou Yishun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Yishun Technology Co Ltd filed Critical Hangzhou Yishun Technology Co Ltd
Priority to CN201910736425.1A priority Critical patent/CN110472552A/en
Publication of CN110472552A publication Critical patent/CN110472552A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30242Counting objects in image

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A kind of video material object method of counting using camera based on image object detection technique, which comprises the steps of: step 1) is using in common image object detection data collection come training image object detection model;Step 2) is extracted multiple comprising these frames in kind at random from a part of video captured by camera, go out the entity to be counted for these image taggeds, on the basis of the model of pre-training, using the data set of these new labels continues that the model is trained to enable it to identify since checkpoint and mark these material objects;Step 3) extracts all frames for the video that each needs counts, and by identifying and marking the material object to be counted by the trained model of above step, matches the material object between each frame;When reference line a distance is left at the material object center, the counter of generic in kind adds one.The invention has the advantages that: the accuracy of counting is improved, and can be with real-time counting.

Description

The video material object method of counting using camera based on image object detection technique
Technical field
The invention belongs to computer vision and artificial intelligence fields more particularly to a kind of based on image object detection technique Utilize the video material object method of counting of camera.
Background technique
As artificial intelligence herds in smart city, wisdom factory, wisdom the application popularization on farm, unattended automation Ground manages the inexorable trend of traffic, production, plantation, cultivation as the epoch.Camera is adopted as ubiquitous monitoring tools The video data of collection has huge tap value.These management come high efficiency smart completed using the video that camera is shot Task can save a large amount of manpower and material resources.Wherein, how to identify and counted for (such as people, vehicle) in kind outstanding in video It is important.
The method of counting of general moving object is that material object is distinguished using the variation of the part moved in video, cannot Type in kind is identified, and accuracy rate is lower;Traditional method in kind and general using machine of capable of detecting The method that device learning method counts also can only be for specific specified counting in kind, such as personage counts, vehicle count, cannot be to more Kind material object counts simultaneously, and is unable to real-time counting.
Summary of the invention
In view of the deficiencies of the prior art, it is an object of the present invention to provide a kind of, and the utilization based on image object detection technique is taken the photograph It, can be for a variety of material objects for specifically needing to identify, using in image object detection as the video material object method of counting of head Trained image object detection model on data set, the training by finely tuning the stage can detect and mark these realities Then object records the material object by reference line using the matched jamming and reference line of object between frame.
Technical scheme is as follows:
A kind of video material object method of counting using camera based on image object detection technique, which is characterized in that packet Include following steps:
Step 1) the pre-training stage: using in common image object detection data collection come training image object detection Model can identify and mark the type in kind in image, preservation model parameter to checkpoint;
Step 2) finely tunes the stage: the material object if necessary to count is already contained in data set, and recognition accuracy It meets the requirements, then directly carries out step 3), otherwise, extracted at random from a part of video captured by camera multiple comprising this A little frames in kind go out the entity to be counted for these image taggeds and utilize these new labels on the basis of the model of pre-training Data set continue to train the model to enable it to identify since checkpoint and mark these material objects;
Step 3) the video count stage: all frames for the video that each needs counts are extracted, by passing through above step Trained model identifies and marks the material object to be counted, and matches, tracks material object between each frame;Fixed next reference line Position record the entity when a center in kind is moved to the other side from the side of reference line, when the material object center When leaving reference line a distance, the counter of generic in kind adds one.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating image object detection data collection in step 1) is MS COCO data set, ImageNet data set, CIFAR data set.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating image object detection model in step 1) is MaskRCNN model, Fast/Faster RCNN model, VGG model, ResNet Model, Inception model.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute The image of arbitrary size can be inputted by stating image object detection model in step 1), and export all realities detected in image Position, size and the title of object in the picture.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating the data set newly marked in step 2) is the view that 5 or more are chosen from all videos that all cameras to be detected are shot Frequently, each video, which is chosen, is no more than 10 frames, and composition is that verifying integrates in data set and the ratio of training set is 1:5~1:2.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute The call format of image object detection model, the mark of object detection will be met by stating the data set annotation formatting newly marked in step 2) Note format is the coordinate of left upper apex, the size of rectangle frame and the type in kind for the rectangle frame in kind that should be detected, image The annotation formatting of segmentation is the orderly two dimension of all apex coordinates composition for the Polygonal Boundary in kind that should be detected Array and type in kind.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute During the training for stating step 2) the fine tuning stage, the step number that each period of model is arranged is 20~100, so that model is new Loss function on the data set of label converges to minimum, and accuracy rate converges to maximum.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute State reference line in step 3) position range be picture width or height 10%~90%.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating the matching in kind in step 3) between each frame, tracking is to record the position in kind being each detected first, next All material objects detected are traversed when frame, find corresponding material object in the smallest conduct previous frame of Weighted distance at center in kind, And unidentified material object out during tracking in order to prevent, sets a forgetting value, range is 1~10, if undetected Frame number is more than the forgetting value, then determines that the material object fails.
The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating center in kind in step 3) and leaving a distance of reference line is the 10%~50% of size in kind.
Compared with prior art, the present invention having the beneficial effect that
1) it can be counted for the different material objects shot under different situations, need to only acquire the portion of camera to be detected Divide a small amount of frame of video, mark material object to be counted therein, training is finely adjusted just to parameter on preparatory trained model It can.
2) object name that can identify counting counts a variety of material objects simultaneously.
3) accuracy of counting is improved, and can be with real-time counting.
Detailed description of the invention
Fig. 1 is the overall procedure schematic diagram of the invention patent;
Fig. 2 is the flow diagram of the tracking section in kind of the invention patent;
Fig. 3 be the invention patent according to reference line counting schematic diagram (original state in kind is 0, center in kind from When reference line side is to the other side 1) state becomes from 0;
Fig. 4 is that (state is from 1 when reference line is crossed in kind 3/4 for the counting schematic diagram according to reference line of the invention patent Become 2, the counter of generic in kind adds 1, does not consider further that the material object).
Specific implementation method
Specific implementation method of the invention that the following is further explained with reference to the attached drawings.
Referring to Fig. 1, general steps of the invention are as follows:
1) the pre-training stage: using in common image object detection data collection come training image object detection mould Type enables it to identify and mark the type in kind (material object in data set, such as cat, dog, people, vehicle, aircraft) in image, Preservation model parameter is to checkpoint.
2) finely tune the stage: the material object if necessary to count is already contained in data set, and recognition accuracy meets It is required that then directly carrying out step 3, otherwise, extracted at random from a part of video captured by camera multiple comprising these realities The frame of object goes out the entity to be counted for these image taggeds and utilizes the number of these new labels on the basis of the model of pre-training According to collection continue that the model is trained to enable it to identify since checkpoint and marks these material objects.
3) the video count stage: extracting all frames for the video that each needs counts, by by above step training Good model identifies and marks the material object to be counted, and matches the material object between each frame.The position of fixed next reference line, often When a center in kind is moved to the other side from the side of reference line, the entity is recorded, when the material object leaves reference line When a distance of center, the counter of generic in kind adds one.
Specifically, the image object detection data collection that the present invention uses have MS COCO data set, ImageNet data set, CIFAR data set.The image recognition model that the present invention uses has MaskRCNN model, Fast/FasterRCNN model, VGG mould Type, ResNet model, Inception model.One such data set and a kind of model need to be only selected when implementation.These images Identification model can input the image of arbitrary size, and export all positions in the picture in kind detected in image, Size, title.
It, should be suitable when extracting multiple frames comprising these material objects at random from a part of video captured by camera Representative frame is chosen, 5 or more videos are chosen from all videos that all cameras to be detected are shot, each Video, which is chosen, is no more than 10 frames, and composition is that verifying integrates in data set and the ratio of training set is 1:5~1:2.
When for task flagging new data set, customized data set will meet the data of training image identification model The call format of collection.The annotation formatting of object detection is coordinate, the rectangle of the left upper apex for the rectangle frame in kind that should be detected The size of frame and type in kind, the annotation formatting of image segmentation are the Polygonal Boundary in kind that should be detected, this Polygon needs to cover entire material object as much as possible, and record format is made of for one all apex coordinates of polygon orderly Two-dimensional array and type in kind.
During the training in the stage of fine tuning, the step number that each period of model should be arranged is 20~100, so that mould Loss function of the type on the data set newly marked converges to minimum, and accuracy rate converges to maximum.By finely tuning the training in stage, On the basis of detection original data set may be implemented, the parameter for finely tuning model makes model can detecte customized new data set In object category.
Fig. 2 is participated in, the step of matching in kind between each frame, tracking is to record the material object being each detected first Position, when next frame traverses all material objects detected, and the Weighted distance for finding center in kind is the smallest as in previous frame Corresponding material object.Weighted distance calculates as follows:
Wherein c1It is this center in kind in this frame,WithIt is its two coordinate values, c respectively0It is preceding n frame Finally detected center in kind to be compared,WithIt is its two coordinate values respectively, α and β are weight, if mainly Detect the movement of x-axis direction, settable α=0.8, β=0.2.Unidentified material object out, setting are lost during tracking in order to prevent Forgetting value is n, and range is 1~10, if undetected frame number is more than the forgetting value, determines that the material object fails.
Determine reference line for the counting in kind passed through, position range be picture width or height 10%~ 90%, it is to select the higher part of discrimination in video depending on specific counting load, selects in picture under normal circumstances Portion.
Referring to Fig. 3,4, original state in kind is 0, and state becomes from 0 when center in kind is from reference line side to the other side 1,10%~50% that reference line reaches size in kind is left at center in kind, selects 25%, i.e. the 3/4 of object under normal circumstances When having crossed reference line, state becomes 2 from 1, and the counter of generic in kind adds 1, does not consider further that the material object.
The counting in kind of all categories finally can be obtained.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in and of the invention include Within the scope of.

Claims (10)

1. a kind of video material object method of counting using camera based on image object detection technique, which is characterized in that including Following steps:
Step 1) the pre-training stage: using in common image object detection data collection come training image object detection mould Type can identify and mark the type in kind in image, preservation model parameter to checkpoint;
Step 2) finely tunes the stage: the material object if necessary to count is already contained in data set, and recognition accuracy meets It is required that then directly carrying out step 3), otherwise, extracted at random from a part of video captured by camera multiple comprising these realities The frame of object goes out the entity to be counted for these image taggeds and utilizes the number of these new labels on the basis of the model of pre-training According to collection continue that the model is trained to enable it to identify since checkpoint and marks these material objects;
Step 3) the video count stage: extracting all frames for the video that each needs counts, by by above step training Good model identifies and marks the material object to be counted, and matches, tracks material object between each frame;The position of fixed next reference line It sets, when a center in kind is moved to the other side from the side of reference line, records the entity, when the material object center is left When reference line a distance, the counter of generic in kind adds one.
2. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, image object detection data collection is MS COCO data set, ImageNet data set or CIFAR in the step 1) Data set.
3. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, image object detection model is Mask RCNN model, Fast/Faster RCNN model, VGG in the step 1) Model, ResNet model or Inception model.
4. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, image object detection model can input the image of arbitrary size in the step 1), and export institute in image There is position, size and the title in the picture in kind detected.
5. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, the data set newly marked in the step 2) is chosen from all videos that all cameras to be detected are shot 5 or more videos, each video, which is chosen, is no more than 10 frames, and composition is that the ratio of verifying collection and training set is 1 in data set: 5~1:2.
6. the video material object method of counting using camera based on image object detection technique according to claim 1, It being characterized in that, the data set annotation formatting newly marked in the step 2) will meet the call format of image object detection model, The annotation formatting of object detection is the coordinate of left upper apex, the size of rectangle frame and the reality for the rectangle frame in kind that should be detected Species, the annotation formatting of image segmentation are that all apex coordinates for the Polygonal Boundary in kind that should be detected form Orderly two-dimensional array and type in kind.
7. the video material object method of counting using camera based on image object detection technique according to claim 1, It being characterized in that, during the training in step 2) the fine tuning stage, the step number that each period of model is arranged is 20~100, So that loss function of the model on the data set newly marked converges to minimum, accuracy rate converges to maximum.
8. the video material object method of counting using camera based on image object detection technique according to claim 1, Be characterized in that, in the step 3) position range of reference line be picture width or height 10%~90%.
9. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, the matching in kind, tracking in the step 3) between each frame are to record the material object being each detected first Position, when next frame traverses all material objects detected, and the Weighted distance for finding center in kind is the smallest as in previous frame Corresponding material object, and unidentified material object out during tracking in order to prevent, set a forgetting value, and range is 1~10, if Undetected frame number is more than the forgetting value, then determines that the material object fails.
10. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, 10%~50% that a distance of reference line is size in kind is left at center in kind in the step 3).
CN201910736425.1A 2019-08-09 2019-08-09 The video material object method of counting using camera based on image object detection technique Pending CN110472552A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910736425.1A CN110472552A (en) 2019-08-09 2019-08-09 The video material object method of counting using camera based on image object detection technique

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910736425.1A CN110472552A (en) 2019-08-09 2019-08-09 The video material object method of counting using camera based on image object detection technique

Publications (1)

Publication Number Publication Date
CN110472552A true CN110472552A (en) 2019-11-19

Family

ID=68510075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910736425.1A Pending CN110472552A (en) 2019-08-09 2019-08-09 The video material object method of counting using camera based on image object detection technique

Country Status (1)

Country Link
CN (1) CN110472552A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112153461A (en) * 2020-09-25 2020-12-29 北京百度网讯科技有限公司 Method and device for positioning sound production object, electronic equipment and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160321507A1 (en) * 2014-02-24 2016-11-03 Sk Telecom Co., Ltd. Person counting method and device for same
CN106384345A (en) * 2016-08-31 2017-02-08 上海交通大学 RCNN based image detecting and flow calculating method
CN106446782A (en) * 2016-08-29 2017-02-22 北京小米移动软件有限公司 Image identification method and device
CN107818343A (en) * 2017-10-30 2018-03-20 中国科学院计算技术研究所 Method of counting and device
CN109101929A (en) * 2018-08-16 2018-12-28 新智数字科技有限公司 A kind of pedestrian counting method and device
WO2020101472A1 (en) * 2018-11-14 2020-05-22 Mimos Berhad Method and system for counting and determining direction of movement of moving objects

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160321507A1 (en) * 2014-02-24 2016-11-03 Sk Telecom Co., Ltd. Person counting method and device for same
CN106446782A (en) * 2016-08-29 2017-02-22 北京小米移动软件有限公司 Image identification method and device
CN106384345A (en) * 2016-08-31 2017-02-08 上海交通大学 RCNN based image detecting and flow calculating method
CN107818343A (en) * 2017-10-30 2018-03-20 中国科学院计算技术研究所 Method of counting and device
CN109101929A (en) * 2018-08-16 2018-12-28 新智数字科技有限公司 A kind of pedestrian counting method and device
WO2020101472A1 (en) * 2018-11-14 2020-05-22 Mimos Berhad Method and system for counting and determining direction of movement of moving objects

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KRZYSZTOF GRAJEK: "Counting Objects with Faster RCNN", 《HTTPS://SOFTWAREMILL.COM/COUNTING-OBJECTS-WITH-FASTER-RCNN》 *
段萌等: "基于卷积神经网络的小样本图像识别方法", 《计算机工程与设计》 *
蒋维娜: "基于多特征的行人计数算法研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112153461A (en) * 2020-09-25 2020-12-29 北京百度网讯科技有限公司 Method and device for positioning sound production object, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
Zhao et al. Cloud shape classification system based on multi-channel cnn and improved fdm
Li et al. An effective data augmentation strategy for CNN-based pest localization and recognition in the field
CN102542289B (en) Pedestrian volume statistical method based on plurality of Gaussian counting models
WO2017190574A1 (en) Fast pedestrian detection method based on aggregation channel features
Li et al. Robust people counting in video surveillance: Dataset and system
Zhang et al. Application of deep learning and unmanned aerial vehicle technology in traffic flow monitoring
US20130243343A1 (en) Method and device for people group detection
CN106295532B (en) A kind of human motion recognition method in video image
CN104751136A (en) Face recognition based multi-camera video event retrospective trace method
CN105160319A (en) Method for realizing pedestrian re-identification in monitor video
Ren et al. A novel squeeze YOLO-based real-time people counting approach
CN110334602B (en) People flow statistical method based on convolutional neural network
CN110222582B (en) Image processing method and camera
CN103530638A (en) Method for matching pedestrians under multiple cameras
CN110766123A (en) Fry counting system and fry counting method
CN109685045A (en) A kind of Moving Targets Based on Video Streams tracking and system
CN102214309A (en) Special human body recognition method based on head and shoulder model
CN103020577B (en) Moving target identification method based on hog characteristic and system
Park et al. Detection of construction workers in video frames for automatic initialization of vision trackers
CN105701466A (en) Rapid all angle face tracking method
CN106709438A (en) Method for collecting statistics of number of people based on video conference
CN109376584A (en) A kind of poultry quantity statistics system and method for animal husbandry
CN108932509A (en) A kind of across scene objects search methods and device based on video tracking
CN109409250A (en) A kind of across the video camera pedestrian of no overlap ken recognition methods again based on deep learning
CN107609509A (en) A kind of action identification method based on motion salient region detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination