CN110472552A

CN110472552A - The video material object method of counting using camera based on image object detection technique

Info

Publication number: CN110472552A
Application number: CN201910736425.1A
Authority: CN
Inventors: 于长斌; 颜力琦; 李相清
Original assignee: Hangzhou Yishun Technology Co Ltd
Current assignee: Hangzhou Yishun Technology Co Ltd
Priority date: 2019-08-09
Filing date: 2019-08-09
Publication date: 2019-11-19

Abstract

A kind of video material object method of counting using camera based on image object detection technique, which comprises the steps of: step 1) is using in common image object detection data collection come training image object detection model；Step 2) is extracted multiple comprising these frames in kind at random from a part of video captured by camera, go out the entity to be counted for these image taggeds, on the basis of the model of pre-training, using the data set of these new labels continues that the model is trained to enable it to identify since checkpoint and mark these material objects；Step 3) extracts all frames for the video that each needs counts, and by identifying and marking the material object to be counted by the trained model of above step, matches the material object between each frame；When reference line a distance is left at the material object center, the counter of generic in kind adds one.The invention has the advantages that: the accuracy of counting is improved, and can be with real-time counting.

Description

The video material object method of counting using camera based on image object detection technique

Technical field

The invention belongs to computer vision and artificial intelligence fields more particularly to a kind of based on image object detection technique Utilize the video material object method of counting of camera.

Background technique

As artificial intelligence herds in smart city, wisdom factory, wisdom the application popularization on farm, unattended automation Ground manages the inexorable trend of traffic, production, plantation, cultivation as the epoch.Camera is adopted as ubiquitous monitoring tools The video data of collection has huge tap value.These management come high efficiency smart completed using the video that camera is shot Task can save a large amount of manpower and material resources.Wherein, how to identify and counted for (such as people, vehicle) in kind outstanding in video It is important.

The method of counting of general moving object is that material object is distinguished using the variation of the part moved in video, cannot Type in kind is identified, and accuracy rate is lower；Traditional method in kind and general using machine of capable of detecting The method that device learning method counts also can only be for specific specified counting in kind, such as personage counts, vehicle count, cannot be to more Kind material object counts simultaneously, and is unable to real-time counting.

Summary of the invention

In view of the deficiencies of the prior art, it is an object of the present invention to provide a kind of, and the utilization based on image object detection technique is taken the photograph It, can be for a variety of material objects for specifically needing to identify, using in image object detection as the video material object method of counting of head Trained image object detection model on data set, the training by finely tuning the stage can detect and mark these realities Then object records the material object by reference line using the matched jamming and reference line of object between frame.

Technical scheme is as follows:

A kind of video material object method of counting using camera based on image object detection technique, which is characterized in that packet Include following steps:

Step 1) the pre-training stage: using in common image object detection data collection come training image object detection Model can identify and mark the type in kind in image, preservation model parameter to checkpoint；

Step 2) finely tunes the stage: the material object if necessary to count is already contained in data set, and recognition accuracy It meets the requirements, then directly carries out step 3), otherwise, extracted at random from a part of video captured by camera multiple comprising this A little frames in kind go out the entity to be counted for these image taggeds and utilize these new labels on the basis of the model of pre-training Data set continue to train the model to enable it to identify since checkpoint and mark these material objects；

Step 3) the video count stage: all frames for the video that each needs counts are extracted, by passing through above step Trained model identifies and marks the material object to be counted, and matches, tracks material object between each frame；Fixed next reference line Position record the entity when a center in kind is moved to the other side from the side of reference line, when the material object center When leaving reference line a distance, the counter of generic in kind adds one.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating image object detection data collection in step 1) is MS COCO data set, ImageNet data set, CIFAR data set.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating image object detection model in step 1) is MaskRCNN model, Fast/Faster RCNN model, VGG model, ResNet Model, Inception model.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute The image of arbitrary size can be inputted by stating image object detection model in step 1), and export all realities detected in image Position, size and the title of object in the picture.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating the data set newly marked in step 2) is the view that 5 or more are chosen from all videos that all cameras to be detected are shot Frequently, each video, which is chosen, is no more than 10 frames, and composition is that verifying integrates in data set and the ratio of training set is 1:5~1:2.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute The call format of image object detection model, the mark of object detection will be met by stating the data set annotation formatting newly marked in step 2) Note format is the coordinate of left upper apex, the size of rectangle frame and the type in kind for the rectangle frame in kind that should be detected, image The annotation formatting of segmentation is the orderly two dimension of all apex coordinates composition for the Polygonal Boundary in kind that should be detected Array and type in kind.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute During the training for stating step 2) the fine tuning stage, the step number that each period of model is arranged is 20~100, so that model is new Loss function on the data set of label converges to minimum, and accuracy rate converges to maximum.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute State reference line in step 3) position range be picture width or height 10%~90%.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating the matching in kind in step 3) between each frame, tracking is to record the position in kind being each detected first, next All material objects detected are traversed when frame, find corresponding material object in the smallest conduct previous frame of Weighted distance at center in kind, And unidentified material object out during tracking in order to prevent, sets a forgetting value, range is 1~10, if undetected Frame number is more than the forgetting value, then determines that the material object fails.

The video material object method of counting using camera based on image object detection technique, which is characterized in that institute Stating center in kind in step 3) and leaving a distance of reference line is the 10%~50% of size in kind.

Compared with prior art, the present invention having the beneficial effect that

1) it can be counted for the different material objects shot under different situations, need to only acquire the portion of camera to be detected Divide a small amount of frame of video, mark material object to be counted therein, training is finely adjusted just to parameter on preparatory trained model It can.

2) object name that can identify counting counts a variety of material objects simultaneously.

3) accuracy of counting is improved, and can be with real-time counting.

Detailed description of the invention

Fig. 1 is the overall procedure schematic diagram of the invention patent；

Fig. 2 is the flow diagram of the tracking section in kind of the invention patent；

Fig. 3 be the invention patent according to reference line counting schematic diagram (original state in kind is 0, center in kind from When reference line side is to the other side 1) state becomes from 0；

Fig. 4 is that (state is from 1 when reference line is crossed in kind 3/4 for the counting schematic diagram according to reference line of the invention patent Become 2, the counter of generic in kind adds 1, does not consider further that the material object).

Specific implementation method

Specific implementation method of the invention that the following is further explained with reference to the attached drawings.

Referring to Fig. 1, general steps of the invention are as follows:

1) the pre-training stage: using in common image object detection data collection come training image object detection mould Type enables it to identify and mark the type in kind (material object in data set, such as cat, dog, people, vehicle, aircraft) in image, Preservation model parameter is to checkpoint.

2) finely tune the stage: the material object if necessary to count is already contained in data set, and recognition accuracy meets It is required that then directly carrying out step 3, otherwise, extracted at random from a part of video captured by camera multiple comprising these realities The frame of object goes out the entity to be counted for these image taggeds and utilizes the number of these new labels on the basis of the model of pre-training According to collection continue that the model is trained to enable it to identify since checkpoint and marks these material objects.

3) the video count stage: extracting all frames for the video that each needs counts, by by above step training Good model identifies and marks the material object to be counted, and matches the material object between each frame.The position of fixed next reference line, often When a center in kind is moved to the other side from the side of reference line, the entity is recorded, when the material object leaves reference line When a distance of center, the counter of generic in kind adds one.

Specifically, the image object detection data collection that the present invention uses have MS COCO data set, ImageNet data set, CIFAR data set.The image recognition model that the present invention uses has MaskRCNN model, Fast/FasterRCNN model, VGG mould Type, ResNet model, Inception model.One such data set and a kind of model need to be only selected when implementation.These images Identification model can input the image of arbitrary size, and export all positions in the picture in kind detected in image, Size, title.

It, should be suitable when extracting multiple frames comprising these material objects at random from a part of video captured by camera Representative frame is chosen, 5 or more videos are chosen from all videos that all cameras to be detected are shot, each Video, which is chosen, is no more than 10 frames, and composition is that verifying integrates in data set and the ratio of training set is 1:5~1:2.

When for task flagging new data set, customized data set will meet the data of training image identification model The call format of collection.The annotation formatting of object detection is coordinate, the rectangle of the left upper apex for the rectangle frame in kind that should be detected The size of frame and type in kind, the annotation formatting of image segmentation are the Polygonal Boundary in kind that should be detected, this Polygon needs to cover entire material object as much as possible, and record format is made of for one all apex coordinates of polygon orderly Two-dimensional array and type in kind.

During the training in the stage of fine tuning, the step number that each period of model should be arranged is 20~100, so that mould Loss function of the type on the data set newly marked converges to minimum, and accuracy rate converges to maximum.By finely tuning the training in stage, On the basis of detection original data set may be implemented, the parameter for finely tuning model makes model can detecte customized new data set In object category.

Fig. 2 is participated in, the step of matching in kind between each frame, tracking is to record the material object being each detected first Position, when next frame traverses all material objects detected, and the Weighted distance for finding center in kind is the smallest as in previous frame Corresponding material object.Weighted distance calculates as follows:

Wherein c₁It is this center in kind in this frame,WithIt is its two coordinate values, c respectively₀It is preceding n frame Finally detected center in kind to be compared,WithIt is its two coordinate values respectively, α and β are weight, if mainly Detect the movement of x-axis direction, settable α=0.8, β=0.2.Unidentified material object out, setting are lost during tracking in order to prevent Forgetting value is n, and range is 1~10, if undetected frame number is more than the forgetting value, determines that the material object fails.

Determine reference line for the counting in kind passed through, position range be picture width or height 10%~ 90%, it is to select the higher part of discrimination in video depending on specific counting load, selects in picture under normal circumstances Portion.

Referring to Fig. 3,4, original state in kind is 0, and state becomes from 0 when center in kind is from reference line side to the other side 1,10%~50% that reference line reaches size in kind is left at center in kind, selects 25%, i.e. the 3/4 of object under normal circumstances When having crossed reference line, state becomes 2 from 1, and the counter of generic in kind adds 1, does not consider further that the material object.

The counting in kind of all categories finally can be obtained.

The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in and of the invention include Within the scope of.

Claims

1. a kind of video material object method of counting using camera based on image object detection technique, which is characterized in that including Following steps:

Step 1) the pre-training stage: using in common image object detection data collection come training image object detection mould Type can identify and mark the type in kind in image, preservation model parameter to checkpoint；

Step 2) finely tunes the stage: the material object if necessary to count is already contained in data set, and recognition accuracy meets It is required that then directly carrying out step 3), otherwise, extracted at random from a part of video captured by camera multiple comprising these realities The frame of object goes out the entity to be counted for these image taggeds and utilizes the number of these new labels on the basis of the model of pre-training According to collection continue that the model is trained to enable it to identify since checkpoint and marks these material objects；

Step 3) the video count stage: extracting all frames for the video that each needs counts, by by above step training Good model identifies and marks the material object to be counted, and matches, tracks material object between each frame；The position of fixed next reference line It sets, when a center in kind is moved to the other side from the side of reference line, records the entity, when the material object center is left When reference line a distance, the counter of generic in kind adds one.

2. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, image object detection data collection is MS COCO data set, ImageNet data set or CIFAR in the step 1) Data set.

3. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, image object detection model is Mask RCNN model, Fast/Faster RCNN model, VGG in the step 1) Model, ResNet model or Inception model.

4. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, image object detection model can input the image of arbitrary size in the step 1), and export institute in image There is position, size and the title in the picture in kind detected.

5. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, the data set newly marked in the step 2) is chosen from all videos that all cameras to be detected are shot 5 or more videos, each video, which is chosen, is no more than 10 frames, and composition is that the ratio of verifying collection and training set is 1 in data set: 5~1:2.

6. the video material object method of counting using camera based on image object detection technique according to claim 1, It being characterized in that, the data set annotation formatting newly marked in the step 2) will meet the call format of image object detection model, The annotation formatting of object detection is the coordinate of left upper apex, the size of rectangle frame and the reality for the rectangle frame in kind that should be detected Species, the annotation formatting of image segmentation are that all apex coordinates for the Polygonal Boundary in kind that should be detected form Orderly two-dimensional array and type in kind.

7. the video material object method of counting using camera based on image object detection technique according to claim 1, It being characterized in that, during the training in step 2) the fine tuning stage, the step number that each period of model is arranged is 20~100, So that loss function of the model on the data set newly marked converges to minimum, accuracy rate converges to maximum.

8. the video material object method of counting using camera based on image object detection technique according to claim 1, Be characterized in that, in the step 3) position range of reference line be picture width or height 10%~90%.

9. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, the matching in kind, tracking in the step 3) between each frame are to record the material object being each detected first Position, when next frame traverses all material objects detected, and the Weighted distance for finding center in kind is the smallest as in previous frame Corresponding material object, and unidentified material object out during tracking in order to prevent, set a forgetting value, and range is 1~10, if Undetected frame number is more than the forgetting value, then determines that the material object fails.

10. the video material object method of counting using camera based on image object detection technique according to claim 1, It is characterized in that, 10%~50% that a distance of reference line is size in kind is left at center in kind in the step 3).