CN110287876A - A kind of content identification method based on video image - Google Patents
A kind of content identification method based on video image Download PDFInfo
- Publication number
- CN110287876A CN110287876A CN201910556426.8A CN201910556426A CN110287876A CN 110287876 A CN110287876 A CN 110287876A CN 201910556426 A CN201910556426 A CN 201910556426A CN 110287876 A CN110287876 A CN 110287876A
- Authority
- CN
- China
- Prior art keywords
- layer
- video image
- model
- content
- trained
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Abstract
A kind of content identification method based on video image, the invention belongs to artificial intelligence fields, and in particular to a kind of video image identification method.It is an object of the present invention to solve the problems, such as that the existing identification real-time based on video content is bad.The present invention constructs image recognition network model first, is then directed to video image, extracts key frame images;Key frame images are handled using image recognition network model, determine the content object of image;The optical flow field information between two field pictures is calculated by optical flow method simultaneously, the feature of key frame is transferred to other frame images;Then model is trained, obtains trained final identification model;The content of video image is identified using trained final identification model.The present invention is used for the content recognition of video image.
Description
Technical field
The invention belongs to artificial intelligence fields, and in particular to a kind of video image identification method.
Background technique
With the gradually development of science and technology, the development such as automatic Pilot technology, robot technology are getting faster, and technology is accordingly more next
More mature, no wheel is automatic Pilot technical field or robotic technology field etc., it is desirable to realize autonomous classification and independently judge
It is typically handled based on image, in particular for the autonomous classification in automatic Pilot and robot (in motion process
Collision prevention etc.) etc. it is most of be that base and video image are handled.
But current video image processing has certain disadvantage: the data volume of video is huge, not only for Image Acquisition
There is very high requirement with hardware such as image procossings, also has higher requirement for the software environment of processing, cause existing
Hardware or software processing speed are slower, are not able to satisfy the requirement of real-time.In particular for for automatic Pilot technology, to reality
The requirement of when property judgement is high, if not being able to satisfy the requirement of real-time, not can guarantee traffic safety, if in order to guarantee reality
The requirement of when property, then may need to sacrifice the precision of images be cost, lessen in this way content recognition accuracy or this
Cause rate of false alarm to increase, great security risk is equally existed to traffic safety.This is also to restrict to have real-time simultaneously
It is required that the fields such as robot development.
Summary of the invention
It is an object of the present invention to solve the problems, such as that the existing identification real-time based on video content is bad.
A kind of content identification method based on video image, comprising the following steps:
Step 1, building image recognition network model:
The structure of the image recognition network model are as follows: input layer, the first convolutional layer, the first pond layer, the second convolution
Layer, the second pond layer, third convolutional layer, third pond layer, fisrt feature splicing layer, second feature splice layer, output layer;It is described
Fisrt feature splices layer splicing and carries out merging features to the characteristic pattern of third pond layer and the second pond layer characteristic pattern, then rolls up
It is again passed by after product, batch standardization, ReLU activation fusion and carries out the processing of attention mechanism, characteristic information input second feature is spelled
Connect layer;Second feature splices layer and the characteristic pattern of fisrt feature splicing layer input and the first pond layer characteristic pattern is carried out feature spelling
It connects, is then again passed by after convolution, batch standardization, ReLU activation fusion and carry out the processing of attention mechanism, by depth characteristic information
Input and output layer;
Step 2 is directed to video image, extracts key frame images;
Key frame images are handled using image recognition network model, determine the content object of image;
The optical flow field information between two field pictures is calculated by optical flow method simultaneously, the feature of key frame is transferred to other frames
Image;
Step 3 is trained for the model of step 2, obtains trained final identification model;
Step 4 identifies the content of video image using trained final identification model.
The invention has the benefit that
The parameter for the image recognition network model that the present invention constructs can control in reasonable range, while needle of the present invention
Processing is distinguished to key frame and non-key frame, to ensure the real-time identified to video content;This hair simultaneously
Bright content recognition accuracy rate can also reach 90 percent, have good video image content recognition effect.
Detailed description of the invention
Fig. 1 is the schematic diagram for constructing image recognition network model.
Specific embodiment
Specific embodiment 1:
A kind of content identification method based on video image, comprising the following steps:
Step 1, as shown in Figure 1, building image recognition network model:
The structure of the image recognition network model are as follows: input layer, the first convolutional layer, the first pond layer, the second convolution
Layer, the second pond layer, third convolutional layer, third pond layer, fisrt feature splicing layer, second feature splice layer, output layer;It is described
Fisrt feature splices layer splicing and carries out merging features to the characteristic pattern of third pond layer and the second pond layer characteristic pattern, then rolls up
It is again passed by after product, batch standardization, ReLU activation fusion and carries out the processing of attention mechanism, characteristic information input second feature is spelled
Connect layer;Second feature splices layer and the characteristic pattern of fisrt feature splicing layer input and the first pond layer characteristic pattern is carried out feature spelling
It connects, is then again passed by after convolution, batch standardization, ReLU activation fusion and carry out the processing of attention mechanism, by depth characteristic information
Input and output layer;
Step 2 is directed to video image, extracts key frame images;It extracts key frame images and uses existing method,
In present embodiment, key frame images are extracted using based on Content Analysis Method, this mode is simple and convenient, can help whole calculation
Method meets the requirement of real-time, while on the content object that can identify with key frame images to the content of image of this method more
Add similar, advantageously ensures that the accuracy of algorithm.It is carried out based on color and texture that Content Analysis Method is based on every frame image etc.
Key-frame extraction determines key frame according to the difference of picture frame and the threshold value of setting.
Key frame images are handled using image recognition network model, determine the content object of image;
The optical flow field information between two field pictures is calculated by light stream (Optical Flow) method simultaneously, by the spy of key frame
Sign is transferred to other frame images;
Light stream in present embodiment is dense optical flow, under the visualization pseudocode of light stream enters:
When the described carry out light stream visualization, tone H: being measured with angle, and value range is 0 °~360 °, since red
It calculates counterclockwise, red is 0 °, and green is 120 °, and blue is 240 °;Saturation degree S: value range is 0.0~1.0;It is bright
Spend V: value range is 0.0 (black)~1.0 (white).Flownet is that V is assigned a value of 255, this function follows flownet, is satisfied
The size of pixel displacement is represented with degree S.
Step 3 is trained for the model of step 2, obtains final identification model;It is tested using test set;Such as
The final identification model of fruit meets discrimination requirement, then is used as trained final identification model, and otherwise return step 1 is readjusted
Model parameter.
Loss function all uses cross entropy loss function when being trained, and is shown below:
Wherein N is the total number for the training sample chosen, and k represents k-th of the sample chosen when training, and j is data set
Class number;pkIndicate the probability of k-th of sample, pkIndicate the probability of jth class.
Step 4 identifies the content of video image using trained final identification model.
Claims (5)
1. a kind of content identification method based on video image, which comprises the following steps:
Step 1, building image recognition network model:
The structure of the image recognition network model are as follows: input layer, the first convolutional layer, the first pond layer, the second convolutional layer,
Two pond layers, third convolutional layer, third pond layer, fisrt feature splicing layer, second feature splice layer, output layer;Described first
Merging features layer splicing to the characteristic pattern of third pond layer and the second pond layer characteristic pattern progress merging features, then convolution, criticize
It is again passed by after standardization, ReLU activation fusion and carries out the processing of attention mechanism, characteristic information input second feature is spliced into layer;
Second feature splices layer and the characteristic pattern of fisrt feature splicing layer input and the first pond layer characteristic pattern is carried out merging features, then
It is again passed by after convolution, batch standardization, ReLU activation fusion and carries out the processing of attention mechanism, depth characteristic information input is exported
Layer;
Step 2 is directed to video image, extracts key frame images;
Key frame images are handled using image recognition network model, determine the content object of image;
The optical flow field information between two field pictures is calculated by optical flow method simultaneously, the feature of key frame is transferred to other frame figures
Picture;
Step 3 is trained for the model of step 2, obtains trained final identification model;
Step 4 identifies the content of video image using trained final identification model.
2. a kind of content identification method based on video image according to claim 1, which is characterized in that the first volume
Lamination, the second convolutional layer, third convolutional layer activation primitive be RELU.
3. a kind of content identification method based on video image according to claim 1, which is characterized in that extract key frame
The process of image, which is used, extracts key frame images based on Content Analysis Method.
4. a kind of content identification method based on video image according to claim 1,2 or 3, which is characterized in that be directed to
Loss function all uses cross entropy loss function when the model of step 2 is trained, and is shown below:
Wherein N is the total number for the training sample chosen, and k represents k-th of the sample chosen when training, and j is the classification of data set
Number;pkIndicate the probability of k-th of sample, pkIndicate the probability of jth class.
5. a kind of content identification method based on video image according to claim 4, which is characterized in that be directed to step 2
Model be trained the final identification model after being trained after, tested using test set;If final identification model
Meet discrimination requirement, be then used as trained final identification model, otherwise return step 1 readjusts model parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910556426.8A CN110287876A (en) | 2019-06-25 | 2019-06-25 | A kind of content identification method based on video image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910556426.8A CN110287876A (en) | 2019-06-25 | 2019-06-25 | A kind of content identification method based on video image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110287876A true CN110287876A (en) | 2019-09-27 |
Family
ID=68005684
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910556426.8A Pending CN110287876A (en) | 2019-06-25 | 2019-06-25 | A kind of content identification method based on video image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287876A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110672343A (en) * | 2019-09-29 | 2020-01-10 | 电子科技大学 | Rotary machine fault diagnosis method based on multi-attention convolutional neural network |
CN111652081A (en) * | 2020-05-13 | 2020-09-11 | 电子科技大学 | Video semantic segmentation method based on optical flow feature fusion |
CN112446342A (en) * | 2020-12-07 | 2021-03-05 | 北京邮电大学 | Key frame recognition model training method, recognition method and device |
CN115115822A (en) * | 2022-06-30 | 2022-09-27 | 小米汽车科技有限公司 | Vehicle-end image processing method and device, vehicle, storage medium and chip |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107092883A (en) * | 2017-04-20 | 2017-08-25 | 上海极链网络科技有限公司 | Object identification method for tracing |
CN109740419A (en) * | 2018-11-22 | 2019-05-10 | 东南大学 | A kind of video behavior recognition methods based on Attention-LSTM network |
CN109871781A (en) * | 2019-01-28 | 2019-06-11 | 山东大学 | Dynamic gesture identification method and system based on multi-modal 3D convolutional neural networks |
-
2019
- 2019-06-25 CN CN201910556426.8A patent/CN110287876A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107092883A (en) * | 2017-04-20 | 2017-08-25 | 上海极链网络科技有限公司 | Object identification method for tracing |
CN109740419A (en) * | 2018-11-22 | 2019-05-10 | 东南大学 | A kind of video behavior recognition methods based on Attention-LSTM network |
CN109871781A (en) * | 2019-01-28 | 2019-06-11 | 山东大学 | Dynamic gesture identification method and system based on multi-modal 3D convolutional neural networks |
Non-Patent Citations (1)
Title |
---|
俞璜悦等: "基于用户兴趣语义的视频关键帧提取", 《计算机应用》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110672343A (en) * | 2019-09-29 | 2020-01-10 | 电子科技大学 | Rotary machine fault diagnosis method based on multi-attention convolutional neural network |
CN110672343B (en) * | 2019-09-29 | 2021-01-26 | 电子科技大学 | Rotary machine fault diagnosis method based on multi-attention convolutional neural network |
CN111652081A (en) * | 2020-05-13 | 2020-09-11 | 电子科技大学 | Video semantic segmentation method based on optical flow feature fusion |
CN111652081B (en) * | 2020-05-13 | 2022-08-05 | 电子科技大学 | Video semantic segmentation method based on optical flow feature fusion |
CN112446342A (en) * | 2020-12-07 | 2021-03-05 | 北京邮电大学 | Key frame recognition model training method, recognition method and device |
CN115115822A (en) * | 2022-06-30 | 2022-09-27 | 小米汽车科技有限公司 | Vehicle-end image processing method and device, vehicle, storage medium and chip |
CN115115822B (en) * | 2022-06-30 | 2023-10-31 | 小米汽车科技有限公司 | Vehicle-end image processing method and device, vehicle, storage medium and chip |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110287876A (en) | A kind of content identification method based on video image | |
CN106952269B (en) | The reversible video foreground object sequence detection dividing method of neighbour and system | |
CN109190475B (en) | Face recognition network and pedestrian re-recognition network collaborative training method | |
CN110796018B (en) | Hand motion recognition method based on depth image and color image | |
CN110033040B (en) | Flame identification method, system, medium and equipment | |
CN109635728B (en) | Heterogeneous pedestrian re-identification method based on asymmetric metric learning | |
CN110298297A (en) | Flame identification method and device | |
CN104240256A (en) | Image salient detecting method based on layering sparse modeling | |
CN107067015A (en) | A kind of vehicle checking method and device based on multiple features deep learning | |
CN112487981A (en) | MA-YOLO dynamic gesture rapid recognition method based on two-way segmentation | |
CN113780132A (en) | Lane line detection method based on convolutional neural network | |
CN112712052A (en) | Method for detecting and identifying weak target in airport panoramic video | |
CN113298024A (en) | Unmanned aerial vehicle ground small target identification method based on lightweight neural network | |
CN110991412A (en) | Face recognition method and device, storage medium and electronic equipment | |
CN113034506A (en) | Remote sensing image semantic segmentation method and device, computer equipment and storage medium | |
CN115359406A (en) | Post office scene figure interaction behavior recognition method and system | |
CN114202775A (en) | Transformer substation dangerous area pedestrian intrusion detection method and system based on infrared image | |
CN106960188B (en) | Weather image classification method and device | |
WO2022222036A1 (en) | Method and apparatus for determining parking space | |
CN109977738A (en) | A kind of video scene segmentation judgment method, intelligent terminal and storage medium | |
CN112686122A (en) | Human body and shadow detection method, device, electronic device and storage medium | |
CN116912648A (en) | Method, device, equipment and storage medium for generating material parameter identification model | |
CN110796008A (en) | Early fire detection method based on video image | |
CN112200840B (en) | Moving object detection system in visible light and infrared image combination | |
CN114998801A (en) | Forest fire smoke video detection method based on contrast self-supervision learning network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190927 |