CN103400138A - Video signal preprocessing method for artificial intelligent multimode behavior recognition and description - Google Patents
Video signal preprocessing method for artificial intelligent multimode behavior recognition and description Download PDFInfo
- Publication number
- CN103400138A CN103400138A CN2013103342699A CN201310334269A CN103400138A CN 103400138 A CN103400138 A CN 103400138A CN 2013103342699 A CN2013103342699 A CN 2013103342699A CN 201310334269 A CN201310334269 A CN 201310334269A CN 103400138 A CN103400138 A CN 103400138A
- Authority
- CN
- China
- Prior art keywords
- histogram
- lbp
- hog
- video signal
- behavior recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention relates to a video signal preprocessing method for artificial intelligent multimode behavior recognition and description. By classifying moving objects which are detected at an early stage, the types of the objects can be distinguished and the artificial intelligent multimode behavior recognition and description is realized. The video signal preprocessing method artificial intelligent multimode behavior recognition and description is characterized in that an HOG-LBP (Histogram of Oriented Gradients and Local Binary Patterns) foreground extraction method is adopted and a preprocessing method and a preprocessing algorithm are realized. The video signal preprocessing method for artificial intelligent multimode behavior recognition and description has the beneficial effects that the types of the objects can be well distinguished, a better significance to the tracking of the objects is obtained, usable information can be reserved, unusable information can be removed and the unusable information is reduced.
Description
Technical field
The present invention relates to a kind of disposal route of vision signal, especially relate to a kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description.
Background technology
Present intelligent video-detect technology can detect moving target well, but for the classification of target, still exists certain defect, relatively is difficult to distinguish type of vehicle, the problem of the aspects such as pedestrian.
Summary of the invention
The object of the invention is to overcome the prior art deficiency, by the moving target that detected early stage, classifies, and target type can be distinguished, and realizes artificial intelligence multi-mode behavior recognition and description.
Solving the problems of the technologies described above the technical scheme that adopts is a kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description, it is characterized in that: the foreground extraction mode that has adopted HOG-LBP (gradient orientation histogram and local binary patterns), realize the separation of people's car, preprocess method and algorithm are as follows:
One, by the realization of HOG, described HOG its implementation is first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
Histograms of oriented gradients (HOG) descriptor, have and can keep good unchangeability to image geometry with deformation optics, and these two kinds of deformation only there will be on larger space field; Have under the sampling of thick spatial domain, meticulous direction sampling and stronger conditions such as indicative of local optical normalization, as long as the posture that the pedestrian can be kept upright substantially, can allow that the pedestrian has some trickle limb actions, these trickle actions can be left in the basket and not affect the detection effect, and adopting the histograms of oriented gradients method is to be particularly suitable for doing pedestrian detection in image.
(1) gradient calculation, the HOG descriptor first step is exactly the compute gradient value, computing method: the discrete gradient masterplate of simply applying an one dimension is applied in respectively the horizontal and vertical direction and gets on, accompanying drawing 1 expression be that the gradient of horizontal and vertical direction represents, can use following convolution kernel to carry out convolution:
[-1.0.1]and[-1.0.1]
T.
(2) the direction dividing elements of statistics with histogram (Orientation binning), the second step that calculates is to set up blocked histogram, each pixel in each piece is voted to the direction histogram, the shape of each piece can be rectangle or circle, the direction value of direction histogram is the 0-180 degree, direction is divided into 9 channels result best, what accompanying drawing 2 represented is the direction dividing elements of statistics with histogram.
As for the weight of ballot, can be the amplitude of gradient itself or its function, through actual test, gradient amplitude itself can produce best result.
(3) descriptor block, in order to explain the change of illumination and contrast, gradient intensity is normalization partly, this need to be combined into grid larger, the block that spatially links, the HOG descriptor is the vector of the histogrammic element of normalization grid, this histogram is by the zone of all blocks, these blocks usually can be overlapping, mean that each grid has affected last descriptor more than once, two main blocks exist for how much: one is the R-HOG block of rectangle, another is circular C-HOG block, in general the R-HOG block is that a plurality of grids are molecular, by three Parametric Representations: how many grids each block has, each grid has several pixels, and each grid histogram has How many channels do you, the present invention has adopted the R-HOG block, and obtaining by experiment optimum cell block division is 3x3 or 6x6 pixel, histogram is 9 passages simultaneously.
Two, the realization by LBP, described LBP principle is that central pixel point and the pixel around it are carried out size relatively, thereby obtained the sequence of a binaryzation, be converted to the modal representation of metric numerical value by the sequence with binaryzation, then the pattern of each pixel is added up to sort out and obtain a histogram, then the one-component in the corresponding histogram of each pattern, this histogram is used for follow-up identification mission as effective description of former figure.
Basic LBP operator is described as follows:
Can use LBP
P, RThe LBP operator that represents arbitrary dimension, wherein (P, R) span decentering point radius is P sample point on the circumference of R, can realize the calculating of the LBP surrounding pixel point value of any P and R by bilinear interpolation, according to altogether 2
pThe occurrence number of individual different mode, can obtain their LBP histogram, and accompanying drawing 3 has represented the implementation procedure of LBP.
If getting the peripheral information that any one pixel in image obtains it to compare more than or equal to this pixel with this pixel by a threshold value, be designated as 1, otherwise be designated as 0, obtain a binary value by a clockwise arrangement, obtain a decimal number by conversion, decimal numeral pattern for different, obtain different histograms.In order to reduce histogrammic dimension, the present invention has adopted More General Form (uniform pattern), and the LBP pattern is redefined as LBP
N, r uChoose n pixel in the scope of radius r, the conversion between 0-1 can not be greater than u, and such pattern is exactly More General Form (uniform pattern).
Be the foreground extraction mode due to what adopt, whether so need to first train, then obtaining information compares training result in image, thereby obtain this target, be people or car.
Three, based on the training of HOG-LBP mode, the training of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain training image (comprising positive sample and negative sample);
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block,
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP
8,1 2,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram,
Step4: the histogram simultaneous of HOG and LBP is got up, form a training histogram;
Step5: the training histogram that will calculate is put into the svm classifier device and is trained, thereby obtains the Classification and Identification data.
Four, based on the realization of HOG-LBP mode, the identification of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain the image that needs identification;
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block,
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP
8,1 2,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram,
Step4: the histogram simultaneous of HOG and LBP is got up, form an identification histogram;
Step5: the recognition data that the identification histogram that will calculate and training stage draw carries out convolution, thereby obtains the output of a result;
Step6: if this is output as 1, this shows it is the people; If be output as 0, be indicated as vehicle; If output is-1, this shows it is other object, is defined as unrestrained thing here.
Like this,, by above method, realize detecting the classification of moving target, make target type can distinguish (people, vehicle, perhaps unrestrained thing), realize artificial intelligence multi-mode behavior recognition and description.
The invention has the beneficial effects as follows: can distinguish well the type of each target, thereby for the tracking of target, better meaning be arranged.Can remain Useful Information, reject otiose information, thereby reduce useless information.
Description of drawings
Fig. 1 vertical gradient and horizontal direction gradient represent;
The direction dividing elements of Fig. 2 statistics with histogram;
The implementation procedure of Fig. 3 LBP.
Embodiment
Below in conjunction with accompanying drawing 1, accompanying drawing 2, accompanying drawing 3 and an embodiment, the present invention is described in further detail:
A kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description, it is characterized in that: the foreground extraction mode that has adopted HOG-LBP (gradient orientation histogram and local binary patterns), realize the separation of people's car, preprocess method and algorithm are as follows:
One, by the realization of HOG, described HOG its implementation is first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
Histograms of oriented gradients (HOG) descriptor, have and can keep good unchangeability to image geometry with deformation optics, and these two kinds of deformation only there will be on larger space field; Have under the sampling of thick spatial domain, meticulous direction sampling and stronger conditions such as indicative of local optical normalization, as long as the posture that the pedestrian can be kept upright substantially, can allow that the pedestrian has some trickle limb actions, these trickle actions can be left in the basket and not affect the detection effect, and adopting the histograms of oriented gradients method is to be particularly suitable for doing pedestrian detection in image.
(1) gradient calculation, the HOG descriptor first step is exactly the compute gradient value, computing method: the discrete gradient masterplate of simply applying an one dimension is applied in respectively the horizontal and vertical direction and gets on, accompanying drawing 1 expression be that the gradient of horizontal and vertical direction represents, can use following convolution kernel to carry out convolution:
[-1.01.1]and[-1.0.1]
T.
(2) the direction dividing elements of statistics with histogram (Orientation binning), the second step that calculates is to set up blocked histogram, each pixel in each piece is voted to the direction histogram, the shape of each piece can be rectangle or circle, the direction value of direction histogram is the 0-180 degree, direction is divided into 9 channels result best, what accompanying drawing 2 represented is the direction dividing elements of statistics with histogram.
As for the weight of ballot, can be the amplitude of gradient itself or its function, through actual test, gradient amplitude itself can produce best result.
(3) descriptor block, in order to explain the change of illumination and contrast, gradient intensity is normalization partly, this need to be combined into grid larger, the block that spatially links, the HOG descriptor is the vector of the histogrammic element of normalization grid, this histogram is by the zone of all blocks, these blocks usually can be overlapping, mean that each grid has affected last descriptor more than once, two main blocks exist for how much: one is the R-HOG block of rectangle, another is circular C-HOG block, in general the R-HOG block is that a plurality of grids are molecular, by three Parametric Representations: how many grids each block has, each grid has several pixels, and each grid histogram has How many channels do you, the present invention has adopted the R-HOG block, and obtaining by experiment optimum cell block division is 3x3 or 6x6 pixel, histogram is 9 passages simultaneously.
Two, the realization by LBP, described LBP principle is that central pixel point and the pixel around it are carried out size relatively, thereby obtained the sequence of a binaryzation, be converted to the modal representation of metric numerical value by the sequence with binaryzation, then the pattern of each pixel is added up to sort out and obtain a histogram, then the one-component in the corresponding histogram of each pattern, this histogram is used for follow-up identification mission as effective description of former figure.
Basic LBP operator is described as follows:
Can use LBP
P, RThe LBP operator that represents arbitrary dimension, wherein (P, R) span decentering point radius is P sample point on the circumference of R, can realize the calculating of the LBP surrounding pixel point value of any P and R by bilinear interpolation, according to altogether 2
pThe occurrence number of individual different mode, can obtain their LBP histogram, and accompanying drawing 3 has represented the implementation procedure of LBP.
If getting the peripheral information that any one pixel in image obtains it to compare more than or equal to this pixel with this pixel by a threshold value, be designated as 1, otherwise be designated as 0, obtain a binary value by a clockwise arrangement, obtain a decimal number by conversion, decimal numeral pattern for different, obtain different histograms.
In order to reduce histogrammic dimension, the present invention has adopted More General Form (uniform pattern), and the LBP pattern is redefined as LBP
N, r uChoose n pixel in the scope of radius r, the conversion between 0-1 can not be greater than u.Such pattern is exactly More General Form (uniform pattern).
Be the foreground extraction mode due to what adopt, whether so need to first train, then obtaining information compares training result in image, thereby obtain this target, be people or car.
Three, based on the training of HOG-LBP mode, the training of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain training image (comprising positive sample and negative sample);
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block.
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP
8,1 2,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern
3, statistics LBP histogram.
Step4: the histogram simultaneous of HOG and LBP is got up, form a training histogram;
Step5: the training histogram that will calculate is put into the svm classifier device and is trained, thereby obtains the Classification and Identification data.
Four, based on the realization of HOG-LBP mode, the identification of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain the image that needs identification;
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block.
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP
8,1 2,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram.
Step4: the histogram simultaneous of HOG and LBP is got up, form an identification histogram;
StepS: the recognition data that the identification histogram that will calculate and training stage draw carries out convolution, thereby obtains the output of a result;
Step6: if this is output as 1, this shows it is the people; If be output as 0, be indicated as vehicle; If output is-1, this shows it is other object, is defined as unrestrained thing here.
Like this,, by above method, realize detecting the classification of moving target, make target type can distinguish (people, vehicle, perhaps unrestrained thing), realize that artificial intelligence multi-mode behavior knowledge is another and describe.
Claims (4)
1. the preprocessing method of video signal of an artificial intelligence multi-mode behavior recognition and description, it is characterized in that: adopted the foreground extraction mode of HOG-LBP (gradient orientation histogram and local binary patterns), realized preprocess method and the algorithm of the separation of people's car.
2. the preprocessing method of video signal of a kind of artificial intelligence multi-mode behavior recognition and description according to claim 1 is characterized in that:, by the realization of HOG, be first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
3. the preprocessing method of video signal of a kind of artificial intelligence multi-mode behavior recognition and description according to claim 1, it is characterized in that: by the realization of LBP, central pixel point and the pixel around it are carried out size relatively, thereby obtained the sequence of a binaryzation, be converted to the modal representation of metric numerical value by the sequence with binaryzation, then the pattern of each pixel is added up to sort out and obtain a histogram, then the one-component in the corresponding histogram of each pattern, this histogram is used for follow-up identification mission as effective description of former figure.
4. the preprocessing method of video signal of a kind of artificial intelligence multi-mode behavior recognition and description according to claim 1, it is characterized in that: based on the realization of HOG-LBP mode, the identification of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013103342699A CN103400138A (en) | 2013-07-29 | 2013-07-29 | Video signal preprocessing method for artificial intelligent multimode behavior recognition and description |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013103342699A CN103400138A (en) | 2013-07-29 | 2013-07-29 | Video signal preprocessing method for artificial intelligent multimode behavior recognition and description |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103400138A true CN103400138A (en) | 2013-11-20 |
Family
ID=49563756
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2013103342699A Pending CN103400138A (en) | 2013-07-29 | 2013-07-29 | Video signal preprocessing method for artificial intelligent multimode behavior recognition and description |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103400138A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104978567A (en) * | 2015-06-11 | 2015-10-14 | 武汉大千信息技术有限公司 | Vehicle detection method based on scenario classification |
CN110135254A (en) * | 2019-04-12 | 2019-08-16 | 华南理工大学 | A kind of fatigue expression recognition method |
CN112887765A (en) * | 2021-01-08 | 2021-06-01 | 武汉兴图新科电子股份有限公司 | Code rate self-adaptive adjustment system and method applied to cloud fusion platform |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100067742A1 (en) * | 2008-09-12 | 2010-03-18 | Sony Corporation | Object detecting device, imaging apparatus, object detecting method, and program |
CN102663409A (en) * | 2012-02-28 | 2012-09-12 | 西安电子科技大学 | Pedestrian tracking method based on HOG-LBP |
CN102663366A (en) * | 2012-04-13 | 2012-09-12 | 中国科学院深圳先进技术研究院 | Method and system for identifying pedestrian target |
CN103150375A (en) * | 2013-03-11 | 2013-06-12 | 浙江捷尚视觉科技有限公司 | Quick video retrieval system and quick video retrieval method for video detection |
-
2013
- 2013-07-29 CN CN2013103342699A patent/CN103400138A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100067742A1 (en) * | 2008-09-12 | 2010-03-18 | Sony Corporation | Object detecting device, imaging apparatus, object detecting method, and program |
CN102663409A (en) * | 2012-02-28 | 2012-09-12 | 西安电子科技大学 | Pedestrian tracking method based on HOG-LBP |
CN102663366A (en) * | 2012-04-13 | 2012-09-12 | 中国科学院深圳先进技术研究院 | Method and system for identifying pedestrian target |
CN103150375A (en) * | 2013-03-11 | 2013-06-12 | 浙江捷尚视觉科技有限公司 | Quick video retrieval system and quick video retrieval method for video detection |
Non-Patent Citations (2)
Title |
---|
覃远霞: "基于数据挖掘工具的人脸识别LBP计算", 《制造业自动化》 * |
陈健斌: "图像特征提取及其相似度的研究和实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104978567A (en) * | 2015-06-11 | 2015-10-14 | 武汉大千信息技术有限公司 | Vehicle detection method based on scenario classification |
CN104978567B (en) * | 2015-06-11 | 2018-11-20 | 武汉大千信息技术有限公司 | Vehicle checking method based on scene classification |
CN110135254A (en) * | 2019-04-12 | 2019-08-16 | 华南理工大学 | A kind of fatigue expression recognition method |
CN112887765A (en) * | 2021-01-08 | 2021-06-01 | 武汉兴图新科电子股份有限公司 | Code rate self-adaptive adjustment system and method applied to cloud fusion platform |
CN112887765B (en) * | 2021-01-08 | 2022-07-26 | 武汉兴图新科电子股份有限公司 | Code rate self-adaptive adjustment system and method applied to cloud fusion platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lim et al. | Real-time traffic sign recognition based on a general purpose GPU and deep-learning | |
US8447139B2 (en) | Object recognition using Haar features and histograms of oriented gradients | |
CN104112269B (en) | A kind of solar battery laser groove parameter detection method and system based on machine vision | |
Biglari et al. | A cascaded part-based system for fine-grained vehicle classification | |
CN102609682B (en) | Feedback pedestrian detection method for region of interest | |
Yao et al. | Traffic sign recognition using HOG-SVM and grid search | |
CN104036284A (en) | Adaboost algorithm based multi-scale pedestrian detection method | |
KR20170140214A (en) | Filter specificity as training criterion for neural networks | |
Liu et al. | A rail surface defect detection method based on pyramid feature and lightweight convolutional neural network | |
CN104504395A (en) | Method and system for achieving classification of pedestrians and vehicles based on neural network | |
CN102855500A (en) | Haar and HoG characteristic based preceding car detection method | |
CN102087790B (en) | Method and system for low-altitude ground vehicle detection and motion analysis | |
Dib et al. | A review on negative road anomaly detection methods | |
CN107545263A (en) | A kind of object detecting method and device | |
Wang et al. | Hole-based traffic sign detection method for traffic signs with red rim | |
Patel et al. | Automatic licenses plate recognition | |
CN104915642A (en) | Method and apparatus for measurement of distance to vehicle ahead | |
Mammeri et al. | North-American speed limit sign detection and recognition for smart cars | |
Chen et al. | Robust and real-time traffic light recognition based on hierarchical vision architecture | |
CN111274886A (en) | Deep learning-based pedestrian red light violation analysis method and system | |
CN104123714A (en) | Optimal target detection scale generation method in people flow statistics | |
CN103400138A (en) | Video signal preprocessing method for artificial intelligent multimode behavior recognition and description | |
Tian et al. | License plate detection in an open environment by density-based boundary clustering | |
Shang et al. | A novel method for vehicle headlights detection using salient region segmentation and PHOG feature | |
Nguyen et al. | Fast traffic sign detection under challenging conditions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20131120 |