Three, summary of the invention:
Technical problem to be solved by this invention: at the defective that the present the Internet of background technology pornographic image detects, filtration system exists, propose a kind of content-based, multi-level the Internet pornographic image and bad image detecting system, set up characteristic model and 100,000 the pornographic image standard feature storehouses of oneself.
The technical solution adopted in the present invention:
A kind of content-based network pornography image and bad image detecting system, contain skin color detection subsystem and attitude detection subsystem, described detection system is set up the Mathematical Modeling based on skin color detection and attitude detection fast algorithm, the skin color detection subsystem is formed by the skin color of phase-split network image and the experiment in color of image space is compared, adopt the hsv color space to set up complexion model, the skin color of determining the people is in selected hsv color spatial distributions situation, and then computed image colour of skin degree of exposure, described skin color detection subsystem is by determining to differentiate the threshold value of image colour of skin degree of exposure, distinguish normal picture in view of the above, pornographic image and suspect image, the final judgement of suspect image can't make to(for) the skin color detection subsystem is sent to the attitude detection subsystem with this suspect image and carries out similar matching judgment based on feature; Described attitude detection subsystem, at first pick out the representative standard pornographic image of some, after carrying out signature analysis, extract its feature and set up the posture feature storehouse by training, it is the pornographic characteristics of image of standard storehouse, whether as judgement is the foundation of the coupling similitude of pornographic image, by the suspect image on the network is carried out Wavelet Edge Detection, obtain a Wavelet Edge image, by the Wavelet Edge image is analyzed, extract marginal point, determine the boundary rectangle of object, pixel in the rectangle is cut apart according to complexion model, the skin area image of tentatively being cut apart, be converted to gray level image then through the morphologic filtering corrosion treatment, by shape description and posture analysis to the skin area image, the definition matching similarity, image in the pornographic characteristics of image of present image and the standard storehouse is mated similar judgment processing, if feature similarity in the pornographic characteristics of image of present image feature and the standard storehouse, think that so present image is a pornographic image, and tackled, otherwise the judgement present image is a normal picture.
Described network pornography image and bad image detecting system, the skin color detection subsystem,
(1) at first is the hsv color space and quantizes the pixel transitions of network image, be divided into L color sub-spaces, determine the total shin_count and the frequency sub_count_i of sample skin pixels in this L sub spaces of sample skin pixels then by statistical analysis
Wherein satisfy i=1 ..., L is distributed in the possibility of a subspace, v with the normalized frequency as skin pixels
i=sub_count_i/skin_count
Set the possibility threshold value T_vi of a skin color distribution probability, if satisfy v
i〉=T_vi, then w
i=v
iOtherwise, w
i=0; Final like this obtaining
A={A
1,A
2,…,A
L}
W={w
1,w
2,…,w
L}
Wherein, w
iRepresent corresponding subspace A
iDegree of membership, i.e. A
iIn color be the possibility of skin color, i=1,2 ..., L, parameter L=72, cluster obtains the distribution subspace set A of skin color and the degree of membership set W of A;
(2) computed image colour of skin degree of exposure:
To arbitrary image F (x, y), x=1 ..., M; Y=1 ..., N, with each pixel (x y) is transformed into hsv color space and quantize, and obtains color of pixel subspace label, make entire image F (x, y) just changed into a M * N label dot matrix G (m, n), m=1 ..., M; N=1 ..., N, statistics G (m, normalization histogram Hue[k n)], k=1 ..., L, by the colour of skin degree of exposure in the following formula computed image:
Wherein, w
kRepresent corresponding subspace A
kDegree of membership;
(3) utilize image colour of skin degree of exposure Ratio to distinguish normal picture and pornographic image then, take the soft-decision mode: determine a low threshold value T_Low, a high threshold T_High, relatively Ratio and these two threshold values are adjudicated: if piece image satisfies Ratio 〉=T_High, then adjudicating this image is pornographic image; If satisfy Ratio≤T_Low, then adjudicating this image is normal picture; Think under other situations that this image is a suspect image, the skin color detection subsystem is not done judgement, and the attitude detection subsystem that passes on detects.
Described attitude detection subsystem, attitude detection core algorithm mainly contain Wavelet Edge Detection, image segmentation, morphologic filtering, shape description and similarity coupling:
(1) Wavelet Edge Detection, the suspect image that adopts the Daubechies-4 wavelet basis that the skin color detection subsystem is sent here is carried out tower wavelet decomposition, obtains LL low frequency sub-band and LH, HL, three high-frequency sub-band of HH are utilized following formula
With high-frequency sub-band LH, the edge that HL and HH comprised synthesize an edge graph E (i, j), wherein, E
1[i, j] is the edge subgraph of high-frequency sub-band LH, E
2[i, j] is the edge subgraph of high-frequency sub-band HL, E
3[i, j] is the edge subgraph of high-frequency sub-band HH;
(2) image segmentation at first, is analyzed the Wavelet Edge image, extract the most left, the rightest, go up most, the most following four marginal points, and determine the boundary rectangle of object with this, and wipe then and be positioned at the outer pixel of object boundary rectangle in the original color image, the pixel in the rectangle is cut apart according to complexion model, to any pixel p (x, y), it is transformed into the HSV space and quantize to obtain quantizing label k ∈ l ... L), if w
k≠ 0, then keep this pixel; Otherwise, wipe this pixel, the skin area image of tentatively being cut apart; Wherein, w
kRepresent corresponding subspace A
kDegree of membership,
(3) morphologic filtering adopts mathematical morphology to come the image of tentatively cutting apart is handled, and filters out the noise pixel that does not belong to object area;
(4) shape description, after obtaining the area image of object, utilize the second order of image and 7 constant Hu squares that third moment draws image:
φ
1=η
20+η
02
φ
3=(η
30-3η
12)
2+(3η
21-η
03)
2
φ
4=(η
30+η
12)
2+(η
21+η
03)
2
φ
5=(η
30-3η
12)(η
30+η
12)[(η
30+η
12)
2-3(η
03+η
21)
2]
+(3η
21-η
03)(η
21+η
03)[3(η
30+η
12)
2-(η
03+η
21)
2]
φ
6=(η
20-η
02)[(η
30+η
12)
2-(η
21+η
03)
2]+4η
11(η
30+η
12)(η
21+η
03)
φ
7=(3η
21-η
03)(η
30+η
12)[(η
30+η
12)
2-3(η
03+η
21)
2]
+(3η
12-η
30)(η
21+η
03)[3(η
30+η
12)
2-(η
03+η
21)
2]
Adopt 7 characteristic values of 18 characteristic values of second order to five rank normalization central moment of image and Hu square to describe a width of cloth and cut apart later skin area feature of image shape;
(5) similarity coupling judgement adopts weighting Euclidean distance to carry out measuring similarity, and establishing weight vector is W
j, the current image feature that will adjudicate is φ
j, j=1 wherein, 2 ..., 25; The pornographic characteristics of image of standard storehouse be characterized as φ
Ij', i=1,2 ..., N; J=1,2 ..., 25, wherein N represents the feature number in the pornographic characteristics of image of standard storehouse, definition similarity d
iFor
Obtain N characteristic similarity d
iAfter, setting threshold T_shape, if characteristic similarity drops on threshold interval [T-shape, 1] in, then think feature similarity in the pornographic characteristics of image of the characteristics of image that will work as leading decision and the standard storehouse, and the number Num of statistics similar features, if Num satisfies condition: Num>T_num, wherein T_num just thinks that the current image that will adjudicate is a pornographic image for the threshold value of N feature similarity number in the image feature that will adjudicate and the standard pornographic characteristics of image storehouse; Otherwise adjudicating the current image that will adjudicate is normal picture.
Described network pornography image and bad image detecting system, also contain the icon detection subsystem, according to the size of images ratio network image is differentiated, at first to the width and the height setting threshold value T-size of image, judge according to the size of network image then, if the minimum value in the width of network image and the height is less than preset threshold T-size, then judgement is bad network image, if the minimum value in the width of network image and the height is greater than this setting threshold T-size, then judgement is normal picture; Secondly, judge according to the ratio of the height and the width of image, set the proportion threshold value T-logo of picture altitude and width, filter out laterally or the network image of fillet shape longitudinally, the T-size value selects 32, and the T-logo value selects 10.
Described network pornography image and bad image detecting system, also contain the text detection subsystem, according to text image and the general difference of continuous-tone image on color is formed, to input picture, be converted into gray level image earlier, statistics obtains the histogram H[i of gray level image], i ∈ [0,255], by to the histogrammic analysis of color of image, choose suitable gray value θ eg as dividing histogrammic threshold value, as thresholding grey level histogram is divided into low gray value and two zones of high gray value, choose θ eg 〉=200, utilize following formula to calculate the energy proportion in high gray value zone:
To satisfy Peg 〉=P
EGImage be judged as text image, according to identification requirement P
EGCan choose different values, choose P
EG〉=0.7; Perhaps different with the comentropy that general continuous-tone image is shown according to text image, choose certain gray value range Theta ep
1≤ i≤θ ep
2, the compute histograms comentropy is chosen θ ep
1=127, θ ep
2=255, histogram is done normalized:
Compute histograms local message entropy:
To satisfy epl 〉=EP
LImage be judged as text image, require EP according to identification
LCan get different values, get EP for text image
L≤ 2; Perhaps differentiate the result of text image, above-mentioned two kinds of methods are carried out fusion treatment: Peg selected threshold P according to colouring information
EG1And P
EG2, and to epl selected threshold EP
L1And EP
L2, then definition:
Definition is based on the text image identification parameter of color:
C
H∈ [0,1]; Then work as C
HTo look like be text image to decision diagram during greater than a certain threshold parameter T_ch.
Described network pornography image and bad image detecting system, contain other bad image detection subsystem, the feature samples of other specific bad image is carried out the PCA conversion in rgb color space, set up the PCA color space, in conjunction with neural net to the colour of skin sample training in the PCA color space, obtain a stable characteristics detector, the suspect image that obtains through icon detection subsystem and text detection subsystem by and the comparison of this property detector, detect bad network image, detected bad network image is input to the skin color detection subsystem is determined further processing.
Described network pornography image and bad image detecting system by differentiating the similar ratio of characteristic image coupling in webpage pornographic image and the pornographic characteristics of image of the standard storehouse, are provided with the pornographic image rank.
Positive beneficial effect of the present invention:
1, the present invention takes the lead in being applied to the detection filtration aspect of the Internet pornographic image with technology with " content-based image recognition retrieval " is theoretical at home, created content-based bad image detection model, in conjunction with cluster and neural net method, merged the icon detection, multi-level Intelligent Measurement technology such as text detection and pornographic image, network address by past passive is filtered the information filtering that jumps to active, improved filter effect significantly, can filter JPAG, GIF, BMP, the various picture formats of TIF, the integral body of the Internet pornographic image is discerned filtering success rate greater than 99%, False Rate is lower than 5%, to other flame filter effect greater than 80%, the average recognition time of pornographic image less than 0.5 second, is not influenced networking speed.
2, pornographic image detection model of the present invention, through contrast screening repeatedly, set up 100,000 in the pornographic characteristics of image of standard storehouse, as judging that whether network image is the foundation of the similitude judgement of pornographic image, has realized that content-based flame filters detection, can directly tackle pornographic image information, and in real time pornographic network address being added blacklist automatically, the real-time update url database is in it all the time and dynamically updates, have intelligent, interception efficient height.
Five, embodiment:
Embodiment one: referring to Fig. 1, Fig. 3, network pornography image and bad image detecting system contain the icon detection subsystem, by the size of images ratio network image are differentiated, purpose is to detect the image of those similar advertiser web sites, filters out too little image simultaneously.Since these images be rendered as mostly one very narrow rectangular, perhaps the size of integral image is smaller, does not generally constitute harm from content.
(1) differentiates according to the size of image:, think to belong to icon one class less than the image of this threshold value to the width and the height setting threshold value of image.
Min (image_width, image_height)<T_size, then judgement is bad image.
(2) differentiate according to the ratio of the height and the width of image: the proportion threshold value of setting height and width, can screen fillet image laterally or longitudinally like this, they generally mostly are advertiser web site and so on.
if(image_width>image_height)Rs=image_width/image_height;
else?Rs=image_width/image_height;
(Rs>T_logo), then judgement is normal picture to if.
In practice, rule of thumb, our selected threshold T_size=32, T_logo=10.
Described network pornography image and bad image detecting system, contain the text detection subsystem, network image is carried out text/image to be differentiated, text detector is carried out text/image to network image and is differentiated, the image that detection is made up of large amount of text information, for example fax through internet that exists with image format, network character advertisement etc.
(1) histogram is divided
By to the histogrammic analysis of color of image, find that character image and continuous-tone image have a great difference, have such characteristics: in the higher regional centralized of gray value most energy, and on remaining gray scale, be similar to even distribution.According to these characteristics, choose suitable gray value as dividing histogrammic threshold value, identify character image according to the contrast of the energy of tonal range before and after it.
To input picture, be converted into gray level image earlier, a kind of simple way is promptly got the brightness value of each picture element.Statistics obtains the histogram H[i of this gray level image], i ∈ [0,255] by to the histogrammic analysis of color of image, chooses suitable gray value as dividing histogrammic threshold value.According to a large amount of experiments, get θ eg and grey level histogram is divided into low gray value and two zones of high gray value as thresholding, choose θ eg 〉=200, utilize following formula to calculate the energy proportion in high gray value zone:
To satisfy Peg 〉=P
EGImage be judged as text image, according to identification requirement P
EGCan choose different values.
For text image, experiment shows, chooses P
EGThe 〉=0.7th, suitable.
(2) local message entropy: color character image wants dull a lot of relatively because the color of continuous-tone image is abundant, thereby the comentropy difference that both showed, and histogrammic local message entropy is with difference performance more obvious of the two.Choose certain gray value range Theta ep
1≤ i≤θ ep
2, calculate its histogram information entropy, choose θ ep
1=127, θ ep
2=255, histogram is done normalized:
Compute histograms local message entropy:
To satisfy ep
1〉=EP
LImage be judged as text image, require EP according to identification
LDesirable different value is got EP for text image
LThe≤2nd, suitable.
(3) fusion treatment: the result according to colouring information differentiation text image, above-mentioned two kinds of methods can be carried out fusion treatment, method is as follows: to Peg selected threshold P
EG1And P
EG2, and to ep
1Selected threshold EP
L1And EP
L2, then definition:
Definition is based on the text image identification parameter of color:
C
H∈ [0,1]; Then work as C
HTo look like be text image to decision diagram during greater than a certain threshold parameter T_ch.
The skin color detection subsystem; Color by the phase-split network image forms and the experiment in color of image space is compared; Adopt the hsv color space to set up complexion model; Determine that the Person's skin color is in the distribution situation in selected hsv color space; At first the pixel transitions with network image is the hsv color space and quantizes; Be divided into L color sub-spaces; Determine then total shin_count and the frequency sub_count_i of sample skin pixels in this L sub spaces of sample skin pixels by statistical analysis
Wherein satisfy i=1 ..., L is distributed in the possibility of this subspace, v with the normalized frequency as skin pixels
i=sub_count_i/skin_count
Set the possibility threshold value T_vi of a skin color distribution probability, if satisfy v
i〉=T_vi, then w
i=v
iOtherwise, w
i=0; Final like this obtaining
A={A
1,A
2,…,A
L}
W={w
1,w
2,…,w
L}
Wherein, w
iRepresent corresponding subspace A
iDegree of membership, i.e. A
iIn color be the possibility of skin color, i=1,2 ..., L, parameter L=72, cluster obtains the distribution subspace set A of skin color and the degree of membership set W of A;
Computed image colour of skin degree of exposure: to arbitrary image F (x, y), x=1 ..., M; Y=1 ..., N, with each pixel (x y) is transformed into hsv color space and quantize, and obtains this color of pixel subspace label, make entire image F (x, y) just changed into a M * N label dot matrix G (m, n), m=1 ..., M; N=1 ..., N, statistics G (m, normalization histogram Hue[k n)], k=1 ..., L calculates the skin color degree of exposure of the image browsed by following formula,
Utilize image colour of skin degree of exposure Ratio to distinguish normal picture and pornographic image then, take hard decision mode or soft-decision mode: (1) hard decision mode: determine a threshold value T_Valve, relatively Ratio and T_Valve adjudicate: if piece image satisfies Ratio 〉=T_Valve, then adjudicating this image is pornographic image; Otherwise be normal picture, the value of T_Valve is taken between [0.10,0.15]; (2) soft-decision mode: determine a low threshold value T_Low, a high threshold T_High, relatively Ratio and these two threshold values are adjudicated: if piece image satisfies Ratio 〉=T_High, then adjudicating this image is pornographic image; If satisfy Ratio≤T_Low, then adjudicating this image is normal picture; Think under other situations that this image is a suspect image, the skin color detection subsystem is not done judgement, and the attitude detection subsystem that passes on detects;
The attitude detection subsystem is at first set up the posture feature storehouse by training, to carrying out posture analysis and similar coupling through the suspect image of skin color detection subsystem, distinguishes normal picture and pornographic image.The attitude detection subsystem algorithm mainly is made up of several parts such as Wavelet Edge Detection, image segmentation, morphologic filtering, shape description and similarity couplings, and each several part specifically describes as follows:
(1) Wavelet Edge Detection
Traditional Wavelet Edge Detection principle is: establish C
J+1Represent original image, C
j, D
j 1, D
j 2, D
j 3Be raw video through four width of cloth subimages that wavelet transformation obtains, establish ({ h
k}
K ∈ Z, { g
k}
K ∈ Z) with
Be one group of dual filter that biorthogonal wavelet is derived, then decomposition of the biorthogonal wavelet of image and reconstruction formula are as follows:
The detected image marginal point is promptly sought in certain neighborhood along the gradient vector direction and is made that the gradient vector amplitude is the point of maximum so, and the gradient vector amplitude is proportional to:
And the direction vector of this gradient is: Arg (D
j 1+ iD
j 2).
In application, as fruit dot (x, gradient vector amplitude D y)
jBe the local maximum point in the neighborhood on the direction vector of this gradient, satisfy simultaneously: D
j>T, T are thresholding, and then this point is considered to marginal point.
We adopt the Daubechies-4 wavelet basis that original image is carried out tower wavelet decomposition, obtain LL low frequency sub-band and LH, HL, three high-frequency sub-band of HH.Wherein, the LH subband comprises the edge on the original image horizontal direction; The HL subband comprises the edge on the original image vertical direction; The HH subband comprises the edge on the original image diagonal.We detect as above three types edge respectively, and three types of edges that obtain are synthesized an edge graph.The LH subband is sought gradient vector amplitude maximum point in certain neighborhood in the horizontal direction, and the wavelet coefficient that only keeps the LH subband carries out inverse wavelet transform, obtains edge subgraph E
1(i, j).Similar HL subband and HH subband are handled, obtained E respectively
2(i, j) and E
3(i, j) edge subgraph.Utilize following formula to three types of edges synthesize an edge graph E (i, j).
Image by the skin color detection subsystem is a coloured image, and we handle gray level image when carrying out Wavelet Edge Detection often, therefore coloured image can be converted to gray level image earlier or directly utilize the red channel of coloured image to handle.
(2) image segmentation, for the shape to object in the image is described, we cut apart image in conjunction with Wavelet Edge image and complexion model, mainly therefrom are partitioned into the human body complexion area exposed.
At first, the Wavelet Edge image is analyzed, extract the most left, the rightest, go up most, the most following four marginal points, and determine the boundary rectangle of object with this; Then, wipe the pixel that is positioned in the original color image outside the object boundary rectangle.Pixel in the rectangle is cut apart according to complexion model.To any pixel p (x, y), it is transformed into the HSV space and quantize to obtain quantizing label k ∈ [1 ..., L].If w
k≠ 0, then keep this pixel; Otherwise, wipe this pixel.The skin area image of just tentatively being cut apart like this.
(3) morphologic filtering
The skin area image of tentatively cutting apart that produces above often exists very little graininess of a lot of areas and spot shape noise, need carry out Filtering Processing to them, filter out the noise pixel that those do not belong to object area, effectively keep those pixels that belong to object area simultaneously.Filtering method commonly used, as low pass, high pass, level and smooth etc., at this, we adopt mathematical morphology to come the image of tentatively cutting apart is handled.
Morphology has defined four kinds of basic operations such as expansion, burn into unlatching, closure, and wherein unlatching and closure operation are the compound operations of expansion and erosion operation.For input picture f, the setting structure element is b, and f and b are image in essence, and then b is defined as the expansion of f
(fb)(s)=max{f(s-x)+b(x)|x∈D
b,(s-x)∈D
f}
B is defined as the corrosion of f
(fΘb)(s)=min{f(s+x)-b(x)|x∈D
b,(s+x)∈D
f}
B is defined as the unlatching of f
fоb=(fΘb)b
B is defined as the closure of f
f·b=(fb)Θb
Wherein, D
fAnd D
bBe respectively the domain of definition of f and b, s and x are integer Z
2Vector in the space.For dilation operation,, can expand as long as structural element b and input picture f have a pixel to intersect.On the contrary,, have only when structural element b all is positioned at f, just can corrode for erosion operation.Go up expansion energy expansion image aspects, and corrosion energy downscaled images form from how much.Open computing and can remove the convex domain that does not match with structural element on the image, keep the convex domain that those match simultaneously.Closure operation is then filled the recessed zone that does not match with structural element on those images, keeps the recessed zone that those match simultaneously.To the skin area image of tentatively cutting apart, we adopt the morphological erosion operator to handle.To the image behind the erosion operation, be converted into gray level image earlier, carry out region description then.
(4) shape description: after obtaining the area image of object, shape how to describe this width of cloth image has various ways, describes as digital metric, Fourier description, square description and the topology of region shape.Haveing nothing to do because the translation of Hu square and image, rotation and engineer's scale change, is very useful to the shape description of image.Totally 25 characteristic values of 7 characteristic values that we adopt 18 characteristic values of five rank normalization central moments of second order of image and Hu square are described a width of cloth and are cut apart later skin area feature of image shape.
(5) similarity coupling: we adopt weighting Euclidean distance to carry out measuring similarity.If weight vector is W
j, current image feature is φ
j, j=1 wherein, 2 ..., 25; Feature database is characterized as φ
Ij', i=1,2 ..., N, j=1,2 ..., 25, wherein N representation feature Al Kut is levied number.Definition similarity d
iFor
Obtain N characteristic similarity d
iAfter, setting threshold T_shape if characteristic similarity drops on interval [T_shape, 1], then thinks the feature similarity in present image feature and the feature database and the number Num of statistics similar features.If Num satisfies condition: Num>T_num, wherein T_num is the threshold value of N feature similarity number in current image feature and the feature database, thinks that so this image is a pornographic image.Otherwise adjudicating this image is normal picture.
Network pornography image of the present invention and bad image detecting system, contain other bad image detection subsystem, referring to Fig. 2, other bad image detection subsystem, the feature samples of other specific bad image is carried out the PCA conversion in rgb color space, set up the PCA color space, in conjunction with neural net to the colour of skin sample training in the PCA color space, obtain a stable characteristics detector, the suspect image that obtains through icon detector and text detector by and the comparison of this property detector, detect bad network image and be input to the color detection subsystem and be determined further processing.Other bad visual detector and pornographic image detector concept are similar, but the image recognition of counterpart's body characteristics, and bad image lacks the feature of general character, therefore can only adopt the pattern of training, comparison to adjudicate.Under many circumstances, people are transformed into HSI space or YCbCr space with rgb color space, and monochrome information is separated with chrominance information, utilize the HS two-dimensional sub-spaces in the HSI space or the CbCr two-dimensional sub-spaces in YCbCr space to set up complexion model.But when illumination variation is more violent, bigger variation can appear in the distribution of color in HS subspace and the foundation of CbCr subspace, this is very disadvantageous for feature detection, therefore this part utilizes the PCA conversion to set up the PCA color space, to the colour of skin sample training in the PCA color space, obtain a stable characteristics detector in conjunction with neural net.
Characteristics of image based on neural net and PCA conversion detects: the present invention proposes a kind of characteristics of image detection algorithm based on neural net and PCA conversion, this algorithm detects one by one to the pixel of input picture, under training mode, we carry out the PCA conversion to the feature samples in the training set in rgb space, obtain the projection matrix of a linearity.Secondary series vector sum the 3rd column vector of projection matrix constitutes new two dimensional character and detects the space, the axial vector that is called the PCA feature space, these two vectors are over against the direction of answering feature pixel variations minimum in rgb space, therefore, feature samples in the former training set obtains new feature samples after passing through the matrix projective transformation of being made up of secondary series vector sum the 3rd column vector, the polymerization in the PCA feature space of these feature samples is tight, at last, feature samples in the PCA feature space is delivered neural net train, until network convergence.Under detecting pattern, each pixel of image to be detected is delivered neural net after through the matrix projective transformation that is made of secondary series vector sum the 3rd column vector that obtains under training mode and is detected, and detects one by one to finish, and obtains the testing result of entire image.
The PCA feature space: following condition must be satisfied in a good feature detection space:
1. colouring information is concentrated on certain two component in the image;
2. the non-colouring information (as monochrome information) of these two components should enough lack;
3. the mean square deviation of these two components should be enough little.
The PCA conversion is the optimal mapping under the mean square error meaning, also claims the KL conversion usually.Be expressed in matrix as: A=O
TB, in the formula, A is the vector after the conversion, and B is a vector of wanting conversion, and O is a transformation matrix, and is closely related with B, usually is made up of the characteristic vector of the autocorrelation matrix of B.So on mathematics, the core of PCA conversion is to find the solution the characteristic value and the characteristic vector of matrix.
We set up the PCA feature space by the PCA conversion.If X is the feature samples set that is used to train in the rgb space, X=[X
1, X
2..., X
T], T is the number of feature samples here.The mean vector of calculated characteristics sample at first
It is sample set Φ=[Φ of 0 that the rgb space feature samples is gone to obtain after the average average
1, Φ
2..., Φ
T], Φ
i=X
i-M, 1≤i≤T.Then calculate autocorrelation matrix S
T,
Obtain autocorrelation matrix S at last
TCharacteristic value and characteristic vector, S
TΨ=Ψ Λ, Ψ=[Ψ here
1, Ψ
2, Ψ
3] representing the proper phasor of matrix, Λ is an eigenvalue
1, λ
2, λ
3(λ
1〉=λ
2〉=λ
3) diagonal matrix that constitutes.Eigenvalue
2, λ
3Two corresponding vectorial Ψ
2, Ψ
3Corresponding in rgb space the direction of feature pixel variations minimum, therefore with Ψ
2, Ψ
3Be considered as two main shafts in the new color space, constitute the PCA feature space, and Ψ
2, Ψ
3Constitute the linear projection matrix, the feature samples in the former rgb space arrives the PCA feature space through the linear projection matrixing.
The BP neural net: neural net method has good parallel processing performance, and good generalization ability is arranged, and does not need the prior probability distribution of data, and therefore, neural net method has embodied huge superiority in area of pattern recognition.The BP neural net is the most ripe and most widely used a kind of network of studying in the feed-forward type neural net, and we adopt the BP neural net of a hidden layer here.It is three layers that network is divided into: i is an input layer; J is a hidden node; K is the output layer node.The study error function of define grid is
In the formula: d
kThe desired output of expression network; y
kThe actual output of expression network.So it is as follows to release each layer weights correction formula:
Hidden layer and output layer: w
Jk(t+1)=w
Jk(t)+η δ
ky
j
δ
k=y
k(1-y
k)(d
k-y
k)
Input layer and hidden layer
w
ij(t+1)=w
ij(t)+ηδ
jy
i
In the following formula: η is a learning rate; δ
k, δ
jBe the corresponding correction value of each layer.
Network pornography image of the present invention and bad image detecting system, by differentiating the similar ratio of characteristic image coupling in webpage pornographic image and the pornographic standard picture feature database, the pornographic image rank can be set be tackled respectively, at adult or children, browsed content can be different.
Embodiment two: referring to Fig. 1, Fig. 3, present embodiment is substantially with embodiment one, and its difference is: system does not contain other bad image detection subsystem.Network image is after detection of process icon and text detection, isolate the webpage normal picture, suspect image is delivered the color detection subsystem to be detected, isolate the webpage normal picture, the suspect image that can't judge color detection, be sent to the attitude detection subsystem and carry out similar matching judgment, filter out pornographic image with pornographic standard picture.
Embodiment three: referring to Fig. 1, Fig. 3, present embodiment network pornography image and bad image detecting system only contain color detection subsystem and attitude detection subsystem, only those are tackled the stronger network pornography image of visual stimulus.