CN100361450C - System for blocking off erotic images and unhealthy information in internet - Google Patents

System for blocking off erotic images and unhealthy information in internet Download PDF

Info

Publication number
CN100361450C
CN100361450C CNB2005100485766A CN200510048576A CN100361450C CN 100361450 C CN100361450 C CN 100361450C CN B2005100485766 A CNB2005100485766 A CN B2005100485766A CN 200510048576 A CN200510048576 A CN 200510048576A CN 100361450 C CN100361450 C CN 100361450C
Authority
CN
China
Prior art keywords
image
pornographic
detector
data
color
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2005100485766A
Other languages
Chinese (zh)
Other versions
CN1761204A (en
Inventor
赵慧琴
周翬
汤怀礼
李弼程
彭天强
曹闻
张晨民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Jinhui Computer System Engineering Co., Ltd.
Original Assignee
Zhengzhou Jinhui Computer System Engineering Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Jinhui Computer System Engineering Co Ltd filed Critical Zhengzhou Jinhui Computer System Engineering Co Ltd
Priority to CNB2005100485766A priority Critical patent/CN100361450C/en
Publication of CN1761204A publication Critical patent/CN1761204A/en
Application granted granted Critical
Publication of CN100361450C publication Critical patent/CN100361450C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention relates to a system for intercepting erotic images and cacoethic information in internet. The system comprises IP address filtration, keyword filtration and erotic image detection. A mathematic model of erotic images is established through multiple times of decision directed feedback; a characteristic library of erotic standard images is established as the basis for judging whether a network image is an erotic image; a model for similarity matching judging is established for judging the images based on content through network information of keyword contrast. The present invention not only carries out the filtration of the information content at an application layer, but also adopts the filtration of the web site addresses at an IP layer, which can directly intercept the erotic image information and renew a URL database in real time. The present invention alternates to the active content filtration rather than the passive filtration of the web site addresses of the past. A multifunctional management platform specially provided by the system integrated the complex relationship among an operating system, a browser, an internet protocol and an image detector, solves the problems of course alternations between a client computer and a server, the work division of erotic images detecting tasks and data recombination, and achieves the characteristics independent of the browser.

Description

Intercept the system of pornographic image and flame on the internet
One, technical field: the present invention relates to technical field of internet application, particularly relate to a kind of system of intercepting pornographic image and flame on the internet.
Two, background technology: current society is the epoch of information-based develop rapidly, and the Internet is worldwide popularized rapidly as a kind of wide-open modernized communication technology, and the Internet communication approach spreads all over each corner in the world.Because network world is a Virtual Space, all data in real life, sound, information such as image can change computer bit kenel into and shuttle back and forth in the whole world with computerized information stream, at present increasing people is engaged in amusement on network, research and commercial activity, thereby form a virtual society on the network, network user need not disclose the true identity of oneself and can enjoy a trip to therebetween, the interpersonal also rare morals of daily society, the ethics constraint, therefore network world is more complicated more than real society, fearful, panoramic personage is mingled with therebetween, nourish different purposes separately, justice, evil difficulty is distinguished.Because ordering about of violence, porn site and pornographic webpage are mad in recent years increases, particularly the strong harmful informations such as pornographic image of stimulus to the sense organ are overflowed, bring out juvenile deliquency, have a strong impact on pupillary growing up healthy and sound, cause the head of a family's very big indignation and worry, also cause the concern of society and government, even send the cry of " helping child ", causing spends huge sums set up in, primary school's campus gateway has closed the passage of leading to network education, also disabled for the household PC that child buys, wasted investment greatly, have to give up online superior educational resource.
For online harmful informations such as filtering eroticisms, a large amount of filter softwares and system have also appearred on the market in recent years, can be referred to as " blacklist software ", its technological means is with artificial means known pornographic network address or domain name to be included in " blacklist " address database, by address comparison and keyword comparison, listed network address and relevant information in " blacklist " that the blockade viewer lands.The shortcoming of this method is: for a large amount of undiscovered powerless with pornographic network address that increase newly and the conversion looks, the discovery of intelligence that can not be real-time also is included into blacklist, and the literal comparison time also is subjected to the restriction of country variant literal, is in passive filtration state all the time.A kind of Sexy file judging system and the method for Chinese invention patent ZL0112132.7, elder generation's input marking file in system, word segment in the isolated tested webpage and picture part are sent to literal relatively engine and porny identification engine respectively, also compare by the pornographic index that calculates literal and picture with pornographic index of discrimination, judge Sexy file: a kind of Sexy picture checking system of ZL0112127.0, filter the examine picture by dual engine, introduce relatively engine of porny database and database, improved the accuracy of porny identification; A kind of Web content filtration system of patent application 200410053683.3, by information filtering agency, querying server and content analysis and management server, the information filtering proxies store has blacklist and white list, querying server has a URL storehouse with classification and rating information, content analysis and management server are classified and classified estimation to the resource among the intemet, system has self-learning capability, the genealogical classification precision can be improved, all kinds of media datas that exist in the Internet can be initiatively filtered; But their main filtration approach remain based on automatic renewal and interception to URL, lack profound, completely, based on the filtration of flesh and blood.Also do not have to find content-based fully identification, filter software up till now on the market and software and hardware is combined, integrate with computer operating system, realize characteristics that " eradicating pornography " application system and web browser have nothing to do, have and to unload and this series products of survivability.
Three, summary of the invention:
Technical problem to be solved by this invention: detect the defective that filtration system exists at background technology present the Internet pornographic image and flame, propose a kind of content-based, multi-level, the comprehensive interconnected network pornography image and the system of intercepting of flame.
The technical solution adopted in the present invention:
A kind of system of intercepting pornographic image and flame on the internet, contain IP address filtering and keyword filtration, pornographic image detector and multifunctional management platform, the video processing board-card of pornographic characteristics of image storehouse of standard and parallel high-speed computing, wherein the pornographic image detector has been set up the colour of skin and attitude has been carried out signature analysis, the Mathematical Modeling core algorithm of the similitude coupling judgement of feature extraction and feature, core algorithm is embedded in the described video processing board-card, video processing board-card is inserted in the webserver expansion slot, there are 100,000 standard picture features in the pornographic characteristics of image of standard storehouse as the judgement foundation, the multifunctional management platform contains server end and client-side, communicating by letter between multifunctional management platform charge management server and multi-client process with mutual, the multifunctional management platform is finished detection procedure to the pornographic image of browsing at server end, the multifunctional management platform is finished parsing to HTTP at client-side, reduction and reorganization, finishing network address at client-side filters and keyword filtration, multifunctional management platform integrated operation system, browser, relation between http protocol and the video processing board-card, it is irrelevant with browser to realize that pornographic image and flame detect filtration, SPI interface by Winsock2 or XP obtains the data that send and receive, then these data are analyzed, obtain the HTTP data, after the HTTP data are separated agreement, carrying out reliable URL at client-side detects, bad URL detects and keyword filtration, and determine that according to testing result whether needs use the pornographic image detector, then request server calls video processing board-card bad image is detected if desired, server is collected the image detection result, intercept pornographic and bad image, pass normal picture back client computer, and newfound bad network address added in the blacklist network address database automatically, no network address is deleted automatically a period of time in the blacklist network address database, and the blacklist database is in the dynamic change always.
Described pornographic image detector contains skin color detector and gesture detector,
The color of skin color detector by the phase-split network image formed and to the experiment in color of image space relatively, adopted the hsv color space to set up complexion model, and the skin color of determining the people is in selected hsv color spatial distributions situation,
At first the pixel transitions with network image is the hsv color space and quantizes, be divided into L color sub-spaces, determine the total shin_count and the frequency sub_count_i of sample skin color pixel in this L sub spaces of sample skin color pixel then by statistical analysis, wherein satisfy i=1, L
Σ i = 1 L sub _ count _ i = shin _ count
With the normalized frequency as the possibility of skin color pixel distribution in this subspace,
v i=sub_count_i/skin_count
Set the possibility threshold value T_vi of a skin color distribution probability, if satisfy v i〉=T_vi, then w i=v iOtherwise, w i=0; Final like this obtaining
A={A 1,A 2,…,A L}
W={W 1,W 2,…,W L}
Wherein, w iThe degree of membership of representing corresponding subspace Ai, i.e. A iIn color be the possibility of skin color, i=1,2 ..., L, parameter L=72, cluster obtains the distribution subspace rendezvous value A of skin color and the degree of membership set W of rendezvous value A; Computed image colour of skin degree of exposure:
To arbitrary image F (x, y), x=1 ..., M; Y=1 ..., N is with each skin color pixel (x, y) be transformed into hsv color space and quantize, obtain this each skin color color of pixel subspace label, (x y) has just changed into the label dot matrix G (m of a M * N to make entire image F, n), m=1 ..., M; N=1 ..., N, statistics G (m, normalization histogram Hue[k n)], k=1 ..., L, by the skin color degree of exposure of following formula computed image, in the formula, w kRepresent corresponding subspace A kDegree of membership,
Ratio = Σ k = 1 L Hue [ k ] × w k
Utilize the skin color degree of exposure Ratio of image to distinguish normal picture and pornographic image then, take hard decision mode or soft-decision mode: (1) hard decision mode: determine a threshold value T_Valve, relatively Ratio and T_Valve adjudicate: if piece image satisfies Ratio 〉=T_Valve, then adjudicating this image is pornographic image; Otherwise be normal picture, the value of T_Valve is taken between [0.10,0.15]; (2) soft-decision mode: determine a low threshold value T_Low, a high threshold T_High, relatively Ratio and these two threshold values are adjudicated: if piece image satisfies Ratio 〉=T_High, then adjudicating this image is pornographic image; If satisfy Ratio≤T_Low, then adjudicating this image is normal picture; Think under other situations that this image is a suspect image, the skin color detector is not done judgement, passes on gesture detector;
Described gesture detector, the skin color detector is not done the network suspect image of judgement and carried out posture analysis identification, at first pick out 100,000 representative standard pornographic images and carry out signature analysis, foundation is the pornographic characteristics of image of the standard storehouse of feature with pornographic image Mathematical Modeling accurately, whether as the decision networks image is the foundation of the similitude coupling judgement of pornographic image, and the gesture detector core algorithm mainly contains Wavelet Edge Detection, image segmentation, morphologic filtering, shape description and several parts of similarity coupling judgement:
(1) Wavelet Edge Detection adopts the Daubechies-4 wavelet basis that the original suspect image on the network is carried out tower wavelet decomposition, obtains LL low frequency sub-band and LH, HL, and three high-frequency sub-band of HH are utilized following formula
E [ i , j ] = ( E 1 [ i , j ] 2 + E 2 [ i , j ] 2 + E 3 [ i , j ] 2 ) 1 2
With high-frequency sub-band LH, the edge that HL and HH comprised synthesize an edge graph E (i, j), wherein, E 1[i, j] is the edge subgraph of high-frequency sub-band LH, E 2[i, j] is the edge subgraph of high-frequency sub-band HL, E 3[i, j] is the edge subgraph of high-frequency sub-band HH;
(2) image segmentation for the shape to object in the image is described, need be partitioned into object from image, at first, the Wavelet Edge image is analyzed, extract the most left, the rightest, go up most, the most following four marginal points, and determine the boundary rectangle of object with this, then
Wipe and be positioned at the outer pixel of object boundary rectangle in the original color image, the pixel in the rectangle is cut apart according to the skin color model, to any pixel p (x, y), it is transformed into the HSV space and quantize to obtain quantizing label k ∈ 1 ..., if L} is w k≠ 0, then keep this pixel; Otherwise, wipe this pixel, the skin area image of tentatively being cut apart, w kRepresent corresponding subspace A kDegree of membership;
(3) morphologic filtering adopts mathematical morphology to come the image of tentatively cutting apart is handled, and filters out the noise pixel that does not belong to object area;
(4) shape description, after obtaining the area image of object, utilize the second order of image and 7 constant Hu squares that third moment draws image:
φ 1=η 2002
φ 2 = ( η 20 - η 02 ) 2 + 4 η 11 2
φ 3=(η 30-3η 12) 2+(3η 2103) 2
φ 4=(η 3012) 2+(η 2103) 2
φ 5=(η 30-3η 12)(η 3012)[(η 3012) 2-3(η 0321) 2]+(3η 2103)(η 2103)[3(η 3012) 2-(η 0321) 2]
φ 6=(η 2002)[(η 3012) 2-(η 2103) 2]+4η 113012)(η 2103)
φ 7=(3η 2103)(η 3012)[(η 3012) 2-3(η 0321) 2]+(3η 1230)(η 2103)[3(η 3012) 2-(η 0321) 2]
Adopt 7 characteristic values of 18 characteristic values of second order to five rank normalization central moment of image and Hu square to describe a width of cloth and cut apart later skin area feature of image shape;
(5) similarity coupling judgement adopts weighting Euclidean distance to carry out measuring similarity, and establishing weight vector is W j, the current image feature that will adjudicate is φ j, j=1 wherein, 2 ..., 25; The pornographic characteristics of image of standard storehouse be characterized as φ Ij', i=1,2 ..., N; J=1,2 ..., 25, wherein N represents the feature number in the pornographic characteristics of image of standard storehouse, definition similarity d iFor
d i = 1 - ( Σ j = 1 25 W j ( φ j - φ ij ′ ) 2 ) 1 2
Obtain N characteristic similarity d iAfter, setting threshold T_shape, if characteristic similarity drops on threshold interval [T-shape, 1] in, then think feature similarity in the pornographic characteristics of image of the characteristics of image that will work as leading decision and the standard storehouse, and the number Num of statistics similar features, if Num satisfies condition: Num>T_num, wherein T_num just thinks that the current image that will adjudicate is a pornographic image for the threshold value of N feature similarity number in the image feature that will adjudicate and the standard pornographic characteristics of image storehouse; Otherwise adjudicating the current image that will adjudicate is normal picture.
Described system of intercepting pornographic image and flame on the internet, video processing board-card contains digital signal processing circuit and pci bus interface circuit, pornographic image detector core algorithm is stored in the memory that the central processing unit by external memory interface and digital signal processing circuit is connected, the host interface of the central processing unit of digital signal processing circuit connects programmable logic device, programmable logic device connects video processing board-card by the PCI drive circuit, described video processing board-card is connected with computer server by the PCI slot, the central processing unit of digital signal processing circuit adopts TMS320C6711, the synclk circuit is connected with the corresponding port of central processing unit with electrification reset Dongle circuit, the SDRAM external memory storage is connected by the I/O port of bus interface and central processing unit with flash memory FLASH, and programmable logic device is selected PLX9054 for use, PLX9052, the S5920 of AMCC or among the S5933 any.
Described system of intercepting pornographic image and flame on the internet, the server end of multifunctional management platform contains the detection procedure management, image detection API and image processing card module, server end is by the communication and the data interaction of communication module realization and client-side, server is finished and the communicating by letter of client computer on the one hand, monitor the connection request of client computer image detection simultaneously, and after receiving the connection request of client-side, start a thread dispatching image detection API and image processing card module network image is detected, and pass testing result back client computer; Pornographic judgement is the application layer at server, a plurality of IP packets are resolved back pie graph picture frame to carry out, uninterrupted in order to connect, adopt " storage is transmitted " method, the Web website is made a start, the debit that disguises oneself as of the system of intercepting pornographic image and flame, to client-side as real recipient's browse graph picture, the system of intercepting pornographic image and the flame originating party that disguises oneself as again.
Described system of intercepting pornographic image and flame on the internet, multifunctional management platform client-side comprises: the data filter interface provides obtaining and loopback interface of network data;
Separate protocol module, extract http protocol from the data filter interface network information is handled, realize the decomposition and the reorganization of application layer and IP bag data;
Credible URL detection module is finished credible URL and is detected;
Bad URL detection module is finished bad URL and is detected;
Bad literal filtering module is finished the keyword filtration of flame;
Image detection process module;
Automatically update module is upgraded application program and data automatically from the internet;
System compares by the HTTP data that will obtain and the network address of the blacklist network address database in the system at client-side, detect bad URL and tackle, enter the keyword comparison of second level then, if the keyword of browsing is then interception in pornographic and flame key word library, suspicious network image information is detected at server end according to testing result then.
Operating system adopts Windows2K or XP, the Windows socket application programming interface for client applications access network services of data filter interface for providing by Winsock2, and by transmitting Winsock ISP interface SPI and the ws2_32.dll that ISP and name resolution ISP realize, Winsock ISP interface SPI is the interface function of the standard of opening, can between the ISP, insert one deck, realize SPI HOOK.
Described system of intercepting pornographic image and flame on the internet, after the data filter interface intercepts and sends data, at first check the data legitimacy, judge whether the HTTP head is the image request head, if image request then the judgment data bag whether be that browser sends, if then duplicate socket and " data of transmission " sent to the purpose http server, simultaneously send the image detection request to server, by image detection thread of startup of server, call video processing board-card, carry out the computing judgement at video processing board-card, result is returned multifunctional management platform client-side, handle " data of transmission " of browser according to the result of video processing board-card, if normal data then directly lets pass, if bad image then replaces to data predefined view data.
Described system of intercepting pornographic image and flame on the internet, multifunctional management Platform Server end contains other worker thread, and other worker thread comprises the data analysis service, in order to the analytical system daily record, carry out the record and the analysis of bad network address, handle bad url list; Automatically whether update service is made regular check on version and is upgraded, and upgrades from the internet automatically; User application interface is user enhance trust URL, bad URL and display system daily record by the interface.
Described system of intercepting pornographic image and flame on the internet, the pornographic image detector also contains other bad visual detector, the feature samples of other specific bad image is carried out the PCA conversion in rgb color space, set up the PCA color space, in conjunction with neural net to the colour of skin sample training in the PCA color space, obtain a stable characteristics detector, the suspect image that obtains through icon detector and text detector by and the comparison of this property detector, detected bad network image is imported into the skin color detector and is determined further processing.
Positive beneficial effect of the present invention:
1, the present invention system of intercepting both carried out information content filtration in application layer, adopt network address to filter at the IP layer again, can directly tackle pornographic image information, and in real time pornographic network address is added blacklist automatically, upgrade url database, network address by past passive is filtered the information filtering that jumps to active, improved filter effect significantly, can filter JPAG, GIF, the various picture formats of BMP, TIF, the integral body of the Internet pornographic image is discerned filtering success rate greater than 99%, False Rate is lower than 5%, to other flame filter effect greater than 80%.
2, the present invention intercepts the unique multifunctional management platform of system, integrated the complex relationship between operating system, browser, the internet protocol negotiation visual detector, the process interaction and the pornographic image that have solved between the client-server detect the division of labor and the data recombination problem of task, and have realized and the irrelevant characteristics of browser.
3; take the lead at home " content-based image recognition retrieval " technology is applied to the detection filtration aspect of the Internet pornographic image; created content-based bad image detection model core algorithm; in conjunction with cluster and neural net method; merged the icon detection; multi-level Intelligent Measurement technology such as text detection and pornographic image; quick computing and accurate express comprehensive have been realized; core algorithm is embedded in the high-speed parallel computing video processing board-card; both quickened the arithmetic speed of algorithm; protected algorithm simultaneously; judgement precision and judgement speed have been improved; the average recognition time of pornographic image less than 0.5 second, is not influenced networking speed.
4, system core algorithm is embedded the high-speed dsp hard card, the computer of packing into pulls out hard card, and computer just disconnects with the Internet, and the soft or hard combination has survivability, and it is unloaded easily not resemble other software, and has solved the slow problem of the simple software systems speed of service.
5, take the lead in having set up the pornographic image feature base of 100,000 standards at home, solved the judgement standard problem.
Four, description of drawings:
Fig. 1: the Internet pornographic image and flame are intercepted the functional-block diagram of system
Fig. 2: the Internet pornographic image and flame are intercepted the composition block diagram of system
Fig. 3: the Internet pornographic image and flame are intercepted the workflow block diagram of system
Fig. 4: pornographic image detector overall structure pattern and application flow
Fig. 5: the video processing board-card theory diagram of high-speed parallel computing
Fig. 6: the multifunctional management platform is formed and the workflow block diagram
Fig. 7: multifunctional management Platform Server end workflow block diagram
Fig. 8: multifunctional management platform client-side structure is formed block diagram
Fig. 9: multifunctional management platform client-side data-interface is formed schematic diagram
Figure 10: multifunctional management platform image detection module workflow block diagram
Figure 11: the storage repeating process schematic diagram of guaranteeing not chain rupture of TCP
Five, embodiment:
Embodiment one: referring to Fig. 2, pornographic image of the present invention and flame are intercepted system and are made up of with the pornographic characteristics of image of standard storehouse the video processing board-card (DSP handles pci card) of multifunctional management platform and the high-speed parallel computing that embeds content-based image recognition core algorithm, be that the pornographic image detection subsystem is embedded in the memory on the video processing board-card, video processing board-card is installed on the server computer, Fig. 5 is the video processing board-card schematic block circuit diagram, the pornographic image detection subsystem is embedded on the video processing board-card, described video processing board-card contains high-speed parallel digital signal processing circuit and pci bus interface circuit, pornographic image detection system functional software core algorithm is stored in the memory that the central processing unit by external memory interface and digital signal processing circuit is connected, the host interface of central processing unit connects programmable logic device, programmable logic device connects video processing board-card by the PCI drive circuit, video processing board-card is connected with computer server by the PCI slot, wherein the central processing unit of digital signal processing circuit adopts TMS320C6711, the synclk circuit is connected with the corresponding port of electrification reset Dongle circuit and central processing unit, the SDRAM external memory storage is connected by the I/O port of bus interface and central processing unit with flash memory FLASH, the host interface of central processing unit connects programmable logic device, programmable logic device is selected PLX9054 for use, perhaps select PLX9052 for use, perhaps adopt the S5920 of AMCC, or S5933.
Referring to Fig. 1, Fig. 3, the present invention intercepts the system of pornographic image and flame on the internet, contain IP address filtering subsystem, IP address filtering subsystem contains dynamic blacklist network address database, system at first extracts the network address that client computer requires the webpage land, carry out the network address contrast with the network address of blacklist network address database in the system, if this network address belongs to the address in the blacklist database, promptly tackle, enter the keyword comparison of second level then, if the keyword of browsing is then interception in pornographic and flame key word library, system contains the pornographic image detection subsystem, system sets up pornographic image Mathematical Modeling accurately by decision-feedback repeatedly: pick out 100,000 representative standard pornographic images and carry out signature analysis, extract its feature, whether set up pornographic standard picture feature database, be the foundation of pornographic image as the decision networks image; Set up similitude coupling judgement model; To carry out the content-based image judgement of tri-layer by the network information of keyword contrast, extract the network image characteristic by analyzing, with the characteristic image contrast in the pornographic standard image data storehouse, if similar then interception to some images match in the pornographic image storehouse, and this web page address added in the blacklist network address database automatically, no network address is deleted automatically a period of time in the blacklist network address database.The http protocol parsing module extracts http protocol from the data filter interface network information is handled, and realizes the decomposition and the reorganization of application layer and IP bag data; Data detection module, contain normal URL detection, bad URL detection, keyword detection and image detection process, carry out bad literal at client-side and filter, credible URL detects, bad URL detects, and calling graph detects network image at server end as detection procedure; Automatically update module is upgraded application program and blacklist HTTP data automatically from the internet.
Referring to Figure 11.Because pornographic judgement is the application layer at server, a plurality of IP packets carry out when unpacking back pie graph picture frame,, adopt " storage is transmitted " method for TCP connects not chain rupture, and filtration system is left complete data message earlier, issues client again.Web is made a start, and the filtration system debit that disguises oneself as is to real recipient's filtration system originating party that disguises oneself as again.Because the filtration system connection processing is a plurality of complete message of a plurality of client computer, any one equipment does not all have this concatenation ability, and we will lose the message that some affect the general situation in the accurate Calculation time-out time for this reason, guarantee to connect not chain rupture.
Fig. 4 intercepts the pornographic image detector arrangement pattern and the FB(flow block) of system for interconnected network pornography image of the present invention and flame.
Icon detector among the figure is mainly differentiated network image by the size of images ratio, and main purpose is to detect the image of those similar advertiser web sites, filters out too little image simultaneously.Usually, exist a large amount of advertisement LOGO and icon on the webpage.Since these images be rendered as mostly one very narrow rectangular, perhaps the size of integral image is smaller, does not generally constitute harm from content, so we at first do differentiation to this class image in the process that image is judged.
(1) differentiates according to the size of image:, think to belong to icon one class less than the image of this threshold value to the width and the height setting threshold value of image.(2) differentiate according to the ratio of the height and the width of image: the proportion threshold value of setting height and width, can screen fillet image laterally or longitudinally like this,
min(image_width,image_height)<T_size,
Then judgement is bad image, and they generally mostly are advertiser web site and so on.
if(image_width>image_height)Rs,=image_width/image_height;
elseRs=image_height/image_width。
(Rs>Tlogo), then judgement is normal picture to if.
In practice, rule of thumb, our selected threshold T_size=32, T_logo=10.
Text detector is carried out text/image to network image and is differentiated the image that detection is made up of large amount of text information, for example fax through internet that exists with image format, network character advertisement etc.The image of being made up of text and general image (continuous-tone image) have very big different, and this is mainly reflected on the composition of color.We adopt the intensity profile histogram of analysis image to differentiate character image.
The color of color detector phase-split network image is formed, and distinguishes normal picture and pornographic image by setting up complexion model.
(1) model color space: compare by test, finally select the hsv color space to set up complexion model to color spaces such as RGB, LUV, HSV, LHS, XYZ.Usually image can use the RGB model representation, the hsv color space to the description of color more near people's visual effect, and relatively simple to the conversion and the quantification in HSV space by rgb space.
(2) complexion model: the main task of complexion model is to determine that people's skin color is in selected hsv color spatial distributions situation.At first the hsv color space is quantized, be divided into last color sub-spaces.Determine the distribution of skin color in this worker's sub spaces by statistical analysis then, cluster obtains the distribution subspace set A of skin color and the degree of membership set W of A.
When carrying out statistical analysis, at first determine the total shin_count and the frequency sub_count_i of sample skin-color pixel in this L sub spaces of sample skin pixels, wherein satisfy i=1 ..., L,
Σ i = 1 L sub _ count _ i = shin _ count
With the normalized frequency as the possibility of skin color pixel distribution in this subspace,
v i=sub_count_i/skin_count
Set the possibility threshold value T_vi of a skin color distribution probability, if satisfy v i〉=T_vi, then w i=v iOtherwise, w i=0, final like this obtaining
A={A 1,A 2,…,A L}
W={W 1,W 2,…,W L}
Wherein, w iRepresent corresponding subspace A iDegree of membership, i.e. A iIn color be the possibility of skin color, i=1,2 ..., L, parameter L=72, cluster obtains the distribution subspace of skin color and gathers the degree of membership set W that ends and end;
(3) image colour of skin degree of exposure:
To arbitrary image F (x, y), x=1 ..., M; Y=1 ..., N, with each skin color pixel (x y) is transformed into hsv color space and quantize, and obtains this color of pixel subspace label, make entire image F (x, y) just changed into a M * N label dot matrix G (m, n), m=1 ..., M; N=1 ..., N, statistics G (m, normalization histogram Hue[k n)], k=1 ..., L, calculate the skin color degree of exposure of the image browsed by following formula: Ratio = Σ k = 1 L Hue [ k ] × w k
(4) image detection: utilize image colour of skin degree of exposure Ratio to distinguish normal picture and pornographic image, take hard decision mode or soft-decision mode: (1) hard decision mode: determine a threshold value T_Valve, relatively Ratio and T_Valve adjudicate: if piece image satisfies Ratio 〉=T_Valve, then adjudicating this image is pornographic image; Otherwise be normal picture, the value of T_Valve is taken between [0.10,0.15]; (2) soft-decision mode: determine a low threshold value T_Low, a high threshold T_High, relatively Ratio and these two threshold values are adjudicated: if piece image satisfies Ratio 〉=T_High, then adjudicating this image is pornographic image; If satisfy Ratio≤T_Low, then adjudicating this image is normal picture; Think under other situations that this image is a suspect image, this detector is not done judgement, passes on gesture detector.According to different verification and measurement ratio requirements, being provided with of each parameter can be made amendment accordingly, and as a rule, parameter can come to determine by experiment.
Gesture detector is set up the posture feature storehouse by training.To carrying out posture analysis and similar coupling, distinguish normal picture and pornographic image by the suspect image of color detector.The gesture detector core algorithm mainly is made up of Wavelet Edge Detection, image segmentation, morphologic filtering, shape description and several parts of similarity coupling judgement, and each several part specifically describes as follows:
(1) Wavelet Edge Detection
Tradition Wavelet Edge Detection principle is: establish C J+1Represent original image, C j, D J 1, D J 2, D j 3Be raw video through four width of cloth subimages that wavelet transformation obtains, establish ({ h k} K ∈ Z, { g k} K ∈ Z) with ( { h ~ k } k ∈ Z , { g ~ k } k ∈ Z ) Be one group of dual filter that biorthogonal wavelet is derived, then decomposition of the biorthogonal wavelet of image and reconstruction formula are as follows:
C j , m , n = Σ k , j ∈ Z C j + 1 , k , l h k - 2 m h l - 2 n D j , m , n 1 = Σ k , j ∈ Z C j + 1 , k , l h k - 2 m g l - 2 n D j , m , n 2 = Σ k , j ∈ Z C j + 1 , k , l g k - 2 m h l - 2 n D j , m , n 3 = Σ k , j ∈ Z C j + 1 , k , l g k - 2 m g l - 2 n
C j + 1 , m , n = ( Σ k , l ∈ Z C j , k , l h ~ m - 2 k h ~ n - 2 l + Σ k , l ∈ Z D j , k , l 1 h ~ m - 2 k g ~ n - 2 l
+ Σ k , l ∈ Z D j , k , l 2 g ~ m - 2 k h ~ n - 2 l + Σ k , l ∈ Z D j , k , l 3 g ~ m - 2 k g ~ n - 2 l )
The detected image marginal point is promptly sought in certain neighborhood along the gradient vector direction and is made that the gradient vector amplitude is the point of maximum so, and the gradient vector amplitude is proportional to:
D j = | D j 1 | 2 + | D j 2 | 2
And the direction vector of this gradient is: Arg (D j 1+ iD j 2).
In application, as fruit dot (x, gradient vector amplitude D y) jBe the local maximum point in the neighborhood on the direction vector of this gradient, satisfy simultaneously: D j>T, T are thresholding, and then this point is considered to marginal point.
We adopt the Daubechies-4 wavelet basis that original image is carried out tower wavelet decomposition, obtain LL low frequency sub-band and LH, HL, three high-frequency sub-band of HH.Wherein, the LH subband comprises the edge on the original image horizontal direction; The HL subband comprises the edge on the original image vertical direction; The HH subband comprises the edge on the original image diagonal.We detect as above three types edge respectively, and three types of edges that obtain are synthesized an edge graph.The LH subband is sought gradient vector amplitude maximum point in certain neighborhood in the horizontal direction, and the wavelet coefficient that only keeps the LH subband carries out inverse wavelet transform, obtains edge subgraph E 1(i, j).Similar HL subband and HH subband are handled, obtained E respectively 2(i, j) and E 3(i, j) edge subgraph.Utilize following formula to three types of edges synthesize an edge graph E (i, j).
E [ i , j ] = ( E 1 [ i , j ] 2 + E 2 [ i , j ] 2 + E 3 [ i , j ] 2 ) 1 2
Image by the skin color detector is a coloured image, and we handle gray level image when carrying out Wavelet Edge Detection often, therefore coloured image can be converted to gray level image earlier or directly utilize the red channel of coloured image to handle.
(2) image segmentation for the shape to object in the image is described, just must be partitioned into object from image.At this, we cut apart image in conjunction with Wavelet Edge image and complexion model, mainly therefrom are partitioned into the human body complexion area exposed.
At first, the Wavelet Edge image is analyzed, extract the most left, the rightest, go up most, the most following four marginal points, and determine the boundary rectangle of object with this; Then, wipe the pixel that is positioned in the original color image outside the object boundary rectangle.Pixel in the rectangle is cut apart according to complexion model.To any pixel p (x, y), it is transformed into the HSV space and quantize to obtain quantizing label k ∈ [1 ..., L].If w k≠ 0, then keep this pixel; Otherwise, wipe this pixel.The skin area image of just tentatively being cut apart like this.
(3) morphologic filtering
The skin area image of tentatively cutting apart that produces above often exists very little graininess of a lot of areas and spot shape noise, need carry out Filtering Processing to them, filter out the noise pixel that those do not belong to object area, effectively keep those pixels that belong to object area simultaneously.Filtering method commonly used, as low pass, high pass, level and smooth etc., at this, we adopt mathematical morphology to come the image of tentatively cutting apart is handled.
Morphology has defined four kinds of basic operations such as expansion, burn into unlatching, closure, and wherein unlatching and closure operation are the compound operations of expansion and erosion operation.For input picture f, the setting structure element is b, and f and b are image in essence, and then b is defined as the expansion of f
(fb)(s)=max{f(s-x)+b(x)|x∈D b,(s-x)∈D f}
B is defined as the corrosion of f
(fΘb)(s)=min{f(s+x)-b(x)|x∈D b,(s+x)∈D f}
B is defined as the unlatching of f
fоb=(fΘb)b
B is defined as the closure of f
f·b=(fb)Θb
Wherein, D fAnd D bBe respectively the domain of definition of f and b, s and x are integer Z 2Vector in the space.For dilation operation,, can expand as long as structural element b and input picture f have a pixel to intersect.On the contrary,, have only when structural element b all is positioned at f, just can corrode for erosion operation.Go up expansion energy expansion image aspects, and corrosion energy downscaled images form from how much.Open computing and can remove the convex domain that does not match with structural element on the image, keep the convex domain that those match simultaneously.Closure operation is then filled the recessed zone that does not match with structural element on those images, keeps the recessed zone that those match simultaneously.To the skin area image of tentatively cutting apart, we adopt the morphological erosion operator to handle.To the image behind the erosion operation, be converted into gray level image earlier, carry out region description then.
(4) shape description: after obtaining the area image of object, shape how to describe this width of cloth image has various ways, describes as digital metric, Fourier description, square description and the topology of region shape.In 1962, M.K.Hu proved have 7 not translation, rotation and the engineer's scale of bending moment and image change irrelevantly, be called as the Hu square, they are very useful to the shape description of image.Introduce the concrete implication of Hu square below:
Two-dimensional digital image not bending moment for two-dimensional digital image f (x, y), the formula of square and central moment is as follows:
Square: m pq = Σ x Σ y x p y q f ( x , y )
Central moment: μ pq = Σ x Σ y ( x - x ‾ ) p ( y - y ‾ ) q f ( x , y )
Wherein x ‾ = m 10 m 00 , y ‾ = m 01 m 00 . The square on these each rank and central moment all have their concrete physical meaning.Irrelevant for the size of the description that makes shape and image, define normalized central moment and be:
η pq = μ pq μ 00 r , r = p + q 2 + 1
Utilize the second order of image and third moment can draw 7 of image not bending moments (Hu square):
φ 1=η 2002
φ 2 = ( η 20 - η 02 ) 2 + 4 η 11 2
φ 3=(η 30-3η 12) 2+(3η 2103) 2
φ 4=(η 3012) 2+(η 2103) 2
φ 5=(η 30-3η 12)(η 3012)[(η 3012) 2-3(η 0321) 2]+(3η 2103)(η 2103)[3(η 3012) 2-(η 0321) 2]
φ 6=(η 2002)[(η 3012) 2-(η 2103) 2]+4η 113012)(η 2103)
φ 7=(3η 2103)(η 3012)[(η 3012) 2-3(η 0321) 2]+(3η 1230)(η 2103)[3(η 3012) 2-(η 0321) 2]
Totally 25 characteristic values of 7 characteristic values that we adopt 18 characteristic values of second order to five rank normalization central moment of image and Hu square are described a width of cloth and are cut apart later skin area feature of image shape.
(5) similarity coupling: our image feature data that obtains all is that form with vector exists usually, how to calculate their similarity by these vectors, and system still adopts vector calculation.
We adopt weighting Euclidean distance to carry out measuring similarity.If weight vector is W j, current image feature is φ j, j=1 wherein, 2 ..., 25; Feature database is characterized as φ Ij', i=1,2 ..., N; J=1,2 ..., 25, wherein N representation feature Al Kut is levied number.Definition similarity d iFor
d i = 1 - ( Σ j = 1 25 W j ( φ j - φ ij ′ ) 2 ) 1 2
Obtain N characteristic similarity d iAfter, setting threshold T_shape if characteristic similarity drops on interval [T_shape, 1], then thinks the feature similarity in present image feature and the feature database and the number Num of statistics similar features.If Num satisfies condition: Num>T_num, wherein T_num is the threshold value of N feature similarity number in current image feature and the feature database, thinks that so this image is a pornographic image.Otherwise adjudicating this image is normal picture.
Referring to Fig. 6, Fig. 7, the multifunctional management system platform of the system that flame of the present invention is intercepted contains server end and client-side, both realize communication and data interaction by communication module, server end contains monitors module and image detection module, server at first starts the connection request that a monitor process is monitored client computer, and behind the connection request of receiving objective machine mutually, start the image detection thread, carry out communication with client computer, calling image detection API simultaneously detects image, and pass testing result back client computer, finish IP address filtering and keyword filtration at client-side, finish pornographic image at server end and detect, server end contains other worker thread, in order to the analytical system daily record, handle bad url list, automatically update service and user application interface.
Multifunctional management platform subsystem client-side contains the data filter interface among Fig. 8, is responsible for obtaining website data and loopback interface; Separate protocol module, extract http protocol and handle, realize the decomposition and the reorganization of application layer and IP bag data; Data detection module contains normal URL detection, bad URL detection, keyword detection and image detection; Automatically update module is upgraded application program and data: server communication module, communication and data interaction between realization client computer and server automatically from the internet.
Fig. 9 is a multifunctional management platform subsystem client-side data filter interface composition frame chart, the data filter interface is for being provided the Windows socket application programming interface (API) of client access network service by Winsock2, comprise the Winsock ISP interface SPI and the ws2 32.dll that realize by transmission ISP and name resolution ISP, its filtration pattern is: core DLL, install or unloading HOOK interface; HOOKDLL, data processing core; Image detection interface and network communication interface, the mutually mutual communication of this three.
Figure 10 is a multifunctional management subsystem image detection module workflow, after the data filter interface module intercepts and sends data, at first check the data legitimacy, judge whether the HTTP head is the image request head, whether if image request is then judged this packet is that browser sends, if then duplicate socket and send the send data to the purpose http server, calling the pornographic image detection subsystem simultaneously detects image, perhaps by image detection thread of startup of server, call pornographic image and detect the DSP hard card, carry out the computing judgement at hard card, result is returned the multifunctional management platform, according to image detection subsystem processes result treatment browser send data, if then directly clearance of normal data, if bad image then replaces to data predefined view data.
Embodiment two: referring to Fig. 1~Figure 10, present embodiment is with embodiment one difference: system finishes IP address filtering and keyword filtration at client-side, finishing pornographic image at server end detects, system can be according to the similar ratio of the coupling of the characteristic image in webpage pornographic image and the pornographic standard picture feature database, the pornographic image rank is set, may be provided in the people and browse to the network information that children can not browse, the pornographic image detector does not contain icon and filters and text filtering.
Embodiment three: referring to Fig. 1~Figure 10, present embodiment is with embodiment one difference: system contains other bad visual detector, the feature samples of other specific bad image is carried out the PCA conversion in rgb color space, set up the PCA color space, in conjunction with neural net to the colour of skin sample training in the PCA color space, obtain a stable characteristics detector, the suspect image that obtains through icon detector and text detector by and the comparison of this property detector, detect bad network image and be input to the color detection subsystem and be determined further processing.Other bad visual detector and pornographic image detector concept are similar, but the image recognition of counterpart's body characteristics, and bad image lacks the feature of general character, therefore can only adopt the pattern of training, comparison to adjudicate.Under many circumstances, people are transformed into HSI space or YCbCr space with rgb color space, and monochrome information is separated with chrominance information, utilize the HS two-dimensional sub-spaces in the HSI space or the CbCr two-dimensional sub-spaces in YCbCr space to set up complexion model.But when illumination variation is more violent, bigger variation can appear in the distribution of color in HS subspace and the foundation of CbCr subspace, this is very disadvantageous for feature detection, therefore this part utilizes the PCA conversion to set up the PCA color space, to the colour of skin sample training in the PCA color space, obtain a stable characteristics detector in conjunction with neural net.
Characteristics of image based on neural net and PCA conversion detects: the present invention proposes a kind of characteristics of image detection algorithm based on neural net and PCA conversion, this algorithm detects one by one to the pixel of input picture, under training mode, we carry out the PCA conversion to the feature samples in the training set in rgb space, obtain the projection matrix of a linearity.Secondary series vector sum the 3rd column vector of projection matrix constitutes new two dimensional character and detects the space, the axial vector that is called the PCA feature space, these two vectors are over against the direction of answering feature pixel variations minimum in rgb space, therefore, feature samples in the former training set obtains new feature samples through after the projective transformation, the polymerization in the PCA feature space of these feature samples is tight, feature samples in the PCA feature space is delivered neural net train, until network convergence.Under detecting pattern, each pixel of image to be detected is delivered neural net after through the matrix projective transformation that is made of secondary series vector sum the 3rd column vector that obtains under training mode and is detected, and detects one by one to finish, and obtains the testing result of entire image.
The PCA feature space: following condition must be satisfied in a good feature detection space:
Colouring information is concentrated on certain two component in the image; The non-colouring information (as monochrome information) of these two components should enough lack; The mean square deviation of these two components should be enough little.
The PCA conversion is the optimal mapping under the mean square error meaning, also claims the KL conversion usually.Be expressed in matrix as: A=O TB, in the formula, A is the vector after the conversion, and B is a vector of wanting conversion, and O is a transformation matrix, and is closely related with B, usually is made up of the characteristic vector of the autocorrelation matrix of B.
We set up the PCA feature space by the PCA conversion.If X is the feature samples set that is used to train in the rgb space, X=[X 1, X 2..., X T], T is the number of feature samples here.The mean vector of calculated characteristics sample at first M = Σ i = 1 T X i , It is sample set Φ=[Φ of 0 that the rgb space feature samples is gone to obtain after the average average 1, Φ 2..., Φ T], Φ i=X i-M, 1≤i≤T.Then calculate autocorrelation matrix S T, S T = Σ i = 1 T Φ i Φ i T . Obtain autocorrelation matrix S at last TCharacteristic value and characteristic vector, S TΨ=Ψ Λ, Ψ=[Ψ here 1, Ψ 2, Ψ 3] representing the proper phasor of matrix, Λ is an eigenvalue 1, λ 2, λ 31〉=λ 2〉=λ 3) diagonal matrix that constitutes.Eigenvalue 2, λ 3Two corresponding vectorial Ψ 2, Ψ 3Corresponding in rgb space the direction of feature pixel variations minimum, therefore with Ψ 2, Ψ 3Be considered as two main shafts in the new color space, constitute the PCA feature space, and Ψ 2, Ψ 3Constitute the linear projection matrix, the feature samples in the former rgb space arrives the PCA feature space through the linear projection matrixing.
The BP neural net: neural net method has good parallel processing performance, and good generalization ability is arranged, and does not need the prior probability distribution of data, and therefore, neural net method has embodied huge superiority in area of pattern recognition.The BP neural net is the most ripe and most widely used a kind of network of studying in the feed-forward type neural net, and we adopt the BP neural net of a hidden layer here.It is three layers that network is divided into: i is an input layer; J is a hidden node; K is the output layer node.The study error function of define grid is
E = 1 2 Σ k ( d k - y k ) 2
In the formula: d kThe desired output of expression network; y kThe actual output of expression network.So it is as follows to release each layer weights correction formula:
Hidden layer and output layer: w Jk(t+1)=w Jk(t)+η δ ky j
δ k=y k(1-y k)(d k-y k)
Input layer and hidden layer
w ij(t+1)=w ij(t)+ηδ jy i
δ j = y j ( 1 - y j ) Σ k δ k w jk
In the following formula: η is a learning rate; δ k, δ jBe the corresponding correction value of each layer.

Claims (9)

1, a kind of system of intercepting pornographic image and flame on the internet, contain IP address filtering and keyword filtration, it is characterized in that: system contains pornographic image detector and multifunctional management platform, the video processing board-card of pornographic characteristics of image storehouse of standard and parallel high-speed computing, wherein the pornographic image detector has been set up the colour of skin and attitude has been carried out signature analysis, the Mathematical Modeling core algorithm of the similitude coupling judgement of feature extraction and feature, core algorithm is embedded in the described video processing board-card, video processing board-card is inserted in the webserver expansion slot, there are 100,000 standard picture features in the pornographic characteristics of image of standard storehouse as the judgement foundation, the multifunctional management platform contains server end and client-side, communicating by letter between multifunctional management platform charge management server and multi-client process with mutual, the multifunctional management platform is finished detection procedure to the pornographic image of browsing at server end, the multifunctional management platform is finished parsing to HTTP at client-side, reduction and reorganization, finishing network address at client-side filters and keyword filtration, multifunctional management platform integrated operation system, browser, relation between http protocol and the video processing board-card, it is irrelevant with browser to realize that pornographic image and flame detect filtration, SPI interface by Winsock2 or XP obtains the data that send and receive, then these data are analyzed, obtain the HTTP data, after the HTTP data are separated agreement, carrying out reliable URL at client-side detects, bad URL detects and keyword filtration, and determine that according to testing result whether needs use the pornographic image detector, then request server calls video processing board-card bad image is detected if desired, server is collected the image detection result, intercept pornographic and bad image, pass normal picture back client computer, and newfound bad network address added in the blacklist network address database automatically, no network address is deleted automatically a period of time in the blacklist network address database, and the blacklist database is in the dynamic change always.
2, system of intercepting pornographic image and flame on the internet according to claim 1 is characterized in that: described pornographic image detector contains skin color detector and gesture detector,
The color of skin color detector by the phase-split network image formed and to the experiment in color of image space relatively, adopted the hsv color space to set up complexion model, and the skin color of determining the people is in selected hsv color spatial distributions situation,
At first the pixel transitions with network image is the hsv color space and quantizes, be divided into L color sub-spaces, determine the total shin_count and the frequency sub_count_i of sample skin color pixel in this L sub spaces of sample skin color pixel then by statistical analysis, wherein satisfy i=1, L
Σ i = 1 L sub _ count _ i = shin _ count
With the normalized frequency as the possibility of skin color pixel distribution in this subspace,
v i=sub_count_i/skin_count
Set the possibility threshold value T_vi of a skin color distribution probability, if satisfy v i〉=T_vi, then w i=v iOtherwise, w i=0; Final like this obtaining
A={A 1,A 2,…,A L}
W={w 1,w 2,…,w L}
Wherein, w iRepresent corresponding subspace A iDegree of membership, i.e. A iIn color be the possibility of skin color, i=1,2 ..., L, parameter L=72, cluster obtains the distribution subspace rendezvous value A of skin color and the degree of membership set W of rendezvous value A; Computed image colour of skin degree of exposure:
To arbitrary image F (x, y), x=1 ..., M; Y=1 ..., N is with each skin color pixel (x, y) be transformed into hsv color space and quantize, obtain this each skin color color of pixel subspace label, (x y) has just changed into the label dot matrix G (m of a M * N to make entire image F, n), m=1 ..., M; N=1 ..., N, statistics G (m, normalization histogram Hue[k n)], k=1 ..., L, by the skin color degree of exposure of following formula computed image, in the formula, w kRepresent corresponding subspace A kDegree of membership,
Ratio = Σ k = 1 L Hue [ k ] × w k
Utilize the skin color degree of exposure Ratio of image to distinguish normal picture and pornographic image then, take hard decision mode or soft-decision mode: (1) hard decision mode: determine a threshold value T_Valve, relatively Ratio and T_Valve adjudicate: if piece image satisfies Ratio 〉=T_Valve, then adjudicating this image is pornographic image; Otherwise be normal picture, the value of T_Valve is taken between [0.10,0.15]; (2) soft-decision mode: determine a low threshold value T_Low, a high threshold T_High, relatively Ratio and these two threshold values are adjudicated: if piece image satisfies Ratio 〉=T_High, then adjudicating this image is pornographic image; If satisfy Ratio≤T_Low, then adjudicating this image is normal picture; Think under other situations that this image is a suspect image, the skin color detector is not done judgement, passes on gesture detector;
Described gesture detector is not done the network suspect image of judgement and is carried out posture analysis identification, at first to the skin color detector
Pick out 100,000 representative standard pornographic images and carry out signature analysis, foundation is the pornographic characteristics of image of the standard storehouse of feature with pornographic image Mathematical Modeling accurately, whether as the decision networks image is the foundation of the similitude coupling judgement of pornographic image, and the gesture detector core algorithm mainly contains Wavelet Edge Detection, image segmentation, morphologic filtering, shape description and several parts of similarity coupling judgement:
(1) Wavelet Edge Detection adopts the Daubechies-4 wavelet basis that the original suspect image on the network is carried out tower wavelet decomposition, obtains LL low frequency sub-band and LH, HL, and three high-frequency sub-band of HH are utilized following formula
E [ i , j ] = ( E 1 [ i , j ] 2 + E 2 [ i , j ] 2 + E 3 [ i , j ] 2 ) 1 2
With high-frequency sub-band LH, the edge that HL and HH comprised synthesize an edge graph E (i, j), wherein, E 1[i, j] is the edge subgraph of high-frequency sub-band LH, E 2[i, j] is the edge subgraph of high-frequency sub-band HL, E 3[i, j] is the edge subgraph of high-frequency sub-band HH;
(2) image segmentation at first, is analyzed the Wavelet Edge image, extract the most left, the rightest, go up most, the most following four marginal points, and determine the boundary rectangle of object, then with this
Wipe and be positioned at the outer pixel of object boundary rectangle in the original color image, the pixel in the rectangle is cut apart according to the skin color model, to any pixel p (x, y), it is transformed into the HSV space and quantize to obtain quantizing label k ∈ 1 ..., if L} is w k≠ 0, then keep this pixel; Otherwise, wipe this pixel, the skin area image of tentatively being cut apart, w kRepresent corresponding subspace A kDegree of membership;
(3) morphologic filtering adopts mathematical morphology to come the image of tentatively cutting apart is handled, and filters out the noise pixel that does not belong to object area;
(4) shape description, after obtaining the area image of object, utilize the second order of image and 7 constant Hu squares that third moment draws image:
φ 1=η 2002
φ 2 = ( η 20 - η 02 ) 2 + 4 η 11 2
φ 3=(η 30-3η 12) 2+(3η 2103) 2
φ 4=(η 3012) 2+(η 2103) 2
φ 5=(η 30-3η 12)(η 3012)[(η 3012) 2-3(η 0321) 2]
+(3η 2103)(η 2103)[3(η 3012) 2-(η 0321) 2]
φ 6=(η 2002)[(η 3012) 2-(η 2103) 2]+4η 113012)(η 2103)
φ 7=(3η 2103)(η 3012)[(η 3012) 2-3(η 0321) 2]
+(3η 1230)(η 2103)[3(η 3012) 2-(η 0321) 2]
Adopt 7 characteristic values of 18 characteristic values of second order to five rank normalization central moment of image and Hu square to describe a width of cloth and cut apart later skin area feature of image shape;
(5) similarity coupling judgement adopts weighting Euclidean distance to carry out measuring similarity, and establishing weight vector is W j, the current image feature that will adjudicate is Φ j, j=1 wherein, 2 ..., 25; The pornographic characteristics of image of standard storehouse be characterized as φ Ij', i=1,2 ..., N; J=1,2 ..., 25, wherein N represents the feature number in the pornographic characteristics of image of standard storehouse, definition similarity d iFor
d i = 1 - ( Σ j = 1 25 W j ( φ j - φ ij ′ ) 2 ) 1 2
Obtain N characteristic similarity d iAfter, setting threshold T_shape, if characteristic similarity drops on threshold interval [T-shape, 1] in, then think feature similarity in the pornographic characteristics of image of the characteristics of image that will work as leading decision and the standard storehouse, and the number Num of statistics similar features, if Num satisfies condition: Num>T_num, wherein T_num just thinks that the current image that will adjudicate is a pornographic image for the threshold value of N feature similarity number in the image feature that will adjudicate and the standard pornographic characteristics of image storehouse; Otherwise adjudicating the current image that will adjudicate is normal picture.
3, system of intercepting pornographic image and flame on the internet according to claim 1, it is characterized in that: described video processing board-card contains digital signal processing circuit and pci bus interface circuit, pornographic image detector core algorithm is stored in the memory that the central processing unit by external memory interface and digital signal processing circuit is connected, the host interface of the central processing unit of digital signal processing circuit connects programmable logic device, programmable logic device connects video processing board-card by the PCI drive circuit, described video processing board-card is connected with computer server by the PCI slot, the central processing unit of digital signal processing circuit adopts TMS320C6711, the synclk circuit is connected with the corresponding port of central processing unit with electrification reset Dongle circuit, the SDRAM external memory storage is connected by the I/O port of bus interface and central processing unit with flash memory FLASH, and programmable logic device is selected PLX9054 for use, PLX9052, the S5920 of AMCC or among the S5933 any.
4, system of intercepting pornographic image and flame on the internet according to claim 1, it is characterized in that: the server end of multifunctional management platform contains the detection procedure management, image detection API and image processing card module, server end is by the communication and the data interaction of communication module realization and client-side, server is finished and the communicating by letter of client computer on the one hand, monitor the connection request of client computer image detection simultaneously, and after receiving the connection request of client-side, start a thread dispatching image detection API and image processing card module network image is detected, and pass testing result back client computer; Pornographic judgement is the application layer at server, a plurality of IP packets are resolved back pie graph picture frame to carry out, uninterrupted in order to connect, adopt " storage is transmitted " method, the Web website is made a start, the debit that disguises oneself as of the system of intercepting pornographic image and flame, to client-side as real recipient's browse graph picture, the system of intercepting pornographic image and the flame originating party that disguises oneself as again.
5, system of intercepting pornographic image and flame on the internet according to claim 4 is characterized in that: multifunctional management platform client-side comprises:
The data filter interface provides obtaining and loopback interface of network data;
Separate protocol module, extract http protocol from the data filter interface network information is handled, realize the decomposition and the reorganization of application layer and IP bag data;
Credible URL detection module is finished credible URL and is detected;
Bad URL detection module is finished bad URL and is detected;
Bad literal filtering module is finished the keyword filtration of flame;
Image detection process module;
Automatically update module is upgraded application program and data automatically from the internet;
System compares by the HTTP data that will obtain and the network address of the blacklist network address database in the system at client-side, detect bad URL and tackle, enter the keyword comparison of second level then, if the keyword of browsing is then interception in pornographic and flame key word library, suspicious network image information is detected at server end according to testing result then.
6, system of intercepting pornographic image and flame on the internet according to claim 5, it is characterized in that: operating system adopts Windows2K or XP, the Windows socket application programming interface for client applications access network services of data filter interface for providing by Winsock2, and by transmitting Winsock ISP interface SPI and the ws2_32.dll that ISP and name resolution ISP realize, Winsock ISP interface SPI is the interface function of the standard of opening, can between the ISP, insert one deck, realize SPI HOOK.
7, system of intercepting pornographic image and flame on the internet according to claim 6, it is characterized in that: after the data filter interface intercepts and sends data, at first check the data legitimacy, judge whether the HTTP head is the image request head, if image request then the judgment data bag whether be that browser sends, if then duplicate socket and " data of transmission " sent to the purpose http server, simultaneously send the image detection request to server, by image detection thread of startup of server, call video processing board-card, carry out the computing judgement at video processing board-card, result is returned multifunctional management platform client-side, handle " data of transmission " of browser according to the result of video processing board-card, if then directly clearance of normal data, if bad image then replaces to data predefined view data.
8, according to each described system of intercepting pornographic image and flame on the internet of claim 1~7, it is characterized in that: multifunctional management Platform Server end contains other worker thread, other worker thread comprises the data analysis service, in order to the analytical system daily record, carry out the record and the analysis of bad network address, handle bad url list; Automatically whether update service is made regular check on version and is upgraded, and upgrades from the internet automatically; User application interface is user enhance trust URL, bad URL and display system daily record by the interface.
9, system of intercepting pornographic image and flame on the internet according to claim 8, it is characterized in that: the pornographic image detector also contains other bad visual detector, the feature samples of other specific bad image is carried out the PCA conversion in rgb color space, set up the PCA color space, in conjunction with neural net to the colour of skin sample training in the PCA color space, obtain a stable characteristics detector, the suspect image that obtains through icon detector and text detector by and the comparison of this property detector, detected bad network image is imported into the skin color detector and is determined further processing.
CNB2005100485766A 2005-11-18 2005-11-18 System for blocking off erotic images and unhealthy information in internet Active CN100361450C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100485766A CN100361450C (en) 2005-11-18 2005-11-18 System for blocking off erotic images and unhealthy information in internet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100485766A CN100361450C (en) 2005-11-18 2005-11-18 System for blocking off erotic images and unhealthy information in internet

Publications (2)

Publication Number Publication Date
CN1761204A CN1761204A (en) 2006-04-19
CN100361450C true CN100361450C (en) 2008-01-09

Family

ID=36707157

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100485766A Active CN100361450C (en) 2005-11-18 2005-11-18 System for blocking off erotic images and unhealthy information in internet

Country Status (1)

Country Link
CN (1) CN100361450C (en)

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114907B (en) * 2006-07-28 2010-08-11 腾讯科技(深圳)有限公司 Method and system for managing and filtering black list
CN101035128B (en) * 2007-04-18 2010-04-21 大连理工大学 Three-folded webpage text content recognition and filtering method based on the Chinese punctuation
CN101035281B (en) * 2007-04-19 2010-05-26 北京新岸线网络技术有限公司 Classified content auditing system
CN101441717B (en) * 2007-11-21 2010-12-08 中国科学院计算技术研究所 Method and system for detecting eroticism video
CN101729853B (en) * 2009-11-13 2011-05-18 深圳创维-Rgb电子有限公司 System, method, device and installation for filtering programs
CN101867932B (en) * 2010-05-21 2012-11-28 武汉虹旭信息技术有限责任公司 Harmful information filtration system based on mobile Internet and method thereof
CN102271122B (en) * 2010-06-03 2018-04-10 中兴通讯股份有限公司 A kind of rubbish contents applied to P2P networks determine method and its system
CN102340424B (en) * 2010-07-21 2013-12-04 中国移动通信集团山东有限公司 Bad message detection method and bad message detection device
CN102411577A (en) * 2010-09-25 2012-04-11 百度在线网络技术(北京)有限公司 Method and equipment for analyzing generalization keywords based on benchmark
CN101977235B (en) * 2010-11-03 2013-03-27 北京北信源软件股份有限公司 URL (Uniform Resource Locator) filtering method aiming at HTTPS (Hypertext Transport Protocol Server) encrypted website access
CN102045348B (en) * 2010-12-01 2013-08-07 北京迅捷英翔网络科技有限公司 Link stealing prevention system and method
CN102541899B (en) * 2010-12-23 2014-04-16 阿里巴巴集团控股有限公司 Information identification method and equipment
CN102170640A (en) * 2011-06-01 2011-08-31 南通海韵信息技术服务有限公司 Mode library-based smart mobile phone terminal adverse content website identifying method
CN102523311B (en) * 2011-11-25 2014-08-06 中国科学院计算机网络信息中心 Illegal domain name recognition method and device
CN102567101A (en) * 2012-01-12 2012-07-11 郑州金惠计算机系统工程有限公司 Multi-process management system for recognizing and monitoring pornographic images of WAP (wireless application protocol) mobile phone media
CN102663093B (en) * 2012-04-10 2014-07-09 中国科学院计算机网络信息中心 Method and device for detecting bad website
CN102647425A (en) * 2012-04-20 2012-08-22 汉柏科技有限公司 Method and system for realizing anti-trojan function of firewall
CN102799655B (en) * 2012-06-29 2018-03-27 北京奇虎科技有限公司 The treating method and apparatus of imperfect picture information in a kind of webpage
CN103218559A (en) * 2013-03-25 2013-07-24 苏州德鲁克供应链管理有限公司 Supply chain protection system
CN103473299B (en) * 2013-09-06 2017-02-08 北京锐安科技有限公司 Website bad likelihood obtaining method and device
CN104268284A (en) * 2014-10-21 2015-01-07 合肥星服信息科技有限责任公司 Web browse filtering softdog device special for juveniles
US9984068B2 (en) * 2015-09-18 2018-05-29 Mcafee, Llc Systems and methods for multilingual document filtering
CN105812921B (en) * 2016-04-26 2019-12-03 Tcl海外电子(惠州)有限公司 Control method and terminal that media information plays
CN107809343B (en) * 2016-09-09 2021-03-23 中国人民解放军信息工程大学 Network protocol identification method and device
CN106529567A (en) * 2016-09-30 2017-03-22 维沃移动通信有限公司 Method and device for filtering picture based on mobile terminal
CN107016356A (en) * 2017-03-21 2017-08-04 乐蜜科技有限公司 Certain content recognition methods, device and electronic equipment
CN107256250A (en) * 2017-06-08 2017-10-17 福建中金在线信息科技有限公司 A kind of image processing method, device, server and storage medium
CN107330453B (en) * 2017-06-19 2020-07-07 中国传媒大学 Pornographic image identification method based on step-by-step identification and fusion key part detection
CN107566903B (en) * 2017-09-11 2020-07-03 北京匠数科技有限公司 Video filtering device and method and video display system
CN107895119A (en) * 2017-12-28 2018-04-10 北京奇虎科技有限公司 Program installation packet inspection method, device and electronic equipment
CN110020259A (en) * 2017-12-30 2019-07-16 惠州学院 A kind of method and its system identifying harmful picture based on User IP
CN110020252B (en) * 2017-12-30 2022-04-22 惠州学院 Method and system for identifying harmful video based on trailer content
CN110020257A (en) * 2017-12-30 2019-07-16 惠州学院 The method and system of the harmful video of identification based on User ID and video copy
CN109002842A (en) * 2018-06-27 2018-12-14 北京字节跳动网络技术有限公司 Image-recognizing method and device
CN108900910B (en) * 2018-07-16 2020-11-13 上海艾策通讯科技股份有限公司 Method and system for monitoring validity of IPTV service content
CN109284676A (en) * 2018-08-15 2019-01-29 孙燕 Human body package degree discrimination method
CN111259304A (en) * 2020-02-17 2020-06-09 猎港信息技术(上海)有限公司 Forum monitoring system based on image recognition
CN113220927A (en) * 2021-05-08 2021-08-06 百度在线网络技术(北京)有限公司 Image detection method, device, equipment and storage medium
CN114513764A (en) * 2021-12-06 2022-05-17 成都中星世通电子科技有限公司 Multi-node data storage and interaction method
CN114415792A (en) * 2021-12-28 2022-04-29 中科信息安全共性技术国家工程研究中心有限公司 Network illegal information filtering system based on content understanding and judging technology
CN117938545B (en) * 2024-03-21 2024-06-11 中国信息通信研究院 Bad information sample amplification method and system based on encrypted traffic

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1109370A2 (en) * 1999-12-04 2001-06-20 Nutzwerk Informationsgesellschaft mbH Device and method for individual filtering of information sent over a network
CN1396533A (en) * 2001-07-16 2003-02-12 友立资讯股份有限公司 Sexy file judging system and method
CN1400776A (en) * 2001-07-31 2003-03-05 友立资讯股份有限公司 Filtration system of pornographic film and its method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1109370A2 (en) * 1999-12-04 2001-06-20 Nutzwerk Informationsgesellschaft mbH Device and method for individual filtering of information sent over a network
CN1396533A (en) * 2001-07-16 2003-02-12 友立资讯股份有限公司 Sexy file judging system and method
CN1400776A (en) * 2001-07-31 2003-03-05 友立资讯股份有限公司 Filtration system of pornographic film and its method

Also Published As

Publication number Publication date
CN1761204A (en) 2006-04-19

Similar Documents

Publication Publication Date Title
CN100361450C (en) System for blocking off erotic images and unhealthy information in internet
CN100361451C (en) System for detecting eroticism and unhealthy images on network based on content
CN109284988B (en) Data analysis system and method
CN104935570B (en) Network flow connection behavioural characteristic analysis method based on network flow connection figure
CN112001274B (en) Crowd density determining method, device, storage medium and processor
Lin et al. Pornography detection using support vector machine
Sawicki et al. Human colour skin detection in CMYK colour space
CN110995643B (en) Abnormal user identification method based on mail data analysis
CN114172688B (en) Method for automatically extracting key nodes of network threat of encrypted traffic based on GCN-DL (generalized traffic channel-DL)
CN108985954A (en) A kind of method and relevant device of incidence relation that establishing each mark
CN103839037A (en) Network video-stream unhealthy-content detection method and system based on many cores and GPU
CN107438083A (en) Detection method for phishing site and its detecting system under a kind of Android environment
CN110390673A (en) Cigarette automatic testing method based on deep learning under a kind of monitoring scene
CN109993124A (en) Based on the reflective biopsy method of video, device and computer equipment
CN105429996B (en) A method of intelligence discovery and positioning address conversion equipment
CN104899493B (en) A kind of new examination face authentication system
CN115021965B (en) Method and system for generating attack data of intrusion detection system based on generation type countermeasure network
Rouse et al. Estimating the usefulness of distorted natural images using an image contour degradation measure
US20090265389A1 (en) Learned cognitive system
Shen et al. Channel recombination and projection network for blind image quality measurement
Balamurali et al. Multiple parameter algorithm approach for adult image identification
CN112383488B (en) Content identification method suitable for encrypted and non-encrypted data streams
CN104680118B (en) A kind of face character detection model generation method and system
Brockschmidt et al. On the generality of facial forgery detection
CN106920266A (en) The Background Generation Method and device of identifying code

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Free format text: FORMER OWNER: ZHAO HUIQIN ZHOU HUI TANG HUAILI CAO WEN PENG TIANQIANG LI BICHENG ZHANG CHENMIN

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20110705

Address after: 450001 No. 5 Sakura street, hi tech Development Zone, Henan, Zhengzhou

Patentee after: Zhengzhou Jinhui Computer System Engineering Co., Ltd.

Address before: 450001 No. 5 Sakura street, hi tech Development Zone, Henan, Zhengzhou

Co-patentee before: Zhao Huiqin

Patentee before: Zhengzhou Jinhui Computer System Engineering Co., Ltd.

Co-patentee before: Zhou Long

Co-patentee before: Tang Huaili

Co-patentee before: Cao Wen

Co-patentee before: Peng Tianqiang

Co-patentee before: Li Bicheng

Co-patentee before: Zhang Chenmin