AU2895900A - Apparatus and system for classifying and control access to information - Google Patents

Apparatus and system for classifying and control access to information Download PDF

Info

Publication number
AU2895900A
AU2895900A AU28959/00A AU2895900A AU2895900A AU 2895900 A AU2895900 A AU 2895900A AU 28959/00 A AU28959/00 A AU 28959/00A AU 2895900 A AU2895900 A AU 2895900A AU 2895900 A AU2895900 A AU 2895900A
Authority
AU
Australia
Prior art keywords
param
submodel
porn
exp
model exp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
AU28959/00A
Other versions
AU761017B2 (en
Inventor
Alan Bradley Jones
David Ross Taylor
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Total Defense LLC
Original Assignee
TEL NET MEDIA Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AUPP9048A external-priority patent/AUPP904899A0/en
Application filed by TEL NET MEDIA Pty Ltd filed Critical TEL NET MEDIA Pty Ltd
Priority to AU28959/00A priority Critical patent/AU761017B2/en
Publication of AU2895900A publication Critical patent/AU2895900A/en
Application granted granted Critical
Publication of AU761017B2 publication Critical patent/AU761017B2/en
Assigned to INTERNET SHERIFF TECHNOLOGY LIMITED reassignment INTERNET SHERIFF TECHNOLOGY LIMITED Request to Amend Deed and Register Assignors: TEL.NET MEDIA PTY LTD
Assigned to Total Defense, Inc. reassignment Total Defense, Inc. Alteration of Name(s) in Register under S187 Assignors: INTERNET SHERIFF TECHNOLOGY LIMITED
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Description

WO 00/52598 PCT/AU00/00158 1 APPARATUS AND SYSTEM FOR CLASSIFYING AND CONTROL ACCESS TO INFORMATION TECHNICAL FIELD OF THE INVENTION THIS INVENTION relates to apparatus and system for classifying information 5 on communications network and in particular but not limited to apparatus and system for classifying content servers and for selectively controlling access to classified content servers. BACKGROUND OF THE INVENTION The phenomenon growth of information technology has allowed many 10 people to have access to diverse information on communications networks. The Internet in particular allows fetching of information from any cooperating computers or content servers located in different parts of the world by simply clicking references to the information. As the number of accessible computers or content servers and the amount of information over the communications network 15 grow daily it becomes increasingly difficult to classify them manually. Known systems for controlling the types of information accessible on a network rely on comparing a requested destination with those on pre-determined Access Control Lists (ACL) or on word matching to determine whether to allow or deny access. This approach can be applied at the client node prior to requesting the 20 information or on any suitably intelligent network device capable of intercepting the request or subsequent reply prior to it reaching the requester. For example, in the case of an Internet browser running on a PC or work station, a request is made for an Internet resource such as a web site. A software program for monitoring such requests on the PC can be configured to scan a pre-determined list of site addresses 25 for a match. If found, access to the site may be denied and a suitable message is then displayed informing the user that access is denied. Alternatively, the request may be allowed to proceed, but as data are received from the site they are scanned for checking a match with one or more sets of pre-determined words, word fragments or phrases. If a match is found the site is not displayed on the computer 30 but instead there is shown a suitable message. Typically, this type of control software is installed on a PC or work station which does not have particularly strict WO 00/52598 PCI/AUUU/UUI1 2 access privileges. The control software can be easily removed, disabled or otherwise circumvented and thereby defeating the control system. A network device capable of intercepting the request or reply to a request, such as a proxy server, may perform similar actions using the same methods of web 5 site matching. This is usually maintained by a network administrator with strict access rights. Also, a network requiring clients to connect through the network device in order to access the network can have its content control enforced. This allows content control of multiple clients from one central point. While these known systems do provide some access control abilities, there 10 are several disadvantages. A system based on word or phrase matching can only match text and it therefore would allow access to undesired information comprising graphic images. Also, a single word may match a broad range of sites with quite different classes of information. As an example, when the word "sex" is used to match pornographic sites the system would also block access to other sites 15 providing non offensive information such as articles on biology. A system based on an access control list of prohibited sites is much more selective. Access can only be denied when attempting to access the sites which are included in the lists. While a suitably large list could bar access to a great deal of undesirable information it is difficult to keep up to date due to the rapid increase 20 in the number of new sites and removal of sites. The above systems also do not lend themselves to adaptation to other network protocols and services such as interactive chat, streaming video, email or encrypted data streams. Extending to different languages also poses a problem for globalisation of these systems. 25 OBJECT OF THE INVENTION An object of the present invention is to alleviate or to reduce to a certain degree one or more of the above disadvantages. Another object of the present invention is provide an apparatus/system for classifying user profiles. 30 WO 00/52598 PCT/AUU0O/00158 3 SUMMARY OF THE INVENTION In one aspect therefor the present invention resides in an apparatus for classifying information on communications network. The apparatus comprises means for obtaining one or more transmission characteristics of information on a 5 path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics. In a second aspect therefor the present invention resides in an apparatus for classifying content servers which are accessible on a communications network. The 10 apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics. In a third aspect therefor the present invention resides in a computer 15 program for classifying information which is accessible on a communications network. The program comprises means for obtaining one or more transmission characteristics of information on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics. 20 In a fourth aspect therefor the present invention resides in a computer program for classifying content servers which are accessible on a communications network. The apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, analysing means for predicting a classification of 25 said information based on said one or more transmission characteristics. In a fifth aspect therefor the present invention resides in an apparatus/computer program for classifying user profiles of users accessing information or content servers on a communications network. The apparatus/computer program comprises means for obtaining one or more 30 transmission characteristics of information or information provided by any one of said content servers on a path of said communications network, analysing means WO 00/52598 PCT/AU00/00158 4 for predicting a classification of said information or said one content server based on said one or more transmission characteristics, and means for classifying user profile in accordance with the predicted classification. The above invention may also comprise means for storing said one or more 5 transmission characteristics. Typically said one or more transmission characteristics include any one or more of network protocol, date and time stamps, size of transmission activities (text and image), content type of transmission activities, pattern seen within the content of the transmission and any other characteristic that can be employed for predicting 10 classifications. In preference said one or more transmission characteristics are obtained from network packets or fragments thereof. It is also preferred that the analysing means includes profiling means for providing profiles of interactions based on said one or more transmission 15 characteristics. Typically said profiling means is arranged to process said one or more transmission characteristics for providing any one or more of frequency of interaction, duration of interaction, duration of absence of interaction, patterns of transmission, average number of http links within an object of related sites, average number of like sites visited within a time frame, and statistics from said other 20 characteristics, for forming interaction profiles. The analysing means can then use the profiles for predicting classifications. The invention may have a knowledge base of predetermined profiles, and the analysing means is adapted to predict a classification based on a comparison between the profile of information to be classified and predetermined profiles. 25 Advantageously the invention may have means for updating the knowledge base so that the classification prediction may be enhanced following classifications. In order that the present invention can be more readily understood and be put into practical effect reference will now be made to the accompanying drawings which illustrate one preferred embodiment of the invention and wherein: 30 BRIEF DESCRIPTION OF THE DRAWING Figure 1 is a schematic diagram of the apparatus according to the invention; WO 00/52598 PCT/AU00/U00158 5 Figure 2 is a table of selected data of captured packets of a search engine using the apparatus shown in Figure 1; Figure 3 is a partial table of selected data of captured packets of a news web site using the apparatus shown in Figure 1; 5 Figure 4 is a table of selected data of captured packets of an entertainment web site using the apparatus shown in Figure 1; Figure 5 is a table of selected data of captured packets of the web site of an e-commerce merchant using the apparatus shown in Figure 1; Figure 6 is a table of selected data of captured packets of the web site of 10 another e-commerce merchant using the apparatus shown in Figure 1; Figure 7 is a table of selected data of captured packets of a pornography web site using the apparatus shown in Figure 1; Figure 8 is a table of selected data of captured packets of another pornography web site using the apparatus shown in Figure 1; 15 Figure 9 is a table of model N1 results using the apparatus shown in Figure 1; Figure 10 is a table of model N2 results using the apparatus shown in Figure 1; Figure 11 is a table of model N3 results using the apparatus shown in Figure 20 1; and Figure 12 is a table of classification prediction confidence levels using the apparatus shown in Figure 1. DESCRIPTION OF THE PREFERRED EMBODIMENT Referring initially to Figure 1 there is shown an apparatus 10 for classifying 25 media or information flowing through a path of a communications network which in this case is the Internet. As can be seen, network traffic passing through the apparatus 10 is captured and analysed for providing statistics relating to interactions between two or more terminals (not shown). The captured traffic is first checked against a list of 30 predetermined classifications to determine if it is known or unknown.
WO 00/52598 PCT/AUUU/UUI 8 6 When the captured traffic is of an unknown classification, various models (to be described more fully below) are applied to the data set in the captured traffic in order to predict the content classification. The models use parameters derived from a knowledge base of previously classified data sets and fitness with these 5 parameters to determine the classification of the content of the newly captured traffic. Thus, the web site sending the captured traffic is now classified and is added to the list of known classifications. It should be noted that the embodiment of the apparatus 10 as described herein is for an analysis of transmission traffic using the HTTP protocol. The 10 apparatus 10 according to the present invention is not restricted to HTTP, and is easily adaptable to analyse data carried within any networks using any known protocol. Examples of the protocols include FTP, SMTP, NNTP, etc. Following classification the captured data set is stored in the knowledge base. As the knowledge base expands, more data are used for the model 15 parameters. This refines the apparatus and results in improved predictive performance. The sites that are deemed to include undesirable information are added to Access control lists (ACLs). The ACLs are used control the flow of content information between terminals. E.g. Undesired content information can be 20 prevented from travelling further through the network by simply not forwarding it, or by replacing it, or by intercepting the request for such content information and modifying its destination. Classification of traffic from content servers are relatively static. On the other hand, user terminals that interact with these content servers are variable and their 25 classifications are considered transient classifications. Whereas classifications of content servers form a model of the style of content residing on the server, transient classifications form a model of style of content being viewed by a user terminal, or content consumer. This in effect forms a behaviour profile of such a consumer. This profile can be used to tailor the 30 content information to suit the consumer.
WO 00/52598 PCT/AUOU/UUIS 7 As mentioned earlier the apparatus 10 captures a set of observed data relating to a network interaction event, and provides a set of results indicating the classification of a resource or personality residing at each network node involved in the interaction. This is accomplished by applying various statistical models to a 5 profile, and testing this against results obtained from profiles of known classifications. In this example of the invention this process is represented by the following formulas: x is an unknown profile to be classified; Profiles p1 ,p2,p3...pn are of known classifications; 10 Models M1,M2,M3...Mn are available to operate on these profiles; and C1,C2,C3...Cn are profile classifications. The population of a profile of classification C1, may be defined by the population of M1 (p). M1 (x) may be tested against the true population using any of the standard statistical hypothesis methods. 15 A pre-determined set of media terminals of a classification are modelled by various models M1, M2 .. Mn. Each model consists of an approach and a set of parameter, e.g linear regression, gradient and point of interception, so that for a single classification M1(pl,p2 .. pn), M2(ql,q2 .. qn) .. Mn(rl,r2 .. rn) are used to model the population from the classification. The models may be based on 20 mathematical structures, or arbitrary rules. The models are continually refined as more network traffic passes through the apparatus 10, thereby increasing the population space from which the classifications are computed. A terminal may be permanently or transitionally defined in relation to a 25 classification. A transitionally defined terminal may move between classifications based on the fitness of the observed traffic to the models of the various classifications. Figures 2 to 8 are tables of selected data of traffic for testing the profile of data during a network interaction with a content server to determine if it contains 30 media content of a pornographic nature. Assumption is made that profiles for WO 00/52598 PCT/AU00/00158 8 content servers contain a variable which is the average size of graphical images served. A normal distribution or similar non-deterministic probability distribution is then used to test the hypothesis that the profile belongs to a population classified 5 as pornographic. In this example, the population of the classification may be defined by the population of N(a,b) where N is the image size and a and b are the mean and variance respectively, based on a normal distribution. The average and standard deviation derived from the observed samples is tested against the true population using standard statistical hypothesis methods. 10 In some cases this approach may be broadened to encompass analysis of variance methods with multiple dependant variables, to model the characteristics of a site. Traditional ANOVA or regressive techniques may be applied to model the media content. A variety of traditional deterministic and non-deterministic models may be 15 applied to determine the hypothesis of profile classification. These may be changed or upgraded continually depending on the level of predictive power found. The functionality of models used is not limited to, but can include simple rules-of thumb, deterministic and non-deterministic probability models, or arbitrary calculations. 20 The choice of model is primarily dictated by the predictive power of that model against the population in question. Figures 2 through 8 show examples of basic data set that can be gathered by observing network traffic of a typical interaction between a client browser and a web server. 25 Figures 9 to 11 illustrate a simple classification model. This model looks at the size, content and relationships of objects being transmitted by a content server. The outcome of this model is to determine if the media being transmitted has pornographic content. Classification: pornographic 30 Standard Model: N 1 (a,b) WO 00/52598 PCT/AU00/00158 9 Where N1 is the image size, a and b are the mean and variance respectively, based on a normal distribution. N2(c,d) Where N2 is the ratio of text to graphics, c and d are the total size of the text and 5 graphic objects respectively. N3(e) Where N3 is the count of word patterns matched from a list of pre-determined words, and e is the text of an object. Observed Samples are given in the tables shown in Figures 2 to 8. 10 For model N1 shown in Figure 9, there is applied the normal distribution hypothesis test to the observed samples deriving the results. The result shows confidence to the 93% and 87% level for sites 6 and 7 respectively, that the sites belong to a population of pornographic sites. The other samples give much lower confidence levels. 15 For model N2 shown in Figure 10, a simple rule is used to test if the ratio is below a pre-determined threshold. The results show that sites 2, 4, 6 and 7 are within the threshold rating. For Model N3 shown in Figure 11, a simple rule is used to test if the number of words matching a list of patterns, exceeds a pre-determined threshold. 20 The results show that sites 6 and 7 exceed the threshold. A weighting formula is then applied to derive a final result as shown in Figure 12. Therefore, using this example model, the apparatus 10 would predict that sites 6 and 7 are probably serving media with pornographic content, whereas sites 25 1 through 5 probably are not. The attached appendix shows an example of the set of rules, constants and formulas which determine a confidence prediction based on logistic regression. The rules are defined using "Submodel" and "Model" components to define individial data points, and aggregated data points. These are then referred to in the 30 "ProbabilityAnalyser" equations which use standard predictive formulas.
WO 00/52598 PCT/AU00/00158 10 Whilst the above has been given by way of illustrative example of the present invention many variations and modifications thereto will be apparent to those skilled in the art without departing from the broad ambit and scope of the invention as herein set forth.

Claims (15)

1. An apparatus for classifying information on communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.
2. An apparatus for classifying content servers which are accessible on a communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.
3. A computer program for classifying information which is accessible on a communications network, the program comprises means for obtaining one or more transmission characteristics of information on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.
4. A computer program for classifying content servers which are accessible on a communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, analysing means for predicting a classification of said information based on said one or more transmission characteristics.
5. An apparatus for classifying user profiles of users accessing information or content servers on a communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information or information provided by any one of said content servers on a path of said communications network, analysing means for predicting a classification of said information or said one content server based on said one or more transmission characteristics, and means for classifying user profile in accordance with the predicted classification. WO 00/52598 PCT/AU00/00UU158 12
6. A computer program for classifying user profiles of users accessing information or content servers on a communications network, the program comprises means for obtaining one or more transmission characteristics of information or information provided by any one of said content servers on a path of said communications network, analysing means for predicting a classification of said information or said one content server based on said one or more transmission characteristics, and means for classifying user profile in accordance with the predicted classification.
7. The invention according to any one of claims 1 to 6 further comprising means for storing said one or more transmission characteristics.
8. The invention according to any one of claims 1 to 7 wherein said one or more transmission characteristics include any one or more of network protocol, date and time stamps, size of transmission activities (text and image), content type of transmission activities, pattern seen within the content of the transmission and any other characteristic that can be employed for predicting classifications.
9. The invention according to any one of claims 1 to 8 wherein said one or more transmission characteristics are obtained from network packets or fragments thereof.
10. The invention according to any one of claims 1 to 9 wherein the analysing means includes profiling means for providing profiles of interactions based on said one or more transmission characteristics.
11. The invention according to claim 10 said profiling means is arranged to process said one or more transmission characteristics for providing any one or more of frequency of interaction, duration of interaction, duration of absence of interaction, patterns of transmission, average number of http links within an object of related sites, average number of like sites visited within a time frame, and statistics from said other characteristics, for forming interaction profiles, and the analysing means is adapted to use the profiles for predicting classifications.
12. The invention according to any one of claims 1 to 11 further comprising a knowledge base of predetermined profiles, and the analysing means is adapted to WO 00/52598 PCT/AU00/00158 13 predict a classification based on a comparison between the profile of information to be classified and predetermined profiles.
13. The invention according to claim 12 further comprising means for updating the knowledge base so that the classification prediction can be enhanced following classifications. VVL UUIDODYa rL1 I/AUUU/UUIDO 14 Appendix #Body Text Word Ratio Models 5 SubModel Param AllWordCount WordList AllWords SubModel Param AllWordCount Context BODY #Body Text Unique Word Ratio Models SubModel Param AllWordCountUnique WordList AllWords 10 SubModel Param AllWordCountUnique Context BODY SubModel Param AIIWordCountUnique Mode Distinct #Meta Text Word Ratio Models SubModel Param AllMetaWordCount WordList AllWords 15 SubModel Param AllMetaWordCount Context META #Alternate Text Word Ratio Models SubModel Param AIlAlternateWordCount WordList AllWords SubModel Param AllAlternateWordCount Context ALTERNATE 20 # #Image models SubModel Param LargeGIFPictureCount Dimension 201 x 201 - 999 x 999 SubModel Param LargeGIFPictureCount ImageType GIF SubModel Param ThumbnailGIFPictureCount Dimension 51 x 51 - 200 x 200 25 SubModel Param ThumbnailGIFPictureCount ImageType GIF SubModel Param IconGIFPictureCount Dimension 5 x 5 - 50 x 50 SubModel Param IconGIFPictureCount ImageType GIF SubModel Param AIIGIFPictureCount ImageType GIF Model Exp LargeGIFPictureRatio RATIO(LargeGIFPictureCount, 30 AllGIFPictureCount) Substitute Sheet (Rule 26) RO/AU YVVJ UUIZZ77O -'. it IUvIU u A,7U 15 Model Exp ThumbnailGIFPictureRatio RATIO(ThumbnailGIFPictureCount, AlIGIFPictureCount) Model Exp IconGIFPictureRatio RATIO(lconGIFPictureCount, 5 AIIGIFPictureCount) SubModel Param LargeJPEGPictureCount Dimension 201 x 201 - 999 x 999 SubModel Param LargeJPEGPictureCount ImageType JPEG SubModel Param ThumbnailJPEGPictureCount Dimension 51 x 51 - 200 x 200 10 SubModel Param ThumbnailJPEGPictureCount ImageType JPEG SubModel Param IconJPEGPictureCount Dimension 5 x 5 - 50 x 50 SubModel Param IconJPEGPictureCount ImageType JPEG SubModel Param AIIJPEGPictureCount ImageType JPEG Model Exp LargeJPEGPictureRatio RATIO(LargeJPEGPictureCount, 15 AllJPEGPictureCount) Model Exp ThumbnailJPEGPictureRatio RATIO(ThumbnailJPEGPictureCount, AIIJPEGPictureCount) Model Exp IconJPEGPictureRatio RATIO(IconJPEGPictureCount, AIIJPEGPictureCount) 20 #-- SubModel Param LowDepthGlFPictureCount Depth 2 - 4 SubModel Param LowDepthGIFPictureCount ImageType GIF SubModel Param MediumDepthGIFPictureCount Depth 5 - 6 SubModel Param MediumDepthGIFPictureCount ImageType GIF 25 SubModel Param HighDepthGIFPictureCount Depth 7 - 16 SubModel Param HighDepthGIFPictureCount ImageType GIF Model Exp LowDepthGlFPictureRatio RATIO(LowDepthGIFPictureCount, AIIGIFPictureCount) Model Exp MediumDepthGlFPictureRatio 30 RATIO(MediumDepthGIFPictureCount, AllGIFPictureCount) Substitute Sheet (Rule 26) RO/AU WU UU/3ILYS FL,11/t-u uu o 16 Model Exp HighDepthGIFPictureRatio RATIO(HighDepthGIFPictureCount, AlIGIFPictureCount) 5 SubModel Param LowDepthJPEGPictureCount Depth 2 - 7 SubModel Param LowDepthJPEGPictureCount ImageType JPEG SubModel Param MediumDepthJPEGPictureCount Depth 8 - 15 SubModel Param MediumDepthJPEGPictureCount ImageType JPEG SubModel Param HighDepthJPEGPictureCount Depth 16 - 36 10 SubModel Param HighDepthJPEGPictureCount ImageType JPEG Model Exp LowDepthJPEGPictureRatio RATIO(LowDepthJPEGPictureCount, AlIJPEGPictureCount) Model Exp Medium DepthJPEGPictureRatio RATIO(MediumDepthJPEGPictureCount, AIIJPEGPictureCount) 15 Model Exp HighDepthJPEGPictureRatio RATIO(HighDepthJPEGPictureCount, AIIJPEGPictureCount) #Links Out Models SubModel Param AllLinkOutCount IncludeLocal FALSE 20 #-- SubModel Param AVSLinkOutCount Classification ADULTVERIFICATION SubModel Param AVSLinkOutCount IncludeLocal FALSE Model Exp AVSLinkOutRatio RATIO(AVSLinkOutCount, AIILinkOutCount) 25 # begin porn.conf #Body Text Word Count Models SubModel Param PornExtraHardWordCount WordFile models/dictionary/porn/porn_words extrahard.txt SubModel Param PornHardWordCount WordFile 30 models/dictionary/porn/porn_wordshard.txt Substitute Sheet (Rule 26) RO/AU WU UU/ . y rJ IuAUUU1o 17 SubModel Param PornMediumWordCount WordFile models/dictionary/porn/pornwordsmed i u m.txt SubModel Param Porn LiteWordCount WordFile 5 models/dictionary/porn/porn_wordslite.txt SubModel Param PornExtraLiteWordCount WordFile models/dictionary/porn/pornwords_extralite.txt #Unique Body Text Word Count Models 10 SubModel Param PornExtraHardWordCountUnique WordFile models/dictionary/porn/porn_wordsextrahard.txt SubModel Param PornExtraHardWordCountUnique Mode Distinct SubModel Param PornHardWordCountUnique WordFile 15 models/dictionary/porn/porn_words_hard.txt SubModel Param PornHardWordCountUnique Mode Distinct SubModel Param PornMediumWordCountUnique WordFile models/dictionary/porn/porn_wordsmediumtxt SubModel Param PornMediumWordCountUnique Mode Distinct 20 SubModel Param Porn LiteWordCountUnique WordFile models/dictionary/porn/porn_wordslite.txt SubModel Param PornLiteWordCountUnique Mode Distinct SubModel Param PornExtraLiteWordCountUnique WordFile models/dictionary/porn/porn_words_extralite.txt 25 SubModel Param PornExtraLiteWordCountUnique Mode Distinct #Body Text Word Ratio Models Model Exp PornTextWordRatioExtraHard RATIO(PornExtraHardWordCount, AllWordCount) 30 Model Exp PornTextWordRatioHard RATIO(PornHardWordCount, AllWordCount) Substitute Sheet (Rule 26) RO/AU WU UUI:DL/ k- 1/AUUU/UUl0o 18 Model Exp PornTextWordRatioMedium RATIO(PornMediumWordCount, AllWordCount) Model Exp PornTextWordRatioLite RATIO(PornLiteWordCount, AllWordCount) 5 Model Exp PornTextWordRatioExtraLite RATIO(PornExtraLiteWordCount, AllWordCount) #Body Text Unique Word Ratio Models Model Exp PornTextWordRatioExtraHardUnique 10 RATIO(PornExtraHardWordCountUnique, AllWordCountUnique) Model Exp PornTextWordRatioHardUnique RATIO(PornHardWordCountUnique, AllWordCountUnique) Model Exp PornTextWordRatioMediumUnique 15 RATIO(PornMediumWordCountUnique, AllWordCountUnique) Model Exp PornTextWordRatioLiteUnique RATIO(PornLiteWordCountUnique, AllIWordCountUnique) Model Exp PornTextWordRatioExtraLiteUnique RATIO(PornExtraLiteWordCountUnique, AllWordCountUnique) 20 # #Domain Word Count Models SubModel Param PornExtraHardDomainWordCount Context DOMAIN-NAME SubModel Param PornExtraHardDomainWordCount WordFile models/dictionary/porn/porn_words_extrahard.txt 25 SubModel Param PornHardDomainWordCount Context DOMAIN-NAME SubModel Param PornHardDomainWordCount WordFile models/dictionary/porn/porn_wordshard.txt SubModel Param PornMediumDomainWordCount Context DOMAIN-NAME SubModel Param PornMediumDomainWordCount WordFile 30 models/dictionary/porn/porn_words_medium.txt SubModel Param PornLiteDomainWordCount Context DOMAIN-NAME Substitute Sheet (Rule 26) RO/AU WU UU/iZ.y rL I/AUUUlUUIo 19 SubModel Param Porn LiteDomainWordCount WordFile models/dictionary/porn/porn_words_lite.txt SubModel Param PornExtraLiteDomainWordCount Context DOMAIN-NAME 5 SubModel Param PornExtraLiteDomainWordCount WordFile models/dictionary/porn/pornwordsextralite.txt #Meta Text Word Count Models SubModel Param PornExtraHardMetaWordCount Context META 10 SubModel Param PornExtraHardMetaWordCount WordFile models/dictionary/porn/pornwordsextrahard.txt SubModel Param PornHardMetaWordCount Context META SubModel Param PornHardMetaWordCount WordFile models/dictionary/porn/pornwords_hard.txt 15 SubModel Param PornMediumMetaWordCount Context META SubModel Param PornMediumMetaWordCount WordFile models/dictionary/porn/porn_wordsmedium.txt SubModel Param PornLiteMetaWordCount Context META 20 SubModel Param PornLiteMetaWordCount WordFile models/dictionary/porn/porn_words_lite.txt SubModel Param PornExtraLiteMetaWordCount Context META SubModel Param PornExtraLiteMetaWordCount WordFile models/dictionary/porn/porn_words_extralite.txt 25 # -.. #Meta Text Word Ratio Models Model Exp PornMetaWordRatioExtraHard RATIO(PornExtraHardMetaWordCount, AllMetaWordCount) Model Exp PornMetaWordRatioHard RATIO(PornHardMetaWordCount, 30 AllMetaWordCount) Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AUUU/UU15 20 Model Exp PornMetaWordRatioMedium RATIO(PornMediumMetaWordCount, AllIMetaWordCount) Model Exp PornMetaWordRatioLite RATIO(PornLiteMetaWordCount, 5 AllMetaWordCount) Model Exp PornMetaWordRatioExtraLite RATIO(PornExtraLiteMetaWordCount, AllMetaWordCount) #Alternate Text Word Count Models 10 SubModel Param PornExtraHardAlternateWordCount Context ALTERNATE SubModel Param PornExtraHardAlternateWordCount WordFile models/dictionary/porn/porn_words_extrahard.txt SubModel Param PornHardAlternateWordCount Context ALTERNATE SubModel Param PornHardAlternateWordCount WordFile 15 models/dictionary/porn/porn_wordshard.txt SubModel Param PornMediumAlternateWordCount Context ALTERNATE SubModel Param PornMediumAlternateWordCount WordFile models/dictionary/porn/porn_wordsmedium.txt SubModel Param PornLiteAlternateWordCount Context ALTERNATE 20 SubModel Param PornLiteAlternateWordCount WordFile models/dictionary/porn/pornwordslite.txt SubModel Param PornExtraLiteAlternateWordCount Context ALTERNATE SubModel Param PornExtraLiteAlternateWordCount WordFile models/dictionary/porn/pornwords extralite.txt 25 # #Alternate Text Word Ratio Models Model Exp PornAlternateWordRatioExtraHard RATIO(PornExtraHardAlternateWordCount, AllAlternateWordCount) Model Exp PornAlternateWordRatioHard RATIO(PornHardAlternateWordCount, 30 AllAlternateWordCount) Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AUOU/00158 21 Model Exp PornAlternateWordRatioMedium RATIO(PornMediumAlternateWordCount, AllAlternateWordCount) Model Exp PornAlternateWordRatioLite RATIO(PornLiteAlternateWordCount, 5 AIIAlternateWordCount) Model Exp PornAlternateWordRatioExtraLite RATIO(PornExtraLiteAlternateWordCount, AIIAlternateWordCount) #Links Out Models 10 SubModel Param PornLinkOutCount Classification PORN SubModel Param PornLinkOutCount IncludeLocal FALSE Model Exp PornLinkOutRatio RATIO(PornLinkOutCount, AIILinkOutCount) #Logistic Models 15 Model Exp PornLRConstant -3.9869 Model Exp PornLRCoefficientPornTextWordRatioExtraHard
39.7450 Model Exp PornLRCoefficientPornTextWordRatioHard
355.0550 20 Model Exp PornLRCoefficientPornTextWordRatioMedium -136.436 Model Exp PornLRCoefficientPornTextWordRatioLite -63.2565 Model Exp PornLRCoefficientPornTextWordRatioExtraLite 33.9054 Model Exp PornLRCoefficientPornTextWordRatioExtraHardUnique 111.4752 25 Model Exp PornLRCoefficientPornTextWordRatioHardUnique -72.7005 Model Exp PornLRCoefficientPornTextWordRatioMediumUnique 264.1902 Model Exp PornLRCoefficientPornTextWordRatioLiteUnique 125.0743 Model Exp PornLRCoefficientPornTextWordRatioExtraLiteUnique -16.6895 30 Model Exp PornLRCoefficientPornExtraHardDomainWordCount 0.2598 Model Exp PornLRCoefficientPorn HardDomainWordCount 2.1344 Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AU00/00158 22 Model Exp PornLRCoefficientPornMediumDomainWordCount 0 Model Exp Porn LRCoefficientPornLiteDomainWordCount 0.0610 Model Exp PornLRCoefficientPornExtraLiteDomainWordCount 0 5 # Model Exp PornLRCoefficientPornMetaWordRatioExtraHard 0 Model Exp PornLRCoefficientPornMetaWordRatioHard 0 Model Exp PornLRCoefficientPornMetaWordRatioMedium 0 Model Exp PornLRCoefficientPornMetaWordRatioLite 0 10 Model Exp PornLRCoefficientPornMetaWordRatioExtraLite 0 Model Exp Porn LRCoefficientPornAlternateWordRatioExtraHard 16.1972 Model Exp PornLRCoefficientPornAlternateWordRatioHard 0 Model Exp PornLRCoefficientPornAlternateWordRatioMedi urn 26.4186 15 Model Exp PornLRCoefficientPornAlternateWordRatioLite 0 Model Exp PornLRCoefficientPornAlternateWordRatioExtraLite 14.1615 Model Exp PornLRCoefficientAl IGIFPictureCount 0 Model Exp PornLRCoefficientLargeGIFPictureCount 0 20 Model Exp PornLRCoefficientlconGIFPictureCount 0 Model Exp PornLRCoefficientThumbnailGIFPictureCount 0 Model Exp PornLRCoefficientLargeGIFPictureRatio 0 Model Exp PornLRCoefficientlconGIFPictureRatio 0 25 Model Exp PornLRCoefficientThumbnailGIFPictureRatio 0 Model Exp PornLRCoefficientHighDepthGIFPictureCount 0 Model Exp PornLRCoefficientMediumDepthGlFPictureCount 0 Model Exp PornLRCoefficientLowDepthGIFPictureCount 0 Model Exp Porn LRCoefficientHighDepthGIFPictureRatio 0 30 Model Exp PornLRCoefficientMediumDepthGIFPictureRatio 0 Model Exp PornLRCoefficientLowDepthGIFPictureRatio 0 Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AU00/00158 23 Model Exp PornLRCoefficientAl IJPEGPictureCount 0 Model Exp PornLRCoefficientLargeJPEGPictureCount 0 5 Model Exp PornLRCoefficientlconJPEGPictureCount 0 Model Exp PornLRCoefficientThumbnailJPEGPictureCount 0 Model Exp PornLRCoefficientLargeJPEGPictureRatio 0 Model Exp PornLRCoefficientlconJPEGPictureRatio 0 Model Exp PornLRCoefficientThumbnai lJPEGPictureRatio 0 10 Model Exp PornLRCoefficientHighDepthJPEGPictureCount 0 Model Exp PornLRCoefficientMediumDepthJPEG PictureCount 0 Model Exp Porn LRCoefficientLowDepthJPEG PictureCount 0 Model Exp PornLRCoefficientHighDepthJPEGPictureRatio 0 Model Exp PornLRCoefficientMediumDepthJPEGPictureRatio 0 15 Model Exp PornLRCoefficientLowDepthJPEGPictureRatio 0 Model Exp PornLRCoefficientPornLinkOutRatio 4.6958 Model Exp PornLRCoefficientAVSLinkOutCount 0.3327 Model Exp PornLRCoefficientAVSLinkOutRatio 3.6786 20 # Model Exp PornLRLogOdds SUM(PornLRConstant, \ PROD UCT(Porn LRCoefficientPornTextWordRatioExtraHard, PornTextWordRatioExtraHard), \ PROD UCT(Porn LRCoefficientPornTextWordRatioHard, 25 PornTextWordRatioHard), \ PROD UCT(Porn LRCoefficientPornTextWordRatioMedi um, PornTextWordRatioMedium), \ PROD UCT(Porn LRCoefficientPornTextWord RatioLite, PornTextWordRatioLite), \ 30 PRODUCT(PornLRCoefficientPornTextWordRatioExtraLite, PornTextWordRatioExtraLite), \ Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AU00/00158 24 PRODUCT(PornLRCoefficientPornTextWordRatioExtraHardUnique, PornTextWordRatioExtraHardUnique), \ PRODUCT(Porn LRCoefficientPornTextWordRatioHard Unique, 5 PornTextWordRatioHardUnique), \ PROD UCT(Porn LRCoefficientPornTextWordRatioMedium Unique, PornTextWordRatioMediumUnique), \ PROD UCT(Porn LRCoefficientPornTextWordRatioLiteUn ique, PornTextWordRatioLiteUnique), \ 10 PRODUCT(PornLRCoefficientPornTextWordRatioExtraLiteUnique, PornTextWordRatioExtraLiteUnique), \ PRODUCT(Porn LRCoefficientPornExtraHardDomainWordCount, PornExtraHardDomainWordCount), \ PRODUCT(Porn LRCoefficientPorn Hard Domai nWordCount, 15 PornHardDomainWordCount), \ PROD UCT(Porn LRCoefficientPornMediumDomainWordCount, PornMediumDomainWordCount), \ PRODUCT(Porn LRCoefficientPorn LiteDomainWordCount, PornLiteDomainWordCount), \ 20 PRODUCT(PornLRCoefficientPornExtraLiteDomainWordCount, PornExtraLiteDomainWordCount), \ PRODUCT(PornLRCoefficientPornMetaWordRatioExtraHard, PornMetaWordRatioExtraHard), \ PROD UCT(Porn LRCoefficientPornMetaWordRatioHard, 25 PornMetaWordRatioHard), \ PROD UCT(Porn LRCoefficientPornMetaWord RatioMed i um, PornMetaWordRatioMedium), \ PROD UCT(Porn LRCoefficientPornMetaWord RatioLite, PornMetaWordRatioLite), \ 30 PRODUCT(PornLRCoefficientPornMetaWordRatioExtraLite, PornMetaWordRatioExtraLite), \ Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AU00/00158 25 PRODUCT(PornLRCoefficientPornAlternateWordRatioExtraHard, PornAlternateWordRatioExtraHard), \ PROD UCT(PornLRCoefficientPornAlternateWord RatioHard, 5 PornAlternateWordRatioHard), \ PROD UCT(Porn LRCoefficientPornAlternateWordRatioMed i urnm, PornAlternateWordRatioMedium), \ PRODUCT(Porn LRCoefficientPornAlternateWordRatioLite, PornAlternateWordRatioLite), \ 10 PRODUCT(PornLRCoefficientPornAlternateWordRatioExtraLite, PornAlternateWordRatioExtraLite), \ PRODUCT(PornLRCoefficientAlIIGlFPictureCount, AIIGIFPictureCount), \ PRODUCT(Porn LRCoefficientLargeG I FPictureCount, LargeGIFPictureCount), \ 15 PRODUCT(PornLRCoefficienticonG IFPictureCount, IconGIFPictureCount), \ PRODUCT(PornLRCoefficientThumbnai lGIFPictureCount, ThumbnailGlFPictureCount), \ PROD UCT(Porn LRCoefficientLargeG IFPictureRatio, 20 LargeGIFPictureRatio), \ PRODUCT(PornLRCoefficientlconGlFPictureRatio, IconGIFPictureRatio), \ PRODUCT(PornLRCoefficientThumbnailGIFPictureRatio, ThumbnailGIFPictureRatio), \ PRODUCT(PornLRCoefficientHighDepthGIFPictureCount, 25 HighDepthGIFPictureCount), \ PRODUCT(Porn LRCoefficientMed i urn DepthG I FPictureCount, MediumDepthGlFPictureCount), \ PRODUCT(Porn LRCoefficientLowDepthGIFPictureCount, LowDepthGIFPictureCount), \ 30 PRODUCT(PornLRCoefficientHighDepthGIFPictureRatio, HighDepthGIFPictureRatio), \ Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AU00/00158 26 PROD UCT(PornLRCoefficientMedi urn DepthG IFPictureRatio, MediumDepthGIFPictureRatio), \ PROD UCT(Porn LRCoefficientLowDepthGIFPictureRatio, 5 LowDepthGIFPictureRatio), \ PRODUCT(PornLRCoefficientAlIIJPEGPictureCount, Al IIJPEGPictureCount), PROD UCT(Porn LRCoefficientLargeJPEGPictureCount, LargeJPEGPictureCount), \ 10 PRODUCT(PornLRCoefficientIconJPEGPictureCount, IconJPEGPictureCount), \ PROD UCT(PornLRCoefficientThumbnai IJ PEG PictureCount, ThumbnailJPEGPictureCount), \ PRODUCT(Porn LRCoefficientLargeJPEG PictureRatio, 15 LargeJPEGPictureRatio), \ PRODUCT(PornLRCoefficientlconjPEG PictureRatio, IconJPEGPictureRatio), \ PRODUCT(Porn LRCoefficientThumbnai lJPEG PictureRatio, ThumbnailJPEGPictureRatio), \ 20 PRODUCT(PornLRCoefficientHighDepthJ PEGPictureCount, HighDepthJPEGPictureCount), \ PRODUCT(Porn LRCoefficientMed i u m DepthJ PEG PictureCount, MediumDepthJPEGPictureCount), \ PRODUCT(PornLRCoefficientLowDepthJPEGPictureCount, 25 LowDepthJPEGPictureCount), \ PRODUCT(PornLRCoefficientHighDepthJPEGPictureRatio, HighDepthJPEGPictureRatio), \ PRODUCT(Porn LRCoefficientMediumDepthJPEG PictureRatio, MediumDepthJPEGPictureRatio), \ 30 PRODUCT(PornLRCoefficientLowDepthJPEGPictureRatio, LowDepthJPEGPictureRatio), \ Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AU00/00158 27 PRODUCT(PornLRCoefficientPornLinkOutRatio, PornLinkOutRatio),\ PRODUCT(PornLRCoefficientAVSLinkOutCount, AVSLinkOutCount),\ PRODUCT(PornLRCoefficientAVSLinkOutRatio, AVSLinkOutRatio)) 5 # #Probability Analysers ProbabilityAnalyser Param PornAltMetaWordCountProbability Classification PORN Probabi I ityAnalyser Exp PornAltMetaWordCountProbabi I ity \ 10 SUM(PornExtraHardMetaWordCount, PornHardMetaWordCount, \ PRODUCT(0.5,PornMediumMetaWordCount), \ PornExtraHardAlternateWordCount, Porn HardAlternateWordCount, \ PROD UCT(O.5,PornMed i umAlternateWordCount)) 15 ProbabilityAnalyser Param PornMetaWordRatioProbability Classification PORN Probabi I ityAnalyser Exp PornMetaWordRatioProbabi I ity \ PRODUCT(100, SUM(PornMetaWordRatioExtraHard, \ PornMetaWordRatioHard, PornMetaWordRatioMedium)) ProbabilityAnalyser Param PornLRProbabi I ity Classification PORN 20 ProbabilityAnalyser Exp PornLRProbability PRODUCT(100, RATIO(1,SUM(1,EXP(MINUS(PornLRLogOdds))))) Substitute Sheet (Rule 26) RO/AU
AU28959/00A 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information Ceased AU761017B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU28959/00A AU761017B2 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
AUPP9048A AUPP904899A0 (en) 1999-03-04 1999-03-04 Apparatus and system for classifying and control access to information
AUPP9048 1999-03-04
AU28959/00A AU761017B2 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information
PCT/AU2000/000158 WO2000052598A1 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information

Publications (2)

Publication Number Publication Date
AU2895900A true AU2895900A (en) 2000-09-21
AU761017B2 AU761017B2 (en) 2003-05-29

Family

ID=25620890

Family Applications (1)

Application Number Title Priority Date Filing Date
AU28959/00A Ceased AU761017B2 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information

Country Status (1)

Country Link
AU (1) AU761017B2 (en)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5835905A (en) * 1997-04-09 1998-11-10 Xerox Corporation System for predicting documents relevant to focus documents by spreading activation through network representations of a linked collection of documents

Also Published As

Publication number Publication date
AU761017B2 (en) 2003-05-29

Similar Documents

Publication Publication Date Title
Ring et al. Flow-based network traffic generation using generative adversarial networks
AU2008100859A4 (en) Method and apparatus for restricting access to network accessible digital information
KR101203331B1 (en) Url based filtering of electronic communications and web pages
JP2003263529A (en) Offline behavior analysis for online personalisation of value added services
US7636777B1 (en) Restricting access to requested resources
US9537871B2 (en) Systems and methods for categorizing network traffic content
US20050015454A1 (en) Obfuscation of spam filter
US20150326530A1 (en) Firewall Security for Computers with Internet Access and Method
KR100848319B1 (en) Harmful web site filtering method and apparatus using web structural information
WO2007059428A2 (en) Content-based policy compliance systems and methods
US20040267929A1 (en) Method, system and computer program products for adaptive web-site access blocking
Greenfield et al. Effectiveness of Internet filtering software products
Masoud et al. On tackling social engineering web phishing attacks utilizing software defined networks (SDN) approach
Neri et al. Role of fluctuations in epidemic resurgence after a lockdown
CN107360198A (en) Suspicious domain name detection method and system
Frías-Martínez et al. A customizable behavior model for temporal prediction of web user sequences
AU761017B2 (en) Apparatus and system for classifying and control access to information
WO2000052598A1 (en) Apparatus and system for classifying and control access to information
Yang et al. Adaptive delivery of HTML contents
Schmitz et al. Identifying artificial actors in E-dating: A probabilistic segmentation based on interactional pattern analysis
Bartoš et al. Evaluating reputation of internet entities
KR20010103131A (en) Malicious Site Interception Method
Chou et al. Design and implementation of content-based filter system on embedded linux home gateway
Xu et al. The Obfuscation Method of User Identification System
Schuller et al. Optimized Throttling for OAuth-Based Authorization Servers

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)