AU761017B2 - Apparatus and system for classifying and control access to information - Google Patents

Apparatus and system for classifying and control access to information Download PDF

Info

Publication number
AU761017B2
AU761017B2 AU28959/00A AU2895900A AU761017B2 AU 761017 B2 AU761017 B2 AU 761017B2 AU 28959/00 A AU28959/00 A AU 28959/00A AU 2895900 A AU2895900 A AU 2895900A AU 761017 B2 AU761017 B2 AU 761017B2
Authority
AU
Australia
Prior art keywords
porn
model
exp
model exp
submodel param
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU28959/00A
Other versions
AU2895900A (en
Inventor
Alan Bradley Jones
David Ross Taylor
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Total Defense Inc
Original Assignee
TEL NET MEDIA Pty Ltd
Telnet Media Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to AUPP9048A priority Critical patent/AUPP904899A0/en
Priority to AUPP9048 priority
Application filed by TEL NET MEDIA Pty Ltd, Telnet Media Pty Ltd filed Critical TEL NET MEDIA Pty Ltd
Priority to AU28959/00A priority patent/AU761017B2/en
Priority to PCT/AU2000/000158 priority patent/WO2000052598A1/en
Publication of AU2895900A publication Critical patent/AU2895900A/en
Application granted granted Critical
Publication of AU761017B2 publication Critical patent/AU761017B2/en
Assigned to INTERNET SHERIFF TECHNOLOGY LIMITED reassignment INTERNET SHERIFF TECHNOLOGY LIMITED Request to Amend Deed and Register Assignors: TEL.NET MEDIA PTY LTD
Assigned to Total Defense, Inc. reassignment Total Defense, Inc. Alteration of Name(s) in Register under S187 Assignors: INTERNET SHERIFF TECHNOLOGY LIMITED
Application status is Ceased legal-status Critical
Anticipated expiration legal-status Critical

Links

Description

WO 00/52598 PCT/AU00/00158 1 APPARATUS AND SYSTEM FOR CLASSIFYING AND CONTROL ACCESS TO INFORMATION TECHNICAL FIELD OF THE INVENTION THIS INVENTION relates to apparatus and system forclassifying information on communications network and in particular but not limited to apparatus and system for classifying content servers and for selectively controlling access to classified content servers.

BACKGROUND OF THE INVENTION The phenomenon growth of information technology has allowed many people to have access to diverse information on communications networks. The Internet in particular allows fetching of information from any cooperating computers or content servers located in different parts of the world by simply clicking references to the information. As the number of accessible computers or content servers and the amount of information over the communications network grow daily it becomes increasingly difficult to classify them manually.

Known systems for controlling the types of information accessible on a network rely on comparing a requested destination with those on pre-determined Access Control Lists (ACL) or on word matching to determine whether to allow or deny access. This approach can be applied at the client node prior to requesting the information or on any suitably intelligent network device capable of intercepting the request or subsequent reply prior to it reaching the requester. For example, in the case of an Internet browser running on a PC or work station, a request is made for an Internet resource such as a web site. A software program for monitoring such requests on the PC can be configured to scan a pre-determined list of site addresses for a match. If found, access to the site may be denied and a suitable message is then displayed informing the user that access is denied. Alternatively, the request may be allowed to proceed, but as data are received from the site they are scanned for checking a match with one or more sets of pre-determined words, word fragments or phrases. If a match is found the site is not displayed on the computer but instead there is shown a suitable message. Typically, this type of control software is installed on a PC or work station which does not have particularly strict Ns WO 00/52598 PCT/AUOO/00158 2 access privileges. The control software can be easily removed, disabled or otherwise circumvented and thereby defeating the control system.

A network device capable of intercepting the request or reply to a request, such as a proxy server, may perform similar actions using the same methods of web site matching. This is usually maintained by a network administrator with strict access rights. Also, a network requiring clients to connect through the network device in order to access the network can have its content control enforced. This allows content control of multiple clients from one central point.

While these known systems do provide some access control abilities, there are several disadvantages. A system based on word or phrase matching can only match text and it therefore would allow access to undesired information comprising graphic images. Also, a single word may match a broad range of sites with quite different classes of information. As an example, when the word "sex" is used to match pornographic sites the system would also block access to other sites providing non offensive information such as articles on biology.

A system based on an access control list of prohibited sites is much more selective. Access can only be denied when attempting to access the sites which are included in the lists. While a suitably large list could bar access to a great deal of undesirable information it is difficult to keep up to date due to the rapid increase in the number of new sites and removal of sites.

The above systems also do not lend themselves to adaptation to other network protocols and services such as interactive chat, streaming video, email or encrypted data streams. Extending to different languages also poses a problem for globalisation of these systems.

OBIECT OF THE INVENTION An object of the present invention is to alleviate or to reduce to a certain degree one or more of the above disadvantages.

Another object of the present invention is provide an apparatus/system for classifying user profiles.

WO 00/52598 PCT/AU00/00158 3 SUMMARY OF THE INVENTION In one aspect therefor the present invention resides in an apparatus for classifying information on communications network. The apparatus comprises means for obtaining one or more transmission characteristics of information on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.

In a second aspect therefor the present invention resides in an apparatus for classifying content servers which are accessible on a communications network. The apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.

In a third aspect therefor the present invention resides in a computer program for classifying information which is accessible on a communications network. The program comprises means for obtaining one or more transmission characteristics of information on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.

In a fourth aspect therefor the present invention resides in a computer program for classifying content servers which are accessible on a communications network. The apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, analysing means for predicting a classification of said information based on said one or more transmission characteristics.

In a fifth aspect therefor the present invention resides in an apparatus/computer program for classifying user profiles of users accessing information or content servers on a communications network. The apparatus/computer program comprises means for obtaining one or more transmission characteristics of information or information provided by any one of said content servers on a path of said communications network, analysing means WO 00/52598 PCT/AU00/00158 4 for predicting a classification of said information or said one content server based on said one or more transmission characteristics, and means for classifying user profile in accordance with the predicted classification.

The above invention may also comprise means for storing said one or more transmission characteristics.

Typically said one or more transmission characteristics include any one or more of network protocol, date and time stamps, size of transmission activities (text and image), content type of transmission activities, pattern seen within the content of the transmission and any other characteristic that can be employed for predicting classifications.

In preference said one or more transmission characteristics are obtained from network packets or fragments thereof.

It is also preferred that the analysing means includes profiling means for providing profiles of interactions based on said one or more transmission characteristics. Typically said profiling means is arranged to process said one or more transmission characteristics for providing any one or more of frequency of interaction, duration of interaction, duration of absence of interaction, patterns of transmission, average number of http links within an object of related sites, average number of like sites visited within a time frame, and statistics from said other characteristics, for forming interaction profiles. The analysing means can then use the profiles for predicting classifications.

The invention may have a knowledge base of predetermined profiles, and the analysing means is adapted to predict a classification based on a comparison between the profile of information to be classified and predetermined profiles.

Advantageously the invention may have means for updating the knowledge base so that the classification prediction may be enhanced following classifications.

In order that the present invention can be more readily understood and be put into practical effect reference will now be made to the accompanying drawings which illustrate one preferred embodiment of the invention and wherein: BRIEF DESCRIPTION OF THE DRAWING Figure 1 is a schematic diagram of the apparatus according to the invention; 4, WO 00/52598 PCT/AUOO/00158 Figure 2 is a table of selected data of captured packets of a search engine using the apparatus shown in Figure 1; Figure 3 is a partial table of selected data of captured packets of a news web site using the apparatus shown in Figure 1; Figure 4 is a table of selected data of captured packets of an entertainment web site using the apparatus shown in Figure 1; Figure 5 is a table of selected data of captured packets of the web site of an e-commerce merchant using the apparatus shown in Figure 1; Figure 6 is a table of selected data of captured packets of the web site of another e-commerce merchant using the apparatus shown in Figure 1; Figure 7 is a table of selected data of captured packets of a pornography web site using the apparatus shown in Figure 1; Figure 8 is a table of selected data of captured packets of another pornography web site using the apparatus shown in Figure 1; Figure 9 is a table of model N1 results using the apparatus shown in Figure 1; Figure 10 is a table of model N2 results using the apparatus shown in Figure 1; Figure 11 is a table of model N3 results using the apparatus shown in Figure 1; and Figure 12 is a table of classification prediction confidence levels using the apparatus shown in Figure 1.

DESCRIPTION OF THE PREFERRED EMBODIMENT Referring initially to Figure 1 there is shown an apparatus 10 for classifying media or information flowing through a path of a communications network which in this case is the Internet.

As can be seen, network traffic passing through the apparatus 10 is captured and analysed for providing statistics relating to interactions between two ormore terminals (not shown). The captured traffic is first checked against a list of predetermined classifications to determine if it is known or unknown.

4 WO 00/52598 PCT/AUOO/00158 6 When the captured traffic is of an unknown classification, various models (to be described more fully below) are applied to the data set in the captured traffic in order to predict the content classification. The models use parameters derived from a knowledge base of previously classified data sets and fitness with these parameters to determine the classification of the content of the newly captured traffic. Thus, the web site sending the captured traffic is now classified and is added to the list of known classifications.

It should be noted that the embodiment of the apparatus 10 as described herein is for an analysis of transmission traffic using the HTTP protocol. The apparatus 10 according to the present invention is not restricted to HTTP, and is easily adaptable to analyse data carried within any networks using any known protocol. Examples of the protocols include FTP, SMTP, NNTP, etc.

Following classification the captured data set is stored in the knowledge base. As the knowledge base expands, more data are used for the model parameters. This refines the apparatus and results in improved predictive performance.

The sites that are deemed to include undesirable information are added to Access control lists (ACLs). The ACLs are used control the flow of content information between terminals. E.g. Undesired content information can be prevented from travelling further through the network by simply not forwarding it, or by replacing it, or by intercepting the request for such content information and modifying its destination.

Classification of traffic from content servers are relatively static. On the other hand, user terminals that interact with these content servers are variable and their classifications are considered transient classifications.

Whereas classifications of content servers form a model of the style of content residing on the server, transient classifications form a model of style of content being viewed by a user terminal, or content consumer. This in effect forms a behaviour profile of such a consumer. This profile can be used to tailor the content information to suit the consumer.

WO 00/52598 PCT/AU00/00158 7 As mentioned earlier the apparatus 10 captures a set of observed data relating to a network interaction event, and provides a set of results indicating the classification of a resource or personality residing at each network node involved in the interaction. This is accomplished by applying various statistical models to a profile, and testing this against results obtained from profiles of known classifications. In this example of the invention this process is represented by the following formulas: x is an unknown profile to be classified; Profiles pl,p2,p3...pn are of known classifications; Models M1,M2,M3...Mn are available to operate on these profiles; and C1,C2,C3...Cn are profile classifications.

The population of a profile of classification Cl, may be defined by the population of M1 M1 may be tested against the true population using any of the standard statistical hypothesis methods.

A pre-determined set of media terminals of a classification are modelled by various models M1, M2 Mn. Each model consists of an approach and a set of parameter, e.g linear regression, gradient and point of interception, so that for a single classification M1 (pl ,p2 pn), M2(ql,q2 qn) Mn(rl,r2 rn) are used to model the population from the classification. The models may be based on mathematical structures, or arbitrary rules.

The models are continually refined as more network traffic passes through the apparatus 10, thereby increasing the population space from which the classifications are computed.

A terminal may be permanently or transitionally defined in relation to a classification. A transitionally defined terminal may move between classifications based on the fitness of the observed traffic to the models of the various classifications.

Figures 2 to 8 are tables of selected data of traffic for testing the profile of data during a network interaction with a content server to determine if it contains media content of a pornographic nature. Assumption is made that profiles for WO 00/52598 PCT/AU00/00158 8 content servers contain a variable which is the average size of graphical images served.

A normal distribution or similar non-deterministic probability distribution is then used to test the hypothesis that the profile belongs to a population classified as pornographic. In this example, the population of the classification may be defined by the population of N(a,b) where N is the image size and a and b are the mean and variance respectively, based on a normal distribution. The average and standard deviation derived from the observed samples is tested against the true population using standard statistical hypothesis methods.

In some cases this approach may be broadened to encompass analysis of variance methods with multiple dependant variables, to model the characteristics of a site. Traditional ANOVA or regressive techniques may be applied to model the media content.

A variety of traditional deterministic and non-deterministic models may be applied to determine the hypothesis of profile classification. These may be changed or upgraded continually depending on the level of predictive power found. The functionality of models used is not limited to, but can include simple rules-ofthumb, deterministic and non-deterministic probability models, or arbitrary calculations.

The choice of model is primarily dictated by the predictive power of that model against the population in question.

Figures 2 through 8 show examples of basic data set that can be gathered by observing network traffic of a typical interaction between a client browser and a web server.

Figures 9 to 11 illustrate a simple classification model. This model looks at the size, content and relationships of objects being transmitted by a content server.

The outcome of this model is to determine if the media being transmitted has pornographic content.

Classification: pornographic Standard Model: N 1(a,b) WO 00/52598 PCT/AU00/00158 9 Where N1 is the image size, a and b are the mean and variance respectively, based on a normal distribution.

N2(c,d) Where N2 is the ratio of text to graphics, c and d are the total size of the text and graphic objects respectively.

N3(e) Where N3 is the count of word patterns matched from a list of pre-determined words, and e is the text of an object.

Observed Samples are given in the tables shown in Figures 2 to 8.

For model N1 shown in Figure 9, there is applied the normal distribution hypothesis test to the observed samples deriving the results.

The result shows confidence to the 93% and 87% level for sites 6 and 7 respectively, that the sites belong to a population of pornographic sites. The other samples give much lower confidence levels.

For model N2 shown in Figure 10, a simple rule is used totest if the ratio is below a pre-determined threshold. The results show that sites 2, 4, 6 and 7 are within the threshold rating.

For Model N3 shown in Figure 11, a simple rule is used to test if the number of words matching a list of patterns, exceeds a pre-determined threshold.

The results show that sites 6 and 7 exceed the threshold.

A weighting formula is then applied to derive a final result as shown in Figure 12.

Therefore, using this example model, the apparatus 10 would predict that sites 6 and 7 are probably serving media with pornographic content, whereas sites 1 through 5 probably are not.

The attached appendix shows an example of the set of rules, constants and formulas which determine a confidence prediction based on logistic regression. The rules are defined using "Submodel" and "Model" components to define individial data points, and aggregated data points. These are then referred to in the "ProbabilityAnalyser" equations which use standard predictive formulas.

WO 00/52598 PCT/AU00/00158 Whilst the above has been given by way of illustrative example of the present invention many variations and modifications thereto will be apparent to those skilled in the art without departing from the broad ambit and scope of the invention as herein set forth.

Claims (12)

1. An apparatus for classifying information on communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.
2. An apparatus for classifying content servers which are accessible on a communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.
3. A computer program for classifying information which is accessible on a communications network, the program comprises means for obtaining one or more transmission characteristics of information on a path of said communications network, and analysing means for predicting a classification of said information based on said one or more transmission characteristics.
4. A computer program for classifying content servers which are accessible on a communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information provided by any of said content servers on a path of said communications network, analysing means for predicting a classification of said information based on said one or more transmission characteristics. An apparatus for classifying user profiles of users accessing information or content servers on a communications network, the apparatus comprises means for obtaining one or more transmission characteristics of information or information provided by any one of said content servers on a path of said communications network, analysing means for predicting a classification of said information or said one content server based on said one or more transmission characteristics, and means for classifying user profile in accordance with the predicted classification. WO 00/52598 PCT/AU00/00158 12
6. A computer program for classifying user profiles of users accessing information or content servers on a communications network, the program comprises means for obtaining one or more transmission characteristics of information or information provided by any one of said content servers on a path of said communications network, analysing means for predicting a classification of said information or said one content server based on said one or more transmission characteristics, and means for classifying user profile in accordance with the predicted classification.
7. The invention according to any one of claims 1 to 6 further comprising means for storing said one or more transmission characteristics.
8. The invention according to any one of claims 1 to 7 wherein said one or more transmission characteristics include any one or more of network protocol, date and time stamps, size of transmission activities (text and image), content type of transmission activities, pattern seen within the content of the transmission and any other characteristic that can be employed for predicting classifications.
9. The invention according to any one of claims 1 to 8 wherein said one or more transmission characteristics are obtained from network packets or fragments thereof. The invention according to any one of claims 1 to 9 wherein the analysing means includes profiling means for providing profiles of interactions based on said one or more transmission characteristics.
11. The invention according to claim 10 said profiling means is arranged to process said one or more transmission characteristics for providing any one or more of frequency of interaction, duration of interaction, duration of absence of interaction, patterns of transmission, average number of http links within an object of related sites, average number of like sites visited within a time frame, and statistics from said other characteristics, for forming interaction profiles, and the analysing means is adapted to use the profiles for predicting classifications.
12. The invention according to any one of claims 1 to 11 further comprising a knowledge base of predetermined profiles, and the analysing means is adapted to WO 00/52598 PCT/AUOO/00158 13 predict a classification based on a comparison between the profile of information to be classified and predetermined profiles.
13. The invention according to claim 12 further comprising means for updating the knowledge base so that the classification prediction can be enhanced following classifications. WO 00/52598 PCT/AUOO/00158 14 Appendix #Body Text Word Ratio Models SubModel Param AllWordCount WordList AllWords SubModel Param AllWordCount Context BODY #Body Text Unique Word Ratio Models SubModel Param AllWordCountUnique WordList AlIWords SubModel Param AllWordCountUnique Context BODY SubModel Param AllWordCountUnique Mode Distinct #Meta Text Word Ratio Models SubModel Param Al IMetaWord Count WordList AlIWords SubModel Param AllMetaWordCount Context META #Alternate Text Word Ratio Models SubModel Param AlIAlternateWordCount WordList AllWords SubModel Param AllAlternateWordCount Context ALTERNATE #Image models SubModel Param LargeGIFPictureCount Dimension 201 x 201 999 x 999 SubModel Param LargeGlFPictureCount ImageType GIF SubModel Param ThumbnailGlFPictureCount Dimension 51 x 51 200 x 200 SubModel Param ThumbnailGlFPictureCount ImageType GIF SubModel Param IconGIFPictureCount Dimension 5 x 5 50 x SubModel Param IconGIFPictureCount ImageType GIF SubModel Param AIIGIFPictureCount ImageType GIF Model Exp LargeGlFPictureRatio RATIO(LargeGlFPictureCount, AIIGIFPictureCount) Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/00158 Model Exp IhumbnailGlFPictureRatio RAIIo(IhumbnailGlFPictureCount, All G IFP ictu reCount) Model Exp IconGlFPictureRatio RAIIO(lconGlFPictureCount, AllGlFPictureCount) SubModel Param LargeJPEGPictureCount Dimension 201 x 201 999 x 999 SubModel Param LargeJPEGPictureCount ImageType JPEG SubModel Param IhumbnailJPEGPictureCount Dimension 51 x 51 200 x 200 SubModel Param ThumbnailJPEGPictureCount ImageType JPEG SubModel Param IconiPEG PictureCount Dimension 5 x 5 50 x SubModel Param IconJPEGPictureCount ImageType JPEG' SubModel Param AllJPEGPictureCount ImageType JPEG Model Exp Largej PEG PictureRatio RATIO(LargejPEGPictureCount, AllJPEGPictureCount) Model Exp ThumbnailjPEGPictureRatio RATI O(ThumbnaiIJ PEG Pictu reCount, Al lJPEGPictureCount) Model Exp IconJPEGPictureRatio RATIO(Iconj PEG Pictu reCou nt, AllJ PEG PictureCount) SubModel Param LowDepthGlFPictureCount Depth 2 4 SubModel Param LowDepthGlFPictureCount ImageType GIF SubModel Param Medium DepthGlIFPictureCount Depth 5 6 SubModel Param Medium DepthGlIFPictureCount ImageType GIF SubModel Param HighDepthGlFPictureCount Depth 7 16 SubModel Param HighDepthGlFPictureCount ImageType GIF Model Exp LowDepthGlFPictureRatio RAIIO(LowDepthGlFPictureCount, Al IGlEPictureCount) Model Exp MediumDepthGlFPictureRatio RATIO(MediumDepthGlFPictureCount, AllGlEPictureCount) Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/0158 16 Model Exp HighDepthclFPictureRatio RAIIO(HighDepthGlFPictureCount, Al IGlEPictureCount) SubModel Param LowDepthjPEGPictureCount Depth 2 7 SubModel Param LowDepthJ PEGPictureCount ImageType JPEG SubModel Param Med iumDepthj PEG PictureCount Depth 8 SubModel Param MediumDepthJPEGPictureCouflt Imagelype JPEG SubModel Param HighDepthJPEGPictureCount Depth 1 6 36 SubModel Param HighDepthJPEGPictureCount ImageType JPEG Model Exp LowDepthjPEGPictureRatio RATIO(LowDepthjPEGPictureCount, Al IJPEGPictureCount) Model Exp MediumDepthJPEGPictureRatio RATIO(MediumDepthjPEGPictureCount, Al lJPEGPictureCount) Model Exp HighDepthJPEGPictureRatio RATIO(HighDepthJPEGPictureCount, Al lJPEGPictureCount) #Links Out Models SubModel Param All LinkOutCount IncludeL-ocal FALSE SubModel Param AVSLinkOutCount Classification ADULIVERIFICATION SubModel Param AVSLinkOutCount IncludeL-ocal FALSE Model Exp AVSLinkOutRatio RATIO(AVSLinkOutCount, Al lLinkOutCount) begin porn.conf #Body Text Word Count Models SubModel Param Porn Extral-ardWordCou nt WordF iie model s/d ictionary/porn/porn-words-extrahard .txt SubModel Param Porn Hard WordCount WordFile modelIs/d iction ary/porn/porn-word s-hard.txt Substitute Sheet (Rule 26) RO/AU IWO 00/52598 PCT/AUOO/00158 17 SubModel Param PornMediumWordCount WordFile models/dictionary/porn/porn_words_medi um.txt SubModel Param Porn LiteWordCount WordFile model s/dictionary/porn/pornwordsI ite.txt SubModel Param PornExtraLiteWord Count WordFile models/dictionary/porn/porn words extral ite.txt #Unique Body Text Word Count Models SubModel Param PornExtraHardWordCountUnique WordFile models/dicti onary/porn/pornwordsextrahard .txt SubModel Param PornExtraHardWordCountUniqLle Mode Distinct SubModel Param PornHardWordCountUnique WordFile model s/dictionary/porn/pornwordshard.txt SubModel Param PornHardWordCountUnique Mode Distinct SubModel Param PornMediumWordCountUnique WordFile models/dictionary/porn/pornwordsmedium. txt SubModel Param PornMediumWordCountUnique Mode Distinct SubModel Param PorntiteWordCountUnique WordFile model s/dictionary/porn/pornwordsI ite.txt SubModel Param Porn LiteWordCountU n i que Mode Distinct SubModel Param PornExtraiteWordCountUn ique WordFile models/dictionary/porn/pornwordsextral ite.txt SubModel Param PornExtraLiteWordCountUnique Mode Distinct #Body Text Word Ratio Models Model Exp PornlextWordRatioExtraHard RAIIO(PornExtraHardWordCount, AllWordCount) Model Exp PornlextWord Ratio Hard RATIO(Porn HardWordCount, AllWordCount) Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/00158 18 Model Exp PornlextWord RatioMed iurn RAIIO(PornMed iumWordCoun i, Al lWordCount) Model Exp PornlextWordRatioLite RAIO(Porn LiteWord Count, Al lWordCount) Model Exp PornlextWordRatioExtraLite RATIO(PornExtraLiteWordCount, Al lWordCount) #Body Text Unique Word Ratio Models Model Exp PornlextWordRatioExtraHard Unique RATI O(Porn ExtraH ardWord Count Un iq ue, AllWordlCountUn ique) Model Exp PornTextWordRatioHard Unique RATIO(PornHardWordCountUnique, Al lWordCountUnique) Model Exp PornTextWordRatioMed iurnUnique RATIO(PornMediumWordCountUnique, AllWordCountUnique) Model Exp PornTextWordRatioLiteUnique RATIO(PornLiteWordCountUnique, Al lWordlCountUnique) Model Exp PornTextWord Rati oExtra.Lite Un iq ue RATIO(PornExtraLiteWordCountUnique, Al lWordCountUnique) #Domnain Word Count Models SubModel Pararn Porn Extra Hard Domai nWordCou nt Context DOMAIN-NAME SubModel Pararn Porn ExtraHard Domai nWordCou nt WordFile model s/d icti onary/porn/porn-words-extrahard .txt SubModel Param Porn Hard Domai nWordCount Context DOMAI N-NAME SubModel Param Porn Hard Domai nWordCount WordFile model s/d icti onary/porn/porn-words-hard .txt SubModel Param PornMediumDomainWordCount Context DOMAIN-NAME SubModel Param PornMediumDomainWordCount WordFile model s/d icti onary/porn/porn-words-med ium.txt SubModel Param Porn LiteDomainWordCount Context DOMAIN-NAME Substitute Sheet (Rule 26) RO/AU WO 00/52598 PCT/AUOO/00158 19 SubModel Param Porn LiteDomainWordCount WordFile models/dictionary/porn/porn wordsI ite.txt SubModel Param PornExtraLiteDomai nWordCount Context DOMAIN-NAME SubModel Param PornExtraLiteDomai nWordCount WordFile models/dictionary/porn/porn wordsextral ite.txt #Meta Text Word Count Models SubModel Param Porn ExtraHardMetaWordCount Context META SubModel Param Porn ExtraHardMetaWordCount WordFile models/dictionary/porn/pornwordsextrahard .txt SubModel Param Porn HardMetaWordCount Context META SubModeI Param Porn HardMetaWordCount WordFile models/dictionary/porn/pornwordshard.txt SubModel Param Porn Med i umMetaWordCount Context META SubModel Param PornMed i umMetaWordCount WordFile models/dictionary/porn/pornwordsmed i um.txt SubModel Param Porn LiteMetaWordCount Context META SubModel Param Porn LiteMetaWordCount WordFile models/dictionary/porn/pornwords_l ite.txt SubModel Param PornExtra LiteMetaWordCount Context META SubModel Param PornExtraLiteMetaWordCount WordFile models/d icti onary/porn/pornwordsextral ite.txt #Meta Text Word Ratio Models Model Exp Porn MetaWord Rati oExtraHard RATIO(PornExtraHardMetaWordCount, AllMetaWordCount) Model Exp Porn MetaWordRatioHard RAIIO(Porn HardMetaWordCount, AllMetaWordCount) Substitute Sheet (Rule 26) RO/AU WVO 00/52598 PCT/AUOO/00158 Model Exp PornMetaWordRatioMedium RATIO(PornMediumMetaWordCount, AllMetaWordCount) Model Exp PornMetaWordRatioLite RATIO(Porn LiteMetaWordCount, AlIMetaWordCount) Model Exp PornMetaWordRatioExtraLite RATIO(PornExtraLiteMetaWordCount, AlIMetaWordCount) #Alternate Text Word Count Models SubModel Param Porn ExtraHardAlternateWordCount Context ALTERNATE SubModel Param PornExtraHardAlternateWordCount WordFile mode I s/d i cti onary/porn/pornwords_extrahard .txt SubModel Param Porn HardAlternateWordCount Context ALTERNATE SubModel Param Porn HardAlternateWordCou nt WordlFile model s/d i cti onary/porn/porn-words_hard.txt SubModel Param PornMed i umAlternateWordCount Context ALTERNATE SubModel Param PornMediumAlternateWordCount WordFile model s/dictionary/porn/pornwordsmedium.txt SubModel Param Porn Li teAl tern ateWordCou nt Context ALTERNATE SubModel Param Porn LiteAlternateWordCount WordFile models/dictionary/porn/pornwordsI ite.txt SubModel Param PornExtraLiteAlternateWordCount Context ALTERNATE SubModel Param Porn ExtraLiteAlternateWordCou nt Word File models/d ictionary/porn/pornwordsextral ite.txt #Alternate Text Word Ratio Models Model Exp PornAlternateWordRatioExtraHard RATI O(PornExtraHardAlternateWordCount, Al lAlternateWordCount) Model Exp PornA ternateWordRatio Hard RATI O(Porn HardAlternateWordCount, Al lAlternateWordCount) Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/001 58 21 Model Exp PornAlternateWordRatioMedium RATIO(PornMedijumAlternateWordCou nt, Al lAlternateWordCount) Model Exp PornAlternateWordRatio Lite RAIIO(Porn LiteAlternateWordCount, Al lAl tern ateWordCou nt) Model Exp PornAlternateWordRatioExtraLite RAT IO(Porn ExtraLiteAlternateWord Count, Al lAlternateWordCount) #Links Out Models SubModel Param Porn LinkOutCount Classification PORN SubModel Param Porn Li nkOutCount IncludeLocal FALSE Model Exp Porn LinkOutRatio RAIIO(PornLinkOutCount, Al lLinkOutCount) #Logistic Models Model Exp Porn LRConstant -3.9869 Model Exp Porn LRCoeffi ci entPornlextWord Rati oExtra Hard 39.7450 Model Exp Porn LRCoeffi ci entPo rnlextWord Ratio Hard 355.0550 Model Exp Porn LRCoeff icientPornlextWord RatioMed iurn -136.436 Model Exp Porn LRCoeffi ci entPornTextWord Rat io Lite -63.2565 Model Exp Porn LRCoeff icientPornTextWordRatioExtraLte
33.9054 Model Exp Porn LRCoeffi ci entPornTextWord Rati oExtra Hard U n ique 111 .4752 Model Exp Porn LRCoeffi ci entPornTextWord Rat ioHard U n iqu e -72.7005 Model Exp Porn LRCoefficientPornTextWordRatioMed ium Unique 264.1902 Model Exp Porn LRCoeff icientPornTextWord Rati oLiteU nique 125.0743 Model Exp PornLRCoeff icientPornTextWordRatioExtraLiteUnique -16.6895 Model Exp Porn LRCoeffi ci entPornE xtra Hard Domai nWordCou nt 0.2598 Model Exp Porn LRCoeffici entPorn Hard Domai nWordCount 2.1 344 Substitute Sheet (Rule 26) ROMA WO 00/52598 WO 0052598PCT/AUOO/00158 22 Model Exp PornLRCoefficientPornMediumDomainWordCoufli 0 Model Exp Porn LRCoeff icientPorn LiteDornainWordCount 0.0610 Model Exp Porn LRCoefficientPornExtraLiteDomainWordCouflt 0 Model Exp Porn LRCoeff icientPornMetaWord RatioExtraHard 0 Model Exp Porn LRCoeffi cientPorn MetaWord Rat ioHard 0 Model Exp Porn LRCoeff i ci entPornMetaWord Rat ioMed i urn 0 Model Exp Porn LRCoefficientPornMetaWordRatioLite 0 Model Exp Porn LRCoeffi cientPornMetaWordRatioExtraLite 0 Model Exp Porn LRCoefficientPornAlternateWordRatioExtraHard 1 6.1 972 Model Exp Porn LRCoeff i ci entPornAl ternateWord Ratio Hard 0 Model Exp Porn LRCoeff icientPornAlternateWordRatioMed ium 26.4186 Model Exp Porn LRCoeffi ci entPornAlternateWordRatioLite 0 Model Exp Porn LRCoeff icientPornAlternateWordRatioExtraLite 14.1 61 Model Exp Porn LRCoeff icientAl IGlEFPictureCount 0 Model Exp Porn LRCoeff icientLargeGlIFPictureCount 0 Model Exp Porn LRCoefficientlconG lFPictu reCou nt 0 Model Exp Porn LRCoefficientThumbnai lGlFPictureCount 0 Model Exp Porn LRCoeff icientLargeGlIFPictureRatio 0 Model Exp Porn LRCoeffi cientlcon G IFPictureRatio 0 Model Exp Porn LRCoeffi cientlhumbnai lGlFPi ctureRatio 0 Model Exp PorntLRCoefficientH igh DepthGlIFPi ctureCount 0 Model Exp PorntLRCoeff icientMed iumnDepthG lEPi ctu reCount 0 Model Exp Porn LRCoefficientLowDepthGlIFPictureCount 0 Model Exp Porn LRCoeff icientHigh DepthGlFPi ctureRatio 0 Model Exp Porn LRCoefficientMed iumDepthGl FPictureRatio 0 Model Exp Porn LRCoefficienttowDepthGlFPictureRatio 0 Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/00158 23 Model Exp Porn LRCoefficientAlIj PEG P ictureCount 0 Model Exp Porn LRCoeff icientLargej PEG PictureCount 0 Model Exp Porn LRCoeffi cien tlconj PEG P ictureCount 0 Model Exp Porn LRCoeffici entlhumbnai lJ PEG PictureCount 0 Model Exp Porn LRCoeffici entLargej PEG PictureRatio 0 Model Exp Porn LRCoefficientl conj PEG PictureRatio 0 Model Exp Porn LRCoefficientThumbnai IJ PEG Pi ctureRatio 0 Model Exp Porn LRCoeffici entH igh Depthj PEG PictureCount 0 Model Exp Porn LRCoefficientMed i urn Depthj PEG Pi ctureCount 0 Model Exp Porn LRCoeff i cient Low DepthJ PEG Pi ctu reCou nt 0 Model Exp Porn LRCoeff icientHigh DepthjPEGPictureRatio 0 Model Exp Porn LRCoeffi cientMedi um Depthj PEG Pi ctu reRatio 0 Model Exp Porn LRCoeff icientLowDepthj PEG Pi ctureRatio 0 Model Exp Porn LRCoeffi cientPorn Li n kOutRatio 4.6958 Model Exp Porn LRCoeff icientAVSLi nkOutCount 0.3327 Model Exp Porn LRCoefficientAVS Li nkOutRatio 3.6786 Model Exp Porn LRLogOdds S UM(Porn LRConstant,\ PROD UCT(Porn LRCoeffi cientPornlextWord Rat ioExtra Hard, PornlextWordRatioExtraHard), PROD U CT(Porn LRCoeffi cientPornlextWordRatioHard, PornlextWordRatioHard), PROD UCI(Porn LRCoefficientPornlextWordRatioMed iurn, PornlextWordRatioMedium), PROD UCI(Porn LRCoeffi cientPornlextWordRatioLite, PornlextWordRatioLite), PROD UCT(Porn LRCoeff ici entPornlextWord Rat ioExtratite, PornlextWordRatioExtraLite), Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/00I 58 24 PROD UCI(Porn LRCoeffi cien tPornlextWordRati oExtra Hard Unique, PornlextWord RatioExtraHard Unique), PROD UCI(Porn LRCoeff ici entPornTextWordRatioHard Unique, PornlextWord Ratio Hard U n ique), PROD UCT(Porn LRCoefficientPornlextWordRati oMedi urn Unique, PornlextWordRatioMediumUnique), PROD UCT(Porn LRCoefficientPornlextWordRatioLiteUfl ique, PornTextWord Rati oLiteU nique), PRO DUCT(Porn LRCoeff i cientPornTextWordRatioExtraLiteU n ique, PornlextWordRati oExtraLiteJn ique),\ PROD UCT(Porn LRCoefficientPorn ExtraHard Domai nWordCou nt, Porn ExtraHard Dorai nWordCount), PRODUCT(Porn LRCoeff icientPorn Hard Domai nWordCount, 1 5 Porn Hard Domai nWordCount), PROD UCT(Porn LRCoefficientPornMed i umDomai nWordCount, PornMedium DomainWordCount), PROD UCT(Porn LRCoeffici entPorn LiteDomai nWordCount, PorntLiteDomai nWordCount), PROD UCT(Porn LRCoeffi ci en tPorn ExtraLiteDoma i nWordlCou nt, PornExtraLiteDomai nWordCount), PROD UCT(Porn LRCoeffi c ien tPorn MetaWord Rati oExtra Hard, Porn MetaWo rd Rat ioExtra Hard), PROD UCT(Porn LRCoeffi ci entPornMetaWord Rati oHard, PornMetaWordRatioHard),\ PROD UCI(Porn LRCoefficientPornMetaWordRatioMed i urn, Porn MetaWord Rati oMed iu PROD UCT(Porn LRCoeffi cientPornMetaWordRati oLite, Porn MetaWord Rati oLite), PROD UCI(Porn LRCoeffici entPornMetaWordRatioExtraLite, PornMetaWord Rat i oExtraLi te), Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/001 58 PROD UCI(Porn LRCoeffi ci entPornAlIternateWord Rati oExtra Hard, PornAlternateWordRatioExtraHard), PROD UCI(Porn LRCoefficientPornAlternateWordRatioHard, PornAlternateWord RatioHard), PROD UCT(Porn LRCoefficientPornA IternateWordRatioMed i urn, PornAlternateWord RatioMedium), PRODUCI(Porn LRCoeff icientPornAl ternateWordRatioLite, PornAlternateWordRatioLite), PROD UCT(Porn LRCoeffi cientPornAlternateWord RatloExtraLite, PornAlternateWordRatioExtraLite), PRO DUCI(Porn LRCoeffi cientAl I GlIEPi ctureCount, AltGlFPictureCount),\ PRO DUCT(Porn LRCoeffi cienttargeG IFPictu reCount, LargeGlFPictureCount), 1 5 PRO DUCI(Porn LRCoeff icientlconGlIFPi ctureCouflt, IconGlFPictu reCount),\ PRO DUCI(Porn LRCoeffi cientlh umbnai I G IlEPictureCount, ThurnbnailGlFPictureCount), PROD UCT(Porn LRCoeff icientLargeG IlEPictureRatl o, LargeGlFPictureRatio), PROD UCT(PornLRCoeff icientlconGlIFPictureRatio, IconG IFPictureRatio),\ PROD UCT(Porn LRCoeff icientThurnbnai I G I FPictureRatio, ThurnbnailGlFPictureRatio), PRODUCI(PornLRCoefficientHigh DepthG IEPi ctureCount, High DepthGlFPictureCount), PROD UCI(Porn LRCoeff icientMediumDepthG IFPictureCount, Medium DepthGlIFPictureCount), PROD UCT(Porn LRCoeffi cient LowDepthG I FP ictu reCou nt, LowDepthGlFPictureCount), PROD UCI(Porn LRCoefficientHighDepthGIFPictureRatio, HighDepthGlFPictureRatio), Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/00158 26 PROD UCI(Porn LRCoefficientMedi urn DepthG IFPictu reRatio, MediumDepthGFPictL'reRatio), PROD UCT(Porn LRCoeff i ci en tLowDepth G I EPi ctureRati o, LowDepthGlFPictureRatio), PROD UCT(Porn LRCoeff iciefltAl IJ PEGcPictureCount, Al IJPEcPictureCount), PROD UCI(Porn LRCoefficientLargei PEG PictureCount, LargeJPEGPictureCouflt), PROD UCT(Porn LRCoeff icientcoliPEG Pi ctureCouflt, Iconj PEG PictureCount), PRODUCI(Porn LRCoeff icientlhumbnai Ij PEG Pi ctureCcoun L, ThumbnaiIj PEG Pi ctureCount), PROD UCT(Porn LRCoeff icientLargej PEG Pi ctureRatio, 1 5 LargeJPEGPictureRatio), PROD UCT(Porn LRCoefficientlconi PEG PictujreRatio, Iconj PEG PictureRatio), PROD UCI(Porn LRCoefficientThumbflai IJ PEG Pi cturePatio, ThumbnailjPEGPictureRatio), PROD UCT(Porn LRCoeff ici entH igh Depthj PEG Pi ctureCou nt, H ighDepthj PEG PictureCount),\ PROD UCT(Porn LRCoefficientMed i um Depthj PEG PictureCouflt, Mediu~mDepthjPEGPictureCouflt), PROD UCT(Porn LRCoeff ici entLowDepthj PEG Pictu reCount, LowDepthjPEGPictureCouflt), PROD UCI(Porn LRCoeffi cientH igh Depthj PEGPictu reRati o, High Depthj PEG Pi ctureRatio), PRO DUCT(Porn LRCoefficientMedi urn Depthj PEG PictureRatio, Medi ujmDepthj PEG PictureRatio), PROD UCI(Porn LRCoeff icientLowDepthj PEG PictureRatio, LowDepthj PEG Pi ctureRatio), Substitute Sheet (Rule 26) RO/AU WO 00/52598 WO 0052598PCT/AUOO/O01 58 27 PROD UCI(Porn LRCoeff icientPorn Lin kOutRatio, Porn LinkOutRatio),\ PROD UCT(Porn LRCoeff icientAVStinkOutCount, AVSLinkOutCount),\ PRODbUCI(Porn LRCoeffi cientAVS L inkOutRati o, AVSLinkOutRatio)) #Probability Analysers ProbabilIityAnalyser Param PornAltMetaWordCountProbabilIity Classification PORN Probabi I ityAnalyser Exp PornAltMetaWordCountProbabi I ity\ S UM(PornExtraHardMetaWordCount, Porn HardMetaWordCount,\ PROD UCT(O. 5, Porn Med iu mMetaWordCount),\ PornExtraHardAlternateWordCount, Porn HardAlternateWordCount, PROD UCI(0. 5, Porn Med i umAlternateWordCount)) ProbabilityAnalyser Param Porn MetaWord Rati oProbabilIity Classification PORN Probabi I ityAnalyser Exp PornMetaWord RatioProbabilIity\ PROD UCT( 100, S UM(Porn MetaWord RatioE xtra Hard,\ PornMetaWordRatioHard, Porn MetaWord Rati oMed i urn)) ProbabilityAnalyser Param Porn LRProbabilI ity Classification PORN ProbabilityAnalyser Exp Porn LRProbabilIity PRODUCI(100, RATIO( 1,SUM( 1, EXP(MI N US(Porn LRLogOdds))))) Substitute Sheet (Rule 26) RO/AU
AU28959/00A 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information Ceased AU761017B2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AUPP9048A AUPP904899A0 (en) 1999-03-04 1999-03-04 Apparatus and system for classifying and control access to information
AUPP9048 1999-03-04
AU28959/00A AU761017B2 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information
PCT/AU2000/000158 WO2000052598A1 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU28959/00A AU761017B2 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information

Publications (2)

Publication Number Publication Date
AU2895900A AU2895900A (en) 2000-09-21
AU761017B2 true AU761017B2 (en) 2003-05-29

Family

ID=25620890

Family Applications (1)

Application Number Title Priority Date Filing Date
AU28959/00A Ceased AU761017B2 (en) 1999-03-04 2000-03-06 Apparatus and system for classifying and control access to information

Country Status (1)

Country Link
AU (1) AU761017B2 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5835905A (en) * 1997-04-09 1998-11-10 Xerox Corporation System for predicting documents relevant to focus documents by spreading activation through network representations of a linked collection of documents

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5835905A (en) * 1997-04-09 1998-11-10 Xerox Corporation System for predicting documents relevant to focus documents by spreading activation through network representations of a linked collection of documents

Also Published As

Publication number Publication date
AU2895900A (en) 2000-09-21

Similar Documents

Publication Publication Date Title
US7975025B1 (en) Smart prefetching of data over a network
US7047229B2 (en) Searching content on web pages
JP5209733B2 (en) The methods and systems that use keyword vector and associated metrics to learn and predict the user mutual association of targeted content messages in a mobile environment
US6704874B1 (en) Network-based alert management
CN102203765B (en) Uniquely identifying network-distributed devices without explicitly provided device or user identifying information
CN102546738B (en) System and method for allocating resources based on network environment event
US8356001B2 (en) Systems and methods for application-level security
US6003030A (en) System and method for optimized storage and retrieval of data on a distributed computer network
US7734807B2 (en) Method and apparatus for improving bandwidth efficiency in a computer network
US9438614B2 (en) Sdi-scam
US7225180B2 (en) Filtering search results
Yu et al. Predicted packet padding for anonymous web browsing against traffic analysis attacks
US6662230B1 (en) System and method for dynamically limiting robot access to server data
US20020103787A1 (en) Category searching
EP1764951B1 (en) Statistical trace-based method, apparatus, node and system for real-time traffic classification
US8438386B2 (en) System and method for developing a risk profile for an internet service
Fiedler et al. A generic quantitative relationship between quality of experience and quality of service
US8516601B2 (en) Online privacy management
US6606659B1 (en) System and method for controlling access to internet sites
US7152018B2 (en) System and method for monitoring usage patterns
EP1008087B1 (en) Method and apparatus for remote network access logging and reporting
Johnson et al. Users get routed: Traffic correlation on Tor by realistic adversaries
JP4292403B2 (en) Filtering technology to manage access to the Internet site or other software applications
US8527504B1 (en) Data network content filtering using categorized filtering parameters
US20030009495A1 (en) Systems and methods for filtering electronic content

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)