US7613766B2 - Apparatus and method for linguistic scoring - Google Patents
Apparatus and method for linguistic scoring Download PDFInfo
- Publication number
- US7613766B2 US7613766B2 US10/748,677 US74867703A US7613766B2 US 7613766 B2 US7613766 B2 US 7613766B2 US 74867703 A US74867703 A US 74867703A US 7613766 B2 US7613766 B2 US 7613766B2
- Authority
- US
- United States
- Prior art keywords
- trigger
- triggers
- requisite
- category
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000000034 method Methods 0.000 title claims abstract description 76
- 238000004891 communication Methods 0.000 claims description 16
- 230000009471 action Effects 0.000 claims description 9
- 238000011156 evaluation Methods 0.000 claims description 5
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 230000000903 blocking effect Effects 0.000 claims description 2
- 238000012544 monitoring process Methods 0.000 abstract description 14
- 238000013459 approach Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 52
- 238000010586 diagram Methods 0.000 description 16
- 230000006870 function Effects 0.000 description 12
- 230000000694 effects Effects 0.000 description 6
- 208000001613 Gambling Diseases 0.000 description 5
- 230000003068 static effect Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 201000009032 substance abuse Diseases 0.000 description 1
- 231100000736 substance abuse Toxicity 0.000 description 1
- 208000011117 substance-related disease Diseases 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
Definitions
- the invention relates to linguistic analysis.
- the invention relates to systems and methods for scoring textual data based on relevance of the textual data to one or more pre-defined and/or custom categories.
- Network-based communications for example those enabled by the Internet, have made available a wide variety of data to network users. But all data types may not be appropriate for all user types. For example, a parent may seek to protect their children from pornographic Web sites, and an employer may seek to prevent hate speech or other categories of communications within their private enterprise. Accordingly, systems and methods have been developed to monitor network-based communications so that access to such data can then be blocked or reported, for example.
- a system receives selections from a user based on a list of pre-defined monitoring categories and/or optionally receives custom category definitions from the user.
- the option for custom category definitions may be advantageous due to the flexibility provided to a system administrator or other user.
- the pre-defined and/or custom monitoring categories may be or include complex hierarchical behavior. Such an approach provides monitoring algorithms that can achieve improved accuracy compared to known methods.
- the computations used in resolving a monitoring category may be re-ordered, statically and/or dynamically, to improve the efficiency of monitoring operations.
- FIG. 1 is a functional architecture for a linguistic analysis system, according to an embodiment of the invention
- FIG. 2 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention
- FIG. 3 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 4 is a schematic diagram of a trigger, according to an embodiment of the invention.
- FIG. 5 is a schematic diagram of an ordered list of pre-requisite triggers, according to an embodiment of the invention.
- FIG. 6 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 7 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 8A is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 8B is an illustration of a truth table for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 9 is a process flow diagram for a dynamic reordering method, according to an embodiment of the invention.
- Scoring refers to the underlying computations required in determining whether a category is a hit (e.g., whether or not the data source has been resolved to be within a particular category). Scoring is then described as a complex aggregate behavior, where, for example, a category definition may include multiple pre-requisite triggers.
- a trigger is a regular expression (regex) or other code that performs a textual search function. Accordingly, a discussion is provided on how such linguistic triggers may be aggregated, how such triggers may be constructed, and how complex aggregated behavior may be simplified.
- threshold scoring includes a description of static re-ordering of pre-requisite triggers to improve scoring efficiency. Exemplary embodiments are also provided for Boolean logic scoring behavior using two or more pre-requisite triggers. The detailed description concludes with a discussion of dynamic re-ordering of pre-requisite triggers, which may be applied to Boolean scoring behavior and/or threshold scoring behavior as another way to improve the efficiency of linguistic scoring.
- FIG. 1 is a functional architecture for a linguistic analysis system, according to an embodiment of the invention.
- a linguistic analysis system includes an Internet 102 , a Web page host 104 , an email server 106 , a router/firewall 108 , a Linguistic Analysis Server (LAS) 110 , an intranet 112 , and network clients 114 , 116 and 118 .
- LAS Linguistic Analysis Server
- the email server 106 , router/firewall 108 , LAS 110 , and clients 114 , 116 and 118 are coupled to the intranet 112 , and the Internet 102 is coupled to the router/firewall 108 and the Web page host 104 .
- the LAS 110 monitors data communications on intranet 112 associated with one or more clients 114 , 116 and/or 118 .
- the LAS 110 may be configured to monitor email communications, chat, instant messaging (IM), point-to-point (P2P) communications, File Transfer Protocol (FTP) communications, and/or URL-based Web browser communications.
- communications monitored by the LAS 110 may be communications local to the intranet 112 and/or between any one of clients 114 , 116 , and 118 and the Internet 102 , for example.
- the LAS 110 may be or include, for example, a computer having an Intel 3 GHz processor, 2 GB of Random Access Memory (RAM), a 120 GB hard drive, a Compact Disc Read-Only Memory (CD ROM), and a Red Hat Linux Operating System (OS).
- the clients 114 , 116 , and/or 118 may be or include, for example, a personal computer, a Personal Data Assistant (PDA), a Web-enabled telephone, or other networkable user interface device.
- PDA Personal Data Assistant
- Internet 102 Webpage Host 104 , email server 106 and router/firewall 108 are optional system components.
- intranet 112 and/or Internet 102 may be replaced, for example, by a Local Area Network (LAN), Wide Area Network (WAN), or other wired or wireless network configuration.
- the LAS 110 may only monitor traffic local to the intranet 112 , or only between, for example, clients 114 , 116 , and 118 and the Internet 102 .
- the functionality of LAS 110 may reside in, for example, email server 106 , router/firewall 108 , and/or in each of the clients 114 , 116 , and 118 .
- the linguistic analysis processes described below with reference to FIGS. 2 , 3 , and 6 - 9 may be implemented with computer-executable code. Moreover, such code may be stored on a CD ROM, hard drive, or other data storage medium and/or loaded into RAM for execution by a processor. For example, code for performing the processes described herein may be stored in the 120 GB hard drive of the LAS 100 , loaded into the RAM of the LAS 110 , and executed by the 3 GHz processor of the LAS 110 .
- FIG. 2 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 2 is depicted from the perspective of LAS 110 .
- the process begins by receiving a selection from a list of pre-defined categories in step 202 .
- the predefined categories may be, for instance, categories such as: adult, confidential, conflict, gambling, games, merger and acquisition, Vietnamese, resignation, shopping, sports, substance abuse, stock trading, and/or other predefined data category.
- a system administrator or other user of LAS 110 may select the predefined categories based on an Approved Usage Policy (AUP) for a corporation, or based other criterion.
- AUP Approved Usage Policy
- the LAS 110 optionally receives a custom category definition.
- a custom category definition may be based on one or more of the predefined categories. For example, in the case where a user has selected the predefined category of mergers and acquisitions, a user may further specify that when a hit is resolved for the predefined category of mergers and acquisitions, a custom category is resolved based on a particular company name. Accordingly, the form of a custom category definition may include both search criteria (e.g., a particular company name) and a link to a selected category (e.g., mergers and acquisitions).
- Step 206 the LAS 110 prepares the data source for analysis.
- Step 206 may include collecting data from a data stream, a file system, database, or other data source.
- Step 206 may further include, in combination with, or in the alternative to collecting data, partitioning the data into sessions, groups of sessions, or other logical group(s) for analysis.
- LAS 110 may collect an email correspondence and its reply from email server 102 for linguistic scoring.
- step 208 the LAS 110 performs scoring of input data sources resulting from step 206 against the selected predefined categories and/or custom categories received in steps 202 and 204 , respectively.
- step 210 the system performs predetermined action(s) for each of the selected and/or custom categories that is resolved as a hit (also referred to herein as resolved-positive).
- action may include, for instance, blocking a URL, alerting an administrator via email, pager, or Simple Network Management Protocol (SNMP) trap, or logging data for later review by a system administrator, manager, or other user.
- SNMP Simple Network Management Protocol
- a trigger is a regular expression (regex) or other code that performs a textual search function.
- a category is a named trigger. Triggers and/or categories can be arranged into a hierarchy of complex aggregate behavior, as illustrated in FIG. 3 and described below.
- FIG. 3 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- data source 302 is a pre-requisite for resolution of triggers 304 , 306 , 312 , 314 , and 316 .
- Triggers 304 and 306 are pre-requisite triggers (or contained triggers) for containing trigger 310 .
- triggers 310 and 312 are pre-requisite triggers for category 318
- triggers 312 and 314 are pre-requisite triggers for category 320
- category 320 and trigger 316 are pre-requisite triggers for category 324 .
- a predefined score is associated with each trigger.
- the scores of all contained triggers are used in resolving the containing trigger. For example, if both triggers 310 and 312 are resolved positive (determined to be as a hit), then category 318 would be resolved using the predefined scores from triggers 310 and 312 .
- FIG. 3 illustrates that a score may be modified in resolving a containing trigger. For example, if trigger 304 is resolved as a hit, then the score associated with trigger 304 is increased by 5 , as illustrated by addition operator 308 , in resolving trigger 310 .
- the effect of addition operator 308 is to add increased importance to trigger 304 in resolving trigger 310 .
- subtraction, multiplication, and/or division operators could be used to similar effect.
- the addition operator 308 is a property of the containing trigger 310 . The reason for this is more apparent when considering the relative importance of trigger 312 in FIG. 3 : if trigger 312 is a hit, its score is not modified in resolving category 318 , but is increased by 10 in resolving category 320 .
- negation operator Another way that a score can be modified is with a negation operator.
- the score associated with trigger 316 is negated by negation operator 322 in resolving category 324 .
- the negation operator is a property of the containing trigger.
- Trigger 316 category 324 , and associated links are illustrated in dashed lines to indicate that category 324 may be a custom category rather than a predefined category.
- FIG. 4 is a schematic diagram of a trigger, according to an embodiment of the invention.
- a trigger may include status data 404 , invert data 406 , threshold data 408 , tally data 410 , an ordered list of pre-requisite triggers 412 , a pattern tuple 414 , a list of triggers that are potentially updated if the status of the current trigger becomes resolved-positive 416 , a list of triggers that are potentially updated if the status of the current trigger becomes resolved-negative 418 , a user-specified name (e.g., a category name) 420 and a list of actions 422 if the category is resolved positive.
- a user-specified name e.g., a category name
- Status data 404 may be unresolved, resolved-positive, or resolved-negative. The effect of the resolved status may be inverted according to invert data 406 .
- Threshold data 408 is a predetermined number that may be used to resolve a trigger. For example, if a containing trigger has a threshold of 5 , and the only pre-requisite trigger has been resolved positive and has a score of 6 , then the threshold of the containing trigger has been exceeded, and the containing trigger is resolved-positive.
- the tally 410 is a parameter (e.g., a running total) that reflects the effect of all pre-requisite triggers that have been considered in resolving the containing trigger.
- the ordered list of pre-requisite triggers 412 provides information about the contained triggers (used if the status of the containing trigger is unresolved), and will be described in more detail with reference to FIG. 5 below.
- Pattern Tuple 414 includes a reference to a particular pattern-evaluation engine.
- Potential pattern-evaluation engines include regular expression engines, string matchers, numeric and character comparisons, IP-in-network/netmask-range, “always true” and “always false”.
- Pattern Tuple 414 may further include a reference to some data. This may be “raw” data, the result of applying transformations to the raw data, or data related to the raw data. One example transformation is converting all uppercase letters to lowercase.
- Related data includes the length of the data. If the data is extracted from network traffic, related data may also include the IPs of the involved hosts or information associated with the IPs of the involved hosts.
- related data may also include the name of the file, permissions of the file, and owner(s) of the file.
- evaluation of a pattern tuple may generate more data that subsequently may be used in other pattern tuples.
- This additional data which may also be included in pattern tuple 414 , may include a number of times the pattern matched, offsets from the beginning of the data to the beginning or end of matched data, etc.
- trigger 312 would include category 318 and category 320 in list 416 .
- the list of triggers that are potentially updated if the status of the current trigger becomes resolved-negative 418 is also self-descriptive. Such cases may arise, for instance, where the data is inverted. For example, consider a gambling trigger containing a news story pre-requisite trigger, where the new story pre-requisite trigger has invert data 406 . In this case, the gambling trigger is only evaluated if the news trigger is not a hit. The effect is that gambling is not scored for news stories related to gambling.
- Complex aggregate behavior models may be simplified with reference to data included in trigger/category 402 .
- two or more triggers containing the same pattern tuple may be collapsed into exactly one trigger so a pattern tuple is never evaluated more than once.
- resolved-positive output lists 416 and resolved-negative output lists 418 are appended.
- one or more triggers containing an identical list of prerequisite triggers 504 , respective scores 506 , and respective negate statuses 406 may be collapsed into exactly one trigger so the list is never evaluated more than once.
- the system may be configured so that only categories having at least one action 422 (and all prerequisite triggers of such categories) are loaded into RAM and/or resolved.
- categories 320 and 324 each included actions 422 , but category 318 did not include any actions 422 , then trigger 304 , trigger 306 , trigger 310 , and category 318 would not be loaded into RAM and/or would not be resolved.
- FIG. 5 is a schematic diagram of an ordered list of pre-requisite triggers, according to an embodiment of the invention.
- an ordered list 502 includes a list of prerequisite triggers 504 , a list of scores for each of the prerequisite triggers 506 , a total for all subsequent positive scores 508 , and a total for all subsequent negative scores 510 .
- FIG. 6 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention. To illustrate the operation of the process in FIG. 6 , consider a containing trigger having three pre-requisite triggers: trigger A is associated with a score of ⁇ 2, trigger B is associated with a score of +1, and trigger C is associated with a score of +13.
- step 602 The process begins in step 602 with receiving a data source.
- step 604 the tally for a containing trigger is set equal to zero.
- step 606 the system orders contained triggers based on decreasing absolute value of scores. In the example presented, the contained triggers would be ordered: C, A, and B in step 606 .
- the system may execute step 606 using the list of prerequisite triggers 504 and the list of scores for each of the prerequisite triggers 506 .
- Step 606 is an example of static re-ordering of triggers within a complex aggregate behavior.
- step 608 the process selects the first or next trigger (in the preceding example, trigger C would be selected first).
- conditional step 618 it is determined whether the process is done. In other words, in step 618 , it is determined whether all contained triggers have been evaluated. Where the result of conditional step 618 is in the affirmative, the process advances to step 620 where the containing trigger is identified as a non-hit (resolved negative). On the other hand, where the result of conditional step 618 is in the negative, the process advances to step 608 to select the next contained trigger (as ordered in step 606 ) before returning to conditional step 610 .
- step 614 operates to provide an early exit where a containing trigger can be resolved by evaluating less than all pre-requisite triggers.
- the effect of ordering step 606 and selection step 608 is to further improve the efficiency of a trigger having an early exit feature.
- a trigger may be configured to perform a Boolean logic function. In such cases, the predetermined threshold is zero.
- FIG. 7 is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 7 illustrates a logical AND function for a category having prerequisite triggers identified as a first trigger and a second trigger.
- FIG. 7 further illustrates the application of a pattern tuple.
- the process begins in step 702 by receiving a data source.
- the process advances to conditional step 704 where it is determined whether the input data source is from a particular source account X.
- a source account may be an alias associated with any description of source.
- a source account may be an alias associated with From, MAIL FROM, and Reply To fields.
- the process advances to step 718 where the category tally is set to ⁇ MAX, and the category is a non-hit (resolved negative) in step 720 .
- Steps 704 and 718 may be based on a pattern tuple 414 .
- step 706 where it is determined whether the first trigger is a hit.
- step 708 where it is determined whether the score for the first trigger is >0.
- step 710 where it is determined whether the second trigger is a hit.
- step 712 it is determined whether the score for the second trigger is >0.
- the category is a hit (resolved positive) in step 614 and the process will terminate with actions in step 716 .
- step 720 indicating a non-hit of the category.
- the category is a hit only when both the first trigger and the second trigger are hits, and where their associated scores are greater than zero.
- FIG. 7 also illustrates that where ⁇ MAX is applied to a trigger tally, the trigger is immediately considered to be a non-hit.
- FIG. 7 also illustrates an early exit for the case where the first trigger is not a hit (since in this instance, the second trigger is not evaluated).
- FIG. 8A is a process flow diagram of a method for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 8A illustrates a logical OR function for a category having prerequisite triggers identified as a first trigger and a second trigger.
- FIG. 8A further illustrates the application of a pattern tuple.
- step 802 The process begins in step 802 with receiving a data source.
- Steps 804 and 814 may be based on a pattern tuple 414 .
- conditional step 806 determines whether the first trigger is a hit. Where the result of conditional step 806 is in the affirmative, the process advances to step 808 where it is determined whether the score for the first trigger is >0. Where the result of conditional step 806 is in the affirmative, then the process advances to step 810 , indicating that the category is a hit (resolved positive). Then, in step 812 , appropriate action for the category is performed.
- conditional step 816 determines whether the second trigger is a hit.
- step 818 determines whether the score for the second trigger is >0.
- the process advances to step 810 , indicating that the category is a hit.
- the process advances to step 820 , indicating that the category is a non-hit.
- FIG. 8A illustrates that the category will be a hit where either the first trigger is a hit and has a score greater than zero, or where the second trigger is a hit and has a score greater than zero.
- FIG. 8A also illustrates that where ⁇ MAX is applied to a trigger tally, the trigger is immediately considered to be a non-hit.
- FIG. 8A further illustrates an early exit function, since the category is resolved positive if it is determined that the first category is a hit and has a score >0.
- FIG. 8B is an illustration of a truth table for performing linguistic analysis, according to an embodiment of the invention.
- FIG. 8B is a truth table for a category having a logical OR function based on 1 st and 2 nd pre-requisite triggers.
- the category also includes a pattern tuple that is seeking to match a particular IP address.
- column 822 indicates whether the IP address of the input data is 123.45.678.910; column 824 indicates whether the 1 st trigger score is >0; column 826 indicates whether the 2 nd trigger score is >0; and column 828 indicates whether the category result will be a hit (resolved positive) or a non-hit (resolved negative).
- Triggers may include other Boolean logic operations. For example, since a result may be inverted (a logical NOT), the AND and OR functions described above may be combined to produce an Exclusive OR (XOR) function. Thus, where p and q are pre-requisite triggers, p XOR q could be implemented via the following expression:
- FIG. 9 is a process flow diagram for a dynamic reordering method, according to an embodiment of the invention.
- the process begins in step 902 by initializing an Avoid Evaluation of This Trigger (AEOTT) rating.
- AEOTT Avoid Evaluation of This Trigger
- the process evaluates a first or next data source (e.g., resolves a pre-requisite trigger for the first or next data source).
- AEOTT Avoid Evaluation of This Trigger
- an AEOTT rating can be either incremented or decremented based on whether it is determined in step 906 that the contained trigger caused an early exit. For example, with reference to FIG. 7 , where a higher AEOTT causes a pre-requisite trigger to be evaluated later, and where it is determined that the first trigger did not cause an early exit, the AEOTT rating for the first trigger would be increased. Over time, the result is that the trigger most likely to cause an early exit (a non-hit in the case of an AND function) will be evaluated prior to other pre-requisite triggers.
- adaptive reordering could be applied to pattern tuples.
- adaptive or dynamic reordering could be applied to threshold scoring in combination with, or in the alternative to, static trigger ordering described with reference to FIG. 6 .
- embodiments of the invention provide, among other things, a robust and efficient system and method for linguistic scoring.
- Those skilled in the art can readily recognize that numerous variations and substitutions may be made in the invention, its use and its configuration to achieve substantially the same results as achieved by the embodiments described herein. Accordingly, there is no intention to limit the invention to the disclosed exemplary forms. Many variations, modifications and alternative constructions fall within the scope and spirit of the disclosed invention as expressed in the claims.
- thresholds are expressed in terms of whether a tally is greater than a predetermined threshold, the processes could be altered so that the test is whether the tally is greater than or equal to the predetermined threshold.
- references are made to embodiments of the invention, all embodiments disclosed herein need not be separate embodiments. In other words, many of the features disclosed herein can be utilized in combinations not expressly illustrated.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- (p AND (NOT q)) OR ((NOT p) AND q).
Dynamic Re-Ordering
Claims (1)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/748,677 US7613766B2 (en) | 2003-12-31 | 2003-12-31 | Apparatus and method for linguistic scoring |
US12/233,323 US8234328B2 (en) | 2003-12-31 | 2008-09-18 | Apparatus and method for linguistic scoring |
US13/535,332 US8620642B2 (en) | 2003-12-31 | 2012-06-27 | Apparatus and method for linguistic scoring |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/748,677 US7613766B2 (en) | 2003-12-31 | 2003-12-31 | Apparatus and method for linguistic scoring |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/233,323 Continuation US8234328B2 (en) | 2003-12-31 | 2008-09-18 | Apparatus and method for linguistic scoring |
Publications (2)
Publication Number | Publication Date |
---|---|
US20050149317A1 US20050149317A1 (en) | 2005-07-07 |
US7613766B2 true US7613766B2 (en) | 2009-11-03 |
Family
ID=34710964
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/748,677 Active 2026-03-07 US7613766B2 (en) | 2003-12-31 | 2003-12-31 | Apparatus and method for linguistic scoring |
US12/233,323 Active 2025-10-28 US8234328B2 (en) | 2003-12-31 | 2008-09-18 | Apparatus and method for linguistic scoring |
US13/535,332 Expired - Lifetime US8620642B2 (en) | 2003-12-31 | 2012-06-27 | Apparatus and method for linguistic scoring |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/233,323 Active 2025-10-28 US8234328B2 (en) | 2003-12-31 | 2008-09-18 | Apparatus and method for linguistic scoring |
US13/535,332 Expired - Lifetime US8620642B2 (en) | 2003-12-31 | 2012-06-27 | Apparatus and method for linguistic scoring |
Country Status (1)
Country | Link |
---|---|
US (3) | US7613766B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060150249A1 (en) * | 2003-05-07 | 2006-07-06 | Derek Gassen | Method and apparatus for predictive and actual intrusion detection on a network |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006526424A (en) * | 2003-06-04 | 2006-11-24 | イニオン リミテッド | Biodegradable implant and method for producing the same |
US20050283357A1 (en) * | 2004-06-22 | 2005-12-22 | Microsoft Corporation | Text mining method |
US7716210B2 (en) * | 2006-12-20 | 2010-05-11 | International Business Machines Corporation | Method and apparatus for XML query evaluation using early-outs and multiple passes |
CN102202007B (en) * | 2010-03-25 | 2015-02-18 | 腾讯科技(深圳)有限公司 | Method and device for automatically counting instant messaging behaviors |
US20150317337A1 (en) * | 2014-05-05 | 2015-11-05 | General Electric Company | Systems and Methods for Identifying and Driving Actionable Insights from Data |
GB2565037A (en) * | 2017-06-01 | 2019-02-06 | Spirit Al Ltd | Online user monitoring |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5832212A (en) * | 1996-04-19 | 1998-11-03 | International Business Machines Corporation | Censoring browser method and apparatus for internet viewing |
US6266664B1 (en) | 1997-10-01 | 2001-07-24 | Rulespace, Inc. | Method for scanning, analyzing and rating digital information content |
US20020004907A1 (en) * | 2000-01-12 | 2002-01-10 | Donahue Thomas P. | Employee internet management device |
US6453345B2 (en) * | 1996-11-06 | 2002-09-17 | Datadirect Networks, Inc. | Network security and surveillance system |
US6477571B1 (en) | 1998-08-11 | 2002-11-05 | Computer Associates Think, Inc. | Transaction recognition and prediction using regular expressions |
US7032007B2 (en) * | 2001-12-05 | 2006-04-18 | International Business Machines Corporation | Apparatus and method for monitoring instant messaging accounts |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5699287A (en) * | 1992-09-30 | 1997-12-16 | Texas Instruments Incorporated | Method and device for adding and subtracting thermometer coded data |
US6487666B1 (en) * | 1999-01-15 | 2002-11-26 | Cisco Technology, Inc. | Intrusion detection signature analysis using regular expressions and logical operators |
US7676822B2 (en) * | 2001-01-11 | 2010-03-09 | Thomson Licensing | Automatic on-screen display of auxiliary information |
US20050033849A1 (en) * | 2002-06-20 | 2005-02-10 | Bellsouth Intellectual Property Corporation | Content blocking |
-
2003
- 2003-12-31 US US10/748,677 patent/US7613766B2/en active Active
-
2008
- 2008-09-18 US US12/233,323 patent/US8234328B2/en active Active
-
2012
- 2012-06-27 US US13/535,332 patent/US8620642B2/en not_active Expired - Lifetime
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5832212A (en) * | 1996-04-19 | 1998-11-03 | International Business Machines Corporation | Censoring browser method and apparatus for internet viewing |
US6453345B2 (en) * | 1996-11-06 | 2002-09-17 | Datadirect Networks, Inc. | Network security and surveillance system |
US6266664B1 (en) | 1997-10-01 | 2001-07-24 | Rulespace, Inc. | Method for scanning, analyzing and rating digital information content |
US6477571B1 (en) | 1998-08-11 | 2002-11-05 | Computer Associates Think, Inc. | Transaction recognition and prediction using regular expressions |
US20020004907A1 (en) * | 2000-01-12 | 2002-01-10 | Donahue Thomas P. | Employee internet management device |
US7032007B2 (en) * | 2001-12-05 | 2006-04-18 | International Business Machines Corporation | Apparatus and method for monitoring instant messaging accounts |
Non-Patent Citations (2)
Title |
---|
Justin Mason, Filtering Spam with SpamAssassin, http://useast.spamassassin.orq (undated). |
Matt Sergeant, Internet Level Spam Detection and SpamAssassin 2.50, MessageLabs (undated). |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060150249A1 (en) * | 2003-05-07 | 2006-07-06 | Derek Gassen | Method and apparatus for predictive and actual intrusion detection on a network |
US8640234B2 (en) | 2003-05-07 | 2014-01-28 | Trustwave Holdings, Inc. | Method and apparatus for predictive and actual intrusion detection on a network |
Also Published As
Publication number | Publication date |
---|---|
US8620642B2 (en) | 2013-12-31 |
US20090119094A1 (en) | 2009-05-07 |
US8234328B2 (en) | 2012-07-31 |
US20120271626A1 (en) | 2012-10-25 |
US20050149317A1 (en) | 2005-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8620642B2 (en) | Apparatus and method for linguistic scoring | |
US10560471B2 (en) | Detecting web exploit kits by tree-based structural similarity search | |
US11503044B2 (en) | Method computing device for detecting malicious domain names in network traffic | |
Marchal et al. | PhishStorm: Detecting phishing with streaming analytics | |
Schechter et al. | Using path profiles to predict HTTP requests | |
US10044737B2 (en) | Detection of beaconing behavior in network traffic | |
US8260914B1 (en) | Detecting DNS fast-flux anomalies | |
US10404731B2 (en) | Method and device for detecting website attack | |
US9053320B2 (en) | Method of and apparatus for identifying requestors of machine-generated requests to resolve a textual identifier | |
US9058381B2 (en) | Method of and apparatus for identifying machine-generated textual identifiers | |
US11128641B2 (en) | Propagating belief information about malicious and benign nodes | |
Zhang et al. | Toward unsupervised protocol feature word extraction | |
Kim et al. | Phishing url detection: A network-based approach robust to evasion | |
US10761614B2 (en) | Enhanced context-based command line interface auto-completion using multiple command matching conditions | |
US8392421B1 (en) | System and method for internet endpoint profiling | |
Manna et al. | Detecting network anomalies using machine learning and SNMP-MIB dataset with IP group | |
US11799904B2 (en) | Malware detection using inverse imbalance subspace searching | |
WO2016173327A1 (en) | Method and device for detecting website attack | |
US20210084011A1 (en) | Hardware acceleration device for string matching and range comparison | |
Cheng et al. | Correlate the advanced persistent threat alerts and logs for cyber situation comprehension | |
Cheng et al. | Cheetah: a space-efficient HNB-based NFAT approach to supporting network forensics | |
US20230336528A1 (en) | System and method for detecting dictionary-based dga traffic | |
Khukalenko et al. | Machine Learning Models Stacking in the Malicious Links Detecting | |
CN110460592B (en) | URL analysis method, device, equipment and medium | |
Chen et al. | A Novel Network Security Situation Awareness Model for Advanced Persistent Threat |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: VERICEPT CORPORATION, COLORADO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BABA, DAISUKE;PHILLIPS, CHARLES DOUGLAS;REEL/FRAME:015550/0528;SIGNING DATES FROM 20040518 TO 20040524 |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK, CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNOR:VERICEPT CORPORATION;REEL/FRAME:016310/0174 Effective date: 20041229 |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK, CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNOR:VERICEPT CORPORATION;REEL/FRAME:018244/0529 Effective date: 20060911 |
|
AS | Assignment |
Owner name: VENTURE LENDING & LEASING IV INC., CALIFORNIA Free format text: SECURITY INTEREST;ASSIGNOR:VERICEPT CORPORATION;REEL/FRAME:018384/0352 Effective date: 20060911 |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK, CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNOR:TW VERICEPT CORPORATION;REEL/FRAME:023234/0194 Effective date: 20090819 |
|
AS | Assignment |
Owner name: TW VERICEPT CORPORATION, ILLINOIS Free format text: MERGER;ASSIGNOR:VERICEPT CORPORATION;REEL/FRAME:023292/0843 Effective date: 20090826 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: VERICEPT CORPORATION, ILLINOIS Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:VENTURE LENDING & LEASING IV, INC.;REEL/FRAME:023750/0027 Effective date: 20091015 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: TRUSTWAVE HOLDINGS, INC., ILLINOIS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TW VERICEPT CORPORATION;REEL/FRAME:027478/0601 Effective date: 20090826 |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK, NEW YORK Free format text: SECURITY AGREEMENT;ASSIGNOR:TRUSTWAVE HOLDINGS, INC.;REEL/FRAME:027867/0199 Effective date: 20120223 |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK, CALIFORNIA Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ADDRESS OF THE RECEIVING PARTY PREVIOUSLY RECORDED ON REEL 027867 FRAME 0199. ASSIGNOR(S) HEREBY CONFIRMS THE SECURITY AGREEMENT;ASSIGNOR:TRUSTWAVE HOLDINGS, INC.;REEL/FRAME:027886/0058 Effective date: 20120223 |
|
AS | Assignment |
Owner name: TW VERICEPT CORPORATION, ILLINOIS Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:028519/0433 Effective date: 20120709 Owner name: WELLS FARGO CAPITAL FINANCE, LLC, AS AGENT, MASSAC Free format text: SECURITY AGREEMENT;ASSIGNORS:TRUSTWAVE HOLDINGS, INC.;TW SECURITY CORP.;REEL/FRAME:028518/0700 Effective date: 20120709 |
|
AS | Assignment |
Owner name: TRUSTWAVE HOLDINGS, INC., ILLINOIS Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:028526/0001 Effective date: 20120709 |
|
AS | Assignment |
Owner name: VERICEPT CORPORATION, ILLINOIS Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:028533/0383 Effective date: 20120709 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 12 |
|
AS | Assignment |
Owner name: SYSXNET LIMITED, IRELAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TRUSTWAVE HOLDINGS, INC.;REEL/FRAME:058748/0177 Effective date: 20211017 |
|
AS | Assignment |
Owner name: MIDCAP FINANCIAL TRUST, AS COLLATERAL AGENT, MARYLAND Free format text: SECURITY INTEREST;ASSIGNORS:SYSXNET LIMITED;CONTROLSCAN, INC.;VIKING CLOUD, INC.;REEL/FRAME:068196/0462 Effective date: 20240806 |