US11314862B2 - Method for detecting malicious scripts through modeling of script structure - Google Patents
Method for detecting malicious scripts through modeling of script structure Download PDFInfo
- Publication number
- US11314862B2 US11314862B2 US15/953,953 US201815953953A US11314862B2 US 11314862 B2 US11314862 B2 US 11314862B2 US 201815953953 A US201815953953 A US 201815953953A US 11314862 B2 US11314862 B2 US 11314862B2
- Authority
- US
- United States
- Prior art keywords
- script
- abstract
- end user
- unclassified
- generalized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/563—Static detection by source code analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/554—Detecting local intrusion or implementing counter-measures involving event detection and direct action
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/03—Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
- G06F2221/033—Test or assess software
Definitions
- Scripts are an integral component of modern documents, software applications, web pages within a web browser, usage of the shells of operating systems (OS), embedded systems, as well as games, providing unparalleled flexibility and dynamism (e.g. an online banking website might use a script to obtain credentials and transaction information from the user).
- the spectrum of scripting languages ranges from very small and highly domain-specific languages to general-purpose programming languages used for scripting.
- Standard examples of scripting languages for specific environments include: Bash, ECMAScript (JavaScript), VBScript, PHP, PowerShell, Visual Basic, Lua, Linden Scripting Language and TrainzScript. Scripts provide important functionality, but also have security ramifications and can lead to various attacks.
- Modern web applications and documents such as pdf, docx, xlsx, html etc. contain both snippets of code (in the form of scripts and macros) and data.
- Modern enterprise web application architectures e.g., Angular JS, Ember JS
- client-side attacks such as web injection attacks from malware, cross-site scripting (XSS), and document object model (DOM) based attacks and others.
- client-side attacks result in fraud, malicious ads, lower return on investment (ROI) on ad spend, loss of sensitive customer and app data, and customer dissatisfaction.
- Malware includes computer code (e.g., a software application, a utility, or other code) that can interfere with an end user device's normal functioning.
- Script-based malware includes script code that can infect a computing end user device and cause the computing end user device to malfunction. Examples of script code that can be used to generate script-based malware include JavaScript, Visual Basic Script (VBScript), and so on.
- a user navigates to a website that includes script-based malware.
- the script code of the script-based malware is then loaded onto the user's computing end user device, e.g., as part of the website code.
- a script engine on the computing device then parses the script code and the parsed script code is compiled and executed on the computing end user device. Execution of the script code can cause a variety of undesirable activities on the computing end user device, such as the slowing and/or malfunctioning of applications running on the computing end user device. This could also be in the form of additional form fields to a login page that is designed to pass user credentials to some external address controlled by the malware, therein stealing the user's credentials and/or authorization codes to use at a later point in time (e.g. cross-site request forgery, clickjacking, crypto-jacking, etc.).
- the present disclosure describes a method for automatically categorizing a script, available either in full or part in a language such as Javascript, into one or more abstractions available in a reference database, built previously from a corpus. These abstractions group similar scripts and therefore can be used to provide a scalable solution for fast matching and categorization of a given script either in real time or offline. This, in turn, enables the ability to whitelist legitimate scripts, weed out non-matching scripts, and infer false positives and true attacks from raw events.
- This disclosure outlines an automated approach for accurately modeling the structure of embedded scripts using language based and machine learning based techniques to abstract the dynamism of similar scripts. This, in turn, provides a scalable and efficient way of identifying benign, malicious, known and unknown scripts from a script available in full or in part. This disclosure outlines the ability to infer the structure of scripts and then use that in detection, protection, and remediation in a variety of applications in the context of information security.
- a method may determine a structure of an unclassified end user browser script by building an abstract structure using code from unclassified end user browser script; comparing the determined structure of the unclassified end user browser script with a plurality of generalized abstract structures; if the determined structure of the unclassified end user browser script matches within a predetermined threshold of any of the plurality of generalized abstract structures, then the unclassified end user browser script is classified as benign, otherwise the determined structure is classified as malicious.
- a method includes: receiving a plurality of benign web content, each web content having a script; extracting the scripts from each of the plurality of benign web content; extracting structural features from the scripts; building a plurality of abstract syntax trees (ASTs) using the structural features of the scripts; clustering the plurality of ASTs into common clusters of ASTs by means of a predetermined tree edit distance; and generalizing the common clusters of ASTs into a plurality of generalized ASTs (GASTs) by means of associating a common node that is predictive of a particular type of script.
- ASTs abstract syntax trees
- a method includes: determining structure of unclassified end user browser script by building abstract syntax tree (AST) using code from unclassified end user browser script; compare determined structure of unclassified end user browser script with a plurality of generalized ASTs, the plurality of generalized ASTs determined by: a plurality of abstract syntax trees (ASTs) from a plurality of known benign script codes, wherein building an AST includes extracting a structural feature from a first portion of known benign script code; clustering the plurality of ASTs into common clusters of ASTs by means of a predetermined tree edit distance; and generalizing the common clusters of ASTs into a plurality of generalized ASTs (GASTs) by means of associating a common node that is predictive of a particular type of script.
- AST abstract syntax tree
- FIG. 1 illustrates a communication system for detection of script-based malware according to an implementation.
- FIG. 2 illustrates an example process for iteratively modeling scripts from a corpus of script samples.
- FIG. 3 illustrates an example process of a management system to configure malware policy of a web application according to an implementation.
- FIG. 4 illustrates an example of abstract syntax tree (AST) structural feature extraction in accordance with one or more implementations.
- AST abstract syntax tree
- FIG. 5 is a flow diagram depicting an example process for detecting script-based malware in accordance with one or more implementations.
- FIG. 6 illustrates a management computing system for configuring malware policies of a web application according to an implementation.
- a script is any program written in a scripting language (a language to automate or execute tasks in a runtime environment). Usually scripts are interpreted but can be compiled. Scripting languages can be domain specific or general purpose. Examples of scripting languages are ECMAscript (JavaScript), Perl, VBScript, Python, Lua, Applescript, ActionScript etc. Scripts provide important functionality, but also have security ramifications and can lead to various attacks. Therefore, one of the keys to safeguarding against script-based attacks is the ability to infer the structure of scripts and then use that in detection, protection, and remediation in a variety of applications in the context of information security. It is a well-established fact that scripts are a security issue on the client side. Whitelisting scripts or small snippets of code embedded in various documents (html, pdf, docx, xlsx) is one way of providing a secure environment outside of the back-end servers.
- the technical effect of this disclosure is to: 1) automatically determine the n possible structures that a script conforms to; 2) use a combination of parser and machine-learning techniques; 3) refine the results of step 1 with the availability of additional scripts; and 4) use the automatically derived script structures of webpages and documents for threat detection, prevention, mitigation, prediction and response.
- this approach produces high-quality results that can be leveraged in: whitelisting benign scripts, detecting malicious scripts, and classifying security events into false positives and true positives (e.g. real attacks).
- the malware policy requires no changes to a web application or user device and is agent-less protection. Examples disclosed herein provide enhancements for configuring web applications of an origin server or content provider in a communication network with a malware policy according to an implementation.
- the disclosure in an embodiment, uniquely monitors the user environment providing comprehensive understanding and analytics of how users are being attacked.
- FIG. 1 illustrates a communication network 100 to produce and implement a malware policy in a web application according to an implementation.
- Communication network 100 includes a server 110 , end user device 114 , a browser 112 , malware 101 , and a management system 150 .
- Server 110 communicates with end user devices 114 and management system 150 via communication links 170 - 171 .
- Server 110 can include a variety of different devices and entities from which web content can be retrieved, such as a cache node, a local server (e.g., a LAN server), an origin server, back end server, a content provider server, an application server, a network resource, a cloud computing resource, and so on.
- Server 110 communicates with end user devices 114 and management system 150 via communication links 170 - 171 .
- Server 110 and management system 150 can each include communication interfaces, network interfaces, processing systems, computer systems, microprocessors, storage systems, storage media, or some other processing devices or software systems, and can be distributed among multiple devices ( FIG. 6 ). Examples of servers 110 and management system 150 can include software such as an operating system, logs, databases, utilities, drivers, caching software, networking software, and other software stored on a computer-readable medium. Servers 110 and management system 150 may each comprise, in some examples, one or more server computing systems, desktop computing systems, laptop computing systems, or any other computing system, including combinations thereof.
- End user devices 114 can each be a user device, subscriber equipment, customer equipment, access terminal, smartphone, personal digital assistant (PDA), computer, tablet computing device, e-book, Internet appliance, media player, game console, or some other user communication apparatus, including combinations thereof.
- End user devices 114 can each include communication interfaces, network interfaces, processing systems, computer systems, microprocessors, storage systems, storage media, or some other processing devices or software systems ( FIG. 6 ).
- Communication links 170 - 172 each use metal, glass, optical, air, space, or some other material as the transport media.
- Communication links 170 - 172 can each use various communication protocols, such as Time Division Multiplex (TDM), asynchronous transfer mode (ATM), Internet Protocol (IP), Ethernet, synchronous optical networking (SONET), hybrid fiber-coax (HFC), circuit-switched, communication signaling, wireless communications, or some other communication format, including combinations, improvements, or variations thereof.
- Communication links 170 - 172 can each be a direct link or can include intermediate networks, systems, servers, or devices, and can include a logical network link transported over multiple physical links. Although one main link for each of links 170 - 176 is shown in FIG.
- links 170 - 172 are merely illustrative to show communication modes or access pathways. In other examples, further links can be shown, with portions of the further links shared and used for different communication sessions or different content types, among other configurations.
- Communication links 170 - 172 can each include many different signals sharing the same associated link, as represented by the associated lines in FIG. 1 , comprising resource blocks, access channels, paging channels, notification channels, forward links, reverse links, user communications, communication sessions, overhead communications, carrier frequencies, other channels, timeslots, spreading codes, transportation ports, logical transportation links, network sockets, packets, or communication directions.
- Malware 101 may be understood to include a variety of forms of hostile, malicious, or intrusive software, including computer viruses, worms, trojans, ransomware, spyware, adware, scareware, and other intentionally harmful programs. Malware 101 may be injected into a web application 105 , e.g. a, a web page, the application 105 having some portion of script code, e.g. JavaScript being run on a user's browser 112 on an end user device 114 . Malware 101 on an infected device 114 may be totally invisible for a user, by tampering with both HTTP requests sent by a user's web browser 112 and responses received from the server 110 .
- a web application 105 e.g. a, a web page
- script code e.g. JavaScript
- Malware 101 web injection is used by an attacker to introduce (or “inject”) code into a vulnerable computer program (e.g. browser 112 ) and change the course of execution.
- XSS cross site scripting
- XSS attacks are a type of injection, in which malicious scripts 101 are injected into otherwise benign and trusted websites 105 .
- XSS attacks may be a piece of HTML or JavaScript code, which is added by malware 101 to the website 105 when a browser 112 opens it. This may be in the form of additional form fields added to the login page or JavaScript code designed to pass user credentials to some external address controlled by malware operator.
- XSS attacks may occur when an attacker uses a web application to send malicious code 101 , generally in the form of a browser side script, to a different end user.
- An attacker can use XSS to send a malicious script 101 to an unsuspecting user device 114 .
- the end user's browser 112 has no way to know that the malicious script 101 should not be trusted and will execute the malicious script 101 . Because it thinks the script came from a trusted source, the malicious script 101 can access any cookies, session tokens, or other sensitive information retained by the browser 112 and used with that site 105 . These malicious scripts 101 can even rewrite the content of the HTML page.
- DOM document object model
- DOM Based XSS or “type-0 XSS” is an XSS attack wherein the attack payload is executed as a result of modifying the DOM “environment” in the end user's browser 112 used by the original client side script, so that the client side script code runs in an “unexpected” manner. That is, the page 105 itself (the HTTP response that is) does not change, but the client-side code contained in the page 105 executes differently due to the malicious modifications that have occurred in the DOM environment.
- end user devices 114 In operation, end user devices 114 generate requests for network content from server 110 , such as Internet web pages or media content 105 such as videos, documents, pictures, and music. Upon receipt of a request, the server 110 processes the requests and supplies (get function) the required content, e.g. web application 105 to the requesting device 114 . The end user browser 112 compiles and runs the web application 105 on the end user device 114 .
- server 110 processes the requests and supplies (get function) the required content, e.g. web application 105 to the requesting device 114 .
- the end user browser 112 compiles and runs the web application 105 on the end user device 114 .
- the web application 105 may include a content security or anti-malware policy 103 developed by the management system 150 and transmitted to the server 110 to be transmitted and compiled and run on client end user side.
- the web application 105 is configured or embedded with a malware policy 103 , which represent malware detection capable of comparing benign scripts with unclassified scripts.
- the policy 103 may be specific to a content provider or may work for various content providers.
- the policy 103 can interact with the end user web browser 112 and/or a script engine 135 to inspect script code that is received from the server 110 and determine if the script code is associated with malware or benign ( FIG. 7 ).
- script code that is written as part of web content 105 is subjected to obfuscation to hide all or part of the actual script code.
- the web browser 112 will de-obfuscate the scripts at which point the malware policy 103 will initially instantize. Should the script dynamically call to resource objections or further scripts, the policy 103 may re-instantize at that time.
- Management system 150 is configured to provide operation 200 , which is further described in FIGS. 2-3 .
- FIG. 2 illustrates an operation 200 of a management system to configure servers of a communication network 100 , in particular, building a corpus repository of scripts to be utilized by a policy 103 configured for web applications according to an implementation.
- the processes of operation 200 are referenced parenthetically in the paragraphs that follow with reference to systems and elements of communication network 100 of FIG. 1 .
- management system 150 in a communication network 100 employ a module, such as a crawler 210 , an extractor 212 , a script modeler 214 , and memory such as a script repository or script abstraction database 216 and other similar configurations to provide an anti-malware policy 103 to requesting content providers 110 and eventually to end user devices 114 .
- Each content service provider 110 may be provided with its own anti malware policy 103 that provides the anti-malware operations that may be specific to its service. The specificity is dependent upon the web content repository 218 . If the web content repository 218 is web applications that are specific to the content provider 110 , then the solution or policy 103 will be specific to the content provider 110 .
- the policy 103 will be broadly defined. It is envisioned in this disclosure that the entire web may be scrubbed and filtered down to create a corpus by the process disclosed herein. The corpus being saved in the script repository 216 . The script repository 216 being utilized or copied by the server 110 whenever policy 103 is instantized. It is foreseen that the policy may be run by the management system 150 or in a cloud as the analysis does not perform penetration testing or vulnerability analysis, but to determine what the legitimate behaviors of the web app are or would be if allowed to run (T3).
- T3 allowed to run
- the scripts are extracted from the web applications 105 by an extractor 212 . It is foreseen that a crawler or scanner 210 may not be necessary, and that the policy 103 may be developed alongside development of the web application, e.g. in the production continuous integration/continuous delivery (CI/CD) cycle of the web applications or pre-production.
- the extracted scripts are then modeled by the script modeler 214 and stored in the script repository 216 as will be further discussed below.
- operation 200 includes receiving ( 301 ) a plurality of benign web content 130 (T1).
- an administrator for a particular service provider 110 may transmit to the management system 150 all benign web content 130 utilized by that service provider 110 . This would make a specific policy 103 to that service provider 110 .
- a crawler or scanner 210 may be instantized to crawl through the web and collect web applications 130 to create a corpus of benign web applications 130 of which the modeling algorithm of the modeler 214 may be applied.
- the scanner 210 can record scripts or fragments of script code that will be generated as part of a script unfolding or de-obfuscating in the end user browser 112 . It is foreseen that this step may also be some combination of the two (modeling specific and having a corpus already), as the corpus increases, it may be advantageous to utilize the corpus with a specific subset of scripts utilized from a service provider 110 .
- Each of the received web content 105 generates a script or application using C, C++, Java, or some other similar programming language.
- management system 150 may extract ( 303 ) the individual scripts from each of the plurality of benign web content.
- a parser or extractor 212 may parse the benign script code received to extract structural features (e.g. 403 ) from the script code.
- the parsed script code can then be used by the script modeler 214 to build ( 307 ) a plurality of abstract syntax trees (ASTs), parsing trees, derivation trees, or a concrete syntax trees ( FIG. 4 ).
- An AST is an internal data structure built by a parser after performing syntactic analysis of a script or a program. ASTs have a lot more structure than the texts of the scripts because they capture the “structure” of the script. ASTs could also be derived from portions or part of the scripts (e.g. first 40 characters of a script or a particular function call).
- the modeler 214 will cluster ( 309 ) the ASTs into common clusters. These clusters may be accomplished by algorithms such as: hierarchical; partitioning (e.g. k-means, k-medoids, probabilistic, etc.); grid-based; co-occurrence of data; constraint based; machine learning; scalable; subspace; projection; and co-clustering.
- the tree edit distance between ordered labeled ASTs is the minimal-cost sequence of node edit operations that transforms one tree into another.
- a cost is assigned to each edit operation, then, the cost of an edit sequence is the sum of the costs of its edit operations.
- a tree edit distance is the sequence with the minimal cost.
- a single-path function computes the tree edit distance between two relevant subtrees according to the chosen path.
- a predetermined tree edit distance (d) is a value for the threshold of comparison of one tree to another. The larger d is, the looser the match with a given tree.
- the tree edit distance d decomposes the trees into subtrees and subforests (e.g. clusters).
- the distance between two forests is computed in constant time from the solution of smaller subproblems.
- the distance between two forests is the minimum distance of four smaller problems.
- the costs of the edit operations are added to the corresponding problems.
- the subproblems that occur during the computation of the tree edit distance are called relevant subproblems. Their number, which describes also the time complexity, depends on the choice between left and right solution at each recursive step. Algorithms (e.g.
- Tai, Zhang and Shasha, and Klein use a so-called decomposition strategy to determine the choice of left vs. right at each recursive step.
- One cluster s a tree t 2 in with a common t 1 if a predetermined tree distance is calculated between trees t 1 and t 2 .
- Intuitively trees that are closer according to the distance metric d are “more” similar and are clustered together as a common cluster. For example, d(t 1 , t 2 ) ⁇ d(t 3 , t 2 ) means that t 2 is more similar to t 1 than t 3 .
- hierarchical clustering is advantageous over other clustering algorithms, such as k-means, as that it does not apriori require the number of clusters.
- the output of step 309 is a set of clusters, e.g. C ⁇ c 1 , c 2 , c 3 , . . . , c n ⁇ , where each cluster (e.g. c 1 ) is a set of common ASTs.
- GAST generalized AST
- Each cluster intuitively corresponds to a functionality (e.g., as the example above GAST 400 , wherein each cluster corresponds to the log-in script corresponding to individuals). But these scripts will have data corresponding to the individual (e.g. individuals account number).
- the process 300 “abstracts away” this specific information. For example, account number might be replaced by its type (4-digit number).
- GAST corresponds to abstraction of each AST in the cluster. So, each cluster will become a GAST.
- FIG. 4 illustrates an example GAST generally at 400 .
- node 403 represents the class user
- node 405 represents the password being 6 characters long
- node 407 represents the user's ID
- node 409 represents the string name.
- the GAST 400 illustrated is discussed with respect to script code, this is not intended to be limiting.
- the structural feature extraction scenario 300 can be used to extract structural features from a variety of different types of code (examples of which are listed above).
- the GAST 400 includes various aspects that can be used to characterize the structure as being associated with a log-in function 401 .
- AST node 407 in the ASTs in a cluster has values ⁇ 2345, 6789, 9788, 9090 ⁇ at AST node 407 .
- these values probably correspond to individuals account numbers or ID.
- the process will assign a type of “account-number” or “ID” to that node 407 along with a specification of this type (e.g., an integer with 4 digits).
- the plurality of ASTs in that cluster becomes a GAST ( 311 ) and will match any script that has a 4-digit value in that node.
- the process will generalize each node that corresponds to the values that might appear in the nodes of the ASTs corresponding to scripts, e.g.
- node 409 is assigned name and node 405 is assigned to password.
- the type system may have a sub-type relation (e.g., account-number ⁇ int, because an account-number is also an integer, albeit of a special form).
- the process infers a most specific type t given a set of values ⁇ v 1 , . . . , v k ⁇ according to the sub-typing relation. For example, for the set of values ⁇ 2345, 6789, 9788, 9090 ⁇ are inferred to be the type “account-number,” and not an integer (int) or floating point.
- the modeler 214 will infer the functionality of the plurality of GASTs ⁇ g 1 , . . . , g k ⁇ . As the example above, wherein a user has a password, ID, and user name, this will be inferred to be a log in function.
- the modeler 214 infers the functionality the GASTs will be assigned the predicted function as a type ( 313 ).
- the modeler 214 can infer the functionality based on: the type of website, e.g. banking, auction, retailer, etc.; the web designer template; and the web application framework or platform (e.g. Angular, Java, Flash, Silverlight, etc.). As the modeler 214 builds a larger and larger corpus of scripts and GASTs, the modeler can machine learn based on false positives, and can change the inferences on the fly to better the GAST.
- the GAST g 1 correspond to a user entering a transaction in an online banking website.
- the g 1 may contain various paths with various structures of the transaction (e.g., obtaining the user credentials, or a user providing the transaction type).
- This step requires inference from the modeler 214 . Such inferences may also be from metadata or knowledge base of the underlying web framework or document container in which the script was generated.
- the generalized ASTs are sent to server 110 as a policy 103 (T2).
- the browser 112 may use an application programming interface (API) to communicate with the server 110 .
- API commands may include commands such as PUT (PRINT), GET_REQUEST, SEND_RESPONSE, and other similar commands to provide the desired operation.
- PRINT physical resource provisioned by the server 110
- GET_REQUEST GET_REQUEST
- SEND_RESPONSE SEND_RESPONSE
- other similar commands to provide the desired operation.
- the browser 112 may use a first command to obtain the data (from the local cache node 110 ), process the data in accordance with the web application requested, and provide the processed content to the desired end user browser.
- the browser will compile or interpret the script portion of the web application 105 into web assembly code, or WebAssembly code, wherein the operations of the application 105 may be translated from the first language to the WebAssembly language.
- This WebAssembly code is a standard that defines a binary format and a corresponding assembly-like text format for executable code in Web pages.
- the policy 103 will determine the structure of the script in order to classify the application 105 as benign or malicious.
- Some web applications 105 are dynamic and call for them to be run over and over and the policy 103 may check every instance of the web application 105 that is run or just the initial.
- the policy 103 may only review some portion (e.g. the first) of the script to determine the script structure. This may be the case, for instance, where the script has been obfuscated and in the process of being de-obfuscated at the browser 112 .
- a process 500 for detecting script-based malware may be utilized.
- the policy 103 will determine the structure of the unclassified script being loaded on the end user browser 112 . This may be accomplished by the extractor 212 .
- the policy 103 will compare the determined structure of the unclassified end user browser 112 with the plurality of generalized ASTs (determined in step 311 above). It should be understood that the policy 103 may be utilizing at least one of the management system 150 and the server 110 for processing the comparison as will be further discussed below.
- the determined structure of the unclassified script will be determined if it is a match with the plurality or set of generalized ASTs. A match may be a predetermined tree edit distance or some threshold of a predetermined tree edit distance. If there is a match within the predetermined tree edit distance, then the script is classified as benign ( 509 ).
- the policy 103 can be configured in detection or blocking mode—in detection mode, the policy 103 will not block the behavior and send an alert or notification to the management system 150 and/or the server 110 . If any portion of the web application 105 does not meet the constraints, then an alert may be provided to the administrator of the server 110 indicating the issue with the web application. In blocking mode, the policy 103 will both block the malicious behavior and alert the management system 150 . In contrast, if the application 105 does meet the constraints, then the application in the web assembly code may be recompiled or finish compiling to be run on the end user browser 112 .
- policy 103 With such a strict black or white, meet or not threshold, policy 103 , there are bound to be false positives/negatives or misclassifications on the malicious side, though the benign side is a possibility too. As such the policy 103 will learn or train from the false positives.
- the server 110 once notification is received may allow the web application 105 to run, therein creating an exception to the policy 103 as a script was misclassified ( 511 ). Instead of or in combination with allocating for an exception, the policy 103 will review ( 515 ) the false positive, and in particular the tree edit distance, otherwise the process 500 ends ( 513 ). It is foreseen that other algorithms may be used to compare to the first algorithm used to get to a result that does not tend towards a false positive. Once the training or machine learning is accomplished the updated generalized AST replaces previous GAST ( 517 ) to limit miscalculations in future comparisons.
- FIG. 6 illustrates a management computing system 600 for a communication network 100 according to an implementation.
- Management computing system 600 is representative of any computing system or systems with which the various operational architectures, processes, scenarios, and sequences disclosed herein for a management system may be implemented.
- Management computing system 600 is an example of management system 150 of FIG. 1 , although other examples may exist.
- Management computing system 600 comprises communication interface 601 , user interface 602 , and processing system 603 .
- Processing system 603 is linked to communication interface 601 and user interface 602 .
- Processing system 603 includes processing circuitry 605 and memory device 606 that stores software 607 .
- Management computing system 600 may include other well-known components such as a battery and enclosure that are not shown for clarity.
- Management computing system 600 may comprise one or more server computing systems, desktop computing systems, laptop computing systems, distributed processing devices, or any other computing system, including combinations thereof.
- the management computing system 600 may be a cloud infrastructure or a cloud node, wherein a software layer permits the underlying physical hardware associated with clouds, which can include servers, memory, storage, and network resources, to be viewed as virtualized units or virtual machines (VMs). These virtualized units represent some fraction of the underlying computing hardware or resources supported by the cloud infrastructure. It is understood that the software 607 may be deployed seamlessly on a plurality of clouds.
- Communication interface 601 comprises components that communicate over communication links, such as network cards, ports, radio frequency (RF), processing circuitry and software, or some other communication devices.
- Communication interface 601 may be configured to communicate over metallic, wireless, or optical links.
- Communication interface 601 may be configured to use Time Division Multiplex (TDM), Internet Protocol (IP), Ethernet, optical networking, wireless protocols, communication signaling, or some other communication format—including combinations thereof.
- TDM Time Division Multiplex
- IP Internet Protocol
- Ethernet optical networking
- wireless protocols communication signaling, or some other communication format—including combinations thereof.
- communication interface 601 is configured to communicate with cache nodes of the content delivery network to configure the cache nodes with HTTP acceleration services and applications.
- User interface 602 comprises components that interact with a user to receive user inputs and to present media and/or information.
- User interface 602 may include a speaker, microphone, buttons, lights, display screen, touch screen, touch pad, scroll wheel, communication port, or some other user input/output apparatus—including combinations thereof.
- User interface 602 may be omitted in some examples.
- Processing circuitry 605 comprises microprocessor and other circuitry that retrieves and executes operating software 607 from memory device 606 .
- Memory device 606 comprises a non-transitory storage medium, such as a disk drive, flash drive, data storage circuitry, or some other memory apparatus (e.g. script repository 216 ).
- Processing circuitry 605 is typically mounted on a circuit board that may also hold memory device 606 and portions of communication interface 601 and user interface 602 .
- Software 607 comprises computer programs, firmware, or some other form of machine-readable processing instructions.
- Software 607 includes crawler module 608 , extractor module 609 , and script modeler module 610 , comparison module 612 , recursion module 614 , although any number of software modules may provide the same operation.
- Software 607 may further include an operating system, utilities, drivers, network interfaces, applications, or some other type of software.
- software 607 directs processing system 603 to operate management computing system 600 as described herein.
- crawler module 608 directs processing system 603 to crawl through a repository of web content in search of web applications 105 with script content.
- the extractor module 609 parses the application into structural features or nodes.
- script modeler module 610 builds a plurality of ASTs for each script extracted. The script modeler module 610 clusters the plurality of ASTs into common clusters by means a predetermined tree edit distance.
- the tree is clustered with like trees. Once assembled into clusters, an entire cluster (e.g. set) is generalized into a generalized AST.
- the GAST has nodes that predictive of the type of script (e.g. banking, retailer, email, auction, etc.).
- the modeler module 610 assigns a type to the GAST based on the common nodes (e.g. password for type of login scripts).
- a client device 114 that includes and/or makes use of an end user web browser 112 , which may further include a script de-obfuscator.
- the de-obfuscator may de-obfuscate and/or unfold script code.
- the management system 150 can develop a policy 103 .
- the end user device 114 can receive the client-based malware policy 103 embedded with the requested unclassified web application 105 from at least one of the management system 150 and the server 110 .
- the policy 103 may call to the extractor or parser module 609 to extract the unclassified script structural features and the modeler module 610 to model the structural features into an AST.
- the policy 103 calls to the comparison module 612 to inspect code (e.g., script) received by the client web browser 112 .
- the comparison module 612 can compare the unclassified end user web browser script with the generalized scripts in the script repository 216 to determine if the code is malicious or benign.
- the script engine of the web browser 112 is configured to load, compile, and/or run script code that is retrieved by the web browser 112 , e.g., as part of a web page that is navigated to via the web browser. If the script is determined to be malicious, a notification may be sent to the server 110 and the script may not be allowed to run. Once there is a determination, there exists a possibility that the determination is false.
- the recursion module 614 is instantized once a misclassification has occurred.
- the server 110 once notification is received, may allow the web application to run, therein creating an exception to the policy 103 as a script was misclassified. Instead of or in combination with allocating for an exception, the management system 150 will review the false positive, and in particular the tree edit distance. It is foreseen that other algorithms may be used to compare to the first algorithm used to get to a result that does not tend towards a false positive. Once the training or machine learning is accomplished the updated generalized AST replaces previous GAST to minimize miscalculations.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Virology (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/953,953 US11314862B2 (en) | 2017-04-17 | 2018-04-16 | Method for detecting malicious scripts through modeling of script structure |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762486135P | 2017-04-17 | 2017-04-17 | |
US15/953,953 US11314862B2 (en) | 2017-04-17 | 2018-04-16 | Method for detecting malicious scripts through modeling of script structure |
Publications (2)
Publication Number | Publication Date |
---|---|
US20180300480A1 US20180300480A1 (en) | 2018-10-18 |
US11314862B2 true US11314862B2 (en) | 2022-04-26 |
Family
ID=63790693
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/953,953 Active 2038-11-18 US11314862B2 (en) | 2017-04-17 | 2018-04-16 | Method for detecting malicious scripts through modeling of script structure |
Country Status (1)
Country | Link |
---|---|
US (1) | US11314862B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210314353A1 (en) * | 2020-04-07 | 2021-10-07 | Target Brands, Inc. | Rule-based dynamic security test system |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112106048A (en) * | 2018-05-04 | 2020-12-18 | 谷歌有限责任公司 | Detecting injection vulnerabilities of client template systems |
US10831892B2 (en) * | 2018-06-07 | 2020-11-10 | Sap Se | Web browser script monitoring |
US11204998B2 (en) * | 2018-08-07 | 2021-12-21 | Mcafee, Llc | Detection and mitigation of fileless security threats |
US20210319098A1 (en) * | 2018-12-31 | 2021-10-14 | Intel Corporation | Securing systems employing artificial intelligence |
US11409867B2 (en) | 2019-03-28 | 2022-08-09 | Juniper Networks, Inc. | Behavioral detection of malicious scripts |
CN110362996B (en) * | 2019-06-03 | 2021-03-09 | 中国科学院信息工程研究所 | Method and system for offline detection of PowerShell malicious software |
US20200412740A1 (en) * | 2019-06-27 | 2020-12-31 | Vade Secure, Inc. | Methods, devices and systems for the detection of obfuscated code in application software files |
US11003444B2 (en) * | 2019-06-28 | 2021-05-11 | Intel Corporation | Methods and apparatus for recommending computer program updates utilizing a trained model |
US20210026969A1 (en) * | 2019-07-23 | 2021-01-28 | Chameleonx Ltd | Detection and prevention of malicious script attacks using behavioral analysis of run-time script execution events |
US10783082B2 (en) * | 2019-08-30 | 2020-09-22 | Alibaba Group Holding Limited | Deploying a smart contract |
CN110737890B (en) * | 2019-10-25 | 2021-04-02 | 中国科学院信息工程研究所 | Internal threat detection system and method based on heterogeneous time sequence event embedding learning |
CN112883372B (en) * | 2019-11-29 | 2024-02-09 | 中国电信股份有限公司 | Cross-site scripting attack detection method and device |
CN111245838B (en) * | 2020-01-13 | 2022-04-26 | 四川坤翔科技有限公司 | Method for protecting key information by anti-crawler |
US11973780B2 (en) * | 2020-10-14 | 2024-04-30 | Palo Alto Networks, Inc. | Deobfuscating and decloaking web-based malware with abstract execution |
CN112948862B (en) * | 2021-03-10 | 2021-10-29 | 山西云媒体发展有限公司 | Enterprise information service system |
US11436330B1 (en) | 2021-07-14 | 2022-09-06 | Soos Llc | System for automated malicious software detection |
US20230252148A1 (en) * | 2022-02-09 | 2023-08-10 | Microsoft Technology Licensing, Llc | Efficient usage of sandbox environments for malicious and benign documents with macros |
US11609985B1 (en) * | 2022-05-11 | 2023-03-21 | Cyberark Software Ltd. | Analyzing scripts to create and enforce security policies in dynamic development pipelines |
CN115268867B (en) * | 2022-07-26 | 2023-04-07 | 中国海洋大学 | Abstract syntax tree clipping method |
Citations (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060075468A1 (en) * | 2004-10-01 | 2006-04-06 | Boney Matthew L | System and method for locating malware and generating malware definitions |
US7117488B1 (en) * | 2001-10-31 | 2006-10-03 | The Regents Of The University Of California | Safe computer code formats and methods for generating safe computer code |
US7203963B1 (en) * | 2002-06-13 | 2007-04-10 | Mcafee, Inc. | Method and apparatus for adaptively classifying network traffic |
US20070113282A1 (en) * | 2005-11-17 | 2007-05-17 | Ross Robert F | Systems and methods for detecting and disabling malicious script code |
US20070152854A1 (en) * | 2005-12-29 | 2007-07-05 | Drew Copley | Forgery detection using entropy modeling |
US20070239993A1 (en) * | 2006-03-17 | 2007-10-11 | The Trustees Of The University Of Pennsylvania | System and method for comparing similarity of computer programs |
US20070240215A1 (en) * | 2006-03-28 | 2007-10-11 | Blue Coat Systems, Inc. | Method and system for tracking access to application data and preventing data exploitation by malicious programs |
US20080263659A1 (en) * | 2007-04-23 | 2008-10-23 | Christoph Alme | System and method for detecting malicious mobile program code |
US20090089759A1 (en) * | 2007-10-02 | 2009-04-02 | Fujitsu Limited | System and Method for Providing Symbolic Execution Engine for Validating Web Applications |
US20090265692A1 (en) * | 2008-04-21 | 2009-10-22 | Microsoft Corporation | Active property checking |
US20090300764A1 (en) * | 2008-05-28 | 2009-12-03 | International Business Machines Corporation | System and method for identification and blocking of malicious code for web browser script engines |
US20090327688A1 (en) * | 2008-06-28 | 2009-12-31 | Chengdu Huawei Symantec Technologies Co., Ltd. | Method and system for detecting a malicious code |
US20090328185A1 (en) * | 2004-11-04 | 2009-12-31 | Eric Van Den Berg | Detecting exploit code in network flows |
US20100031359A1 (en) * | 2008-04-14 | 2010-02-04 | Secure Computing Corporation | Probabilistic shellcode detection |
US20100180344A1 (en) * | 2009-01-10 | 2010-07-15 | Kaspersky Labs ZAO | Systems and Methods For Malware Classification |
US20110030060A1 (en) * | 2009-08-03 | 2011-02-03 | Barracuda Networks, Inc. | Method for detecting malicious javascript |
US20110197177A1 (en) * | 2010-02-09 | 2011-08-11 | Rajesh Mony | Detection of scripting-language-based exploits using parse tree transformation |
US20110231349A1 (en) * | 2010-03-22 | 2011-09-22 | Aptima, Inc. | Systems and methods of cognitive patterns knowledge generation |
US20110239294A1 (en) * | 2010-03-29 | 2011-09-29 | Electronics And Telecommunications Research Institute | System and method for detecting malicious script |
US20120216280A1 (en) * | 2011-02-18 | 2012-08-23 | Microsoft Corporation | Detection of code-based malware |
US8352409B1 (en) * | 2009-06-30 | 2013-01-08 | Symantec Corporation | Systems and methods for improving the effectiveness of decision trees |
US8381302B1 (en) * | 2009-09-14 | 2013-02-19 | Symantec Corporation | Systems and methods for translating non-comparable values into comparable values for use in heuristics |
US8401982B1 (en) * | 2010-01-14 | 2013-03-19 | Symantec Corporation | Using sequencing and timing information of behavior events in machine learning to detect malware |
US8413244B1 (en) * | 2010-11-11 | 2013-04-02 | Symantec Corporation | Using temporal attributes to detect malware |
US8418249B1 (en) * | 2011-11-10 | 2013-04-09 | Narus, Inc. | Class discovery for automated discovery, attribution, analysis, and risk assessment of security threats |
US20140310222A1 (en) * | 2013-04-12 | 2014-10-16 | Apple Inc. | Cloud-based diagnostics and remediation |
US9178904B1 (en) * | 2013-09-11 | 2015-11-03 | Symantec Corporation | Systems and methods for detecting malicious browser-based scripts |
US20160173507A1 (en) * | 2014-12-12 | 2016-06-16 | International Business Machines Corporation | Normalizing and detecting inserted malicious code |
US20160212153A1 (en) * | 2015-01-16 | 2016-07-21 | Microsoft Technology Licensing, Llc. | Code Labeling Based on Tokenized Code Samples |
US20160253500A1 (en) * | 2015-02-26 | 2016-09-01 | Mcafee, Inc. | System and method to mitigate malware |
US20160337387A1 (en) * | 2015-05-14 | 2016-11-17 | International Business Machines Corporation | Detecting web exploit kits by tree-based structural similarity search |
US9531736B1 (en) * | 2012-12-24 | 2016-12-27 | Narus, Inc. | Detecting malicious HTTP redirections using user browsing activity trees |
US9690933B1 (en) * | 2014-12-22 | 2017-06-27 | Fireeye, Inc. | Framework for classifying an object as malicious with machine learning for deploying updated predictive models |
US20170195356A1 (en) * | 2010-11-29 | 2017-07-06 | Biocatch Ltd. | Identification of computerized bots and automated cyber-attack modules |
US9825976B1 (en) * | 2015-09-30 | 2017-11-21 | Fireeye, Inc. | Detection and classification of exploit kits |
US9916448B1 (en) * | 2016-01-21 | 2018-03-13 | Trend Micro Incorporated | Detection of malicious mobile apps |
US9942264B1 (en) * | 2016-12-16 | 2018-04-10 | Symantec Corporation | Systems and methods for improving forest-based malware detection within an organization |
US20180103047A1 (en) * | 2010-11-29 | 2018-04-12 | Biocatch Ltd. | Detection of computerized bots and automated cyber-attack modules |
US20180124109A1 (en) * | 2016-11-02 | 2018-05-03 | RiskIQ, Inc. | Techniques for classifying a web page based upon functions used to render the web page |
US10216933B1 (en) * | 2016-09-16 | 2019-02-26 | Symantec Corporation | Systems and methods for determining whether malicious files are targeted |
US10484399B1 (en) * | 2017-02-16 | 2019-11-19 | Symantec Corporation | Systems and methods for detecting low-density training regions of machine-learning classification systems |
US10489587B1 (en) * | 2016-12-22 | 2019-11-26 | Symantec Corporation | Systems and methods for classifying files as specific types of malware |
US10581879B1 (en) * | 2016-12-22 | 2020-03-03 | Fireeye, Inc. | Enhanced malware detection for generated objects |
US10860719B1 (en) * | 2020-03-06 | 2020-12-08 | Cyberark Software Ltd. | Detecting and protecting against security vulnerabilities in dynamic linkers and scripts |
-
2018
- 2018-04-16 US US15/953,953 patent/US11314862B2/en active Active
Patent Citations (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7117488B1 (en) * | 2001-10-31 | 2006-10-03 | The Regents Of The University Of California | Safe computer code formats and methods for generating safe computer code |
US7203963B1 (en) * | 2002-06-13 | 2007-04-10 | Mcafee, Inc. | Method and apparatus for adaptively classifying network traffic |
US20060075468A1 (en) * | 2004-10-01 | 2006-04-06 | Boney Matthew L | System and method for locating malware and generating malware definitions |
US20090328185A1 (en) * | 2004-11-04 | 2009-12-31 | Eric Van Den Berg | Detecting exploit code in network flows |
US20070113282A1 (en) * | 2005-11-17 | 2007-05-17 | Ross Robert F | Systems and methods for detecting and disabling malicious script code |
US20070152854A1 (en) * | 2005-12-29 | 2007-07-05 | Drew Copley | Forgery detection using entropy modeling |
US20070239993A1 (en) * | 2006-03-17 | 2007-10-11 | The Trustees Of The University Of Pennsylvania | System and method for comparing similarity of computer programs |
US20070240215A1 (en) * | 2006-03-28 | 2007-10-11 | Blue Coat Systems, Inc. | Method and system for tracking access to application data and preventing data exploitation by malicious programs |
US20080263659A1 (en) * | 2007-04-23 | 2008-10-23 | Christoph Alme | System and method for detecting malicious mobile program code |
US20090089759A1 (en) * | 2007-10-02 | 2009-04-02 | Fujitsu Limited | System and Method for Providing Symbolic Execution Engine for Validating Web Applications |
US20100031359A1 (en) * | 2008-04-14 | 2010-02-04 | Secure Computing Corporation | Probabilistic shellcode detection |
US20090265692A1 (en) * | 2008-04-21 | 2009-10-22 | Microsoft Corporation | Active property checking |
US20090300764A1 (en) * | 2008-05-28 | 2009-12-03 | International Business Machines Corporation | System and method for identification and blocking of malicious code for web browser script engines |
US20090327688A1 (en) * | 2008-06-28 | 2009-12-31 | Chengdu Huawei Symantec Technologies Co., Ltd. | Method and system for detecting a malicious code |
US20100180344A1 (en) * | 2009-01-10 | 2010-07-15 | Kaspersky Labs ZAO | Systems and Methods For Malware Classification |
US8352409B1 (en) * | 2009-06-30 | 2013-01-08 | Symantec Corporation | Systems and methods for improving the effectiveness of decision trees |
US20110030060A1 (en) * | 2009-08-03 | 2011-02-03 | Barracuda Networks, Inc. | Method for detecting malicious javascript |
US8381302B1 (en) * | 2009-09-14 | 2013-02-19 | Symantec Corporation | Systems and methods for translating non-comparable values into comparable values for use in heuristics |
US8401982B1 (en) * | 2010-01-14 | 2013-03-19 | Symantec Corporation | Using sequencing and timing information of behavior events in machine learning to detect malware |
US20110197177A1 (en) * | 2010-02-09 | 2011-08-11 | Rajesh Mony | Detection of scripting-language-based exploits using parse tree transformation |
US20110231349A1 (en) * | 2010-03-22 | 2011-09-22 | Aptima, Inc. | Systems and methods of cognitive patterns knowledge generation |
US20110239294A1 (en) * | 2010-03-29 | 2011-09-29 | Electronics And Telecommunications Research Institute | System and method for detecting malicious script |
US8413244B1 (en) * | 2010-11-11 | 2013-04-02 | Symantec Corporation | Using temporal attributes to detect malware |
US20170195356A1 (en) * | 2010-11-29 | 2017-07-06 | Biocatch Ltd. | Identification of computerized bots and automated cyber-attack modules |
US20180103047A1 (en) * | 2010-11-29 | 2018-04-12 | Biocatch Ltd. | Detection of computerized bots and automated cyber-attack modules |
US20120216280A1 (en) * | 2011-02-18 | 2012-08-23 | Microsoft Corporation | Detection of code-based malware |
US8713679B2 (en) * | 2011-02-18 | 2014-04-29 | Microsoft Corporation | Detection of code-based malware |
US8418249B1 (en) * | 2011-11-10 | 2013-04-09 | Narus, Inc. | Class discovery for automated discovery, attribution, analysis, and risk assessment of security threats |
US9531736B1 (en) * | 2012-12-24 | 2016-12-27 | Narus, Inc. | Detecting malicious HTTP redirections using user browsing activity trees |
US20140310222A1 (en) * | 2013-04-12 | 2014-10-16 | Apple Inc. | Cloud-based diagnostics and remediation |
US9178904B1 (en) * | 2013-09-11 | 2015-11-03 | Symantec Corporation | Systems and methods for detecting malicious browser-based scripts |
US20160173507A1 (en) * | 2014-12-12 | 2016-06-16 | International Business Machines Corporation | Normalizing and detecting inserted malicious code |
US9690933B1 (en) * | 2014-12-22 | 2017-06-27 | Fireeye, Inc. | Framework for classifying an object as malicious with machine learning for deploying updated predictive models |
US20160212153A1 (en) * | 2015-01-16 | 2016-07-21 | Microsoft Technology Licensing, Llc. | Code Labeling Based on Tokenized Code Samples |
US20160253500A1 (en) * | 2015-02-26 | 2016-09-01 | Mcafee, Inc. | System and method to mitigate malware |
US20160337387A1 (en) * | 2015-05-14 | 2016-11-17 | International Business Machines Corporation | Detecting web exploit kits by tree-based structural similarity search |
US9516051B1 (en) * | 2015-05-14 | 2016-12-06 | International Business Machines Corporation | Detecting web exploit kits by tree-based structural similarity search |
US9825976B1 (en) * | 2015-09-30 | 2017-11-21 | Fireeye, Inc. | Detection and classification of exploit kits |
US9916448B1 (en) * | 2016-01-21 | 2018-03-13 | Trend Micro Incorporated | Detection of malicious mobile apps |
US10216933B1 (en) * | 2016-09-16 | 2019-02-26 | Symantec Corporation | Systems and methods for determining whether malicious files are targeted |
US20180124109A1 (en) * | 2016-11-02 | 2018-05-03 | RiskIQ, Inc. | Techniques for classifying a web page based upon functions used to render the web page |
US9942264B1 (en) * | 2016-12-16 | 2018-04-10 | Symantec Corporation | Systems and methods for improving forest-based malware detection within an organization |
US10489587B1 (en) * | 2016-12-22 | 2019-11-26 | Symantec Corporation | Systems and methods for classifying files as specific types of malware |
US10581879B1 (en) * | 2016-12-22 | 2020-03-03 | Fireeye, Inc. | Enhanced malware detection for generated objects |
US10484399B1 (en) * | 2017-02-16 | 2019-11-19 | Symantec Corporation | Systems and methods for detecting low-density training regions of machine-learning classification systems |
US10860719B1 (en) * | 2020-03-06 | 2020-12-08 | Cyberark Software Ltd. | Detecting and protecting against security vulnerabilities in dynamic linkers and scripts |
Non-Patent Citations (6)
Title |
---|
Berkhin, Pavel "Survey Of Clustering Data Mining Techniques", Grouping Multidimensional Data: Recent Advances in Clustering, 2002, p. 25-71. |
Curtsinger et al. "ZOZZLE: fast and precise in-browser JavaScript malware detection", SEC'11: Proceedings of the 20th USENIX conference on Security Aug. 2011; 16 Pages. (Year: 2011). * |
Kapravelos, et al. "Revolver: An Automated Approach to the Detection of Evasive Web-based Malware," published in 22nd USENIX Security Symposium, Aug. 14-16, 2013. pp. 636-651; 16 pages. (Year: 2013). * |
Pan et al. "CSPAutoGen: Black-box Enforcement of Content Security Policy Upon Real-world Websites", In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS '16), ACM, Oct. 2016, p. 653-665. |
Pawlik et al. "Tree edit distance: Robust and memory-efficient", Information Systems, 2016, vol. 56, p. 157-173. |
Wang et al. "Constructing format-preserving printing from syntax-directed definitions", Science China Information Sciences, Nov. 2015, vol. 58, p. 1-14. |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210314353A1 (en) * | 2020-04-07 | 2021-10-07 | Target Brands, Inc. | Rule-based dynamic security test system |
US11595436B2 (en) * | 2020-04-07 | 2023-02-28 | Target Brands, Inc. | Rule-based dynamic security test system |
Also Published As
Publication number | Publication date |
---|---|
US20180300480A1 (en) | 2018-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11314862B2 (en) | Method for detecting malicious scripts through modeling of script structure | |
US10592676B2 (en) | Application security service | |
Sarmah et al. | A survey of detection methods for XSS attacks | |
US11783035B2 (en) | Multi-representational learning models for static analysis of source code | |
Prokhorenko et al. | Web application protection techniques: A taxonomy | |
US10089464B2 (en) | De-obfuscating scripted language for network intrusion detection using a regular expression signature | |
US11941054B2 (en) | Iterative constraint solving in abstract graph matching for cyber incident reasoning | |
Chumachenko | Machine learning methods for malware detection and classification | |
Borgolte et al. | Delta: automatic identification of unknown web-based infection campaigns | |
US11615184B2 (en) | Building multi-representational learning models for static analysis of source code | |
Pan et al. | Cspautogen: Black-box enforcement of content security policy upon real-world websites | |
Rabadi et al. | Advanced windows methods on malware detection and classification | |
Kasim | An ensemble classification-based approach to detect attack level of SQL injections | |
US20200137126A1 (en) | Creation of security profiles for web application components | |
CN115943613A (en) | Guiltless presumption (IUPG): anti-adversary and anti-false positive deep learning model | |
Otterstad et al. | Low-level exploitation mitigation by diverse microservices | |
Gupta et al. | Evaluation and monitoring of XSS defensive solutions: a survey, open research issues and future directions | |
Dib et al. | EVOLIoT: A self-supervised contrastive learning framework for detecting and characterizing evolving IoT malware variants | |
Gupta et al. | A client‐server JavaScript code rewriting‐based framework to detect the XSS worms from online social network | |
Cavalli et al. | Design of a secure shield for internet and web-based services using software reflection | |
CN110659478B (en) | Method for detecting malicious files preventing analysis in isolated environment | |
Sharif | Web attacks analysis and mitigation techniques | |
US20230306114A1 (en) | Method and system for automatically generating malware signature | |
Hiremath | A novel approach for analyzing and classifying malicious web pages | |
Sayed et al. | Detection and mitigation of malicious JavaScript using information flow control |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TALA SECURITY, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SAWHNEY, SANJAY;BHALODE, SWAPNIL;DAVIDSON, ANDREW JOSEPH;AND OTHERS;SIGNING DATES FROM 20180413 TO 20180414;REEL/FRAME:045553/0180 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |