CN116595518A

CN116595518A - Malicious uniform resource locator URL detection in memory of a data processing unit

Info

Publication number: CN116595518A
Application number: CN202310009038.4A
Authority: CN
Inventors: V·格切曼; N·罗森; H·以利沙; B·理查森; R·艾伦; A·萨利赫; R·艾拉布尼; T·阮
Original assignee: Mellanox Technologies Ltd
Current assignee: Mellanox Technologies Ltd
Priority date: 2022-02-14
Filing date: 2023-01-04
Publication date: 2023-08-15

Abstract

The present disclosure relates to malicious uniform resource locator URL detection in a memory of a data processing unit. Apparatus, systems, and techniques for classifying candidate Uniform Resource Locators (URLs) as malicious using a Machine Learning (ML) detection system. An integrated circuit is coupled to a physical memory of a host device via a host interface. The integrated circuit hosts a hardware-accelerated security service to protect one or more computer programs executed by the host device. The security service extracts a set of features from data stored in physical memory, the data being words in the candidate URL and numerical features in the URL structure of the candidate URL. The security service uses an ML detection system to classify candidate URLs as malicious or benign using a set of features. The security service outputs an indication of the malicious URL in response to the candidate URL being classified as malicious.

Description

Malicious uniform resource locator URL detection in memory of a data processing unit

RELATED APPLICATIONS

The present application claims the benefit of U.S. provisional application No. 63/309,849, filed on 2 months of 2022, 14, the entire contents of which are incorporated herein by reference. The present application is related to co-pending U.S. application entitled "malicious activity DETECTION in memory of data processing units using machine learning DETECTION model (MALICIOUS ACTIVITY DETECTION IN MEMORY OF A DATA PROCESSING UNIT USING MACHINE LEARNING DETECTION modules)", co-pending U.S. application entitled "Lee software DETECTION in memory of data processing units using machine learning DETECTION model (RANSOMWARE DETECTION IN MEMORY OF A DATA PROCESSING UNIT USING MACHINE LEARNING DETECTION modules)", and co-pending U.S. application entitled "malicious Domain Generation Algorithm (DGA) DETECTION in memory of data processing units using machine learning DETECTION model (MALICIOUS DOMAIN GENERATION ALGORITHM (DGA) DETECTION IN MEMORY OF A DATA PROCESSING UNIT USING MACHINE LEARNING DETECTION modules)".

Technical Field

At least one embodiment relates to processing resources for performing and facilitating operations for detecting whether one or more computer programs are subject to malicious activity. For example, in accordance with various novel techniques described herein, at least one embodiment relates to a processor or computing system for providing and enabling a Data Processing Unit (DPU) to determine whether one or more computer programs executed by a host device are affected by malicious activity using a Machine Learning (ML) detection system based on features extracted from data stored in a physical memory of the host device.

Background

Machine learning involves training a computing system (using training data) to identify features in the data that may facilitate detection and classification. Training may be supervised or unsupervised. The machine learning model may use various computing algorithms, such as decision tree algorithms (or other rule-based algorithms), artificial neural networks, and the like. In the inference phase, new data is input into a trained machine learning model, which can classify items of interest using features identified during training.

Drawings

Various embodiments according to the present disclosure will be described with reference to the accompanying drawings, in which:

FIG. 1A is a block diagram of an example system architecture in accordance with at least one embodiment.

FIG. 1B is a block diagram of an example system architecture in accordance with at least one embodiment.

FIG. 2 is a flowchart of an example method of malicious activity detection of data stored in memory associated with one or more computer programs executed by a host device, in accordance with at least one embodiment.

FIG. 3A is a diagram of an example random forest classification model in accordance with at least one embodiment.

Fig. 3B is a block diagram of an example system architecture for a lux software detection system, according to at least one embodiment.

Fig. 3C is a block diagram of an example system architecture for a lux software detection system, according to at least one embodiment.

FIG. 4 is a flowchart of an example method of lux software detection using a random forest classification model, in accordance with at least one embodiment.

FIG. 5A is a block diagram of an example malicious Uniform Resource Locator (URL) detection system in accordance with at least one embodiment.

FIG. 5B is a block diagram of an example system architecture for a malicious URL detection system in accordance with at least one embodiment.

FIG. 5C is a block diagram of an example system architecture for a malicious URL detection system in accordance with at least one embodiment.

FIG. 6 illustrates a URL structure of a candidate URL in accordance with at least one embodiment.

FIG. 7 is a flow diagram of an example method of malicious URL detection using a binary (binary) classification model in accordance with at least one embodiment.

FIG. 8A is a block diagram of an example Domain Generation Algorithm (DGA) detection system, according to at least one embodiment.

Fig. 8B is a block diagram of an example system architecture for a DGA detection system in accordance with at least one embodiment.

Fig. 8C is a block diagram of an example system architecture for a DGA detection system in accordance with at least one embodiment.

FIG. 9A is a graph illustrating an exact-recall curve of a binary classification model of a DGA detection system according to at least one embodiment.

Fig. 9B is a graph illustrating training data prior to Unified Manifold Approximation and Projection (UMAP) dimension reduction in accordance with at least one embodiment.

Fig. 9C is a graph illustrating training data after UMAP dimension reduction in accordance with at least one embodiment.

FIG. 10 is a flowchart of an example method of DGA detection using a two-stage classification model, according to at least one embodiment.

Detailed Description

Malicious activity can cause damage to the computer system. Malicious activity may be caused by malware (also known as malicious software or malicious code). Malware is any software that is intentionally designed to destroy a computer, server, client, or computer network, reveal private information, gain unauthorized access to information or resources, deprive a user of rights to access information, or intentionally interfere with the user's computer's security and privacy. Common malware may include computer viruses (e.g., trojan horse viruses) or other infectious malware, worms, spyware, adware, rogue software, wipers, frightening software, luxer software, backdoor, phishing, and the like.

One type of malicious activity is caused by luxo software. Luxury software is malware designed to deny users or organizations access to files on their computers. The lux software may be encryption-based or screen lock-based lux software. For example, by encrypting a document and requiring a redemption of payment to obtain a decryption key, the le's software places the organization in the situation where payment of redemption is the simplest and cheapest way to regain access to its document. Luxury software has rapidly become the most prominent and obvious type of malware. Recent lux software attacks affect the ability of hospitals to provide critical services, putting urban public services in paralysis, and causing significant losses to various organizations. Existing security solutions for the lux software are installed on hosts or virtual machines (e.g., proxy-based anti-virus solutions). These existing solutions are inadequate because malware can hide from them. Furthermore, these tools are largely undetectable for new unknown malware because they are mostly based on static analysis, while malware with different static features is more easily created.

Another type of malicious activity is caused by malicious URLs. Malicious URLs are links that are aimed at facilitating fraud, attacks, and fraud. By clicking on a malicious URL, the user may download a lux software, virus, trojan horse, or any other type of malware, which would damage the machine or even the organization's network. Malicious URLs may also be used to persuade users to provide sensitive information on fake websites. Existing security solutions for malicious URLs are inadequate because they focus only on detecting malicious URLs by monitoring external sources, such as email, downloaded files, etc. This means that if the URL penetrates into the host or virtual machine, the current detection system will not discover it until it is used in an external source. Sometimes, hackers use encryption or obfuscation techniques to hide malicious URLs in files. These URLs in the file are hidden or confused from the scan or the user is enticed to click, which is only exposed in memory.

Another type of malicious activity is caused by domain name generation algorithm (DGA) malware. DGA malware establishes a connection with command and control (C & C) servers by periodically generating a large number of candidate domain names for the command and control servers and querying all these algorithmically generated domain names to resolve the Internet Protocol (IP) addresses of the command and control servers. An adversary registers one of these DGA generated domain names for the command and control server in advance using the same algorithm embedded in DGA malware. Finally, the malicious software queries the domain name registered in advance by the adversary and resolves the IP address of the command and control server. The malware then begins communicating with the command and control server to receive new commands and updates. If DGA malware does not find a command and control server in its previous domain name, it queries the next set of DGA generated domain names until an available domain name is found. Existing security solutions for DGA malware detect DGA domain names when DGA malware queries a Domain Name System (DNS) request to resolve the IP addresses of command and control servers.

Embodiments of the present disclosure address the above and other deficiencies by hosting hardware accelerated security services on an accelerated hardware engine of an integrated circuit. The hardware-accelerated security service extracts features from data stored in memory and associated with one or more computer programs executed by a Central Processing Unit (CPU), and determines whether the one or more computer programs are subject to malicious activity using an ML detection system based on the features extracted from the data stored in memory. The hardware-accelerated security service outputs an indication of malicious activity in response to determining that one or more computer programs are affected by the malicious activity. The computer program may be any of a host Operating System (OS), an application program, a guest operating system, a guest application program, and the like. The hardware accelerated security services running on the DPU are agent-free hardware products that examine the memory of one or more computer programs. Thus, malware is not aware of its presence, and hardware-accelerated security services may detect malware during an attack, i.e., when the malware exposes itself in memory, it is easier to detect malware. In at least one embodiment, the hardware accelerated security service is NVIDIA BlueField AppShield. In addition, other hardware-accelerated security services may be used. Thus, malware is unaware of its presence. In some cases, ML detection systems may detect malicious activity during an attack, in which case malware may expose itself, more easily discovered. The integrated circuit may be a Data Processing Unit (DPU), an on-chip programmable data center infrastructure. The integrated circuit may include a network interface operatively coupled to the CPU for responsible for network data path processing, and the CPU may control path initialization and exception handling.

As mentioned above, existing solutions to the lux software are installed on hosts or virtual machines (e.g., proxy-based anti-virus solutions) and are inadequate because the lux software can evade them and they cannot detect new unknown malware due to their static analysis. Embodiments of the present disclosure address the above and other deficiencies of the lux software by using a agentless hardware product which examines the memory of one or more computer programs, makes the lux software unaware of its existence, and detects the lux software during an attack in which the lux software exposes itself. Embodiments of the present disclosure address the above and other deficiencies in the lux software by taking a series of snapshots of data stored in memory, and extracting a set of features from each snapshot in the series of snapshots, each snapshot representing data at a point in time. The ML detection system may include a random forest classification model. The random forest classification model is a time series based model that is trained to classify a process as either lux or non-lux software using a cascade of different numbers of snapshots in a series of snapshots (e.g., 3, 5, and 10 snapshots).

As mentioned above, existing solutions of malicious URLs monitor external sources (such as email) for detection and are inadequate because these malicious URLs sometimes only display themselves in memory because they are encrypted or confusing URLs. Embodiments of the present disclosure address the above and other deficiencies related to malicious URLs by monitoring memory from the malicious URLs to detect even encrypted and obfuscated URLs. Embodiments of the present disclosure may provide an extended solution to existing security solutions where encrypted or obfuscated URLs are used to circumvent existing security solutions. Embodiments of the present disclosure address the above and other deficiencies related to malicious URLs by taking a snapshot of data stored in memory and extracting a set of features from the snapshot. The set of features may include words in the candidate URL and numerical features of the candidate URL structure, such as specifying the length, count, location, etc. of the URL structure. The ML detection system may include a binary classification model trained to classify candidate URLs as malicious or benign using a set of features.

As described above, existing DGA malware solutions detect DGA domains when DGA malware queries DNS requests to resolve command and control server IP addresses. Embodiments of the present disclosure address the above-described problems, as well as other deficiencies associated with DGA malware, by the DAM domain detecting DGA malware in memory before it attempts to establish a connection with a command and control server. By detecting DGA domains in memory prior to establishing a connection, embodiments of the present disclosure can quickly and efficiently eliminate DGA malware even before the DGA malware exposes itself with DNS requests. Furthermore, in most cases, DGA malware will generate multiple domains to attempt to connect with command and control servers. Embodiments of the present disclosure may collect the domains for each process, increasing the detection rate of the ML model, as it may be based on the comprehensive decision of all domains processed together. Embodiments of the present disclosure address the above and other deficiencies with respect to DGA malware by taking a snapshot of data stored in memory and extracting a set of features from the snapshot. The set of features may include one or more candidate URLs. The ML detection system may include a two-stage classification model. The two-stage classification model may include a binary classification model trained to classify one or more candidate URLs as having DGA domains or non-DGA domains in a first stage, and a multi-class classification model trained to classify DGA families of DGA domains among a set of DGA families. The binary classification model may be trained to classify one or more candidate URLs as generated by DGA malware in a first stage, while the multi-class classification model may be trained to classify DGA families of DGA malware among a set of DGA malware families.

System architecture

FIG. 1A is a block diagram of an example system architecture 100 in accordance with at least one embodiment. The system architecture 100 (also referred to herein as a "system" or "computing system") includes an integrated circuit, a tagged DPU102, a host device 104, a Secure Information and Event Management (SIEM) or an extended detection and response (XDR) system 106. The system architecture 100 may be part of a data center including one or more data stores, one or more server machines, and other components of the data center infrastructure. In embodiments, the network 108 may include a public network (e.g., the internet), a private network (e.g., a Local Area Network (LAN) or Wide Area Network (WAN)), a wired network (e.g., an ethernet network), a wireless network (e.g., an 802.11 network or Wi-Fi network), a cellular network (e.g., a Long Term Evolution (LTE) network), a router, a hub, a switch, a server computer, and/or combinations thereof.

In at least one embodiment, DPU102 is integrated as a system on a chip (SoC) that is considered a data center infrastructure on a chip. In at least one embodiment, DPU102 includes DPU hardware 110 and a software framework having acceleration library 112. DPU hardware 110 may include a CPU 114 (e.g., a single-core or multi-core CPU), one or more hardware accelerators 116, memory 118, one or more host interfaces 120, and one or more network interfaces 121. The software framework and acceleration library 112 may include one or more hardware acceleration services including a hardware accelerated security service 122 (e.g., NVIDIA DOCA APPSHIELD) (also referred to herein as "AppShield"), a hardware accelerated virtualization service 124, a hardware accelerated web service 126, a hardware accelerated storage service 128, a hardware accelerated artificial intelligence/machine learning (AI/ML) service 130, and a hardware accelerated management service 132. In at least one embodiment, DPU102 includes an ML detection system 134 that includes one or more ML detection models trained to determine whether one or more computer programs executed by host device 104 (e.g., a physical machine or Virtual Machine (VM)) are subject to malicious activity based on features extracted from data stored in host physical memory 148 associated with the one or more computer programs. The host physical memory 148 may include one or more volatile and/or non-volatile memory devices configured to store data for the host device 104. In at least one embodiment, the ML detection system 134 includes a lux software detection system 136, a malicious URL detection system 138, a DGA detection system 140, and optionally other malware detection systems 142.

In at least one embodiment, the hardware-accelerated security service 122 includes data extraction logic 146 that extracts data stored in host physical memory 148 (referred to as extracted data 147) through the host interface 120. In at least one embodiment, the data extraction logic 146 may obtain a snapshot or series of snapshots of data stored in the host physical memory 148 through the host interface 120. Each snapshot represents data at a point in time. In at least one embodiment, the data extraction logic 146 has feature extraction logic to extract one or more features and send the extracted features (rather than the extracted data 147) to the ML detection system 134. For example, the data extraction logic 146 may extract candidate URLs from the extracted data 147 and send the candidate URLs to the ML detection system 134.

In at least one embodiment, the data extraction logic 146 extracts and sends a series of snapshots to the ML detection system 134, and the ML detection system 134 includes feature extraction logic 144 to extract a set of features from different process plug-ins (e.g., memory plug-ins). Feature extraction logic 144 extracts a set of features from a different memory card from each snapshot of a series of snapshots. In at least one embodiment, the extracted features are fed into a lux software detection system 136. In at least one embodiment, the lux software detection system 136 includes a random forest classification model. The random forest classification model may be a time series based model trained to classify a process as either lux or non-lux software using a cascade of different numbers of snapshots in a series of snapshots. In at least one embodiment, the cascade of different numbers of snapshots in the series includes a first number of snapshots taken in a first amount of time, a second number of snapshots taken in a second amount of time greater than the first amount of time, and a third number of snapshots taken in a third amount of time greater than the second amount of time. The second number of snapshots includes the first number of snapshots, and the third number of snapshots includes the second number of snapshots. Additional details of the different memory plug-ins and random forest classification models are described below with respect to fig. 3A-4.

In at least one embodiment, the data extraction logic 146 extracts and sends the snapshot to the ML detection system 134, and the ML detection system 134 includes feature extraction logic 144 to extract a set of features from the snapshot. The set of features includes words in the candidate URL and numerical features of the URL structure of the candidate URL. The URL structure may include a schema, subdomain, domain, top Level Domain (TLD), port, path, query, fragment, or other structure, such as a secondary domain, subdirectory, etc., as shown in fig. 6. The numerical characteristics may include the length of the words in the candidate URL, a count of different parts of the URL structure (e.g., two TLDs, an indication of a port, three fragments, or the like). In at least one embodiment, feature extraction logic 144 may extract candidate URLs and digital features and tokenize words into tokens (tokens). In at least one embodiment, malicious URL detection system 138 includes a binary classification model trained to classify candidate URLs as malicious or benign using a set of features. In at least one embodiment, the binary classification model includes an embedded layer, a Long Short Term Memory (LSTM) layer, and a fully connected neural network layer. The embedding layer receives as tokens an input sequence of tokens representing words in the candidate URLs and generates an input vector from the input sequence of tokens. The LSTM layer is trained to generate output vectors based on the input vectors. The fully connected neural network layer is trained to classify candidate URLs as malicious or benign using the output vector of the LSTM layer and the numerical features of the URL structure. Additional details regarding the features and binary classification model of URLs are described below with respect to FIGS. 5A-7.

In at least one embodiment, the data extraction logic 146 extracts and sends the snapshot to the ML detection system 134, and the ML detection system 134 includes feature extraction logic 144 to extract a set of features from the snapshot. The set of features includes domain characters in one or more candidate URLs. The domain of the candidate URL may include a plurality of domain characters and feature extraction logic 144 may extract the domain characters as features of one or more candidate URLs. Feature extraction logic 144 may token the domain characters as tokens. In at least one embodiment, DGA detection system 140 includes a two-stage classification model. The two-stage classification model may include a binary classification model of the first stage and a multi-class classification model of the second stage. The binary classification model is trained to classify one or more candidate URLs as having a DGA domain or a non-DGA domain using a set of features in a first stage. The multi-class classification model is trained to classify DGA families of DGA domains between a set of DGA families using a set of features in a second stage. In at least one embodiment, the binary classification model is a Convolutional Neural Network (CNN) having an embedding layer to receive as tokens an input sequence of tokens representing domain characters in one or more candidate URLs and generate an input vector based on the input sequence of tokens. The CNN is trained to classify one or more candidate URLs as having a DGA domain or a non-DGA domain using an input vector from the embedded layer in a first stage. In at least one embodiment, the multi-class classification model includes a twin network (Siamese network) of CNNs with embedded layers. The twinning network is trained to classify the DGA family using the input vector from the embedded layer in the second stage. Additional details regarding the features and binary classification model of URLs are described below with respect to FIGS. 8A-10.

In at least one embodiment, the ML detection system 134 can output an indication 149 of the classification of the ML detection system 134. The indication 149 may be an indication of the luxo software, an indication of a malicious URL, an indication of a DGA domain, an indication that one or more computer programs executed by the host device 104 are subject to malicious activity, an indication of classification by other malware detection systems 142, or the like. In at least one embodiment, the ML detection system 134 may send an indication 149 to the hardware-accelerated security service 122, and the hardware-accelerated security service 122 may send an alert 151 to the SIEM or XDR system 106. Alert 151 may include information about the lux software, malicious URLs, DGA domains, or the like. In at least one embodiment, the ML detection system 134 may send an indication 155 to the SIEM or XDR system 106 in addition to or in lieu of sending an indication 149 to the hardware accelerated security service 122.

In at least one embodiment, DPU 102 extracts the data stored in host physical memory 148 and sends the extracted data 147 to another computing system hosting an ML detection system, as shown in FIG. 1B, wherein ML detection system 154 is hosted on accelerated AI/ML pipeline 153. In at least one embodiment, the accelerated AI/ML pipeline may be an NVIDIA MORPHEUS network security platform. The accelerated AI/ML pipeline 153 may perform preprocessing operations, reasoning, post-processing operations, actions, or any combination thereof. The accelerated AI/ML pipeline 153 may be a combination of hardware and software, such as an Injeida EXG platform and software for accelerating AI/ML operations on the Injeida EXG platform. The accelerated AI/ML pipeline 153 may provide advantages, for example, up to 60 times the acceleration process as compared to a CPU. The accelerated AI/ML pipeline 153 may also provide the advantage of multiple inferences that can be done in parallel (e.g., up to millions of parallel inferences). Additional details of the ML detection system 154 are described below in conjunction with FIG. 1B.

It should be noted that unlike a CPU or Graphics Processing Unit (GPU), DPU 102 is a new class of programmable processors that incorporates three key elements, including, for example: 1) Industry standard, high performance, software programmable, CPU (single core or multi core CPU) tightly coupled with other SoC components; 2) The high-performance network interface can analyze, process and efficiently transmit data to the GPU and the CPU at a linear speed or at speeds of other parts of the network; and 3) a rich set of flexible and programmable acceleration engines that can offload and improve the application performance in terms of AI and machine learning, security, telecommunications and storage, etc. These capabilities may provide an isolated, bare-metal, cloud-native computing platform for cloud-scale computing. In at least one embodiment, DPU 102 may be used as a stand-alone embedded processor. In at least one embodiment, DPU 102 may be incorporated into a network interface controller (also known as an intelligent network interface card (SmartNIC)) for use as a component in a server system. The DPU-based network interface card (network adapter) can offload processing tasks that would normally be responsible for the CPU of the server system. The DPU-based SmartNIC may perform any combination of encryption/decryption, firewall, transmission control protocol/internet protocol (TCP/IP), and hypertext transfer protocol (HTTP) processing using its own onboard processor. For example, smartNIC may be used for high traffic web servers.

In at least one embodiment, DPU102 may be configured as a modern cloud workload and high performance computing for legacy enterprises. In at least one embodiment, DPU102 may provide a set of software-defined network, storage, security, and management services (e.g., 122-132) on a data center scale with the ability to offload, accelerate, and isolate data center infrastructure. In at least one embodiment, DPU102 may provide a multi-tenant, cloud-native environment through these software services. In at least one embodiment, DPU102 may provide data center services of up to hundreds of CPU cores, freeing up valuable CPU cycles to run critical business applications. In at least one embodiment, DPU102 may be considered a new type of processor designed to process data center infrastructure software to offload and accelerate the computational load of virtualization, networking, storage, security, cloud-native AI/ML services, and other management services (e.g., 122-132).

In at least one embodiment, DPU102 may include connections to packet-based interconnects (e.g., ethernet), switch fabric interconnects (e.g., infiniBand, fibre channel, omni-Path), or the like. In at least one of the embodiments of the present invention, DPU 102 may provide an accelerated, fully programmable data center configured with security (e.g., zero trust security) to prevent data leakage and network attacks. In at least one embodiment, DPU 102 may include a network adapter, a processor core array, and an infrastructure offload engine with complete software programmability. In at least one embodiment, DPU 102 may be located at the edge of a server to provide flexible, secure, high performance cloud and AI workloads. In at least one embodiment, DPU 102 may reduce the overall cost of ownership and increase the efficiency of the data center. In at least one embodiment, DPU 102 may provide a software framework 112 (e.g., NVIDIA DOCA ^TM ) Enabling developers to quickly create applications and services for DPU 102, such as security services 122, virtualization services 124, networking services 126, storage services 128, AI/ML services 130, and management services 132. In at least one embodiment, the ML detection system 134 is implemented in the AI/ML service 130. In another embodiment, ML detection system 134 is implemented on one or more hardware accelerators 116 or other components of DPU hardware 110. In at least one embodiment, software framework 112 facilitates utilizing the hardware accelerator of DPU 102 to provide performance, efficiency, and security of a data center.

In at least one embodiment, DPU 102 may provide networking services 126 with virtual switches (vswitches), virtual routers (vruts), network Address Translation (NAT), load balancing, and network virtualization (NFV). In at least one embodiment, DPU 102 may provide storage services 128, including structural NVME ^TM (NVME ^TM over fabrics)(NVMe-oF ^TM ) Technology, resilient storage virtualization, hyper-fusion infrastructure (HCI) encryption, data integrity, compression, deduplication, or the like. NVM Express ^TM Is an open logical device interface specification for accessing through PCIA (PCIe) interface-attached non-volatile storage medium. NVMe-oF ^TM Provides efficient mapping of NVMe commands and several network transmission protocols to make a computerThe "initiator" is able to access very efficiently a block level storage device connected to another computer ("target") with minimal latency. The term Fabric is a generalization of more specific concepts of networks and input/output (I/O) channels. It essentially refers to the element N: m interconnects, typically in a peripheral environment. NVMe-oF ^TM Techniques enable NVMe command sets to be transmitted over various interconnection infrastructures, including networks (e.g., internet Protocol (IP)/ethernet) and I/O channels (e.g., fibre channel). In at least one embodiment, DPU 102 may provide security services 122 using next generation firewalls (FGFW), intrusion Detection Systems (IDS), intrusion Prevention Systems (IPS), trust roots, micro-segmentation, distributed denial of service (DDoS) prevention techniques, and ML detection using data extraction logic 146 (AppShield) and ML detection system 134. NGFW is a network security device that provides the ability to override stateful firewalls, such as application awareness and control, comprehensive intrusion prevention, and threat intelligence for cloud delivery. In at least one embodiment, the one or more network interfaces 121 may include an ethernet interface (single port or dual port) and an InfiniBand interface (single port or dual port). In at least one embodiment, the one or more host interfaces 120 may include a PCIe interface and a PCIe switch. In at least one embodiment, one or more host interfaces 120 can include other memory interfaces. In at least one embodiment, the CPU 114 may include multiple cores (e.g., up to 8 64-bit core pipelines), with an L2 cache for each two or two cores, an L3 cache with an eviction policy supporting Double Data Rate (DDR) dual memory modules (DIMMs) (e.g., supporting DDR4 DIMMs), and a DDR4 DRAM controller. Memory 118 may be an on-board DDR4 memory supporting Error Correction Code (ECC) error protection. In at least one embodiment, CPU 114 may include a single core and DRAM controller with L2 and L3 caches. In at least one embodiment, the one or more hardware accelerators 116 may include a secure accelerator, a storage accelerator, and a networking accelerator. In at least one embodiment, the ML detection system 134 is hosted by a secure accelerator. In at least one embodiment, the security accelerator may provide security with a hardware root of trust Boot-up, secure firmware update, cerberus compliance, regular expression (RegEx) acceleration, IP Security (IPsec)/Transport Layer Security (TLS) data movement encryption, AES-GCM 128/256 bit keys for static data encryption (e.g., advanced Encryption Standard (AES) and password text stealing (XTS) (e.g., AES-XTS 256/512), secure Hash Algorithm (SHA) 256 bit hardware accelerator, hardware public key accelerator (e.g., rivest-Shamir-Adleman (RSA), diffie-Hellman, digital Signal Algorithm (DSA), ECC, elliptic Curve encryption digital Signal Algorithm (EC-DSA), elliptic Curve Diffie-Hellman (EC-DH)), and True Random Number Generator (TRNG). In at least one embodiment, the storage accelerator may provide Bluefly SNAP-NVMe ^TM And VirtIO-blk, NVMe-oF ^TM Acceleration, compression and decompression acceleration, data hashing and deduplication. In at least one embodiment, the network accelerator may provide Remote Direct Memory Access (RDMA) RoCE, zero Touch RoCE, TCP, IP, and User Datagram Protocol (UDP) stateless offload, large Receive Offload (LRO), large Scale Offload (LSO), checksum, sum of squares (TSS), sum of squares (RSS), HTTP Dynamic Streaming (HDS) and Virtual Local Area Network (VLAN) insertion/stripping, single root I/O virtualization (SR-IOV), virtual Ethernet cards (e.g., virtIO-net), multifunction per port, VMware NetQuue support, virtualization hierarchy, and ingress and egress quality of service (QoS) levels (e.g., 1K ingress and egress QoS levels) through converged Ethernet (RoCE). In at least one embodiment, DPU 102 may also provide boot options including secure boot (RSA authentication), remote boot over Ethernet, remote boot over Internet Small computer System interface (iSCSI), pre-boot execution Environment (PXE), and Unified Extensible Firmware Interface (UEFI).

In at least one embodiment, DPU 102 may provide management services including 1GbE out-of-band management ports, network controller sideband interface (NC-SI), management component transport protocol over System management bus (SMBus) (MCTP) and monitoring control Table over PCIe (MCT), platform level data model for monitoring and control (PLDM), PLDM for firmware updates, interconnect circuit for device control and configuration (I2C) interfaces, serial peripheral interface for flash memory (SPI), embedded multimedia card (eMMC) memory controller, universal asynchronous receiver/transmitter (UART), and Universal Serial Bus (USB).

In at least one embodiment, the hardware-accelerated security service 122 is an adaptive cloud security service (e.g., NVIDIA APP SHIELD) that provides real-time network visibility, detection, and response of network threats. In at least one embodiment, hardware accelerated security service 122 acts as a monitoring or telemetry agent for DPU 102 or a network security platform (e.g., 153 in FIG. 1B), such as the NVIDIA Morpheus platform, which is an AI-enabled cloud native network security platform. The inflheus platform is an open application framework that enables network security developers to create AI/ML pipelines, filter, process and classify large amounts of real-time data, enabling clients to continuously check network and server telemetry data on a large scale. The Invevia Morpheus platform can provide information security for a data center, and realize dynamic protection, real-time telemetry and self-adaptive defense so as to detect and remedy network security threats.

Previously, users, devices, data, and applications within data centers were implicitly trusted and perimeter secured enough to protect them from external threats. In at least one embodiment, DPU 102, using hardware-accelerated security services 122, may define a security perimeter with a zero-trust protection model that recognizes that everyone and everything inside and outside the network cannot be trusted. The hardware-accelerated security service 122 may implement network screening by encrypting, refining access control, and differentiating segments of each host and all network traffic. The hardware-accelerated security service 122 may provide isolation, deploying security agents in trusted domains independent of the host domain. In the event that the host device is compromised, this quarantine of the hardware-accelerated security service 122 may prevent malware from knowing or accessing the hardware-accelerated security service 122, helping to prevent attacks from spreading to other servers. In at least one embodiment, the hardware-accelerated security services 122 described herein may provide host monitoring enabling network security providers to create accelerated Intrusion Detection System (IDS) solutions to identify attacks on any physical or virtual machines. The hardware accelerated security service 122 may feed the SIEM or XDR system 106 with data regarding the application status. The hardware-accelerated security services 122 may also provide enhanced forensic investigation and event responses.

As described above, an attacker attempts to exploit the vulnerability of the security control mechanism to move laterally to other servers and devices in the data center network. The hardware-accelerated security service 122 described herein may cause a security team to mask its applications, continually verify their integrity, and in turn detect malicious activity. In the event that an attacker kills the process of the security control mechanism, the hardware-accelerated security service 122 described herein may mitigate the attack by quarantining the attacked host device, preventing malware from accessing confidential data or diffusing to other resources.

Traditionally, security tools operate in the same host domain as malware. Thus, hidden malware may employ hidden technology from the host device that enables the malware to silently take over and tamper with agents and Operating Systems (OS). For example, if antivirus software is running on a host device that needs to continue running or not to be suspended, the hardware-accelerated security service 122 described herein actively monitors the process to determine any anomalies, malware, or intrusions, as described in more detail in various embodiments described below. In this case, the malware runs in the host domain, while the hardware-accelerated security service 122 runs in a separate domain than the host domain.

Host device 104 may be a desktop computer, a notebook computer, a smart phone, a tablet computer, a server, or any suitable computing device capable of performing the techniques described herein. In some embodiments, host device 104 may be a computing device of a cloud computing platform. For example, the host device 104 may be a server machine of a cloud computing platform or a component of a server machine. In such embodiments, host device 104 may be coupled to one or more edge devices (not shown) through network 108. An edge device refers to a computing device that is capable of communicating between computing devices at the boundary of two networks. For example, the edge device may be connected to the host device 104, one or more data stores, one or more server machines, through the network 108, and to one or more terminal devices (not shown) through another network. In such examples, the edge device may enable communication between host device 104, one or more data stores, one or more server machines, and one or more client devices. In other or similar embodiments, the host device 104 may be an edge device or a component of an edge device. For example, the host device 104 may facilitate communication between one or more data stores connected to the host device 104 via the network 108, one or more server machines, and one or more client devices connected to the host device 104 via another network.

In still other or similar embodiments, the host device 104 may be a terminal device or a component of a terminal device. For example, host device 104 may be or be a component of a device such as a television, a smart phone, a cellular phone, a data center server, a data DPU, a Personal Digital Assistant (PDA), a portable media player, a netbook, a notebook, an e-book reader, a tablet, a desktop, a set-top box, a gaming machine, a computing device for an autonomous vehicle, a monitoring device, and so forth. In such embodiments, host device 104 may be connected to DPU 102 through network 108 via one or more network interfaces 121. In other or similar embodiments, host device 104 may be connected to an edge device (not shown) through another network, and the edge device may be connected to DPU 102 through network 108.

In at least one embodiment, the host device 104 executes one or more computer programs. The one or more computer programs may be any process, routine, or code executed by the host device 104, such as a host operating system, an application program, a guest operating system of a virtual machine, or a guest application program, such as executing in a container. Host device 104 may include one or more core CPUs, one or more multi-core CPUs, one or more GPUs, one or more hardware accelerators, or the like.

In at least one embodiment, one or more computer programs reside in a first computing domain (e.g., a host domain), and the hardware-accelerated security service 122 and the ML detection system 134 reside in a second computing domain (e.g., a DPU domain or an infrastructure domain) that is different from the first computing domain. In at least one embodiment, the malicious activity is caused by malware, and the hardware-accelerated security service 122 is out-of-band security software in a trusted domain that is distinct and separate from the malware. That is, malware may reside in the host domain, while the hardware-accelerated security service 122, because it is in the trusted domain, may monitor physical memory to detect malware in the host domain. In at least one embodiment, DPU 102 includes a Direct Memory Access (DMA) controller (not shown in FIG. 1A) coupled with host interface 120. The DMA controller may read data from the host physical memory 148 through the host interface 120. In at least one embodiment, the DMA controller uses PCIe technology to read data from the host physical memory 148. In addition, other techniques may be used to read data from the host physical memory 148.

Although the various embodiments described above are for embodiments in which hardware-accelerated security services 122 and ML detection system 134 are implemented in DPU 102, in other embodiments, some operations are performed on DPU 102 while other operations are performed on another computing device, such as described and illustrated in fig. 1B. In other embodiments, DPU 102 may be any computing system or computing device capable of performing the techniques described herein.

FIG. 1B is a block diagram of an example system architecture 150 in accordance with at least one embodiment. The system architecture 150 is similar to the system architecture 100, as indicated by similar reference numerals, except as described below. The system architecture 150 includes an integrated circuit, a tagged DPU 152, a host device 104, a SIEM or XDR system 106, and an accelerated AI/ML pipeline 153. As described above, the accelerated AI/ML pipeline 153 is a network security platform. In at least one embodiment, the accelerated AI/ML pipeline may be an NVIDIA MORPHEUS network security platform. As described above, the NVIDIA Morpheus platform is an AI-enabled, cloud-native, network security platform. The inflheus platform is an open application framework that enables network security developers to create AI/ML pipelines, filter, process and classify large amounts of real-time data, enabling clients to continuously check network and server telemetry data on a large scale. The inflheus platform can provide information security for the data center to realize dynamic protection, real-time telemetry and self-adaptive defense, so as to detect and remedy network security threats. In at least one embodiment of FIG. 1B, DPU 152 extracts data stored in host physical memory 148 and sends the extracted data 147 to an accelerated AI/ML pipeline 153 that hosts ML detection system 154. In this embodiment, the ML detection system 154 includes a Lesoware detection system 136, a malicious URL detection system 138, a DGA detection system 140, and optionally other malware detection systems 142 similar to the ML detection system 134 of FIG. 1A.

In at least one embodiment, the ML detection system 154 can output an indication 149 of the classification of the ML detection system 154. The indication 149 may be an indication of the luxo software, an indication of a malicious URL, an indication of a DGA domain, an indication that one or more computer programs executed by the host device 104 are subject to malicious activity, an indication of classification by other malware detection systems 142, or the like. In at least one embodiment, the ML detection system 154 may send an indication 149 to the hardware-accelerated security service 122, and the hardware-accelerated security service 122 may send an alert 151 to the SIEM or XDR system 106. Alert 151 may include information about the lux software, malicious URLs, DGA domains, etc. In at least one embodiment, the ML detection system 154 may send an indication 155 to the SIEM or XDR system 106 in addition to or in lieu of sending an indication 149 to the hardware accelerated security service 122.

In at least one embodiment, one or more computer programs reside in a first computing domain (e.g., a host domain), and the hardware-accelerated security service 122 and the ML detection system 154 reside in a second computing domain (e.g., a DPU domain) that is different from the first computing domain. In another embodiment, one or more computer programs reside in a first computing domain (e.g., a host domain), the hardware-accelerated security service 122 resides in a second computing domain (e.g., a DPU domain), and the ML detection system 154 resides in a third computing domain that is different from the first and second computing domains.

In at least one embodiment, the malicious activity is caused by malware, and the hardware-accelerated security service 122 is out-of-band security software in a trusted domain that is distinct and separate from the malware. That is, malware may reside in the host domain, while the hardware-accelerated security service 122, because it is in the trusted domain, may monitor physical memory to detect malware in the host domain. In at least one embodiment, DPU 152 includes a DMA controller (not shown in FIG. 1B) coupled with host interface 120. The DMA controller may read data from the host physical memory 148 through the host interface 120. In at least one embodiment, the DMA controller uses PCIe technology to read data from the host physical memory 148. In addition, other techniques may be used to read data from the host physical memory 148.

Additional details of the lux software detection system 136 are described below in connection with fig. 3A-4. Additional details of the malicious URL detection system 138 are described below with respect to FIGS. 5A-7. Additional details of DGA detection system 140 are described below with respect to fig. 8A-10. The following is the use of i) DPU 102, including hardware-accelerated security services 122 and ML detection system 134; or ii) DPU 152, including hardware accelerated security services 122, and accelerated AI/ML pipeline 153 (also referred to as accelerated pipeline hardware), including additional details of the general operation of ML detection system 154 to detect malicious activity.

FIG. 2 is a flowchart of an example method 200 of malicious activity detection of data in memory associated with one or more computer programs executed by a host device, in accordance with at least one embodiment. In at least one embodiment, method 200 may be performed by logic that processes DPU 102. In at least one embodiment, the method 200 may be performed by logic that processes the DPU 152 and logic that processes the accelerated AI/ML pipeline 153. The processing logic may be hardware, firmware, software, or any combination thereof. The method 200 may be performed by one or more data processing units (e.g., a DPU, a CPU, and/or a GPU), which may include (or be in communication with) one or more memory devices. In at least one embodiment, the method 200 may be performed by a plurality of processing threads, each thread performing one or more separate functions, routines, subroutines, or operations of the method. In at least one embodiment, the processing threads implementing method 200 may be synchronized (e.g., using signals, critical portions, and/or other thread synchronization logic). Alternatively, the processing threads implementing method 200 may execute asynchronously with respect to each other. The various operations of method 200 may be performed in a different order than that shown in fig. 2. Some operations of the method may be performed concurrently with other operations. In at least one embodiment, one or more of the operations shown in FIG. 2 may not always be performed.

Referring to fig. 2, processing logic (of the DPU 102, 152) extracts a plurality of features from data stored in a memory associated with one or more computer programs executed by a host device (block 202). Processing logic determines whether the one or more computer programs are affected by malicious activity based on a plurality of features extracted from data stored in memory using a Machine Learning (ML) detection system (implemented on DPU 102 or accelerated AI/ML pipeline 153) (block 204). Processing logic outputs an indication of malicious activity in response to determining that the one or more computer programs are affected by the malicious activity (block 206).

In at least one embodiment, the one or more computer programs may be a host operating system, an application program, a guest operating system, a guest application program, or the like. The malicious activity detected at block 204 may be caused by luxo software, malicious URLs, DGA malware, or other malware described herein.

In at least one embodiment of the lux software, processing logic obtains a series of snapshots of data stored in memory, each snapshot representing data at a point in time. Processing logic extracts a set of features from a different memory card from each snapshot of the series of snapshots. As described herein, processing logic determines whether malicious activity was caused by the lux software using a random forest classification model of the ML detection system. The random forest classification model may be a time series based model that is trained to classify the process as either lux or non-lux software using a cascade of different numbers of snapshots in a series of snapshots, e.g., 3, 5, and 10 snapshots or other combinations of different numbers of snapshots.

In at least one embodiment of a malicious URL, processing logic obtains a snapshot of data stored in memory, the snapshot representing the data at a point in time. Processing logic extracts a set of features from the snapshot, the set of features including words in the candidate URL and numerical features of the URL structure of the candidate URL. Processing logic may token the word as a token. As described herein, processing logic determines whether malicious activity is caused by a malicious URL using a binary classification model of the ML detection system that is trained to classify candidate URLs as malicious or benign using the set of features. The binary classification model may include an embedded layer, an LSTM layer, and a fully connected neural network layer. The embedding layer may receive as tokens an input sequence of tokens representing words in the candidate URLs and generate an input vector from the input sequence of tokens. The LSTM layer is trained to generate output vectors based on the input vectors. The fully connected neural network layer is trained to classify candidate URLs as malicious or benign using the output vector from the LSTM layer and the numerical features of the URL structure.

In at least one embodiment of DGA malware, processing logic obtains a snapshot of data stored in memory, the snapshot representing data at a point in time. Processing logic extracts a set of features from the snapshot, the set of features including domain characters of one or more candidate URLs. Processing logic may token the domain character as a token. The ML detection system includes a two-stage classification model including a binary classification model and a multi-class classification model. The binary classification model is trained to classify one or more candidate URLs as having a DGA domain or a non-DGA domain using a set of features in a first stage. The multi-class classification model is trained to classify DGA families of DGA domains between a set of DGA families using a set of features in a second stage. The binary classification model of the first stage may include a CNN having an embedding layer to receive as tokens an input sequence of tokens representing domain characters in one or more candidate URLs and generate an input vector based on the input sequence of tokens. The CNN is trained to classify one or more candidate URLs as having a DGA domain or a non-DGA domain using an input vector from the embedded layer in a first stage. The multi-class classification model includes a twin network of CNNs with embedded layers trained to classify DGA families using input vectors from the embedded layers in a second phase.

Lesu software detection

As described above, one type of malicious activity is caused by the lux software. In an example computing system, there may be many in-flight lux software, trojans, or Remote Access Trojans (RATs) to monitor hundreds of legitimate computer programs (also known as non-lux software), such as antivirus software, compression software, cleaning software, drivers, scanners, editors, or the like. These computer programs create and run up to thousands of processes. By using a snapshot of these processes to observe, for example, the system may have 1000 processes that are not considered to be luxury software and 1 or more processes that are considered to be luxury software, which are encrypting the data of the system. The observation may be a snapshot of a Process Identifier (PID) (pid+snapshot). In at least one embodiment, the hardware-accelerated security service 122 is an active system for detecting lux software activity in an operating system by constantly monitoring the physical memory of the host and virtual machines based on multiple plug-ins (also referred to as volatility plug-ins, memory plug-ins, or process plug-ins). Multiple plug-ins may extract information such as process lists, network connections, kernel modules, or the like. This information may include an indication that may be used by the ML detection system 134 (or 154) for feature extraction. Multiple plug-ins may be used to obtain a data dump of the host physical memory 148. The plurality of plug-ins allow for real-time memory analysis (or real-time data analysis) of the host physical memory 148. Multiple plug-ins may obtain selection data required for a particular purpose, such as building a process list. Multiple plug-ins allow DMA controllers on DPU 102 (or DPU 152) to access host physical memory 148. The data 147 extracted by the feature extraction logic 144 may be stored in the memory 118 for analysis by the DPU 102 (or the DPU 152 or the accelerated AI/ML pipeline 153) without the malware being aware of or able to modify the data. In at least one embodiment, DPU 102 may process extracted data 147 and extract features or indications from extracted data 147 before sending to ML detection system 134 (or 154). DPU 102 (or DPU 152) may collect real-time data using out-of-band memory retrieval using hardware-accelerated security services 122. DPU 102 may integrate the lux software detection system 136 with real-time data collected by the hardware accelerated security services 122 to detect lux software in host physical memory 148.

In at least one embodiment, the data extraction logic 146 may take a snapshot of the host physical memory 148 and record data from multiple cards serially for each snapshot. In each streaming snapshot, the ML detection system 134 (or 154) receives data from multiple memory cards. These inserts may include LdrModules, vadInfo, handles, threadList, envars, or the like. The ThreadList plug-in may provide information about the list of threads and their state, such as running, pending, stopped (e.g., the reason for stopping the work). The ldmodules plug-in may provide information that hides or injects activity types in the process. The Handles plugin may provide information about Handles, handle tables, pointers, and files, keys, threads, or processes in a process. The VadInfo plug-in may provide information about the Virtual Address Descriptor (VAD). The Envars plug-in may provide information about environmental variables.

In at least one embodiment, multiple seconds (e.g., 4 seconds) are required for each snapshot. Feature extraction logic 146 may extract features (e.g., 100 dominant features) from each snapshot. For reference, 99% of the lux software takes 8 seconds or more to encrypt the machine, 97% of the lux software takes 12 seconds or more to encrypt the machine, 87% of the lux software takes 20 seconds or more to encrypt the machine, and 55% of the lux software takes 40 seconds or more to encrypt the machine. In at least one embodiment, the data extraction logic 146 may take two or more snapshots for 99% of the lux software, three or more snapshots for 97% of the lux software, five or more snapshots for 87% of the lux software, and ten or more snapshots for 55% of the lux software.

In at least one embodiment, the feature extraction logic 144 may extract various features from the ldrs modules plug-in, including, for example: ldrmmodules_df_size_int.

In at least one embodiment, feature extraction logic 144 may extract various features from the Envars plug-in (fewer than 35 environmental variables for most system processes, and more for malware processes), including, for example: envirs_patnext: COM, EXE, · BAT, · CMD, · VBS, · VBE, · JS, · JSE, · WSF, · WSH, · MSC, · CPL, and envars_df_count.

In at least one embodiment, feature extraction logic 144 may extract various features from the VadInfo plug-in including, for example: get_complete_charge_max_vad, page_no_vad_ratio, count_entry_complete_charge_vad,

get_commit_charge_min_vad_page_noaccess，page_noaccess_count,page_readonly_vads_count,ratio_private_memory,page_noaccess_vad_count,get_commit_charge_mean_vad,vad_ratio，get_commit_charge_max_page_noaccess,

get_commit_charge_mean_page_execute_readwrite,

get_commit_charge_mean_vad_page_noaccess,

get_fill_charge_max_page_execution_read_write, get_fill_charge_min_vad, page_read_va_ratio, page_read_write_ratio, page_no_ratio, or the like.

In at least one embodiment, feature extraction logic 144 may extract various features from the ThreadList plug-in, including, for example: wirelist_df_wait_coast_9 (e.g., the feature is greater than zero for 25% of the halyard), wirelist_df_wait_coast_31 (e.g., the feature is greater than zero for 25% of the halyard), wirelist_df_state_2, wirelist_df_state_ unique, threadlist _df_wait_coast_13.

In at least one embodiment, feature extraction logic 144 may extract various features from Handles plugins, including, for example: double_extension_len_handles, count_double_extension_count_handles_handles_counts, check_doc_file_handles_counts, handles_df_section_ratio, handles_df_wavecompletacket_counts, handles_df_direction_counts, handles_df_section_count, handles_df_tworkfactor_count, handles_df_direction_ratio, handles_df_semaphore_ratio, handles_df_mutan_ratio, handles_df_event_ratio, handles_df_tworkfactor_ratio, handles_df_file_count, handles_df_ionic completion_ratio, handles_df_thread_ratio, handles_df_key_ratio, handles_df_ionic completion_ratio, handles_df_file_ratio, handles_df_ionic completion_ratio, file_users_exists, handles_df_semaphore_count, handles_df_iocompletions_reserve_count, handles_df_mutant_count, handles_df_event_count, handles_df_key_count, file_windows_count, handles_df_name_unique.

In at least one embodiment, feature extraction logic 144 may extract a count of extensions of files (count_double_extension_count) that identifies how many files are duplicate files but with different extensions (e.g., copy. Docx and copy. Docx. Donut).

In at least one embodiment, once feature extraction logic 144 extracts a set of two or more feature snapshots (different N snapshots), a different number of snapshots are fed into the lux software detection system 136. In at least one embodiment, the lux software detection system 136 includes a random forest classification model that is trained to classify processes between lux software and non-lux software, as shown in fig. 3A.

FIG. 3A is a schematic diagram of an example random forest classification model 300 in accordance with at least one embodiment. The random forest classification model 300 obtains a streaming snapshot of features from the plugin in real-time. In at least one embodiment, random forest classification model 300 uses feature extraction logic 144 to extract a set of features from each snapshot. In another embodiment, random forest classification model 300 receives a set of features from feature extraction logic 144. The random forest classification model 300 is a time series based model that uses a cascade of different numbers of snapshots to classify the progress between the lux and non-lux software. In the illustrated embodiment, three, five, and ten snapshots are used to categorize a process.

In at least one embodiment, the first random forest classification model 302 receives a first feature set in a first snapshot 304, a second feature set in a second snapshot 306, and a third feature set in a third snapshot 308. The first random forest classification model 302 uses feature sets from these snapshots 304-308 to classify the process as either lux software 301 or non-lux software 303. In at least one embodiment, the first random forest classification model 302 may output an indication of the lux software 305 in response to the process being classified as lux software 301. The indication 305 of the lux software may specify a confidence level that the process corresponds to the lux software class. The confidence level may be a predicted percentage as a lux software. For example, if the confidence level meets a confidence level criterion (e.g., a confidence threshold), the first random forest classification model 302 may classify the process as a Leuco software 301. Alternatively, the first random forest classification model 302 may output an indication of the non-lux software in response to the process being classified as the non-lux software 303. The indication of the non-lux software may indicate a confidence level that the process corresponds to the non-lux software class. In the present embodiment, the first random forest classification model 302 is used as a first stage of a plurality of stages in a time series based model (the random forest classification model 300).

In at least one embodiment, the second random forest classification model 310 receives feature sets from three snapshots 304-308 used by the first random forest classification model 302, and a fourth feature set in a fourth snapshot 312, and a fifth feature set in a fifth snapshot 314. The second random forest classification model 310 uses feature sets from the five snapshots 304-308 and 312-314 to classify a process as either lux software 307 or non-lux software 309. In at least one embodiment, the second random forest classification model 310 may output an indication of the lux software 305 being classified as lux software 307 in response to the process. The indication 305 of the lux software may specify a confidence level that the process corresponds to the lux software class. Alternatively, the second random forest classification model 310 may output a non-lux software indication responsive to a process classified as non-lux software 303, and a confidence level that the process corresponds to the non-lux software class. In this embodiment, a second random forest classification model 310 is used as the second stage in the time series based model (random forest classification model 300).

In at least one embodiment, the third random forest classification model 316 receives feature sets from the five snapshots 304-308 and 312-314 used by the second random forest classification model 310, and feature sets from four additional snapshots, including a tenth snapshot 318. The third random forest classification model 316 uses feature sets from the ten snapshots 304-308, 312-314, and 318 to classify the process as either lux software 311 or non-lux software 313. In at least one embodiment, the third random forest classification model 316 may output an indication of the lux software 305 being classified as lux software 307 in response to the process. The indication of the lux software 305 may specify a confidence level that the process corresponds to the lux software class. Alternatively, the second random forest classification model 310 may output a non-lux software indication responsive to a process classified as non-lux software 303, and a confidence level that the process corresponds to the non-lux software class. In the present embodiment, the third random forest classification model 316 is used as the third stage of the time series-based model (the random forest classification model 300).

In at least one embodiment, the lux software detection system 136 may output an indication of the lux software 305 in response to the process being classified as lux software 307. The indication 305 of the lux software may specify a confidence level that the process corresponds to the lux software class. Alternatively, the lux software detection system 136 may output an indication of the lux software in response to a process classified as the lux software 303, and a confidence level for the process corresponding to the lux software class.

In at least one embodiment, a different number of snapshots in a series other than 3, 5, and 10 may be used. In at least one embodiment, a first number of snapshots may be taken in a first amount of time, a second number of snapshots may be taken in a second amount of time greater than the first amount of time, and so on. The different number of snapshots is cascaded by the subsequent number of snapshots, including the previous snapshot. Similarly, a third number of snapshots may be obtained over a third amount of time greater than the second amount of time, the third number of snapshots including the second number of snapshots.

As described above, the ML detection model (e.g., random forest classification model 300) may be deployed in the lux software detection system 136 resident in the DPU 102, as described in more detail with respect to fig. 3B, or the lux software detection system 136 resident in the accelerated AI/ML pipeline 153, as described in more detail with respect to fig. 3C.

Fig. 3B is a block diagram of an example system architecture 320 for the lux software detection system 136, according to at least one embodiment. In system architecture 320, DPU 102 hosts hardware-accelerated security services 122 and a lux software detection system 136. The hardware-accelerated security service 122 takes a snapshot of the memory plug-in 321, as described above with respect to fig. 3A, and sends or otherwise provides the snapshot of the memory plug-in 321 to the lux software detection system 136. The lux software detection system 136 classifies one or more processes as lux software or non-lux software using the random forest classification model 300, and outputs an indication of the lux software 305 (or an indication of the non-lux software) to the SIEM or XDR system 106 for further action by the SIEM or XDR system 106. The SIEM or XDR system 106 may monitor and display the classification results of the lux software, for example, on a dashboard displayed to a user or operator of the SIEM or XDR system 106.

Fig. 3C is a block diagram of an example system architecture 340 for a lux software detection system, according to at least one embodiment. In the system architecture 340, the DPU 152 hosts the hardware accelerated security services 122 and the accelerated AI/ML pipeline 153 hosts the Lecable software detection system 136. The hardware-accelerated security service 122 extracts a snapshot of the memory plug-in 341, as described above with respect to fig. 3A, and sends or otherwise provides the snapshot of the memory plug-in 341 to a publisher subscription feature 342 (e.g., kafka). The publisher subscription feature 342 sends or otherwise provides a snapshot of the memory plug-in 341 to the lux software detection system 136. The lux software detection system 136 classifies one or more processes as lux software or non-lux software using the random forest classification model 300, and outputs an indication of the lux software 305 (or an indication of the non-lux software) to the SIEM or XDR system 106 for further action by the SIEM or XDR system 106.

FIG. 4 is a flowchart of an example method 400 of lux software detection using a random forest classification model, in accordance with at least one embodiment. In at least one embodiment, method 400 may be performed by processing logic of DPU 102. In at least one embodiment, the method 400 may be performed by processing logic of the DPU 152 and processing logic of the accelerated AI/ML pipeline 153. In at least one embodiment, the method 400 may be performed by the processing logic of the Leucasian software detection system 136 of FIGS. 1A-1B, 3A-3B. The processing logic may be hardware, firmware, software, or any combination thereof. The method 200 may be performed by one or more data processing units (e.g., a DPU, a CPU, and/or a GPU), which may include (or be in communication with) one or more memory devices. In at least one embodiment, the method 400 may be performed by a plurality of processing threads, each thread performing one or more separate functions, routines, subroutines, or operations of the method. In at least one embodiment, the processing threads implementing method 400 may be synchronized (e.g., using signals, critical portions, and/or other thread synchronization logic). Alternatively, the processing threads implementing method 400 may execute asynchronously with respect to each other. The various operations of method 400 may be performed in a different order than that shown in fig. 4. Some operations of the method may be performed concurrently with other operations. In at least one embodiment, one or more of the operations shown in FIG. 4 may not always be performed.

Referring to FIG. 4, processing logic obtains a series of snapshots of data stored in physical memory of a host device, the data being associated with one or more computer programs executed by the host device (block 402). Processing logic extracts a set of features from each snapshot of the series of snapshots using the ML detection system, each snapshot representing data at a point in time (block 404). Processing logic classifies the process of the one or more computer programs as either lux software or non-lux software using the ML detection system using the set of features (block 406). Processing logic outputs an indication of the lux software in response to the process being classified as lux software (block 408).

In another embodiment, the ML detection system includes a random forest classification model (e.g., 300). The random forest classification model is a time series based model that is trained to classify a process as either lux or non-lux software using a cascade of different numbers of snapshots in a series of snapshots. In at least one embodiment, the cascade of different numbers of snapshots includes a first number of snapshots taken within a first amount of time and a second number of snapshots taken within a second amount of time greater than the first amount of time, the second number of snapshots including the first number of snapshots. In another embodiment, the cascade of different numbers of snapshots includes a third number of snapshots taken over a third amount of time greater than the second amount of time, the third number of snapshots including the second number of snapshots.

In another embodiment, the ML detection system comprises a time-based classification model trained to classify processes as either lux software or non-lux software over different amounts of time using different feature sets. In at least one embodiment, the different feature sets include a first set of snapshots representing data stored in physical memory for a first period of time, and a second set of snapshots representing data stored in physical memory for a second period of time that is greater than the first period of time. In another embodiment, the different feature set further includes a third set of snapshots representing data stored in physical memory for a third period of time that is greater than the second period of time. In other embodiments, processing logic may perform other operations described above with respect to the lux software detection system 136.

Other malware detection

As described above, one type of malicious activity is caused by malware. As described above, the data extraction logic 146 may take snapshots of the host physical memory 148 and record data from multiple cards serially for each snapshot. In each streaming snapshot, the ML detection system 134 (or 154) receives data from multiple memory cards. In at least one embodiment, feature extraction logic 144 may extract a set of features from one or more snapshots. In at least one embodiment, once feature extraction logic 144 extracts a set of features, the features are fed into other malware detection systems 142. In at least one embodiment, other malware detection systems 142 include one or more trained ML models to classify processes of one or more computer programs as malware or non-malware using a set of features.

Malicious URL detection

As described above, one type of malicious activity is caused by malicious URLs. As described above, the data extraction logic 146 may take snapshots of the host physical memory 148 and record data from multiple cards serially for each snapshot. In each streaming snapshot, the ML detection system 134 (or 154) receives data from multiple memory cards. In at least one embodiment, feature extraction logic 144 may extract one or more candidate URLs from one or more snapshots. In at least one embodiment, once the feature extraction logic 144 extracts the candidate URL, the candidate URL is fed into the malicious URL detection system 138. In at least one embodiment, malicious URL detection system 138 includes a binary classification model trained to classify candidate URLs as malicious or benign, as shown in FIG. 5A.

FIG. 5A is a block diagram of an example malicious URL detection system 138 in accordance with at least one embodiment. The malicious URL detection system 138 includes feature extraction logic 144 and a binary classification model 500 that is trained to classify candidate URLs as malicious or benign using a set of features. Feature extraction logic 144 receives extracted data 147 and extracts one or more candidate URLs from extracted data 147. For the binary classification model 500, the feature extraction logic 144 extracts the word features and the digital features of the candidate URLs. In at least one embodiment, feature extraction logic 144 may tokenize words in candidate URLs into word tokens 507 and determine digital features 509 of the URL structure, as shown in fig. 6. Feature extraction logic 144 may provide word tokens 507 and digital features 409 to binary classification model 500, which is trained to classify candidate URLs as malicious 501 or good 503 using word tokens 507 and digital features 509. In at least one embodiment, feature extraction logic 144 may clean up text of candidate URLs, for example, by deleting slashes, punctuation marks, words of less than three characters, words of greater than a specified number of characters (e.g., 15 characters), or the like. Feature extraction logic 144 may determine the presence of ports, a count of the number of fields, a count of the number of TLDs, the length of each word, or the like. In at least one embodiment, feature extraction logic 144 may extract a set of digital features 509 (e.g., 20 digital features) about the URL structure outside of word token 507.

In at least one embodiment, feature extraction logic 144 may extract candidate URLs with a text regex (regular expression) function. For example, feature extraction logic 144 may extract candidate URLs from a stack using dynamic allocation (e.g., malloc ()), a stack using static allocation (e.g., char ar [ ] = "STRING"). In another embodiment, feature extraction logic 144 may extract the information of the VadInfo plug-in from the VadTree.

In at least one embodiment, binary classification model 500 includes an embedded layer 502, an LSTM layer 504, and a fully connected neural network layer 506. The embedding layer 502 may receive the word token 507 as an input sequence of tokens representing words in the candidate URL. The embedding layer 502 may generate an input vector 511 based on the input sequence of tokens. The input vector 511 may include one insert for each word and represent the word token 507 in a representation of words other than the input sequence. The input vector 511 may represent words in the candidate URL in the vector space used by the LSTM layer 504. LSTM layer 504 may receive input vector 511 and generate output vector 513 based on input vector 511. The fully connected neural network layer 506 may receive the output vector 513 and the digital signature 509 from the LSTM layer 504. The fully connected neural network layer 506 is trained to classify candidate URLs as malicious 501 or good 503 using the output vector 513 from the LSTM layer 504 and the digital features 509 of the URL structure. In at least one embodiment, the fully connected neural network layer 506 may determine a confidence level that the candidate URL corresponds to a malicious category. The confidence level may be a percentage of predictions that are malicious. For example, if the confidence level meets a confidence level criterion (e.g., a confidence threshold), the fully connected neural network layer 506 may classify the candidate URL as malicious 501.

In at least one embodiment, the malicious URL detection system 138 may output an indication of the malicious URL505 in response to the candidate URL being classified as malicious 501. The indication of malicious URLs 505 may specify a confidence level that the candidate URLs correspond to a malicious category. Alternatively, the malicious URL detection system 138 may output an indication of a good address in response to the candidate URL being classified as good 503. The indication of the goodwill URL may indicate a confidence level that the candidate URL is goodwill.

As described above, the ML detection model (e.g., binary classification model 500) may be deployed in a malicious URL detection system 138 resident in the DPU 102, as described in more detail with respect to FIG. 5B, or in a malicious URL detection system 138 resident in the accelerated AI/ML pipeline 153, as described in more detail with respect to FIG. 5C.

FIG. 5B is a block diagram of an example system architecture 520 for the malicious URL detection system 138 in accordance with at least one embodiment. In system architecture 520, DPU 102 hosts hardware-accelerated security services 122 and malicious URL detection system 138. The hardware-accelerated security service 122 takes a snapshot of the memory plug-in 321 as described above with respect to fig. 3A. The hardware-accelerated security service 122 may extract candidate URLs 521 from any one or more snapshots of the memory plug-in and send or otherwise provide the candidate URLs 521 to the malicious URL detection system 138. In another embodiment, the hardware-accelerated security service 122 extracts a snapshot of the memory plug-in 321, as described above with respect to FIG. 3A, and sends the snapshot of the memory plug-in 321 to the malicious URL detection system 138, the malicious URL detection system 138 extracting the candidate URL 521, as shown in FIG. 5A. The malicious URL detection system 138 may use the binary classification model 500 to classify the candidate URLs 521 as malicious or benign and output an indication of the malicious URLs 505 (or an indication of the benign URLs) to the SIEM or XDR system 106 in order for the SIEM or XDR system 106 to take further action. The SIEM or XDR system 106 can monitor and display the classification results of the malicious URLs, for example, on a dashboard displayed to a user or operator of the SIEM or XDR system 106.

FIG. 5C is a block diagram of an example system architecture 540 of the malicious URL detection system 138 in accordance with at least one embodiment. In the system architecture 540, the DPU 152 hosts the hardware accelerated security services 122 and the accelerated AI/ML pipeline 153 hosts the malicious URL detection system 138. The hardware-accelerated security service 122 extracts the candidate URL 521, as described above with respect to fig. 3A, and sends or otherwise provides the candidate URL 521 to a publisher subscription feature 542 (e.g., kafka). The publisher subscription feature 542 sends or otherwise provides the candidate URLs 521 to the malicious URL detection system 138. The malicious URL detection system 138 may use the binary classification model 500 to classify the candidate URLs 521 as malicious or benign, and output an indication of the malicious URLs 505 (or an indication of the benign URLs) to the SIEM or XDR system 106 for further action by the SIEM or XDR system 106.

As described above with respect to fig. 5A-5C, feature extraction logic 144 may extract candidate URLs and extract features from the candidate URLs, including words in the candidate URLs and numerical features of the candidate URL structure, as shown in candidate URL600 of fig. 6.

FIG. 6 illustrates a URL structure of a candidate URL600 in accordance with at least one embodiment. Candidate URL600 includes words, numbers, characters, and punctuation marks organized in a URL structure. The URL structure of the candidate URL600 may include a schema 602, subdomain 604, domain 606, TLD608, port 610, path 612, query 614, fragment 616. Subdomain 604, domain 606, and TLD608 may make up host domain 618. Candidate URLs may include other URL structures such as secondary domains, subdirectories, or the like. When extracting features, the feature extraction logic 144 may analyze each URL structure and extract a word (if any), a length of the word and/or URL structure, a location of the URL structure, a count of URL structures (e.g., when there are two TLDs, for example), an indication of the existence of a URL structure (e.g., an indication of a port in the URL structure), or the like. Feature extraction logic 144 may tokenize (e.g., https, www, welcometothejungle, jobs, developer, page, fra) words extracted from candidate URLs 600 and delete diagonal lines and other punctuation marks between potential words. Feature extraction logic 144 may delete words that exceed a specified length, such as welcome jungle, or may reduce longer words to smaller words, such as welcome, the and jungle. Feature extraction logic 144 may output a sequence representing digital features about the URL structure for input to binary classification model 500. The tokens of the words may be processed by the embedding layer 502, converting them into vector space for use by the LSTM layer 504. The fully connected neural network layer 506 may use the output of the LSTM layer 504 and the digital features of the URL structure to classify the candidate URL600 as malicious 501 or good 503.

FIG. 7 is a flow diagram of an example method 700 of malicious URL detection using a binary classification model in accordance with at least one embodiment. In at least one embodiment, method 700 may be performed by processing logic of DPU 102. In at least one embodiment, the method 700 may be performed by processing logic of the DPU 152 and processing logic of the accelerated AI/ML pipeline 153. In at least one embodiment, the method 700 may be performed by the processing logic of the malicious URL detection system 138 of FIGS. 1A-1B, 5A-5B. The processing logic may be hardware, firmware, software, or any combination thereof. Method 700 may be performed by one or more data processing units (e.g., a DPU, a CPU, and/or a GPU), which may include (or be in communication with) one or more memory devices. In at least one embodiment, the method 700 may be performed by multiple processing threads, each thread performing one or more separate functions, routines, subroutines, or operations of the method. In at least one embodiment, the processing threads implementing method 700 may be synchronized (e.g., using signals, critical portions, and/or other thread synchronization logic). Alternatively, the processing threads implementing method 700 may execute asynchronously with respect to each other. The various operations of method 700 may be performed in a different order than that shown in fig. 7. Some operations of the method may be performed concurrently with other operations. In at least one embodiment, one or more of the operations shown in FIG. 7 may not always be performed.

Referring to FIG. 7, processing logic obtains a snapshot of data stored in physical memory of a host device, the data associated with one or more computer programs executed by the host device (block 702). Processing logic extracts a set of features from the snapshot, including candidate URLs, using the ML detection system (block 704). The set of features may include words in the candidate URL and numerical features of the URL structure of the candidate URL. Processing logic classifies the candidate URL as malicious or benign using the ML detection system using the set of features (block 706). Processing logic outputs an indication of the malicious URL in response to the candidate URL being classified as malicious (block 708).

In at least one embodiment, the URL structure includes one or more of a subdomain, a domain, a TLD, a port, a path, a query, and a fragment.

In a further embodiment, the ML detection system includes a binary classification model trained to classify candidate URLs as malicious or benign using tokens representing words in the candidate URLs and numerical features of the candidate URL structure. In at least one embodiment, the binary classification model includes an LSTM layer trained to token words in candidate URLs as tokens, and a fully connected neural network layer trained to classify candidate URLs as malicious or benign using tokens and digital features of the URL structure.

In another embodiment, the ML detection system comprises a binary classification model that trains classifying candidate URLs as malicious or benign by combining Natural Language Processing (NLP) of words in the candidate URLs with features of the URL structure of the candidate URLs.

DGA domain detection

As described above, one type of malicious activity is caused by DGA malware. As described above, the data extraction logic 146 may take snapshots of the host physical memory 148 and record data from multiple cards serially for each snapshot. In each streaming snapshot, the ML detection system 134 (or 154) receives data from multiple memory cards. In at least one embodiment, feature extraction logic 144 may extract one or more candidate URLs from one or more snapshots. In at least one embodiment, once feature extraction logic 144 extracts one or more candidate URLs, the one or more candidate URLs are fed into DGA detection system 140. In at least one embodiment, DGA detection system 140 includes a two-stage classification model that is trained to classify one or more candidate URLs as DGA domains or non-DGA domains in a first stage and to classify DGA families of DGA domains between DGA families using a set of features in a second stage, as shown in fig. 5A. In another embodiment, the two-stage classification model includes a binary classification model trained to classify one or more candidate URLs as generated by DGA malware in a first stage, and a multi-class classification model trained to classify DGA families of DGA malware among a set of DGA malware families.

FIG. 8A is a block diagram of an example DGA detection system 140 according to at least one embodiment. DGA detection system 140 includes feature extraction logic 144 and a two-stage classification model 800 that is trained to classify candidate URLs as malicious or benign using a set of features. Feature extraction logic 144 receives extracted data 147 and extracts one or more candidate URLs from extracted data 147. For the two-stage classification model 800, feature extraction logic 144 extracts domain character features of one or more candidate URLs. In at least one embodiment, feature extraction logic 144 may token the domain character as character token 807. Feature extraction logic 144 may provide character tokens 807 to a two-stage classification model 800 that is trained to classify one or more candidate URLs as having a DGA domain 801 or a non-DGA domain using words Fu Lingpai 807.

In at least one embodiment, feature extraction logic 144 may extract candidate URLs with a text regex (regular expression) function. For example, feature extraction logic 144 may extract candidate URLs from a stack using dynamic allocation (e.g., malloc ()), a stack using static allocation (e.g., char ar [ ] = "STRING"). In another embodiment, feature extraction logic 144 may extract Vadtree information from the VadInfo plug-in.

In at least one embodiment, the two-stage classification model 800 is a binary classification model 802. Binary classification model 802 may include an embedded layer 804 and a CNN 806. The embedding layer 804 may receive the character token 807 as an input sequence of tokens representing domain characters in one or more candidate URLs. The embedding layer 804 may generate an input vector 811 based on the input token sequence. The input vector 811 may include an embedding for each field character of a set of field characters and represent the character token 807 in a different representation than the input token sequence. The input vector 811 may represent domain characters of one or more candidate URLs in a vector space used by the CNN 806. CNN 806 may receive input vector 811 and use input vector 811 to classify one or more candidate URLs as having DGA field 801. In general, CNN 806 identifies whether the domain characters constitute a random character sequence or a word sequence in a language (e.g., english language). In at least one embodiment, CNN 806 may determine a confidence level that one or more candidate URLs correspond to DGA domain classes. The confidence level may be a predictive percentage of the DGA domain. For example, if the confidence level meets a confidence criterion (e.g., a confidence threshold), the CNN 806 may classify one or more candidate URLs as having DGA fields 801.

In at least one embodiment, DGA detection system 140 may output an indication of DGA malware 805 in response to one or more candidate URLs being classified as having DGA domain 801. The indication of DGA malware 805 may specify a confidence level that one or more candidate URLs correspond to DGA domain classes. Alternatively, DGA detection system 140 may output an indication of a non-DGA domain in response to one or more candidate URLs being classified as having a non-DGA domain. The indication of the non-DGA domain may indicate that one or more candidate URLs have a confidence level of the non-DGA domain.

In at least one embodiment, the two-stage classification model 800 includes a first-stage binary classification model 802 and a second-stage multi-class classification model 810. Binary classification model 802 may classify one or more candidate URLs as having DGA fields 801. If the binary classification model 802 classifies one or more candidate URLs as having DGA domains 801 in a first stage, the multi-class classification model 810 may classify DGA families 803 of the DGA domains 801 among a set of DGA families. In another embodiment, binary classification model 802 may be trained to classify one or more candidate URLs as being generated by DGA malware in a first stage, and multi-class classification model 810 may be trained to classify DGA families of DGA malware among a set of DGA malware families.

In at least one embodiment, the multi-class classification model 810 may include a twin network 812 having an embedded layer 814 and a CNN 816. The embedding layer 814 may receive the character token 807 as an input sequence of tokens representing domain characters in one or more candidate URLs. The embedding layer 804 may generate an input vector 813 based on the input sequence of tokens. The input vector 813 may include an embedding of each of a set of field characters and represent the character token 807 in a different representation than the input token sequence. The input vector 813 may represent the domain characters of one or more candidate URLs in the vector space used by the CNN 816. CNN 816 may receive input vector 813 and use input vector 811 to classify DGA families 803 of one or more candidate URLs among a set of DGA families. In at least one embodiment, CNN 816 may determine a confidence level that one or more candidate URLs belong to DGA family 803. The confidence level may be a predictive percentage of the DGA family. For example, if the confidence level meets a confidence level criterion (e.g., a confidence threshold), the CNN 816 may classify one or more candidate URLs as belonging to DGA family 803.

In at least one embodiment, DGA detection system 140 may output an indication of DGA malware family 815 in response to one or more candidate URLs being classified as belonging to DGA family 803. The indication of DGA malware family 815 may specify a confidence level that one or more candidate URLs belong to DGA family 803. Alternatively, DGA detection system 140 may output an indication of the other DGA families in response to one or more candidate URLs being classified as not belonging to one of the set of DGA families. Other DGA family indications may indicate confidence levels. In another embodiment, DGA detection system 140 may output indications of DGAs, probabilities as DGAs, most likely DGA families, or "other" DGA families.

In at least one embodiment, the binary classification model 802 and the multi-class classification model 810 may run concurrently. In another embodiment, logic may be used to trigger the multi-class classification model 810 in response to one or more candidate URLs being classified as having DGA fields 801.

In at least one embodiment, CNN 816 trains over a set of DGA families, e.g., in: banjori, corebot, cryptolocker, dicrypt, emotet, fluebot, gameover, murofet, necurs, newgoz, padcrypt, pykspa, qadars, ramdo, ramnit, ranbyus, rovnix, simda and Tinba, and all other DGA families can be classified as other DGA families.

In at least one embodiment, the CNNs 806, 816 with embedded layers 804, 814 may use tokens of domain characters as features. To reduce false positives, the non-DGA domain and class of DGA domains may be weighted (e.g., non-DGA domain: 100 and DGA domain: 1). In at least one embodiment, the twinning network 812 with CNN 816 and embedded layer 814 may use the same token of domain characters as a feature.

In another embodiment, the two-stage classification model 800 may use other NLP models to process domain characters of one or more candidate URLs to classify them as having a DGA domain 801. In another embodiment, the NLP model or binary classification model 802 may not use the twinning network 812, classifying only candidate URLs as having DGA domains 801 or non-DGA domains, and not as DGA families. In another embodiment, feature extraction logic 144 may extract domain characters, numerical features of URLs, words of candidate URLs, or the like for a more complex classification model.

FIG. 8B is a block diagram of an example system architecture 820 of the DGA detection system 140 according to at least one embodiment. In system architecture 820, DPU 102 hosts hardware-accelerated security services 122 and DGA detection system 140. The hardware-accelerated security service 122 takes a snapshot of the memory plug-in 321 as described above with respect to fig. 3A. The hardware-accelerated security service 122 may extract one or more candidate URLs 821 from any one or more snapshots of the memory plug-in and send or otherwise provide the one or more candidate URLs 821 to the DGA detection system 140. In another embodiment, the hardware-accelerated security service 122 extracts a snapshot of the memory plug-in 321 as described above with respect to FIG. 3A and sends the snapshot of the memory plug-in 321 to the DGA detection system 140, and the DGA detection system 140 extracts one or more candidate URLs 821 as shown in FIG. 8A. DGA detection system 140 uses a two-stage classification model 800 to classify one or more candidate URLs 821 as having DGA domains or non-DGA domains and to classify DGA families among multiple DGA families. DGA detection system 140 may output indications 805 of DGA malware (or indications of non-malware) and/or indications 815 of DGA families (or indications of other DGA families) to SIEM or XDR system 106 in order for SIEM or XDR system 106 to take further action. The SIEM or XDR system 106 can monitor and display the classification results of the DGA domain, for example, on a dashboard displayed to a user or operator of the SIEM or XDR system 106.

FIG. 8C is a block diagram of an example system architecture of DGA detection system 140 according to at least one embodiment. In the system architecture 840, the DPU 152 hosts the hardware accelerated security services 122, while the accelerated AI/ML pipeline 153 hosts the malicious DGA detection system 140. The hardware-accelerated security service 122 extracts one or more candidate URLs 821, as described above with respect to fig. 3A, and sends or otherwise provides the one or more candidate URLs 821 to a publisher subscription feature 842 (e.g., kafka). The publisher subscription feature 842 sends or otherwise provides one or more candidate URLs 821 to the DGA detection system 140. DGA detection system 140 uses a two-stage classification model 800 to classify one or more candidate URLs 821 as having DGA domains or non-DGA domains and to classify DGA families among multiple DGA families. DGA detection system 140 may output indications 805 of DGA malware (or indications of non-malware) and/or indications 815 of DGA malware families (or indications of other DGA families) to SIEM or XDR system 106 for further action by SIEM or XDR system 106.

In at least one embodiment, binary classification model 802 may accept performance evaluations. For performance evaluation, the training dataset includes thousands of DGA domains (e.g., 361,108) and thousands of non-DGA domains (e.g., 715,761), and the test dataset includes thousands of DGA domains (e.g., 444,033) and thousands of non-DGA domains (e.g., 178,941). In at least one embodiment, the test set does not describe the correct category distribution, so the accuracy calculation can be changed to satisfy a 100 times more distribution for the non-DGA domain than for the DGA domain. An accurate-recall (precision-recovery) curve is shown in fig. 9A.

FIG. 9A is a diagram 900 illustrating an exact-recall curve 902 of binary classification model 802 of DGA detection system 140 according to at least one embodiment. Recall (recovery) and precision (precision) can be expressed by the following formulas, where TP is true positive, FN is false negative, and FP is false positive.

DGAAmount＝0.01·NotDGAAmount

As shown in the precision-recall curve 902 of fig. 9A, the recall value is 0.9 when the precision is equal to 0.9.

In at least one embodiment, the multi-class classification model 810 may accept performance evaluations. For performance evaluation, the training dataset includes thousands of non-DGA domains (e.g., 715,761) and thousands of DGA domains (e.g., 327,673) from the following families: banjori, corebot, cryptolock, dicrypt, emotet, fluebot, gameover, murofet, necurs, newgoz, padcrypt, pykspa, qadars, ramdo, ramnit, ranbyus, rovnix, simda, and Tinba. The test dataset includes thousands of non-DGA domains (e.g., 178,941) and millions of DGA domains (e.g., 1,310,693) from the following families: banjori, corebot, cryptolock, dicrypt, emotet, fluebot, gameover, murofet, necurs, newgoz, padcrypt, pykspa, qadars, ramdo, ramnit, ranbyus, rovnix, simda, and Tinba. In at least one embodiment, multi-class classification model 810 may be trained using UMAP. The raw data of the UMAP training data is shown in fig. 9B. To display the clustering capabilities of the multi-class classification model 810, UMAP may be used for dimension reduction, as shown by the two axes of FIG. 9C.

Fig. 9B is a graph 920 illustrating training data 922 prior to a UMAP dimension reduction in accordance with at least one embodiment.

Fig. 9C is a graph 940 illustrating training data 942 after UMAP dimension reduction in accordance with at least one embodiment. As shown in graph 940, the multi-class classification model 810 may successfully aggregate training data into multiple classes, representing each DGA family, without many false positives. The multi-class classification model 810 may classify DGA families of domain characters of one or more candidate URLs among the multiple DGA families. If a candidate URL is not found within a specified group, it may be categorized as part of the "other" DGA family classification.

FIG. 10 is a flowchart of an example method 1000 of DGA detection using a two-stage classification model, according to at least one embodiment. In at least one embodiment, method 1000 may be performed by processing logic of DPU 102. In at least one embodiment, the method 1000 may be performed by processing logic of the DPU 152 and processing logic of the accelerated AI/ML pipeline 153. In at least one embodiment, the method 1000 may be performed by the processing logic of the DGA detection system 140 of FIGS. 1A-1B, 8A-8B. The processing logic may be hardware, firmware, software, or any combination thereof. Method 1000 may be performed by one or more data processing units (e.g., a DPU, a CPU, and/or a GPU), which may include (or be in communication with) one or more memory devices. In at least one embodiment, the method 1000 may be performed by a plurality of processing threads, each thread performing one or more separate functions, routines, subroutines, or operations of the method. In at least one embodiment, the processing threads implementing method 1000 may be synchronized (e.g., using signals, critical portions, and/or other thread synchronization logic). Alternatively, the processing threads implementing method 1000 may execute asynchronously with respect to each other. The various operations of method 1000 may be performed in a different order than that shown in fig. 10. Some operations of the method may be performed concurrently with other operations. In at least one embodiment, one or more of the operations shown in FIG. 10 may not always be performed.

Referring to FIG. 10, processing logic obtains a snapshot of data stored in physical memory, the data associated with one or more computer programs executed by a host device (block 1002). Processing logic extracts a set of features from the snapshot using the ML detection system, the set of features including domain characters in one or more candidate URLs (block 1004). Processing logic classifies the one or more candidate URLs as having a DGA domain or a non-DGA domain using the ML detection system using a set of features (block 1006). Processing logic outputs an indication of DGA malware in response to the one or more candidate URLs being classified as having a DGA domain (block 1008). In another embodiment, processing logic uses the ML detection system to classify DGA families of DGA malware among a set of DGA malware families using the set of features (block 1010). Processing logic outputs an indication of the DGA family of DGA malware (block 1012).

In at least one embodiment, the ML detection system includes a two-stage classification model, including a binary classification model and a multi-class classification model. In at least one embodiment, the binary classification model is trained to classify one or more candidate URLs as having DGA domains or non-DGA domains in a first stage, and the multi-class classification model is trained to classify DGA families of DGA domains between a set of DGA families in a second stage. In at least one embodiment, a binary classification model is trained to tokenize domain characters in one or more candidate URLs and use the tokens to classify the one or more candidate URLs as having a DGA domain or a non-DGA domain in a first stage. In at least one embodiment, a multi-class classification model is trained to classify DGA families of DGA domains between a set of DGA families using tokens in a second stage.

In at least one embodiment, the binary classification model includes a CNN having an embedded layer to tokenize domain characters of one or more candidate URLs into tokens, the CNN using the tokens of the domain characters as a feature set, and the multi-class classification model includes a twinning network of CNNs having embedded layers, the twinning network using the tokens of the domain characters as a feature set.

Other variations are within the spirit of the present disclosure. Thus, while the disclosed technology is susceptible to various modifications and alternative constructions, certain illustrative embodiments thereof are shown in the drawings and have been described above in detail. It should be understood, however, that there is no intention to limit the disclosure to the specific forms disclosed, but on the contrary, the intention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the disclosure, as defined in the appended claims.

The use of the terms "a" and "an" and "the" and similar referents in the context of describing the disclosed embodiments (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context, and not by definition of the terms. Unless otherwise indicated, the terms "comprising," "having," "including," and "containing" are to be construed as open-ended terms (meaning "including, but not limited to"). The term "connected," when unmodified and referring to a physical connection, is to be understood as partially or wholly contained within, attached to, or connected together, even if something is in the middle. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. The use of the term "set" (e.g., "set of items") or "subset" unless otherwise indicated or contradicted by context, is to be understood as a non-empty set of one or more members. Furthermore, unless otherwise indicated or contradicted by context, the term "subset" of a corresponding set does not necessarily denote an appropriate subset of the corresponding set, but the subset and the corresponding set may be equal.

Conjunctive language, such as a phrase in the form of "at least one of A, B and C" or "at least one of A, B and C", is to be understood, along with the context, as generally used to refer to an item, term, etc., unless specifically stated or clearly contradicted by context. For example, in an illustrative example of a set of three members, the conjunctive phrases "at least one of A, B, C" and "at least one of A, B, C" refer to any one of the following sets: { a }, { b }, { c }, { a, b }, { a, c }, { b, c }, { a, b, c }. Thus, such connectivity language does not generally imply that certain embodiments require the presence of at least one of A, B and C, respectively. In addition, unless otherwise indicated herein or otherwise clearly contradicted by context, the term "plurality" refers to a state that is plural (e.g., the term "plurality of items" refers to a plurality of items). Multiple items refer to at least two items, but may be more when explicitly stated or stated by context. Furthermore, unless stated otherwise or as may be seen from the context, the phrase "based on" means "based at least in part on" rather than "based entirely on".

The operations of the processes described herein may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. In at least one embodiment, processes such as those described herein (or variations and/or combinations thereof) are performed under control of one or more computer systems configured with executable instructions and are collectively executed as code (e.g., executable instructions, one or more computer programs, or one or more applications) on one or more processors, implemented by hardware, or combinations thereof. In at least one embodiment, the code is stored on a computer readable storage medium, for example, in the form of a computer program, comprising a plurality of instructions executable by one or more processors. In at least one embodiment, the computer-readable storage medium is a non-transitory computer-readable storage medium that excludes transitory signals (e.g., propagated transient electrical or electromagnetic transmissions), but includes non-transitory data storage circuitry (e.g., buffers, caches, and queues) within the transceiver of the transitory signals. In at least one embodiment, code (e.g., executable code or source code) is stored on a set of one or more non-transitory computer-readable storage media having stored thereon executable instructions (or other memory storing executable instructions) that, when executed by one or more processors of a computer system (i.e., as a result of being executed), cause the computer system to perform operations described herein. In at least one embodiment, a set of non-transitory computer-readable storage media includes a plurality of non-transitory computer-readable storage media, and one or more individual non-transitory storage media of the plurality of non-transitory computer-readable storage media lack all code, and the plurality of non-transitory computer-readable storage media collectively store all code. In at least one embodiment, the executable instructions are executed in such a way that different instructions are executed by different processors-e.g., a non-transitory computer readable storage medium stores instructions, a main CPU executes part of the instructions, and a GPU executes other instructions. In at least one embodiment, different components of the computer system have separate processors, with the different processors executing different subsets of instructions.

Thus, in at least one embodiment, a computer system is configured to implement one or more services that individually or collectively perform the operations of the processes described herein, and such computer system is configured with applicable hardware and/or software that cause the operations to be performed. Moreover, a computer system implementing at least one embodiment of the present disclosure is a single device, and in another embodiment, a distributed computer system comprised of multiple devices that operate in different ways such that the distributed computer system performs the operations described herein, and such that a single device does not perform all of the operations.

The use of any and all examples, or exemplary language (e.g., "such as") provided herein, is intended merely to better illuminate embodiments of the disclosure and does not pose a limitation on the scope of the disclosure unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the disclosure.

All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.

In the description and claims, the terms "coupled" and "connected," along with their derivatives, may be used. It should be understood that these terms may not be intended as synonyms for each other. Rather, in particular examples, "connected" or "coupled" may be used to indicate that two or more elements are in direct or indirect physical or electrical contact with each other. "coupled" may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.

Unless specifically stated otherwise, it is appreciated that throughout the description, terms such as "processing," "computing," "calculating," "determining," or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulates and/or transforms data represented as physical quantities (e.g., electronic quantities) within the computing system's registers and/or memories into other data similarly represented as physical quantities within the computing system's memories, registers or other such information storage, transmission or display devices.

In a similar manner, the term "processor" may refer to any device or portion of a device that processes electronic data from registers and/or memory and converts the electronic data into other electronic data that may be stored in registers and/or memory. As a non-limiting example, a "processor" may be a CPU or GPU. A "computing platform" may include one or more processors. As used herein, a "software" process may include, for example, software and/or hardware entities that perform work over time, such as tasks, threads, and intelligent agents. In addition, each process may refer to multiple processes executing instructions sequentially or in parallel, continuously or intermittently. The terms "system" and "method" are used interchangeably herein as long as one system can embody one or more methods, and the methods can be considered a system.

In this document, reference may be made to obtaining, acquiring, receiving, or inputting analog or digital data to a subsystem, computer system, or computer-implemented machine. Obtaining, acquiring, receiving, or inputting analog and digital data may be accomplished in a variety of ways, such as by receiving the data as a parameter of a function call or a call to an application programming interface. In some embodiments, the process of obtaining, acquiring, receiving, or inputting analog or digital data may be accomplished by transmitting the data via a serial or parallel interface. In another embodiment, the process of obtaining, acquiring, receiving or inputting analog or digital data may be accomplished by transmitting the data from a providing entity to a computer network that acquires the entity. Reference may also be made to providing, outputting, transmitting, sending or presenting analog or digital data. In various examples, the process of providing, outputting, transmitting, sending, or presenting analog or digital data may be accomplished by taking the data as input or output parameters for a function call, parameters for an application programming interface, or an interprocess communication mechanism.

While the above discussion sets forth exemplary implementations of the technology, other architectures may be used to implement the described functionality and are intended to be within the scope of the present disclosure. Furthermore, while specific allocations of responsibilities are defined above for purposes of discussion, various functions and responsibilities may be allocated and divided in different ways depending on the circumstances.

Furthermore, although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter claimed in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as example forms of implementing the claims.

Claims

1. A method, comprising:

obtaining, by a data processing unit, DPU, a snapshot of data stored in physical memory of a host device, the data being associated with one or more computer programs executed by the host device;

extracting a set of features from the snapshot using a machine learning ML detection system, wherein the set of features includes words in candidate uniform resource locators URLs and numerical features of URL structures of the candidate URLs;

classifying the candidate URLs as malicious or benign using the ML detection system using the set of features; and

an indication of a malicious URL is output in response to the candidate URL being classified as malicious.

2. The method of claim 1, wherein the ML detection system includes a binary classification model trained to classify candidate URLs as malicious or benign using tokens representing words in the candidate URLs and numerical features of URL structures of the candidate URLs.

3. The method of claim 2, wherein the URL structure comprises one or more of a subdomain, a domain, a top level domain TLD, a port, a path, a query, and a fragment.

4. The method of claim 2, wherein the binary classification model comprises:

a long-term memory LSTM layer trained to tokenize words in the candidate URLs into tokens; and

a fully connected neural network layer trained to classify the candidate URLs as malicious or benign using the token and the digital features of the URL structure.

5. The method of claim 1, wherein the ML detection system includes a binary classification model trained to classify the candidate URLs as malicious or benign by processing NLP in conjunction with features of URL structures of the candidate URLs.

6. The method of claim 1, further comprising:

tokenizing words in the candidate URLs into tokens, wherein the ML detection system comprises a binary classification model trained to classify the candidate URLs as malicious or benign using the tokens of the candidate URLs and numerical features of URL structures of the candidate URLs, and wherein the binary classification model comprises:

An embedding layer for receiving tokens as input sequences of tokens representing words in the candidate URLs and generating input vectors based on the input sequences of tokens;

a long-term memory LSTM layer trained to generate an output vector based on the input vector; and

a fully connected neural network layer trained to classify the candidate URLs as malicious or benign using the output vector of the LSTM layer and the digital features of the URL structure.

7. An integrated circuit, comprising:

a host interface operably coupled to a physical memory associated with the host device;

a central processing unit, CPU, operatively coupled to the host interface; and

an acceleration hardware engine operably coupled to the host interface and the CPU, wherein the CPU and the acceleration hardware engine are configured to host a hardware accelerated security service to protect one or more computer programs executed by the host device, wherein the hardware accelerated security service is configured to:

obtaining a snapshot of data stored in the physical memory, the data associated with one or more computer programs executed by the host device;

Extracting a set of features from the snapshot using a machine learning ML detection system, wherein the set of features includes words in a candidate uniform resource locator URL and numeric features in a URL structure of the candidate URL;

8. The integrated circuit of claim 7, wherein the integrated circuit is a data processing unit, DPU, wherein the DPU is an on-chip programmable data center infrastructure.

9. The integrated circuit of claim 7, further comprising a network interface operably coupled to the CPU for responsible for network data path processing, wherein the CPU is for control path initialization and exception handling.

10. The integrated circuit of claim 7, wherein the one or more computer programs comprise at least one host operating system OS, application program, guest operating system, or guest application program.

11. The integrated circuit of claim 7, wherein:

the hardware-accelerated security service is to obtain a snapshot of the data stored in the physical memory, the snapshot representing the data at a point in time;

The ML detection system includes:

feature extraction logic to extract a set of features from the snapshot, the set of features including words in a candidate URL and numerical features in a URL structure of the candidate URL; and

a binary classification model trained to classify the candidate URLs as malicious or benign using the set of features.

12. The integrated circuit of claim 11, wherein the feature extraction logic is to token words into tokens, and wherein the binary classification model comprises:

an embedding layer for receiving the token as an input sequence of tokens representing words in the candidate URL and generating an input vector based on the input sequence of tokens;

13. The integrated circuit of claim 7, wherein the one or more computer programs reside in a first computing domain, wherein the hardware-accelerated security service and the ML detection system reside in a second computing domain different from the first computing domain.

14. The integrated circuit of claim 7, wherein the hardware-accelerated security service is out-of-band security software located in a trusted domain distinct from and separate from the malicious URL.

15. The integrated circuit of claim 7, further comprising a Direct Memory Access (DMA) controller coupled to the host interface, wherein the DMA controller is to read the data from the physical memory via the host interface.

16. The integrated circuit of claim 15, wherein the host interface is a peripheral component interconnect express PCIe interface.

17. A computing system, comprising:

a data processing unit, DPU, comprising a host interface, a central processing unit, CPU, and an acceleration hardware engine, the DPU to host a hardware-accelerated security service to protect one or more computer programs executed by a host device, wherein the hardware-accelerated security service is to extract a plurality of features from data stored in physical memory associated with the host device, the data associated with the one or more computer programs; and

acceleration pipeline hardware coupled to the DPU, wherein the acceleration pipeline hardware is to:

18. The computing system of claim 17 wherein the DPU is an on-chip programmable data center infrastructure.

19. The computing system of claim 17 wherein the DPU further comprises a network interface operatively coupled to the CPU for responsible for network data path processing, wherein the CPU is for controlling path initialization and exception handling.

20. The computing system of claim 17, wherein the one or more computer programs comprise at least one host operating system OS, application program, guest operating system, or guest application program.

21. The computing system of claim 17, wherein:

the ML detection system includes:

feature extraction logic to extract a set of features from the snapshot, the set of features including words in a candidate URL and numerical features of a URL structure of the candidate URL; and

22. The computing system of claim 21, wherein the feature extraction logic is to token words into tokens, and wherein the binary classification model comprises:

a fully connected neural network layer trained to classify the candidate URLs as malicious or benign using the output vector from the LSTM layer and the digital features of the URL structure.

23. The computing system of claim 17, wherein the one or more computer programs reside in a first computing domain, wherein the hardware-accelerated security service resides in a second computing domain different from the first computing domain, and wherein the ML detection system resides in the second computing domain or a third computing domain different from the first computing domain and the second computing domain.

24. The computing system of claim 17, wherein the hardware-accelerated security service is out-of-band security software in a trusted domain distinct from and separate from the malicious URL.

25. The computing system of claim 17, further comprising a Direct Memory Access (DMA) controller coupled to the host interface, wherein the DMA controller is to read the data from the physical memory via the host interface.

26. The computing system of claim 23, wherein the host interface is a peripheral component interconnect express PCIe interface.