CN115801466B - Flow-based mining script detection method and device - Google Patents

Flow-based mining script detection method and device Download PDF

Info

Publication number
CN115801466B
CN115801466B CN202310080142.2A CN202310080142A CN115801466B CN 115801466 B CN115801466 B CN 115801466B CN 202310080142 A CN202310080142 A CN 202310080142A CN 115801466 B CN115801466 B CN 115801466B
Authority
CN
China
Prior art keywords
script
word
mining
script file
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310080142.2A
Other languages
Chinese (zh)
Other versions
CN115801466A (en
Inventor
郭静海
张福
程度
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shengxin Network Technology Co ltd
Original Assignee
Beijing Shengxin Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shengxin Network Technology Co ltd filed Critical Beijing Shengxin Network Technology Co ltd
Priority to CN202310080142.2A priority Critical patent/CN115801466B/en
Publication of CN115801466A publication Critical patent/CN115801466A/en
Application granted granted Critical
Publication of CN115801466B publication Critical patent/CN115801466B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a method and a device for detecting an ore mining script based on flow. The method comprises the following steps: responding to the flow data to generate a script file downloading operation, and carrying out matching processing on threat information data and the flow data to obtain a matching result; under the condition of successful matching, determining a first word matched with the characteristic word set and a second word matched with the characteristic word set in the script file through the mining identification model; determining whether the script file is an ore excavation script according to the first word and the second word; and generating alarm information under the condition that the script file is an ore excavation script. According to the method and the device, the flow data in the equipment can be matched with threat information data, the characteristic words and the special words in the script file are determined by using the mining identification model, and the mining script is identified based on the characteristic words and the special words, so that real-time detection under high flow can be supported, the detection speed is high, the performance consumption is low, and the detection efficiency and the accuracy of the mining script can be improved.

Description

Flow-based mining script detection method and device
Technical Field
The disclosure relates to the technical field of computers, in particular to a method and a device for detecting an ore mining script based on flow.
Background
With the development of blockchain technology, virtual currencies are developed, and the obtaining mode of the currencies is obtained through a large amount of operations, so that various mining trojans are generated. The operation resources of other equipment can be stolen by the mining Trojan, so that the equipment can run at full load, the running speed of the equipment is slowed down, and the service life of hardware of the equipment is also influenced.
An attacker can utilize a system with holes to remotely implant mining script files into other devices, the script files can detect the types of the operating systems and are used for downloading corresponding mining Trojan horses and control programs, and in order to monopolize system resources, the attacker also detects the processes of the operating systems and forcedly kills irrelevant processes, and the system also has the functions of adding timing tasks, appointing externally connected domain names, clearing history records, downloading environment installation packages and the like. The impact on the device itself is large and can cause inconvenience to the actual owner of the device in using the device.
Disclosure of Invention
The disclosure provides a method and a device for detecting an ore mining script based on flow.
According to an aspect of the present disclosure, there is provided a method for detecting an ore mining script based on a flow, including:
responding to the operation of downloading script files generated by the flow data, and carrying out matching processing on preset threat information data and the flow data to obtain a matching result;
Under the condition that the matching result is successful, determining a first word matched with a characteristic word set and a second word matched with a specific word set in the script file through an ore excavation recognition model, wherein the characteristic word set is a word set with the occurrence frequency larger than a first threshold value in an ore excavation script sample, and the specific word set is a word set which occurs in the ore excavation script sample and does not occur in a normal script sample;
determining whether the script file is an ore excavation script according to the first word and the second word;
and generating alarm information under the condition that the script file is an ore excavation script.
In some embodiments of the disclosure, the determining, by the mining identification model, the first word in the script file that matches the set of feature words includes:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
determining undetermined words with occurrence frequencies greater than a second threshold value in the word set;
among the pending words, the first word that appears in the feature word set is determined.
In some embodiments of the disclosure, the determining, by the mining identification model, the second term in the script file that matches the set of unique terms includes:
Performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
in the set of words, the second word that appears in the set of unique words is determined.
In some embodiments of the present disclosure, determining whether the script file is an ore mining script based on the first term and the second term includes:
and determining that the script file is an ore mining script under the condition that the total number of the first words and the second words is larger than a third threshold value.
In some embodiments of the present disclosure, the method further comprises:
under the condition that the matching result is successful and the script file is identified as not being an ore mining script, acquiring special words which appear in the script file and are not appearing in a normal script sample, and adding the special words to the special word set;
and generating alarm information.
In some embodiments of the present disclosure, the method further comprises:
under the condition that the matching result is unmatched, determining whether the script file is an ore mining script or not through the ore mining identification model;
under the condition that the script file is identified as the mining script, acquiring the IP address and domain name information of the flow data;
Updating the threat information data through the IP address and the domain name information;
and generating alarm information.
In some embodiments of the present disclosure, the method further comprises:
and if the matching result is unmatched and the script file is identified as not being the mining script, determining the script file as a normal script.
According to another aspect of the present disclosure, there is provided a flow-based mining script detection apparatus, including:
the matching module is used for responding to the operation of downloading the script file generated by the flow data, and matching processing is carried out through the preset threat information data and the flow data to obtain a matching result;
the word determining module is used for determining a first word matched with a characteristic word set and a second word matched with a characteristic word set in the script file through the mining recognition model under the condition that the matching result is successful, wherein the characteristic word set is a word set with the occurrence frequency greater than a first threshold value in a mining script sample, and the characteristic word set is a word set which occurs in the mining script sample and does not occur in a normal script sample;
the identification module is used for determining whether the script file is an ore excavation script or not according to the characteristic words and the special words;
And the alarm module is used for generating alarm information under the condition that the script file is an ore excavation script.
In some embodiments of the disclosure, the term determination module is further to:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
determining undetermined words with occurrence frequencies greater than a second threshold value in the word set;
among the pending words, the first word that appears in the feature word set is determined.
In some embodiments of the disclosure, the term determination module is further to:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
in the set of words, the second word that appears in the set of unique words is determined.
In some embodiments of the present disclosure, the identification module is further to:
and determining that the script file is an ore mining script under the condition that the total number of the first words and the second words is larger than a third threshold value.
In some embodiments of the present disclosure, the apparatus further comprises: the word set updating module is used for:
under the condition that the matching result is successful and the script file is identified as not being an ore mining script, acquiring special words which appear in the script file and are not appearing in a normal script sample, and adding the special words to the special word set;
And generating alarm information.
In some embodiments of the present disclosure, the apparatus further comprises a threat intelligence data update module for:
under the condition that the matching result is unmatched, determining whether the script file is an ore mining script or not through the ore mining identification model;
under the condition that the script file is identified as the mining script, acquiring the IP address and domain name information of the flow data;
updating the threat information data through the IP address and the domain name information;
and generating alarm information.
In some embodiments of the present disclosure, the apparatus further comprises: and the normal script confirming module is used for confirming the script file as a normal script under the condition that the matching result is unmatched and the script file is identified as not being the mining script.
According to another aspect of the present disclosure, there is provided a flow-based mining script detection apparatus device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the instructions stored in the memory to perform the above method.
According to another aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.
According to the flow-based mining script detection method disclosed by the embodiment of the invention, the flow data in the equipment can be matched with threat information data, if the matching is successful, the characteristic words and the specific words in the script file are determined by using the mining recognition model, and further the mining script is recognized based on the characteristics of the characteristic words and the specific words, so that the real-time detection under high flow can be supported, the detection speed is high, the performance consumption is low, and the detection efficiency and the accuracy of the mining script can be improved. In addition, the unique word set, the characteristic word set and the threat information data can be continuously updated in the recognition process, so that the self-learning process is realized, the number of samples required in the learning process is reduced, and the recognition accuracy is higher and higher.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure. Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and, together with the description, serve to explain the technical aspects of the disclosure,
FIG. 1 illustrates a flow chart of a flow-based mining script detection method according to an embodiment of the present disclosure;
FIG. 2 illustrates an application schematic of a flow-based mining script detection method in accordance with an embodiment of the present disclosure;
FIG. 3 illustrates a block diagram of a flow-based mining script detection device, according to an embodiment of the present disclosure;
FIG. 4 illustrates a block diagram of a flow-based mining script detection apparatus, in accordance with an embodiment of the present disclosure;
fig. 5 shows a block diagram of an electronic device, according to an embodiment of the disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
Furthermore, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.
Aiming at the problems in the related art, the flow data in the equipment can be matched with threat information data, if the matching is successful, the characteristic words and the specific words in the script file are determined by using the mining identification model, and then the mining script is identified based on the characteristics of the characteristic words and the specific words, so that the real-time detection under high flow can be supported, the detection speed is high, the performance consumption is low, and the detection efficiency and the accuracy of the mining script can be improved.
Fig. 1 shows a flowchart of a flow-based mining script detection method according to an embodiment of the present disclosure, as shown in fig. 1, the method including:
step S11, responding to the operation of downloading script files generated by the flow data, and carrying out matching processing on preset threat information data and the flow data to obtain a matching result;
Step S12, under the condition that the matching result is successful, determining a first word matched with a characteristic word set and a second word matched with a characteristic word set in the script file through a mining identification model, wherein the characteristic word set is a word set with the occurrence frequency greater than a first threshold value in a mining script sample, and the characteristic word set is a word set which occurs in the mining script sample and does not occur in a normal script sample;
step S13, determining whether the script file is an ore excavation script or not according to the characteristic words and the special words;
and S14, generating alarm information under the condition that the script file is an ore excavation script.
In some embodiments of the present disclosure, when a device (e.g., an electronic device such as a computer, a mobile phone, etc.) is running, if it is attacked maliciously, the device may generate traffic data that is accessed by a malicious attacker, for example, the malicious attacker accesses the device through the internet, remotely downloads a script file, which may be an ore-mining script, and after the downloading is executed, maliciously occupies the running resources of the device, which may cause inconvenience to the device user and may have an adverse effect on the device.
Fig. 2 illustrates an application schematic of a flow-based mining script detection method according to an embodiment of the present disclosure.
In some embodiments of the present disclosure, in step S11, the preset threat intelligence data may record a known domain name or IP address with a threat, and if the domain name or IP address of the visitor matches the record in the threat intelligence data, the traffic data may be considered to have a security risk, and the script file downloaded in the traffic data may be a mining script.
In some embodiments of the present disclosure, in step S12, it may be determined whether the script downloaded in the flow data is a mining script through the mining identification model. In the judging, the characteristic word set and the unique word set can be used for judging.
In some embodiments of the present disclosure, the feature word set is a word set obtained by counting words occurring in a plurality of mine-mining script samples, where the frequency of occurrence of feature words in the word set in the mine-mining script samples is higher, for example, greater than a first threshold value, so that the feature words may be representative words in the mine-mining script samples, and if there are a large number of feature words in a certain script file, the probability that the script file is a mine-mining script is higher.
In some embodiments of the present disclosure, only using the feature words does not accurately determine whether a certain script file is an ore mining script, and accuracy of determination may be further improved based on the feature words. Besides counting the words in the plurality of mine-digging script samples, words in a normal script sample (non-mine-digging script) can be counted, words which only appear in the mine-digging script sample but not in the normal script sample can be determined, the words can be used as unique words of the mine-digging script, a formed set is a unique word set, and if a large number of unique words which only appear in the mine-digging script sample exist in a certain script file, the probability of the script being the mine-digging script is higher. Further, through comprehensive utilization of the characteristic word set and the specific word set, accuracy of judging the mining script can be improved.
In some embodiments of the present disclosure, in step S12, a first term in the script file that matches the set of feature words and a second term that matches the set of unique words may be determined. That is, feature words and unique words in the script file are found.
In some embodiments of the disclosure, the determining, by the mining identification model, the first word in the script file that matches the set of feature words includes: performing word segmentation and denoising processing on the script file to obtain a word set of the script file; determining undetermined words with occurrence frequencies greater than a second threshold value in the word set; among the pending words, the first word that appears in the feature word set is determined.
In some embodiments of the present disclosure, the word segmentation and denoising process may be performed using algorithms in the related art, for example, a barker segmentation and aliquoting algorithm, and the present disclosure is not limited to the specific manner of the word segmentation and denoising process. After processing, a set of words of the script file may be obtained.
In some embodiments of the present disclosure, pending words having a frequency of occurrence greater than a second threshold may be determined in the word set, which may be the same as or different from the first threshold, without limitation by the present disclosure.
In some embodiments of the present disclosure, a first term that appears in the feature term set may be determined among the pending terms, i.e., an intersection of the pending terms and the feature term set is determined, where the term in the intersection is the first term.
In some embodiments of the disclosure, the determining, by the mining identification model, the second term in the script file that matches the set of unique terms includes: performing word segmentation and denoising processing on the script file to obtain a word set of the script file; in the set of words, the second word that appears in the set of unique words is determined.
In some embodiments of the present disclosure, the word segmentation and denoising processes are described above and are not described in detail herein. Alternatively, the above word set may be used directly.
In some embodiments of the present disclosure, a second term that appears in the set of unique terms may be determined from the set of terms, i.e., an intersection of the set of unique terms and the set of terms is determined, and the terms in the intersection are the second term.
In some embodiments of the present disclosure, in step S13, whether the script file is an ore mining script may be comprehensively determined based on the first word and the second word, so as to improve the accuracy of determination. In an example, a total number of the first words and the second words may be solved, and if the total number is greater than a third threshold, the script file is determined to be an mine excavation script.
In some embodiments of the present disclosure, in step S14, if it is determined that the script file is an ore-mining script, an alarm message may be generated, thereby prompting the device or a user of the device to take measures. For example, the script file is isolated or deleted, etc., so that the script file is prevented from being executed, and the device is prevented from being attacked maliciously.
In some embodiments of the present disclosure, family identification may also be performed on the script file before generating the alarm information, for example, identifying information such as a category family of the mine excavation script to which the script file belongs. The present disclosure is not limited by the specific manner in which the family is identified.
In some embodiments of the present disclosure, the method further comprises: under the condition that the matching result is successful and the script file is identified as not being an ore mining script, acquiring special words which appear in the script file and are not appearing in a normal script sample, and adding the special words to the special word set; and generating alarm information. That is, the flow data is matched with threat information data, but the mining identification model is used for judging that the script file is a mining script, the script file can be automatically classified into the mining script, the unique words in the mining script can be counted and added to the unique word set, so that the accuracy of judging by using the unique word set is improved. Furthermore, feature words in the mining script can be counted, and a feature word set is added, so that the accuracy of judging by using the feature word set is further improved. Further, family identification, alarm information generation and other processes can be performed on the script file.
By the method, the mining identification model can automatically learn, and when the script file is determined to be the mining script but is not successfully identified by the mining identification model, the unique word set and the characteristic word set can be automatically updated based on words in the script file, so that the accuracy of judgment by using the mining identification model in the subsequent process is improved.
In some embodiments of the present disclosure, the method further comprises: under the condition that the matching result is unmatched, determining whether the script file is an ore mining script or not through the ore mining identification model; under the condition that the script file is identified as the mining script, acquiring the IP address and domain name information of the flow data; updating the threat information data through the IP address and the domain name information; and generating alarm information.
In some embodiments of the present disclosure, if traffic data does not match threat intelligence data, i.e., the traffic data is not from a known compromised domain name or IP address, but the mining identification model identifies the script file as a mining script, then the script file may be determined to be a mining script, and the unknown domain name or IP address of the device is accessed, then the domain name information or IP address may be added to the threat intelligence data as a known compromised IP address or domain name, and then the access may be determined directly as a malicious access if the IP address or domain name continues to access the device during subsequent use of the device, due to malicious access to the device (i.e., the act of causing the device to download the mining script). Further, family identification can be performed on the mining script and alarm information can be generated.
By the method, threat information data can be automatically updated under the condition that the mining script comes from an unknown domain name or IP address, so that threat information data are enriched, and the detection breadth of equipment is improved.
In some embodiments of the present disclosure, the method further comprises: and if the matching result is unmatched and the script file is identified as not being the mining script, determining the script file as a normal script. That is, the traffic data is not from the IP address or domain name having a threat, and the mining recognition model determines that the script file is not a mining script, but a normal script, the processing of the script can be normally performed without isolation or deletion, thereby minimizing the influence on the normal access behavior.
According to the flow-based mining script detection method disclosed by the embodiment of the invention, the flow data in the equipment can be matched with threat information data, if the matching is successful, the characteristic words and the specific words in the script file are determined by using the mining recognition model, and further the mining script is recognized based on the characteristics of the characteristic words and the specific words, so that the real-time detection under high flow can be supported, the detection speed is high, the performance consumption is low, and the detection efficiency and the accuracy of the mining script can be improved. In addition, the unique word set, the characteristic word set and the threat information data can be continuously updated in the recognition process, so that the self-learning process is realized, the number of samples required in the learning process is reduced, and the recognition accuracy is higher and higher.
Fig. 3 shows a block diagram of a flow-based mining script detection apparatus, as shown in fig. 3, according to an embodiment of the present disclosure, the apparatus comprising:
the matching module 11 is used for responding to the operation of downloading the script file generated by the flow data, and carrying out matching processing on the preset threat information data and the flow data to obtain a matching result;
the word determining module 12 is configured to determine, when the matching result is that the matching is successful, a first word matched with a feature word set and a second word matched with a specific word set in the script file through the mining recognition model, where the feature word set is a word set with a frequency of occurrence greater than a first threshold value in a mining script sample, and the specific word set is a word set that occurs in the mining script sample and does not occur in a normal script sample;
the recognition module 13 is used for determining whether the script file is an ore excavation script according to the characteristic words and the unique words;
and the alarm module 14 is used for generating alarm information when the script file is an ore excavation script.
In some embodiments of the disclosure, the term determination module is further to:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
Determining undetermined words with occurrence frequencies greater than a second threshold value in the word set;
among the pending words, the first word that appears in the feature word set is determined.
In some embodiments of the disclosure, the term determination module is further to:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
in the set of words, the second word that appears in the set of unique words is determined.
In some embodiments of the present disclosure, the identification module is further to:
and determining that the script file is an ore mining script under the condition that the total number of the first words and the second words is larger than a third threshold value.
In some embodiments of the present disclosure, the apparatus further comprises: the word set updating module is used for:
under the condition that the matching result is successful and the script file is identified as not being an ore mining script, acquiring special words which appear in the script file and are not appearing in a normal script sample, and adding the special words to the special word set;
and generating alarm information.
In some embodiments of the present disclosure, the apparatus further comprises a threat intelligence data update module for:
Under the condition that the matching result is unmatched, determining whether the script file is an ore mining script or not through the ore mining identification model;
under the condition that the script file is identified as the mining script, acquiring the IP address and domain name information of the flow data;
updating the threat information data through the IP address and the domain name information;
and generating alarm information.
In some embodiments of the present disclosure, the apparatus further comprises: and the normal script confirming module is used for confirming the script file as a normal script under the condition that the matching result is unmatched and the script file is identified as not being the mining script.
In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
The disclosed embodiments also provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method. The computer readable storage medium may be a non-volatile computer readable storage medium.
The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the instructions stored in the memory to perform the above method.
Embodiments of the present disclosure also provide a computer program product comprising computer readable code which, when run on a device, causes a processor in the device to execute instructions for implementing the cloud application management method as provided in any of the embodiments above.
The present disclosure also provides another computer program product for storing computer readable instructions that, when executed, cause a computer to perform the operations of the cloud application management method provided in any of the above embodiments.
The electronic device may be provided as a terminal, server or other form of device.
Fig. 4 illustrates a block diagram of a flow-based mining script detection apparatus 800, in accordance with an embodiment of the present disclosure. For example, device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 4, device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and the like. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 806 provides power to the various components of the device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.
The multimedia component 808 includes a screen between the device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only an edge of a touch or slide action, but also a duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.
Input/output interface 812 provides an interface between processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the device 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in position of the device 800 or a component in the device 800, the presence or absence of user contact with the device 800, an orientation or acceleration/deceleration of the device 800, and a change in temperature of the device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communication between the device 800 and other devices, either wired or wireless. The device 800 may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including computer program instructions executable by processor 820 of device 800 to perform the above-described method.
Fig. 5 illustrates a block diagram of an electronic device 1900 according to an embodiment of the disclosure. For example, electronic device 1900 may be provided as a server. Referring to FIG. 5, electronic device 1900 includes a processing unit 1922 that further includes one or more processors and memory resources represented by a storage unit 1932 for storing instructions, such as application programs, that can be executed by processing unit 1922. The application programs stored in storage unit 1932 may include one or more modules each corresponding to a set of instructions. Further, the processing unit 1922 is configured to execute instructions to perform the methods described above.
The electronic device 1900 may also include a power module 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an I/O interface 1958. The electronic device 1900 may operate an operating system based on a memory 1932, such as Windows Server TM ,Mac OS X TM ,Unix TM ,Linux TM ,FreeBSD TM Or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as a storage unit 1932, including computer program instructions executable by the processing unit 1922 of the electronic device 1900 to perform the methods described above.
The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The computer program product may be realized in particular by means of hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (5)

1. The mining script detection method based on the flow is characterized by comprising the following steps of:
responding to the operation of downloading script files generated by flow data, and carrying out matching processing on preset threat information data and the flow data to obtain a matching result, wherein the preset threat information data records known domain names or IP addresses with threats;
Under the condition that the matching result is successful, determining a first word matched with a characteristic word set and a second word matched with a specific word set in the script file through an ore excavation recognition model, wherein the characteristic word set is a word set with the occurrence frequency larger than a first threshold value in an ore excavation script sample, and the specific word set is a word set which occurs in the ore excavation script sample and does not occur in a normal script sample;
determining whether the script file is an ore excavation script according to the first word and the second word;
generating alarm information under the condition that the script file is an ore excavation script;
the determining, by the mining recognition model, a first term in the script file that matches the feature term set includes:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
determining undetermined words with occurrence frequencies greater than a second threshold value in the word set;
determining the first word appearing in the feature word set in the undetermined words;
the determining, by the mining recognition model, a second word in the script file that matches the set of unique words includes:
Performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
determining, among the set of words, the second word that appears in the set of unique words;
the method further comprises the steps of:
under the condition that the matching result is successful and the script file is identified as not being an ore mining script, acquiring special words which appear in the script file and are not appearing in a normal script sample, and adding the special words to the special word set;
generating alarm information;
the method further comprises the steps of:
under the condition that the matching result is unmatched, determining whether the script file is an ore mining script or not through the ore mining identification model;
under the condition that the script file is identified as the mining script, acquiring the IP address and domain name information of the flow data;
updating the threat information data through the IP address and the domain name information;
generating alarm information;
the method further comprises the steps of:
and if the matching result is unmatched and the script file is identified as not being the mining script, determining the script file as a normal script.
2. The flow-based mining script detection method of claim 1, wherein determining whether the script file is a mining script based on the first word and the second word comprises:
And determining that the script file is an ore mining script under the condition that the total number of the first words and the second words is larger than a third threshold value.
3. Flow-based mining script detection device, characterized by comprising:
the matching module is used for responding to the operation of downloading the script file generated by the flow data, and carrying out matching processing on the preset threat information data and the flow data to obtain a matching result, wherein the preset threat information data records a known domain name or IP address with threat;
the word determining module is used for determining a first word matched with a characteristic word set and a second word matched with a characteristic word set in the script file through the mining recognition model under the condition that the matching result is successful, wherein the characteristic word set is a word set with the occurrence frequency greater than a first threshold value in a mining script sample, and the characteristic word set is a word set which occurs in the mining script sample and does not occur in a normal script sample;
the identification module is used for determining whether the script file is an ore excavation script or not according to the characteristic words and the special words;
the alarm module is used for generating alarm information under the condition that the script file is an ore excavation script;
The word determining module is further configured to:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
determining undetermined words with occurrence frequencies greater than a second threshold value in the word set;
determining the first word appearing in the feature word set in the undetermined words;
the word determining module is further configured to:
performing word segmentation and denoising processing on the script file to obtain a word set of the script file;
determining, among the set of words, the second word that appears in the set of unique words;
the apparatus further comprises: the word set updating module is used for:
under the condition that the matching result is successful and the script file is identified as not being an ore mining script, acquiring special words which appear in the script file and are not appearing in a normal script sample, and adding the special words to the special word set;
generating alarm information;
the device also comprises a threat intelligence data updating module for:
under the condition that the matching result is unmatched, determining whether the script file is an ore mining script or not through the ore mining identification model;
Under the condition that the script file is identified as the mining script, acquiring the IP address and domain name information of the flow data;
updating the threat information data through the IP address and the domain name information;
generating alarm information;
the apparatus further comprises: and the normal script confirming module is used for confirming the script file as a normal script under the condition that the matching result is unmatched and the script file is identified as not being the mining script.
4. Flow-based mining script detection equipment, characterized by comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform the method of claim 1 or 2.
5. A computer readable storage medium, characterized in that it has stored thereon computer program instructions which, when executed by a processor, implement the method according to claim 1 or 2.
CN202310080142.2A 2023-02-08 2023-02-08 Flow-based mining script detection method and device Active CN115801466B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310080142.2A CN115801466B (en) 2023-02-08 2023-02-08 Flow-based mining script detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310080142.2A CN115801466B (en) 2023-02-08 2023-02-08 Flow-based mining script detection method and device

Publications (2)

Publication Number Publication Date
CN115801466A CN115801466A (en) 2023-03-14
CN115801466B true CN115801466B (en) 2023-05-02

Family

ID=85430463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310080142.2A Active CN115801466B (en) 2023-02-08 2023-02-08 Flow-based mining script detection method and device

Country Status (1)

Country Link
CN (1) CN115801466B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544436A (en) * 2013-10-12 2014-01-29 深圳先进技术研究院 System and method for distinguishing phishing websites
CN104901962A (en) * 2015-05-28 2015-09-09 北京椒图科技有限公司 Method and device for detecting webpage attack data
CN108399337A (en) * 2018-03-16 2018-08-14 北京奇虎科技有限公司 Webpage digs the method and device of mine script for identification
CN110427755A (en) * 2018-10-16 2019-11-08 新华三信息安全技术有限公司 A kind of method and device identifying script file
CN110933060A (en) * 2019-11-22 2020-03-27 上海交通大学 Excavation Trojan detection system based on flow analysis
CN112087414A (en) * 2019-06-14 2020-12-15 北京奇虎科技有限公司 Detection method and device for mining trojans
CN113139189A (en) * 2021-04-29 2021-07-20 广州大学 Method, system and storage medium for identifying mining malicious software
CN115438340A (en) * 2022-08-31 2022-12-06 济南大学 Mining behavior identification method and system based on morpheme characteristics

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544436A (en) * 2013-10-12 2014-01-29 深圳先进技术研究院 System and method for distinguishing phishing websites
CN104901962A (en) * 2015-05-28 2015-09-09 北京椒图科技有限公司 Method and device for detecting webpage attack data
CN108399337A (en) * 2018-03-16 2018-08-14 北京奇虎科技有限公司 Webpage digs the method and device of mine script for identification
CN110427755A (en) * 2018-10-16 2019-11-08 新华三信息安全技术有限公司 A kind of method and device identifying script file
CN112087414A (en) * 2019-06-14 2020-12-15 北京奇虎科技有限公司 Detection method and device for mining trojans
CN110933060A (en) * 2019-11-22 2020-03-27 上海交通大学 Excavation Trojan detection system based on flow analysis
CN113139189A (en) * 2021-04-29 2021-07-20 广州大学 Method, system and storage medium for identifying mining malicious software
CN115438340A (en) * 2022-08-31 2022-12-06 济南大学 Mining behavior identification method and system based on morpheme characteristics

Also Published As

Publication number Publication date
CN115801466A (en) 2023-03-14

Similar Documents

Publication Publication Date Title
CN110990801B (en) Information verification method and device, electronic equipment and storage medium
CN109842612B (en) Log security analysis method and device based on graph library model and storage medium
CN113569992B (en) Abnormal data identification method and device, electronic equipment and storage medium
CN110942036A (en) Person identification method and device, electronic equipment and storage medium
CN103995834A (en) Sensitive information detection method and related device
CN107659717B (en) State detection method, device and storage medium
CN112328398A (en) Task processing method and device, electronic equipment and storage medium
CN110807393A (en) Early warning method and device based on video analysis, electronic equipment and storage medium
CN110826697B (en) Method and device for acquiring sample, electronic equipment and storage medium
CN116707965A (en) Threat detection method and device, storage medium and electronic equipment
CN110928425A (en) Information monitoring method and device
CN111625671A (en) Data processing method and device, electronic equipment and storage medium
CN115801466B (en) Flow-based mining script detection method and device
CN112953916B (en) Anomaly detection method and device
CN110750448B (en) Test case generation method and device based on symbol execution
CN112083841B (en) Information input method, device and storage medium
CN112131999B (en) Identity determination method and device, electronic equipment and storage medium
CN114118278A (en) Image processing method and device, electronic equipment and storage medium
CN110149310B (en) Flow intrusion detection method, device and storage medium
CN108549570B (en) User interface updating method and device
CN113687925A (en) Equipment operation processing method and device, storage medium and computer equipment
CN113810327B (en) Abnormal account detection method, device and storage medium
CN113596016B (en) Malicious domain name detection method and device, electronic equipment and storage medium
CN111767249B (en) Method and device for determining self-running time of function
CN116881914B (en) File system operation processing method, system, device and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant