CN109981818B - Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof - Google Patents

Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof Download PDF

Info

Publication number
CN109981818B
CN109981818B CN201910226208.8A CN201910226208A CN109981818B CN 109981818 B CN109981818 B CN 109981818B CN 201910226208 A CN201910226208 A CN 201910226208A CN 109981818 B CN109981818 B CN 109981818B
Authority
CN
China
Prior art keywords
domain name
protocol
state machine
finite state
finite
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910226208.8A
Other languages
Chinese (zh)
Other versions
CN109981818A (en
Inventor
吴必强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yutong Electronic Technology Co ltd
Original Assignee
Shanghai Yutong Electronic Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yutong Electronic Technology Co ltd filed Critical Shanghai Yutong Electronic Technology Co ltd
Priority to CN201910226208.8A priority Critical patent/CN109981818B/en
Publication of CN109981818A publication Critical patent/CN109981818A/en
Application granted granted Critical
Publication of CN109981818B publication Critical patent/CN109981818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/10Mapping addresses of different types
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/30Types of network names

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to the field of computers, in particular to a domain name semantic anomaly analysis method, a domain name semantic anomaly analysis device, computer equipment and a storage medium thereof, wherein the method comprises the following steps: receiving a domain name network data message, and extracting element information of a domain name protocol in the message; generating a finite state machine of the domain name protocol according to a pre-generated finite state machine model; and performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol, and outputting the state of the domain name. The invention extracts and calculates HASH and information entropy based on the domain name protocol elements, carries out semantic deduction aiming at known 4 conditions, can reduce the calculated amount and improve the accuracy of anomaly analysis by simplifying the domain name protocol semantic finite state machine.

Description

Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof
Technical Field
The invention relates to the field of computers, in particular to a domain name semantic anomaly analysis method and device, computer equipment and a storage medium thereof.
Background
The network protocol is a set of rules, standards or conventions established by data exchange of a computer network, is a framework and a nerve of the current information network, is a link for maintaining normal communication of the network, and a domain name system is one of the most critical infrastructures of the internet, so that the abnormal analysis of domain name semantics is of great importance.
The existing Domain Name semantic anomaly analysis method mainly comprises payload analysis and flow analysis, wherein the payload analysis mainly analyzes and detects a specific DNS (Domain Name System) channel, generally uses rule matching and cannot detect unknown anomalies; the accuracy of the flow analysis technology is not high, so that a domain name anomaly detection method which can effectively detect unknown anomalies and can ensure the anomaly accuracy is urgently needed.
Disclosure of Invention
The present invention is directed to overcome the defects of the prior art, and provides a domain name semantic anomaly analysis method, apparatus, computer device and storage medium, so as to solve the above technical problems.
The embodiment of the invention provides a domain name semantic anomaly analysis method, which comprises the following steps:
receiving a domain name network data message, and extracting element information of a domain name protocol in the message;
generating a finite state machine of the domain name protocol according to a pre-generated finite state machine model;
and performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol, and outputting the state of the domain name.
The embodiment of the present invention further provides a device for analyzing semantic abnormality of a domain name, including:
the information receiving unit is used for receiving a domain name network data message and extracting the element information of a domain name protocol in the message;
the information processing unit is used for generating a finite state machine of the domain name protocol according to a pre-generated finite state machine model;
and the information output unit is used for performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol and outputting the state of the domain name.
The embodiment of the present invention further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the domain name semantic anomaly analysis method.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the processor is enabled to execute the steps of the above method for analyzing domain name semantic anomalies.
According to the domain name semantic anomaly analysis method provided by the embodiment of the invention, the Hash value and the information entropy are extracted and calculated based on the domain name protocol elements, semantic deduction is carried out aiming at known four conditions, the calculated amount can be reduced by simplifying the domain name protocol semantic finite state machine, and the anomaly analysis accuracy is improved.
Drawings
FIG. 1 is a diagram illustrating an environment for implementing a domain name semantic anomaly analysis method suitable for embodiments of the present invention;
FIG. 2 is a diagram illustrating the steps of a domain name semantic anomaly analysis method suitable for embodiments of the present invention;
FIG. 3 illustrates a state transition schematic suitable for embodiments of the present invention;
FIG. 4(a) shows a detailed state transition diagram suitable for embodiments of the present invention;
FIG. 4(b) shows a simplified diagram of a state transition suitable for embodiments of the present invention;
FIG. 4(c) illustrates a state transition process diagram for anomaly analysis suitable for embodiments of the present invention;
FIG. 4(d) shows a state transition process diagram for yet another anomaly analysis suitable for embodiments of the present invention;
FIG. 5 is a diagram illustrating an apparatus for analyzing semantic anomalies of domain names according to an embodiment of the present invention;
fig. 6 shows an internal structural diagram of a computer apparatus suitable for an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention to the preferred embodiments described above, and the present invention is not limited thereto.
Fig. 1 is a diagram illustrating an implementation environment of a domain name semantic anomaly analysis method suitable for an embodiment of the present invention, which is detailed as follows:
in the embodiment of the present invention, the terminal 120 receives the domain name information in the internet 110, and performs state judgment on the domain name on the terminal 120, and outputs the domain name state.
In the embodiment of the present invention, the terminal 120 may be an intelligent terminal device such as a personal computer, a smart phone, a tablet computer, and the like.
Fig. 2 shows a step diagram of a domain name semantic anomaly analysis method suitable for the embodiment of the present invention, which is detailed as follows:
in step S201, domain name network packet information is received, and domain name protocol element information in the packet is extracted.
In the embodiment of the present invention, the terminal 120 may obtain the domain name network data packet from the network flow data, or directly obtain the domain name network data packet from the physical network card, and extract domain name protocol element information in the packet, where a domain name, which is a commonly called website, has a globally unique characteristic, in order to facilitate memory, people often use the domain name to access a website, a domain name server may find an internet protocol address mapped by the domain name server, and the domain name protocol element information includes a domain name protocol attribute vector describing the network flow data, a state of the domain name, a conversion condition of the state of the domain name, and the like.
As an embodiment of the present invention, the terminal 120 is a traffic capturing device configured in a data network, including but not limited to an intelligent device such as a personal computer and a mobile phone, and after acquiring domain name information in the data network, the terminal 120 analyzes the domain name information to obtain a domain name address, domain name protocol attribute vectors, a state of a domain name, a conversion condition of the state of the domain name, and other domain name protocol element information, and stores the domain name address, the domain name protocol attribute vectors, the state of the domain name, and the conversion condition of the state of the domain name on the terminal 120 for subsequent domain name semantic anomaly analysis.
In the embodiment of the invention, the domain name is captured and analyzed through the terminal, so that preparation is made for subsequent domain name semantic analysis.
In step S202, a finite state machine of the domain name protocol is generated according to a pre-generated finite state machine model.
In the embodiment of the invention, a finite-state machine model is preset, and the finite-state machine of the domain name is produced according to the domain name information acquired in the steps.
In the embodiment of the present invention, a finite state machine refers to a mathematical model of a finite number of states and behaviors such as transitions and actions between the states. The pre-generated finite state machine is derived from a local wireless network protocol, the wireless network protocol is standardized, and the derived finite state machine is also standardized. The finite state machine is defined independently of the device itself and is applicable to every device. The fault data interface is debugged by the method defined by the equipment, and various equipment interfaces are difficult to unify. The pre-generated finite-state machine model can be used for uniformly debugging various devices, and the problem that interfaces are difficult to unify in the method is effectively solved.
As an embodiment of the present invention, the finite state machine M is a five-tuple, M ═ Σ, Q, V, D, T, where Σ is a finite event table; q is a finite state set comprising an initial state Q0And a final state qz(ii) a V is a finite set of state variables; d is a finite set of state variable value ranges; t represents a state transition condition in M. Finite events of domain name abnormal analysis are analyzed from domain name protocol specification semantics, all states and state conversion conditions are extracted, the finite events are recorded as a finite event table sigma, namely { start, query, abnormity, idle, response, normal and stop }, and the states are recorded as a finite state set Q, namely { Q } QiE.g. Q | i ═ 0,1,2. }; the state transition condition is set to T ═ Ti∈T|i=0,1,2...}。
Fig. 3 shows a state transition schematic suitable for an embodiment of the present invention, where state 0, state 1 and state 2 represent 3 states in a finite state machine, T represents a condition for making a state transition, and δ and θ are a set of state transitions.
According to the embodiment of the invention, the finite state machine of the domain name protocol is generated according to the preset finite state machine model, so that the standard property of the finite state machine of the domain name protocol is ensured, and the subsequent abnormal identification of domain name semantics is facilitated.
In step S203, the state of the element information of the domain name protocol is inferred according to the finite state machine of the domain name protocol, and the state of the domain name is output.
In the embodiment of the present invention, the domain name protocol semantic abnormality at least includes the following cases:
the number of levels of the domain name is usually not more than 5, and the first byte of each level of domain name label is composed of a 2-bit (bit) flag bit and a 6-bit label length; the domain name label follows the host name naming rules. In an actual environment, the statistical proportion of the naming length of the host name exceeding 25 characters is about 2%, and abnormal behaviors can be inferred if a plurality of domain name labels in the domain name network message continuously exceed 25 characters.
Secondly, words in a dictionary or other character strings which look meaningful are often used for normal domain names, if the domain names uniformly use various characters, continuous consonant letters and numbers in a character set, the information entropy of an information theory is used for describing the information disordering degree, and the larger the domain name entropy value is, the higher the possibility that the domain name is abnormal is.
Thirdly, when the domain name appears repeatedly in the same domain name protocol message, the domain name label appearing after the description according to the domain name protocol specification can be represented by an offset pointer of the position where the previous domain name appears relative to the beginning of the domain name protocol message. Implicit semantics are that the offset value is less than the offset pointer minus the beginning of the domain name packet, otherwise hidden channel traffic may be contained.
Fourthly, analyzing the domain name Protocol data of the network flow data according to the domain name Protocol specification, and if no error occurs in the analysis, the pointer must be positioned at the tail of a User Datagram Protocol (UDP) load after the analysis is finished; otherwise, it indicates that the data packet is not a legal domain name system packet, and may be a domain name tunnel or there is slack space data injection.
As an embodiment of the present invention, after a finite state machine of a domain name protocol is obtained, an abnormal state of a domain name can be obtained by performing state deduction on the domain name, and if a certain termination state of the finite state machine is reached, the state is output and a state space of the finite state machine is emptied.
The method can improve the accuracy of the abnormal analysis by carrying out semantic deduction on the existing 4 cases.
In the embodiment of the present invention, extracting the domain name protocol element information in the packet includes: forming a domain name protocol attribute vector by using the extracted protocol elements; and forming nodes of the state chain by using the extracted protocol elements, and determining a set of value ranges of the node state variables.
As an embodiment of the present invention, a domain name protocol attribute vector describing network flow data is formed by using extracted protocol elements, and a feature vector F is obtained, where the feature vector F is { a network layer source address F1, a network layer destination address F2, a transport layer source port F3, a transport layer destination port F4, and a domain name protocol stream identifier F5}, and the elements are extracted by analyzing the data flow protocol layer by layer according to an OSI (Open System Interconnection, OSI/RM, Open Systems Interconnection Reference Model), and the network layer source address, the network layer destination address, the transport layer source port, the transport layer destination port, the domain name protocol identifier, and the domain name protocol variable are extracted as values of the feature vector, and a datagram is mapped to a point set D on an N-dimensional vector space, and each dimension in the point set D corresponds to one feature vector attribute of the datagram. After vectorization, all datagrams can be represented in the form of a cartesian product:
Dom(f1)×Dom(f2)×…×Dom(fn)
wherein { f }1,f2,…fnIs the n-dimensional feature vector of the datagram, and Dom (f)i) All values of the attribute are represented. The data stream protocol layer-by-layer analysis and extraction has the following elements:
network layer Protocol element extraction decode net (network) containing DecodeIP (Internet Protocol), network layer source address F1 and network layer destination address F2;
the transport layer protocol element extraction decode _ trans (transport layer) contains DecodeUDP, and contains a transport layer source port F3 and a transport layer destination port F4;
the domain name protocol element extraction decode _ dns contains DecodeDNS, contains the domain name protocol stream identifier F5, and the attribute value of the protocol variable.
As an embodiment of the invention, the nodes forming the state chain by using the extracted protocol elements determine the set D of all node state variable value domains, wherein the state variables comprise QFlag, QName, QType, QClass, RFlag, RLables, RAs [ Anc ], RAu [ Nsc ], Rad [ Arc ] and the like, and the state variables are specific data in the computer language.
In the embodiment of the present invention, the performing state inference on the domain name protocol according to the finite state machine of the domain name protocol includes:
and calculating a Hash value according to the domain name protocol attribute vector, and inquiring or creating a state chain entrance.
As an embodiment of the present invention, Hash calculation is performed on a feature vector F ═ a network layer source address F1, a network layer destination address F2, a transport layer source port F3, a transport layer destination port F4, and a domain name protocol flow identifier F5, and a state chain entry is queried or created using the following Hash formula:
Hash=(F1|F2(F3<<16)|(F4<<16)|F5)%N
and after the Hash value is obtained through calculation, assigning values to the state node variables by utilizing the extracted protocol elements.
In the embodiment of the present invention, performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol includes:
and calculating the information entropy value of the domain name according to the information entropy formula and the domain name protocol element information.
As an embodiment of the present invention, the domain name information entropy value is calculated using the following information entropy formula,
Figure GDA0003376114670000071
n: the classification number of the domain name character set X; pi: probability of occurrence of the ith type element in X;
normal domain names start with letters (case-insensitive) and end with letters or numbers, and the internal characters can only be letters, numbers and "-"; when numbers are classified into one category, n is 28, the probability of occurrence of "-" is very small, and the probability of occurrence of 25 characters on the keyboard except for letters, numbers and "-" is marked as 0. When the length of a certain domain name is L, the closer the information entropy value is, the higher the probability of abnormality is.
According to the embodiment of the invention, the information entropy value of the domain name is calculated through the information entropy formula, so that the abnormal probability of the domain name abnormality can be obtained, and the accuracy of domain name semantic abnormality analysis is improved.
In an embodiment of the present invention, generating the finite state machine specific of the domain name protocol according to the pre-generated finite state machine model comprises: creating a finite state machine of the domain name protocol according to the definition of a preset domain name protocol semantic exception finite state machine; and simplifying the finite state machine of the domain name protocol.
As an embodiment of the present invention, the finite state machine M is a five-tuple, M ═ Σ, Q, V, D, T, where Σ is a finite event table; q is a finite state set comprising an initial state Q0 and a final state qz; v is a finite set of state variables; d is a finite set of state variable value ranges; t represents a state transition condition in M. Extracting all states and state conversion conditions from finite events of domain name protocol standard semantic analysis domain name abnormal analysis, recording the finite events as a finite event table sigma ═ { start, query, abnormal, idle, response, normal and stop }, and recording the states as an infinite state set Q ═ qi ∈ Q | i ═ 0,1,2. }; the state transition condition is set to T ═ ti ∈ T | i ═ 0,1,2.
Fig. 4(a) shows a detailed state transition diagram suitable for embodiments of the present invention, where δ is 0.
In the embodiment of the invention, two states representing the same behavior are continuously merged until only 1 behavior of the same behavior is left in the state machine, so that a simplified state machine which can be utilized is obtained. The specific idea of realization is as follows: searching from a certain mark state, if a state with the same identifier attribute as the mark state is found, marking the state as a state to be merged, merging the state to be merged into a target state, and merging the out-degree and in-degree relations of the state to be merged into the target state to prevent loss, namely merging the out-degrees of all the states to be merged into the out-degrees of the target state, and modifying the in-degrees of all the states to be merged into the in-degrees of the target state. That is, it can be simplified as fig. 4(b), and fig. 4(b) is a state transition diagram after simplification of fig. 4 (a).
FIG. 4(c) illustrates a state transition process diagram for anomaly analysis suitable for embodiments of the present invention;
FIG. 4(d) shows a state transition process diagram for yet another anomaly analysis suitable for embodiments of the present invention;
in the implementation of the invention, 4 similar states of the request area, the response area of the resource record, the authority area of the resource record and the additional area of the resource record can be combined into 1 for the domain name string analysis abnormity judgment, thereby realizing the simplification of the state machine model. The simplification process not only keeps the interactive structure of the original state machine protocol, but also realizes the state merging with similar behavioral significance, and can reduce the scale of the state machine.
According to the embodiment of the invention, the finite state machine is simplified, so that the calculation amount of domain name semantic anomaly analysis is greatly reduced, the calculation errors are reduced, and the accuracy of anomaly analysis is further improved.
Fig. 5 is a schematic structural diagram of an apparatus for analyzing semantic anomalies of domain names according to an embodiment of the present invention, which is detailed as follows:
an information receiving unit 501, configured to receive a domain name network data packet, and extract element information of a domain name protocol in the packet.
In the embodiment of the present invention, the terminal 120 may obtain the domain name network data packet from the network flow data, or directly obtain the domain name network data packet from the physical network card, and extract domain name protocol element information in the packet, where a domain name, which is a commonly called website, has a globally unique characteristic, in order to facilitate memory, people often use the domain name to access a website, a domain name server may find an internet protocol address mapped by the domain name server, and the domain name protocol element information includes a domain name protocol attribute vector describing the network flow data, a state of the domain name, a conversion condition of the state of the domain name, and the like.
As an embodiment of the present invention, the terminal 120 is a traffic capturing device configured in a data network, including but not limited to an intelligent device such as a personal computer and a mobile phone, and after acquiring domain name information in the data network, the terminal 120 analyzes the domain name information to obtain a domain name address, domain name protocol attribute vectors, a state of a domain name, a conversion condition of the state of the domain name, and other domain name protocol element information, and stores the domain name address, the domain name protocol attribute vectors, the state of the domain name, and the conversion condition of the state of the domain name on the terminal 120 for subsequent domain name semantic anomaly analysis.
In the embodiment of the invention, the domain name is captured and analyzed through the terminal, so that preparation is made for subsequent domain name semantic analysis.
An information processing unit 502, configured to generate a finite state machine of the domain name protocol according to a pre-generated finite state machine model;
in the embodiment of the invention, a finite-state machine model is preset, and the finite-state machine of the domain name is produced according to the domain name information acquired in the steps.
In the embodiment of the present invention, a finite state machine refers to a mathematical model of a finite number of states and behaviors such as transitions and actions between the states. The pre-generated finite state machine is derived from a local wireless network protocol, the wireless network protocol is standardized, and the derived finite state machine is also standardized. The finite state machine is defined independently of the device itself and is applicable to every device. The fault data interface is debugged by the method defined by the equipment, and various equipment interfaces are difficult to unify. The pre-generated finite-state machine model can be used for uniformly debugging various devices, and the problem that interfaces are difficult to unify in the method is effectively solved.
As an embodiment of the present invention, the finite state machine M is a five-tuple, M ═ Σ, Q, V, D, T, where Σ is a finite event table; q is a finite state set comprising an initial state Q0 and a final state qz; v is a finite set of state variables; d is a finite set of state variable value ranges; t represents a state transition condition in M. Extracting all states and state conversion conditions from finite events of domain name protocol standard semantic analysis domain name abnormal analysis, recording the finite events as a finite event table sigma ═ { start, query, abnormal, idle, response, normal and stop }, and recording the states as an infinite state set Q ═ qi ∈ Q | i ═ 0,1,2. }; the state transition condition is set to T ═ ti ∈ T | i ═ 0,1,2.
Fig. 3 shows a state transition schematic suitable for an embodiment of the present invention, where state 0, state 1 and state 2 represent 3 states in a finite state machine, T represents a condition for making a state transition, and δ and θ are a set of state transitions.
According to the embodiment of the invention, the finite state machine of the domain name protocol is generated according to the preset finite state machine model, so that the standard property of the finite state machine of the domain name protocol is ensured, and the subsequent abnormal identification of domain name semantics is facilitated.
An information output unit 503, configured to perform state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol, and output the state of the domain name.
In the embodiment of the present invention, the domain name protocol semantic abnormality at least includes the following cases:
the number of the levels of the field names is usually not more than 5, and the first byte of each level of the field name label is composed of a 2-bit zone bit and a 6-bit label length; the domain name label follows the host name naming rules. In an actual environment, the statistical proportion of the naming length of the host name exceeding 25 characters is about 2%, and abnormal behaviors can be inferred if a plurality of domain name labels in the domain name network message continuously exceed 25 characters.
Secondly, words in a dictionary or other character strings which look meaningful are often used for normal domain names, if the domain names uniformly use various characters, continuous consonant letters and numbers in a character set, the information entropy of an information theory is used for describing the information disordering degree, and the larger the domain name entropy value is, the higher the possibility that the domain name is abnormal is.
Thirdly, when the domain name appears repeatedly in the same domain name protocol message, the domain name label appearing after the description according to the domain name protocol specification can be represented by an offset pointer of the position where the previous domain name appears relative to the beginning of the domain name protocol message. Implicit semantics are that the offset value is less than the offset pointer minus the beginning of the domain name packet, otherwise hidden channel traffic may be contained.
Fourthly, analyzing the domain name Protocol data of the network flow data according to the domain name Protocol specification, and if no error occurs in the analysis, the pointer must be positioned at the tail of a User Datagram Protocol (UDP) load after the analysis is finished; otherwise, it indicates that the data packet is not a legal domain name system packet, and may be a domain name tunnel or there is slack space data injection.
As an embodiment of the present invention, after a finite state machine of a domain name protocol is obtained, an abnormal state of a domain name can be obtained by performing state deduction on the domain name, and if a certain termination state of the finite state machine is reached, the state is output and a state space of the finite state machine is emptied.
The method can improve the accuracy of the abnormal analysis by carrying out semantic deduction on the existing 4 cases.
The embodiment of the invention provides a device capable of carrying out abnormity analysis on domain name semantics, which can efficiently and accurately judge the abnormal state of a domain name.
FIG. 6 shows an internal block diagram of a computer device suitable for embodiments of the present invention, detailed as follows:
in the embodiment of the present invention, the computer device may specifically be the terminal 120 in fig. 1. As shown in fig. 6, the computer apparatus includes a processor, a memory, a network interface, an input device, and a display screen connected through a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program, which when executed by the processor, causes the processor to implement a method for domain name semantic anomaly recognition. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform a method for domain name semantic anomaly identification. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The embodiment of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and the processor implements the following steps when executing the computer program:
receiving a domain name network data message, and extracting element information of a domain name protocol in the message;
generating a finite state machine of the domain name protocol according to a pre-generated finite state machine model;
and performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol, and outputting the state of the domain name.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the processor is caused to execute the following steps:
receiving a domain name network data message, and extracting element information of a domain name protocol in the message;
generating a finite state machine of the domain name protocol according to a pre-generated finite state machine model;
and performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol, and outputting the state of the domain name.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (4)

1. A domain name semantic anomaly analysis method is characterized by comprising the following steps:
receiving a domain name network data message, and extracting element information of a domain name protocol in the message; the extracting of the domain name protocol element information in the message includes: forming a domain name protocol attribute vector by using the extracted protocol elements; forming nodes of a state chain by using the extracted protocol elements, and determining a set of value ranges of the node state variables;
generating a finite state machine of the domain name protocol according to a pre-generated finite state machine model; the state inference of the domain name protocol according to the finite state machine of the domain name protocol comprises: calculating a Hash value according to the domain name protocol attribute vector, and inquiring or creating a state connection port;
performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol, and outputting the state of the domain name; the state inference of the element information of the domain name protocol according to the finite state machine of the domain name protocol comprises the following steps: calculating the information entropy value of the domain name according to the information entropy formula and the domain name protocol element information; performing semantic deduction on the domain name according to the Hash value and the information entropy value to obtain state information of the domain name; the generating the finite state machine specific of the domain name protocol according to the pre-generated finite state machine model comprises: creating a finite state machine of the domain name protocol according to the definition of a preset domain name protocol semantic exception finite state machine; simplifying a finite state machine of the domain name protocol; the simplifying the finite state machine of the domain name protocol comprises the following steps: and combining 4 similar states of a request area, a response area of the resource record, an authority area of the resource record and an additional area of the resource record in the finite-state machine of the domain name protocol into 1 state, thereby realizing the simplification of the finite-state machine of the domain name protocol.
2. An apparatus using the domain name semantic anomaly analysis method according to claim 1, comprising:
the information receiving unit is used for receiving a domain name network data message and extracting the element information of a domain name protocol in the message;
the information processing unit is used for generating a finite state machine of the domain name protocol according to a pre-generated finite state machine model;
and the information output unit is used for performing state inference on the element information of the domain name protocol according to the finite state machine of the domain name protocol and outputting the state of the domain name.
3. A computer arrangement comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to carry out the steps of a domain name semantic anomaly analysis method as claimed in claim 1.
4. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, causes the processor to carry out the steps of a method of domain name semantic anomaly analysis as claimed in claim 1.
CN201910226208.8A 2019-03-25 2019-03-25 Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof Active CN109981818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910226208.8A CN109981818B (en) 2019-03-25 2019-03-25 Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910226208.8A CN109981818B (en) 2019-03-25 2019-03-25 Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof

Publications (2)

Publication Number Publication Date
CN109981818A CN109981818A (en) 2019-07-05
CN109981818B true CN109981818B (en) 2022-02-25

Family

ID=67080231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910226208.8A Active CN109981818B (en) 2019-03-25 2019-03-25 Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof

Country Status (1)

Country Link
CN (1) CN109981818B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307167A (en) * 2020-10-30 2021-02-02 广州华多网络科技有限公司 Text sentence cutting method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573782A (en) * 2003-06-23 2005-02-02 微软公司 Advanced spam detection techniques
CN102752154A (en) * 2012-07-29 2012-10-24 西北工业大学 Detecting method of dead link of Web site
CN105141598A (en) * 2015-08-14 2015-12-09 中国传媒大学 APT (Advanced Persistent Threat) attack detection method and APT attack detection device based on malicious domain name detection
CN107733851A (en) * 2017-08-23 2018-02-23 刘胜利 DNS tunnels Trojan detecting method based on communication behavior analysis

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7626940B2 (en) * 2004-12-22 2009-12-01 Intruguard Devices, Inc. System and method for integrated header, state, rate and content anomaly prevention for domain name service

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573782A (en) * 2003-06-23 2005-02-02 微软公司 Advanced spam detection techniques
CN102752154A (en) * 2012-07-29 2012-10-24 西北工业大学 Detecting method of dead link of Web site
CN105141598A (en) * 2015-08-14 2015-12-09 中国传媒大学 APT (Advanced Persistent Threat) attack detection method and APT attack detection device based on malicious domain name detection
CN107733851A (en) * 2017-08-23 2018-02-23 刘胜利 DNS tunnels Trojan detecting method based on communication behavior analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Detection of DNS Anomalies using Flow Data Analysis;Anestis Karasaridis等;《IEEE Globecom 2006》;20061230;全文 *

Also Published As

Publication number Publication date
CN109981818A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN110489520B (en) Knowledge graph-based event processing method, device, equipment and storage medium
CN110059320B (en) Entity relationship extraction method and device, computer equipment and storage medium
CN112291424B (en) Fraud number identification method and device, computer equipment and storage medium
CN111192025A (en) Occupational information matching method and device, computer equipment and storage medium
CN106874253A (en) Recognize the method and device of sensitive information
CN113961768B (en) Sensitive word detection method and device, computer equipment and storage medium
CN111159413A (en) Log clustering method, device, equipment and storage medium
CN110597814A (en) Structured data serialization and deserialization method and device
CN110955608A (en) Test data processing method and device, computer equipment and storage medium
CN113381963A (en) Domain name detection method, device and storage medium
CN109981818B (en) Domain name semantic anomaly analysis method and device, computer equipment and storage medium thereof
CN114244795A (en) Information pushing method, device, equipment and medium
CN110598115A (en) Sensitive webpage identification method and system based on artificial intelligence multi-engine
CN116155597A (en) Access request processing method and device and computer equipment
CN116340989A (en) Data desensitization method and device, electronic equipment and storage medium
CN113688240B (en) Threat element extraction method, threat element extraction device, threat element extraction equipment and storage medium
CN109410069A (en) Settlement data processing method, device, computer equipment and storage medium
CN115238124A (en) Video character retrieval method, device, equipment and storage medium
CN114254650A (en) Information processing method, device, equipment and medium
CN113722646A (en) Multi-level fingerprint identification method for multiple browser extensions
CN113627514A (en) Data processing method and device of knowledge graph, electronic equipment and storage medium
CN116383029B (en) User behavior label generation method and device based on small program
CN116975300B (en) Information mining method and system based on big data set
CN112202822B (en) Database injection detection method and device, electronic equipment and storage medium
CN113268647B (en) Method, system and device for classifying network security information data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant