CN114143084B - Malicious domain name judging method and device, electronic equipment and storage medium - Google Patents

Malicious domain name judging method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114143084B
CN114143084B CN202111445685.7A CN202111445685A CN114143084B CN 114143084 B CN114143084 B CN 114143084B CN 202111445685 A CN202111445685 A CN 202111445685A CN 114143084 B CN114143084 B CN 114143084B
Authority
CN
China
Prior art keywords
domain name
domain
index
vector value
port
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111445685.7A
Other languages
Chinese (zh)
Other versions
CN114143084A (en
Inventor
许梦磊
童志明
肖新光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Antiy Technology Group Co Ltd
Original Assignee
Antiy Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Antiy Technology Group Co Ltd filed Critical Antiy Technology Group Co Ltd
Priority to CN202111445685.7A priority Critical patent/CN114143084B/en
Publication of CN114143084A publication Critical patent/CN114143084A/en
Application granted granted Critical
Publication of CN114143084B publication Critical patent/CN114143084B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention discloses a malicious domain name judging method, a malicious domain name judging device, electronic equipment and a storage medium, relates to the technical field of network security, and can realize predictive judgment on a malicious domain name. The malicious domain name judging method comprises the following steps: acquiring a domain name to be judged; analyzing the domain name to be judged to obtain a characterization domain vector value and a behavior domain vector value; converting the characterization domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system respectively; combining the x-axis coordinate point and the y-axis coordinate point to obtain a coordinate position of the domain name to be determined in a plane rectangular coordinate system after conversion; converting the coordinate position into a coordinate position in a preset Poicare disc; calculating the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc; and judging the malicious degree of the domain name according to the distance. The method is suitable for judging the malicious c2 domain name.

Description

Malicious domain name judging method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of network security technologies, and in particular, to a malicious domain name determining method, a malicious domain name determining device, an electronic device, and a storage medium.
Background
With the rapid development of computer technology and internet communication in the last decades, online payment and online shopping become an integral part of life of people, and the accumulation of capital inevitably leads to the phishing layer caused by the 35274 of lawless persons and various malicious domain names. Traditional malicious domain name decisions are mostly focused on decisions based on existing knowledge bases, and predictive decisions cannot be achieved to some extent.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a malicious domain name determination method, a malicious domain name determination device, an electronic device, and a storage medium that can implement predictive determination.
In a first aspect, an embodiment of the present invention provides a malicious domain name determining method, including:
acquiring a domain name to be judged;
analyzing the domain name to be judged to obtain a characterization domain vector value and a behavior domain vector value, wherein the characterization domain comprises at least one of average length of meaningful words in the domain name, number of the meaningful words in the domain name, number of confusion characters in the domain name and average adjacent degree of the confusion characters in the domain name, and the behavior domain comprises at least one of domain name connection duration proportion, domain name connection time period, whether a domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port;
converting the characterization domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system respectively;
combining the x-axis coordinate point and the y-axis coordinate point to obtain a coordinate position of the domain name to be determined in a plane rectangular coordinate system after conversion;
converting the coordinate position into a coordinate position in a preset Poicare disc;
calculating the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc;
and judging the malicious degree of the domain name according to the distance.
With reference to the first aspect, in an implementation manner of the first aspect, the analyzing the domain name to be determined to obtain a representation domain vector value and a behavior domain vector value includes:
analyzing the domain name to be judged to obtain a meaning index and a confusion index, and combining the meaning index and the confusion index to obtain the characterization domain vector value, wherein the meaning index is calculated by at least one of the average length of meaning words in the domain name and the number of meaning words in the domain name, and the confusion index is calculated by at least one of the number of confusion characters in the domain name and the average adjacent degree of confusion characters in the domain name;
analyzing the domain name to be judged to obtain a time index and a port index, and combining the time index and the port index to obtain the behavior domain vector value, wherein the time index is obtained by calculating at least one of a domain name connection duration proportion and a domain name connection period, and the port index is obtained by calculating at least one of whether a domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port.
With reference to the first aspect, in another implementation manner of the first aspect, the converting the representation domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system includes:
and respectively obtaining an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system after the representation domain vector value and the behavior domain vector value are endowed with weight vectors.
With reference to the first aspect, in a further implementation manner of the first aspect, the determining, according to the distance, a malicious degree of the domain name includes:
and judging the malicious degree of the domain name according to the relation between the distance and a preset threshold value.
In a second aspect, an embodiment of the present invention provides a malicious domain name determining apparatus, including:
the acquisition module is used for acquiring the domain name to be judged;
the analysis module is used for analyzing the domain name to be judged to obtain a representation domain vector value and a behavior domain vector value, wherein the representation domain comprises at least one of average length of meaningful words in the domain name, number of confusion characters in the domain name and average adjacent degree of confusion characters in the domain name, and the behavior domain comprises at least one of domain name connection duration proportion, domain name connection time period, whether a domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port;
the first conversion module is used for respectively converting the characterization domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system;
the combination module is used for combining the x-axis coordinate point and the y-axis coordinate point to obtain the coordinate position of the domain name to be determined in the plane rectangular coordinate system after the domain name to be determined is converted;
the second conversion module is used for converting the coordinate position into a coordinate position in a preset Poicare disc;
the computing module is used for computing the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc;
and the judging module is used for judging the malicious degree of the domain name according to the distance.
With reference to the second aspect, in an implementation manner of the second aspect, the analysis module includes:
the first analysis subunit is used for analyzing the domain name to be judged to obtain a meaning index and a confusion index, and combining the meaning index and the confusion index to obtain the characterization domain vector value, wherein the meaning index is calculated by at least one of the average length of meaning words in the domain name and the number of meaning words in the domain name, and the confusion index is calculated by at least one of the number of confusion characters in the domain name and the average adjacent degree of confusion characters in the domain name;
the second analysis subunit is configured to analyze the domain name to be determined to obtain a time index and a port index, and combine the time index and the port index to obtain the behavior domain vector value, where the time index is obtained by calculating at least one of a domain name connection duration proportion and a domain name connection period, and the port index is obtained by calculating at least one of whether the domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port.
With reference to the second aspect, in another implementation manner of the second aspect, the first conversion module includes:
and the conversion subunit is used for respectively obtaining an x-axis coordinate point and a y-axis coordinate point in the plane rectangular coordinate system after the representation domain vector value and the behavior domain vector value are endowed with weight vectors.
With reference to the second aspect, in a further implementation manner of the second aspect, the determining module includes:
and the judging subunit is used for judging the malicious degree of the domain name according to the relation between the distance and a preset threshold value.
In a third aspect, an embodiment of the present invention provides an electronic device, including: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space surrounded by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing any of the methods described above.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium storing one or more programs executable by one or more processors to implement any of the methods described above.
According to the malicious domain name judging method, device, electronic equipment and storage medium, a domain name to be judged is firstly obtained, then the domain name to be judged is analyzed to obtain a characterization domain vector value and a behavior domain vector value, the characterization domain vector value and the behavior domain vector value are respectively converted into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system, then the x-axis coordinate point and the y-axis coordinate point are combined to obtain a coordinate position of the domain name to be judged in the plane rectangular coordinate system after the domain name to be judged is converted, the coordinate position is then converted into a coordinate position in a preset Poicare disc, the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc is calculated, and finally the malicious degree of the domain name is judged according to the distance. In this way, the embodiment of the invention converts the attribute value of the < representation, the behavior > of the domain name to be judged into the Poicare disc model, and judges the malicious degree of the domain name according to the distance between the coordinate position of the domain name to be judged in the Poicare disc and the center of the disc, thereby realizing predictive judgment. The embodiment of the invention belongs to a malicious domain name judging method based on field (characterization and behavior) driving, and is particularly suitable for judging malicious c2 domain names.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a malicious domain name determination method of the present invention;
FIG. 2 is a flow chart of an embodiment of a malicious domain name determination method according to the present invention;
FIG. 3 is a schematic diagram of a Poicare disk in an embodiment of a malicious domain name determination method of the present invention;
FIG. 4 is a schematic diagram of a malicious domain name determining apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an embodiment of the electronic device of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
First, the principle of the embodiment of the present invention will be briefly described.
As shown in fig. 1, the embodiment of the invention belongs to a domain-driven malicious domain name judging method, and the domain refers to a mathematical abstraction of specific attributes (characterization and behavior) of a malicious domain name, wherein related information generated in the interaction process of a server and a client in network communication mainly refers to the scene of remote control trojans is analyzed by using the existing malicious domain name c2 (Command and Control ), a malicious domain name judging model (poicure disc model) is constructed, domain attributes (characterization and behavior) of the domain name to be judged are extracted, x-axis coordinate points and y-axis coordinate points under a plane rectangular coordinate system are respectively formed, coordinate positions under the plane rectangular coordinate system are formed by combining the two coordinate points, and then the coordinate positions are interpolated/mapped into the malicious judging model to judge the malicious degree. The embodiment of the invention provides a processing mode combining abstraction and interfacing, which can effectively judge an unknown c2 domain name based on the existing malicious c2 domain name library and realize predictive judgment.
In one aspect, an embodiment of the present invention provides a malicious domain name determining method, as shown in fig. 2, where the method in this embodiment may include:
step 101: acquiring a domain name to be judged;
step 102: analyzing the domain name to be judged to obtain a characterization domain vector value and a behavior domain vector value, wherein the characterization domain comprises at least one of average length of meaningful words in the domain name, number of the meaningful words in the domain name, number of confusion characters in the domain name and average adjacent degree of the confusion characters in the domain name, and the behavior domain comprises at least one of domain name connection duration proportion, domain name connection time period, whether a domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port;
in this step, the attribute value of "characterization" and the attribute value of "behavior" of the domain name to be determined are represented by the characterization domain vector value and the behavior domain vector value, respectively.
As an alternative embodiment, the analyzing the domain name to be determined to obtain a characterization domain vector value and a behavior domain vector value (step 102) may include:
step 1021: analyzing the domain name to be judged to obtain a meaning index and a confusion index, and combining the meaning index and the confusion index to obtain the characterization domain vector value, wherein the meaning index is calculated by at least one of the average length of meaning words in the domain name and the number of meaning words in the domain name, and the confusion index is calculated by at least one of the number of confusion characters in the domain name and the average adjacent degree of confusion characters in the domain name;
in this step, both the meaning index and the confusion index represent the malicious degree of the domain name. The meaning index may be calculated, for example, as follows:
IF len>3AND count>3THEN mi=0.8
wherein, len: average length of meaningful words in domain name, count: number of meaningful words in domain name, mi: meaning index.
Whether the words in the domain name have meanings or not can be judged in a matching way through a pre-established word dictionary; in this example, the threshold value of the average length of the meaning words is 3, the threshold value of the number of the meaning words is also 3, and the setting of the threshold value can be obtained by carrying out statistical analysis based on malicious domain names in the existing malicious c2 domain name library; in this example, the meaning index is set to 0.8, and the value can be flexibly set according to the requirement, and can be set to a value between 0 and 1 for facilitating subsequent unified processing.
In the above example, the meaning index mi is a specific value, and it is understood that a range may be substituted for the specific value (the range still needs to be empirically determined as a point value for easy calculation in the subsequent calculation), and the method for calculating the meaning index may be, for example, as follows:
IF len>3AND count>3THEN mi>0.75AND mi<1
IF len>3AND count<=3THEN mi>0.5AND mi<=0.75
IF len<=3AND count>3THEN mi>0.5AND mi<=0.75
IF len<=3AND count<=3THEN mi>0AND mi<=0.5
in this example, the smaller the value of the meaning index, the higher the level of maliciousness; vice versa.
The method for calculating the confusion index may be, for example, as follows:
IF count<=2AND degree>2THEN ci>0.75AND ci<1
IF count>2AND degree>2THEN ci>0.5AND ci<=0.75
IF count<=2AND degree<=2THEN ci>0.5AND ci<=0.75
IF count>2AND degree<=2THEN ci>0AND ci<=0.5
wherein, count: the number of confusing characters, the delay: average degree of adjacency, ci: confusion index.
The number of the confusing characters can be matched and judged through a pre-established word dictionary, and corresponding matching rules (such as bad character rules, good suffix rules and the like) can be designed. In this example, the setting of the relevant threshold values of count and deviee is also obtained by statistical analysis based on the malicious domain name in the existing malicious c2 domain name library; the size of the confusion index can be flexibly set according to the requirement, and is also set to be a numerical value between 0 and 1 for facilitating subsequent unified processing.
Step 1022: analyzing the domain name to be judged to obtain a time index and a port index, and combining the time index and the port index to obtain the behavior domain vector value, wherein the time index is obtained by calculating at least one of a domain name connection duration proportion and a domain name connection period, and the port index is obtained by calculating at least one of whether a domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port.
In this step, the time index and the port index also represent the malicious degree of the domain name, and the calculation method is similar to that of the meaning index and the confusion index.
The method for calculating the time index may be, for example, as follows:
IF ratio:[0,0.25)AND time:[18:00,00:00)IN ti THEN ti>0.75AND ti<1
IF ratio:[075,1)AND time:[00:00,05:00)IN ti THEN ti>0AND ti<=0.25
wherein, ratio: domain name connection duration ratio, time: period ti: time index.
ratio has 4 segments: [0,0.25), [0.25, 0.5), [0.5, 0.75), [0.75,1 ]
the time has 3 segments: [00:00, 05:00), [05:00, 18:00), [18:00, 00:00)
A total of 12 combinations are not listed here.
The calculation of the proportion of the domain name connection duration preferably has reference test time (one week, one month and the like), and the ratio of the time for connecting the domain name after the computer is started to the computer networking time is calculated in the reference test time; the domain name connection period is a period of each day.
The method for calculating the port index may be, for example, as follows:
IF port==Standard IN pi THEN pi>0.75AND pi<1
IF port==Common IN pi THEN pi>0.5AND pi<=0.75
IF port!=Standard AND port!=Common IN pi THEN pi>0AND pi<=0.5
wherein, port: connection port, standard: standard port, common: common port, pi: port index.
A protocol standard port, a list of contracted common ports, is needed, wherein the protocol standard port may include: 23. 25, 67, 68, 443, 8080, etc., the contracted common ports may include: 1314. 4444, etc.
The calculation of each index will be described in detail below by taking the domain name test.g0.1f.com to be determined as an example.
Meaning index: the number of meaningful words in the domain name is 2 (respectively, test and com; in other examples, the suffix com can also not be taken as the meaningful words), the average length of the meaningful words in the domain name is 3 (the test length is 4, the com length is 3, the average value of the two is 3.5, and the whole is obtained by rounding down to obtain 3);
confusion index: the number of confusing characters in the domain name is 2 (wherein the number 0 is confusing with the letter o and the number 1 is confusing with the letter l), the average adjacent degree of confusing characters in the domain name is 1 (i.e. 1 character is spaced, if the number 0 and the number 1 are directly adjacent, the average adjacent degree is 0);
time index: the domain name connection duration ratio is 0.8 (the ratio of the time for connecting the domain name after the computer is started to the computer networking time), and the domain name connection time period is 00: 00-05: 00;
port index: the connection port is 1314, which is not a standard protocol port but is a custom common port (e.g., the TCP common port is 8080 and the udp common port is 443).
For the domain name to be determined, the calculated indices are shown in table 1 below.
TABLE 1
Domain name Meaning index Confusion index Time index Port index
test.g0.1f.com 0.41 0.11 0.38 0.10
It should be noted that, in table 1, the numerical values of the indexes are also preprocessed so that the sum of the indexes is 1, so as to facilitate subsequent calculation; in the preprocessing process, different weights can be given to the indexes to more accurately reflect the relation between the indexes and the malicious degree of the domain name, and it can be understood that the preprocessing step can be omitted.
Step 103: converting the characterization domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system respectively;
as an optional embodiment, the converting the representation domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system (step 103) may include:
and respectively obtaining an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system after the representation domain vector value and the behavior domain vector value are endowed with weight vectors.
In the foregoing embodiment, the token domain vector value is obtained by combining the meaning index and the confusion index, and the behavior domain vector value is obtained by combining the time index and the port index, which are both 1*2-dimensional vectors, and are converted into point values, where 2*1-dimensional weight vectors can be used. The size of the weight vector can be obtained by carrying out statistical analysis on malicious domain names in the existing malicious c2 domain name library, and the specific steps can be referred to as follows:
TABLE 2
Domain name Meaning index Confusion index Time index Port index
test.g0.1f.com 0.41 0.11 0.38 0.10
qyhxyw.com 0.48 0.05 0.35 0.12
api.vk3.co 0.45 0.07 0.36 0.12
aaaa.920xz.com 0.46 0.09 0.34 0.11
oauth.vk.com 0.40 0.11 0.37 0.12
kf11.f3322.net 0.42 0.11 0.32 0.15
7895237.cn 0.41 0.09 0.38 0.14
oxxtxxt.biz 0.45 0.04 0.39 0.12
www.ygx5.com 0.44 0.09 0.36 0.11
zjjwh2005.8800.org 0.43 0.04 0.39 0.14
fk.appledoesnt.com 0.39 0.05 0.40 0.16
bak.hnhxzz.com 0.41 0.11 0.35 0.13
Table 2 shows the existing malicious c2 domain name library and the index numbers of the domain names therein. Based on this table 2, an evaluation matrix of the characterization field (meaning index, confusion index) is first established:
then, matrix normalization processing is carried out on the established evaluation matrix, and the obtained normalized matrix is as follows:
the entropy values of the evaluated meaning index and the confusion index are calculated as follows:
e T =0.9973
e C =0.9756
the difference coefficient between the meaning index and the confusion index is determined as follows:
g T =0.0027g C =0.0244
the index weights for calculating the meaning index and confusion index are as follows:
w T =0.0996
w C =0.4502
the characterization domain weight vector obtained by the calculation in the previous step is as follows:
W T =[0.4502,0.0996] T
similarly, the behavior domain weight vector obtained is as follows:
W S =[0.37320.0770] T
step 104: combining the x-axis coordinate point and the y-axis coordinate point to obtain a coordinate position of the domain name to be determined in a plane rectangular coordinate system after conversion;
in this example, the domain name to be determined test.g0.1f.com is taken as an example, and the coordinate position is obtained (0.313,0.166).
Step 105: converting the coordinate position into a coordinate position in a preset Poicare disc;
the poicure disc uses a hyperbolic geometry (also known as Luo Ba chefski geometry) whose characteristics are suitable for converting scattered points into the poicure disc for attribute determination. In this step, the use of the poicure disc is used for judging the malicious degree of the domain name in the subsequent step, and the construction process can be referred to as follows:
first, a domain name maliciousness ranking criterion was given according to the analysis experience, as shown in table 3 below.
TABLE 3 Table 3
Domain name relative maliciousness score Maliciousness ranking
0~47.5 High maliciousness
47.5~82.5 Moderately malicious
82.5 or more Low maliciousness
It will be appreciated that the relative maliciousness score partitioning described in Table 3 above is by way of example only and may be flexibly partitioned as desired.
Then, the complex analysis of the poicure disc model was processed as follows:
s1: simplified split linear transformation
For the partial linear transformation formula
Wherein e Representing spatial rotation (Euler formula e =cos θ+isinθ), R is the radius mapped to the poicure disk, a is the point to be mapped to the center of the poicure disk, z is any point in the poicure disk;
taking r=1, θ=0 to obtain:
s2: determining differential versions of Poicare metrics
Generic differential form for Poicare metrics
Wherein k is a proportionality constant;
taking r=1, k=1 to obtain:
s3: mapping malicious critical point coordinates
According to the domain name maliciousness grading standard, the mapping of the medium maliciousness and low maliciousness critical points comprises the following steps:
similarly, a score of 47.5 maps to 0.399 and a score of 0 maps to 0.
S4: transforming malicious critical points
The medium malicious and low malicious critical points are taken as (0.563,0), the medium malicious and high malicious critical points are taken as (0.399,0), the medium malicious and low malicious critical points are taken as malicious minimum value points for malicious judgment, and (0.563,0) is converted into a Poicare disc center. The step of performing the split linear transformation of (0.399,0) is as follows:
referring to the poicure disc shown in fig. 3, where radius r=1, origin o is the medium and low malicious threshold, representing the score 82.5 in table 3, b is the medium and high malicious threshold (i.e., the b-ring radius size is 0.212), representing the score 47.5 in table 3, a is the high malicious pole (i.e., the a-ring radius size is 0.563), representing the score 0 in table 3. That is, there are low malicious/mapping error points outside of the a-ring, high malicious points between the a-ring and the b-ring, and medium malicious points inside the b-ring.
In this example, reference may be made to the above steps, where the coordinate position (0.313,0.166) corresponding to the domain name to be determined test.g0.1f.com is converted into the coordinate position in the preset poicure disc, to obtain (-0.303,0.202).
Step 106: calculating the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc;
step 107: and judging the malicious degree of the domain name according to the distance.
As an alternative embodiment, the determining the malicious degree of the domain name according to the distance (step 107) may include:
and judging the malicious degree of the domain name according to the relation between the distance and a preset threshold value.
In this step, the preset threshold is the a-ring radius and the b-ring radius of the poicure disc in fig. 3.
In the foregoing example, the coordinate position after the domain name conversion to be determined is (-0.303,0.202), and the distance between the coordinate position and the center of the preset poicure disc is 0.505 (the translation of the euclidean plane corresponds to the rotation of the complex plane and can be calculated according to the distance algorithm of the euclidean plane), that is, the coordinate position after the conversion is located between the a ring and the b ring in the poicure disc, so as to obtain that the domain name test.g0.1f.com to be determined is highly malicious.
In the embodiment of the invention, the closer the distance from the center of the poicure disc is in the score mapping range, the lower the malicious degree of the domain name to be judged is, and it can be understood that the further the distance from the center of the poicure disc is in the score mapping range, the higher the malicious degree of the domain name to be judged is according to different generation modes of the poicure disc model.
In summary, according to the malicious domain name judging method provided by the embodiment of the invention, firstly, a domain name to be judged is obtained, then the domain name to be judged is analyzed to obtain a characterization domain vector value and a behavior domain vector value, then the characterization domain vector value and the behavior domain vector value are respectively converted into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system, then the x-axis coordinate point and the y-axis coordinate point are combined to obtain a coordinate position of the domain name to be judged in the plane rectangular coordinate system after the domain name to be judged is converted, then the coordinate position is converted into a coordinate position in a preset Poicare disc, the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc is calculated, and finally, the malicious degree of the domain name is judged according to the distance. In this way, the embodiment of the invention converts the attribute value of the < representation, the behavior > of the domain name to be judged into the Poicare disc model, and judges the malicious degree of the domain name according to the distance between the coordinate position of the domain name to be judged in the Poicare disc and the center of the disc, thereby realizing predictive judgment. The embodiment of the invention belongs to a malicious domain name judging method based on field (characterization and behavior) driving, and is particularly suitable for judging malicious c2 domain names.
On the other hand, an embodiment of the present invention provides a malicious domain name determining apparatus, as shown in fig. 4, where the apparatus may include:
an obtaining module 11, configured to obtain a domain name to be determined;
the analysis module 12 is configured to analyze the domain name to be determined to obtain a representation domain vector value and a behavior domain vector value, where the representation domain includes at least one of an average length of meaningful words in the domain name, a number of confusion characters in the domain name, and an average adjacency degree of confusion characters in the domain name, and the behavior domain includes at least one of a domain name connection duration proportion, a domain name connection period, whether a domain name connection port is a protocol standard port, and whether a domain name connection port is an agreed common port;
the first conversion module 13 is configured to convert the representation domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system respectively;
a combination module 14, configured to combine the x-axis coordinate point and the y-axis coordinate point to obtain a coordinate position of the domain name to be determined in a plane rectangular coordinate system after conversion;
the second conversion module 15 is configured to convert the coordinate position into a coordinate position in a preset poicure disc;
the calculating module 16 is configured to calculate a distance between the coordinate position in the transformed poicure disc and the center of the preset poicure disc;
and the judging module 17 is used for judging the malicious degree of the domain name according to the distance.
The device of the present embodiment may be used to implement the technical solution of the method embodiment shown in fig. 2, and its implementation principle and technical effects are similar, and are not described here again.
Preferably, the analysis module 12 includes:
the first analysis subunit is used for analyzing the domain name to be judged to obtain a meaning index and a confusion index, and combining the meaning index and the confusion index to obtain the characterization domain vector value, wherein the meaning index is calculated by at least one of the average length of meaning words in the domain name and the number of meaning words in the domain name, and the confusion index is calculated by at least one of the number of confusion characters in the domain name and the average adjacent degree of confusion characters in the domain name;
the second analysis subunit is configured to analyze the domain name to be determined to obtain a time index and a port index, and combine the time index and the port index to obtain the behavior domain vector value, where the time index is obtained by calculating at least one of a domain name connection duration proportion and a domain name connection period, and the port index is obtained by calculating at least one of whether the domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port.
Preferably, the first conversion module 13 includes:
and the conversion subunit is used for respectively obtaining an x-axis coordinate point and a y-axis coordinate point in the plane rectangular coordinate system after the representation domain vector value and the behavior domain vector value are endowed with weight vectors.
Preferably, the judging module 17 includes:
and the judging subunit is used for judging the malicious degree of the domain name according to the relation between the distance and a preset threshold value.
The embodiment of the present invention further provides an electronic device, fig. 5 is a schematic structural diagram of an embodiment of the electronic device, and may implement a flow of the embodiment of fig. 2 of the present invention, as shown in fig. 5, where the electronic device may include: the device comprises a shell 41, a processor 42, a memory 43, a circuit board 44 and a power circuit 45, wherein the circuit board 44 is arranged in a space surrounded by the shell 41, and the processor 42 and the memory 43 are arranged on the circuit board 44; a power supply circuit 45 for supplying power to the respective circuits or devices of the above-described electronic apparatus; the memory 43 is for storing executable program code; the processor 42 runs a program corresponding to the executable program code by reading the executable program code stored in the memory 43 for performing the method described in any of the method embodiments described above.
The specific implementation of the above steps by the processor 42 and the further implementation of the steps by the processor 42 through the execution of the executable program code may be referred to in the description of the embodiment of fig. 2 of the present invention, which is not repeated herein.
The electronic device exists in a variety of forms including, but not limited to:
(1) A mobile communication device: such devices are characterized by mobile communication capabilities and are primarily aimed at providing voice, data communications. Such terminals include: smart phones (e.g., iPhone), multimedia phones, functional phones, and low-end phones, etc.
(2) Ultra mobile personal computer device: such devices are in the category of personal computers, having computing and processing functions, and generally also having mobile internet access characteristics. Such terminals include: PDA, MID, and UMPC devices, etc., such as iPad.
(3) Portable entertainment device: such devices may display and play multimedia content. The device comprises: audio, video players (e.g., iPod), palm game consoles, electronic books, and smart toys and portable car navigation devices.
(4) And (3) a server: the configuration of the server includes a processor, a hard disk, a memory, a system bus, and the like, and the server is similar to a general computer architecture, but is required to provide highly reliable services, and thus has high requirements in terms of processing capacity, stability, reliability, security, scalability, manageability, and the like.
(5) Other electronic devices with data interaction functions.
Embodiments of the present invention also provide a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the method steps of any of the method embodiments described above.
Embodiments of the present invention also provide an application program that is executed to implement the method provided by any of the method embodiments of the present invention.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part. For convenience of description, the above apparatus is described as being functionally divided into various units/modules, respectively. Of course, the functions of the various elements/modules may be implemented in the same piece or pieces of software and/or hardware when implementing the present invention.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (8)

1. A malicious domain name determination method, comprising:
acquiring a domain name to be judged;
analyzing the domain name to be judged to obtain a characterization domain vector value and a behavior domain vector value, wherein the characterization domain comprises: at least one of average length of meaningful words in the domain name, number of confusion characters in the domain name and average adjacency degree of confusion characters in the domain name, and the behavior field comprises at least one of domain name connection duration proportion, domain name connection time period, whether a domain name connection port is a protocol standard port and whether a domain name connection port is an agreed common port;
converting the characterization domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system respectively;
combining the x-axis coordinate point and the y-axis coordinate point to obtain a coordinate position of the domain name to be determined in a plane rectangular coordinate system after conversion;
converting the coordinate position into a coordinate position in a preset Poicare disc;
calculating the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc;
judging the malicious degree of the domain name according to the distance;
the analyzing the domain name to be judged to obtain a characterization domain vector value and a behavior domain vector value comprises the following steps: analyzing the domain name to be judged to obtain a meaning index and a confusion index, and combining the meaning index and the confusion index to obtain the characterization domain vector value, wherein the meaning index is calculated by at least one of the average length of meaning words in the domain name and the number of meaning words in the domain name, and the confusion index is calculated by at least one of the number of confusion characters in the domain name and the average adjacent degree of confusion characters in the domain name; analyzing the domain name to be judged to obtain a time index and a port index, and combining the time index and the port index to obtain the behavior domain vector value, wherein the time index is obtained by calculating at least one of a domain name connection duration proportion and a domain name connection period, and the port index is obtained by calculating at least one of whether a domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port.
2. The method of claim 1, wherein converting the representation domain vector values and behavior domain vector values to x-axis coordinate points and y-axis coordinate points, respectively, in a planar rectangular coordinate system comprises:
and respectively obtaining an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system after the representation domain vector value and the behavior domain vector value are endowed with weight vectors.
3. The method according to any one of claims 1-2, wherein said determining a level of maliciousness of said domain name based on said distance comprises:
and judging the malicious degree of the domain name according to the relation between the distance and a preset threshold value.
4. A malicious domain name determination apparatus, comprising:
the acquisition module is used for acquiring the domain name to be judged;
the analysis module is used for analyzing the domain name to be judged to obtain a representation domain vector value and a behavior domain vector value, wherein the representation domain comprises at least one of average length of meaningful words in the domain name, number of confusion characters in the domain name and average adjacent degree of confusion characters in the domain name, and the behavior domain comprises at least one of domain name connection duration proportion, domain name connection time period, whether a domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port;
the first conversion module is used for respectively converting the characterization domain vector value and the behavior domain vector value into an x-axis coordinate point and a y-axis coordinate point in a plane rectangular coordinate system;
the combination module is used for combining the x-axis coordinate point and the y-axis coordinate point to obtain the coordinate position of the domain name to be determined in the plane rectangular coordinate system after the domain name to be determined is converted;
the second conversion module is used for converting the coordinate position into a coordinate position in a preset Poicare disc;
the computing module is used for computing the distance between the coordinate position in the converted Poicare disc and the center of the preset Poicare disc;
the judging module is used for judging the malicious degree of the domain name according to the distance;
the analysis module comprises: the first analysis subunit is used for analyzing the domain name to be judged to obtain a meaning index and a confusion index, and combining the meaning index and the confusion index to obtain the characterization domain vector value, wherein the meaning index is calculated by at least one of the average length of meaning words in the domain name and the number of meaning words in the domain name, and the confusion index is calculated by at least one of the number of confusion characters in the domain name and the average adjacent degree of confusion characters in the domain name; the second analysis subunit is configured to analyze the domain name to be determined to obtain a time index and a port index, and combine the time index and the port index to obtain the behavior domain vector value, where the time index is obtained by calculating at least one of a domain name connection duration proportion and a domain name connection period, and the port index is obtained by calculating at least one of whether the domain name connection port is a protocol standard port and whether the domain name connection port is a contracted common port.
5. The apparatus of claim 4, wherein the first conversion module comprises:
and the conversion subunit is used for respectively obtaining an x-axis coordinate point and a y-axis coordinate point in the plane rectangular coordinate system after the representation domain vector value and the behavior domain vector value are endowed with weight vectors.
6. The apparatus according to any one of claims 4-5, wherein the determining module comprises:
and the judging subunit is used for judging the malicious degree of the domain name according to the relation between the distance and a preset threshold value.
7. An electronic device, the electronic device comprising: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space surrounded by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; a processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the method of any of the preceding claims 1-3.
8. A computer readable storage medium storing one or more programs executable by one or more processors to implement the method of any of claims 1-3.
CN202111445685.7A 2021-11-30 2021-11-30 Malicious domain name judging method and device, electronic equipment and storage medium Active CN114143084B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111445685.7A CN114143084B (en) 2021-11-30 2021-11-30 Malicious domain name judging method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111445685.7A CN114143084B (en) 2021-11-30 2021-11-30 Malicious domain name judging method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114143084A CN114143084A (en) 2022-03-04
CN114143084B true CN114143084B (en) 2024-02-23

Family

ID=80386068

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111445685.7A Active CN114143084B (en) 2021-11-30 2021-11-30 Malicious domain name judging method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114143084B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109450853A (en) * 2018-10-11 2019-03-08 深圳市腾讯计算机系统有限公司 Malicious websites determination method, device, terminal and server
CN111418192A (en) * 2019-02-21 2020-07-14 北京大学深圳研究生院 Multi-mode identification network addressing method and system based on coordinate mapping
CN112118205A (en) * 2019-06-19 2020-12-22 腾讯科技(深圳)有限公司 Domain name information detection method and related device
CN113079123A (en) * 2020-01-03 2021-07-06 中国移动通信集团广东有限公司 Malicious website detection method and device and electronic equipment
CN113328994A (en) * 2021-04-30 2021-08-31 新华三信息安全技术有限公司 Malicious domain name processing method, device, equipment and machine readable storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2614557C2 (en) * 2015-06-30 2017-03-28 Закрытое акционерное общество "Лаборатория Касперского" System and method for detecting malicious files on mobile devices

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109450853A (en) * 2018-10-11 2019-03-08 深圳市腾讯计算机系统有限公司 Malicious websites determination method, device, terminal and server
CN111418192A (en) * 2019-02-21 2020-07-14 北京大学深圳研究生院 Multi-mode identification network addressing method and system based on coordinate mapping
CN112118205A (en) * 2019-06-19 2020-12-22 腾讯科技(深圳)有限公司 Domain name information detection method and related device
CN113079123A (en) * 2020-01-03 2021-07-06 中国移动通信集团广东有限公司 Malicious website detection method and device and electronic equipment
CN113328994A (en) * 2021-04-30 2021-08-31 新华三信息安全技术有限公司 Malicious domain name processing method, device, equipment and machine readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Malicious Domain Names Detection Algorithm Based on Lexical Analysis and Feature Quantification;Hong Zhao;《IEEE Access(Volume:7)》;全文 *
一种基于字符及解析特征的恶意域名检测方法;黄凯;傅建明;黄坚伟;李鹏伟;;计算机仿真(03);全文 *
基于AGD的恶意域名检测;臧小东;龚俭;胡晓艳;;通信学报(07);全文 *

Also Published As

Publication number Publication date
CN114143084A (en) 2022-03-04

Similar Documents

Publication Publication Date Title
CN106874253A (en) Recognize the method and device of sensitive information
CN111090615A (en) Method and device for analyzing and processing mixed assets, electronic equipment and storage medium
US20170252653A1 (en) Matching method and matching system for users in game
CN105930042B (en) A kind of method and apparatus that academic probation content is presented
WO2019148587A1 (en) Competition object matching method in learning competition and apparatus
CN110795572A (en) Entity alignment method, device, equipment and medium
CN115174250A (en) Network asset safety assessment method and device, electronic equipment and storage medium
CN114365118A (en) Knowledge graph-based queries in an artificial intelligence chat robot with basic query element detection and graphical path generation
JP5940135B2 (en) Topic presentation method, apparatus, and computer program.
CN114143084B (en) Malicious domain name judging method and device, electronic equipment and storage medium
CN112052399B (en) Data processing method, device and computer readable storage medium
WO2021135322A1 (en) Automatic question setting method, apparatus and system
CN113891323B (en) WiFi-based user tag acquisition system
CN111027065A (en) Lesovirus identification method and device, electronic equipment and storage medium
CN110237535A (en) Game data verification method and device, electronic equipment and storage medium
CN110717817A (en) Pre-loan approval method and device, electronic equipment and computer-readable storage medium
CN112929961B (en) Network equipment positioning method, computer equipment and medium
CN116263938A (en) Image processing method, device and computer readable storage medium
CN116244659B (en) Data processing method, device, equipment and medium for identifying abnormal equipment
CN114338102A (en) Security detection method and device, electronic equipment and storage medium
CN112905748A (en) Speech effect evaluation system
US20190336855A1 (en) System and method for playing a game
CN112541069A (en) Text matching method, system, terminal and storage medium combined with keywords
CN111800391A (en) Method and device for detecting port scanning attack, electronic equipment and storage medium
WO2006100800A1 (en) Code analyzing device, code analyzing method, and information storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant