CN112968980B - Probability determination method and device, storage medium and server - Google Patents

Probability determination method and device, storage medium and server Download PDF

Info

Publication number
CN112968980B
CN112968980B CN202110139340.2A CN202110139340A CN112968980B CN 112968980 B CN112968980 B CN 112968980B CN 202110139340 A CN202110139340 A CN 202110139340A CN 112968980 B CN112968980 B CN 112968980B
Authority
CN
China
Prior art keywords
probability
source website
domain name
dns
acceleration node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110139340.2A
Other languages
Chinese (zh)
Other versions
CN112968980A (en
Inventor
潘文军
李雪峰
尚程
梁彧
田野
傅强
王杰
杨满智
蔡琳
金红
陈晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eversec Beijing Technology Co Ltd
Original Assignee
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eversec Beijing Technology Co Ltd filed Critical Eversec Beijing Technology Co Ltd
Priority to CN202110139340.2A priority Critical patent/CN112968980B/en
Publication of CN112968980A publication Critical patent/CN112968980A/en
Application granted granted Critical
Publication of CN112968980B publication Critical patent/CN112968980B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • H04L61/5007Internet protocol [IP] addresses

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention discloses a probability determination method, a probability determination device, a storage medium and a server. The method comprises the following steps: acquiring a domain name of a source website, and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas; respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing each initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by responding to a DNS request sent by a corresponding engine node by a DNS server; and determining the probability that the source website contains the CDN acceleration node according to the target DNS data. By the technical scheme provided by the embodiment of the invention, the probability that the source website contains the CDN acceleration node can be accurately and quickly determined, and the judgment of whether the source website contains the CDN acceleration node or not is facilitated according to the determined probability.

Description

Probability determination method and device, storage medium and server
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a probability determination method and device, a storage medium and a server.
Background
With the rapid growth of netizens and the increase of websites, how to relieve network congestion and improve the speed of obtaining information by users becomes a big problem which puzzles numerous enterprises and service providers. Content Delivery Networks (CDNs) have been developed in this context, which can be considered as a value-added Network built on top of the existing Network infrastructure.
The illegal information such as bad information, fraud information and the like can be spread by bad websites through CDN services by using the Internet. In order to purify the network space and effectively manage illegal information, the CDN node cache data and websites are documented, and not only legal and human governance is required, but also a corresponding technical management system is an indispensable management means.
At present, whether a website contains a CDN acceleration node is mainly determined through autonomous reporting of a CDN enterprise, the problems of missing report and misinformation of the CDN acceleration node of the website and untimely information updating exist, and obvious hysteresis and incompleteness exist.
Disclosure of Invention
The embodiment of the invention provides a probability determination method, a probability determination device, a storage medium and a server, which can accurately and quickly determine the probability that a source website contains CDN acceleration nodes and are beneficial to determining whether the source website contains the CDN acceleration nodes according to the determined probability.
In a first aspect, an embodiment of the present invention provides a probability determination method, including:
acquiring a domain name of a source website, and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas;
respectively acquiring initial domain name resolution (DNS) data sent by the at least two engine nodes, summarizing the initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node;
and determining the probability that the source website contains the CDN acceleration node according to the target DNS data.
In a second aspect, an embodiment of the present invention further provides a probability determination apparatus, including:
the domain name acquisition module is used for acquiring a domain name of a source website and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas;
the DNS data acquisition module is used for respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing the initial DNS data and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by responding to a DNS request sent by a corresponding engine node by a DNS server;
and the probability determining module is used for determining the probability that the CDN acceleration node is contained in the source website according to the target DNS data.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the probability determination method provided by the embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention provides a server, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the probability determination method according to the embodiment of the present invention.
According to the probability determination scheme provided by the embodiment of the invention, the domain name of a source website is obtained and is distributed to at least two engine nodes; wherein the engine nodes are distributed in different location areas; respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing each initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by responding to a DNS request sent by a corresponding engine node by a DNS server; and determining the probability that the source website contains the CDN acceleration node according to the target DNS data. By the technical scheme provided by the embodiment of the invention, the technical problems that the CDN acceleration node of the website is missed and false, and the information is not updated timely because the CDN enterprise determines whether the website contains the CDN acceleration node or not in an autonomous reporting mode can be solved, the probability that the source website contains the CDN acceleration node can be accurately and quickly determined, and the method and the device are beneficial to determining whether the source website contains the CDN acceleration node or not according to the determined probability in the follow-up process.
Drawings
Fig. 1 is a schematic diagram illustrating a comparison between the work efficiency of a website accelerated by a CDN and the work efficiency of a website not accelerated by the CDN, provided in the related art;
fig. 2 is a schematic diagram illustrating a working principle of a website not accelerated by a CDN, provided in the related art;
fig. 3 is a schematic diagram illustrating a working principle of a web site accelerated by a CDN provided in the related art;
fig. 4 is a flowchart of a probability determination method according to an embodiment of the present invention;
FIG. 5 is a flow chart of a probability determination method in another embodiment of the invention;
FIG. 6 is a schematic diagram of a probability determination apparatus in another embodiment of the present invention;
fig. 7 is a schematic structural diagram of a server in another embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more complete and thorough understanding of the present invention. It should be understood that the drawings and the embodiments of the present invention are illustrative only and are not intended to limit the scope of the present invention.
It should be understood that the various steps recited in the method embodiments of the present invention may be performed in a different order and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present invention are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in the present invention are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that reference to "one or more" unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present invention are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
In the related art, the CDN is a strategically deployed overall system, and includes 4 conditions of distributed storage, load balancing, network request redirection, and content Management, and the content Management and global network Traffic Management (Traffic Management) are the core of the CDN. By determining user proximity and server load, the CDN can ensure that the content serves the user's request in an extremely efficient manner.
In general, content services are based on caching servers, also called proxy caches (caches), which are located at the edge of the network, just one Hop away from the user (Single Hop). Meanwhile, the proxy cache is a transparent mirror image of the content provider origin server (usually located in the CDN service provider's data center). Such an architecture enables CDN service providers to provide the best possible experience on behalf of their customers, i.e., content providers, to end users who cannot tolerate any delay in request response time. According to statistics, by adopting the CDN technology, 70% -95% of content access amount of the whole website page can be processed, the pressure of a server is reduced, and the performance and the expandability of the website are improved. Fig. 1 is a schematic diagram illustrating a comparison between the working efficiency of a website accelerated by a CDN and that of a website not accelerated by the CDN in the related art. Obviously, the request response time of the website to the user after being accelerated by the CDN is much shorter than the request response time of the website to the user after not being accelerated by the CDN, and the website can efficiently provide service for the request of the user after being accelerated by the CDN.
Fig. 2 is a schematic diagram illustrating a working principle of a website that is not accelerated by a CDN in the related art. As shown in fig. 2, the process of accessing the unused CDN cache website by the user is 1, the user provides a domain name www.web.com to be accessed to a browser at the client; 2. the browser calls a domain name resolution function library to resolve the domain name so as to obtain an IP address x.x.x.x corresponding to the domain name; 3. the browser sends a data access request to the domain name host based on the obtained IP address; 4. and the browser displays the content of the webpage according to the data returned by the domain name host. Obviously, through the above four steps, the browser completes the whole process from receiving the domain name to be accessed by the user from the user to acquiring data from the domain name host.
In the CDN network, a Cache layer is added between a user and a server, and a request sent by the user at a client is guided to the Cache to obtain data of an origin server, where fig. 3 is a schematic diagram illustrating a working principle of a website accelerated by the CDN provided in the related art. As shown in fig. 3: the process of accessing the website cached by the CDN by the user is as follows: 1. providing a domain name to be accessed to a browser by a user; 2. the browser resolves the domain name through the DNS, and since the CDN adjusts the domain name resolution process, the resolved domain name is the CNAME record of the CDN corresponding to the domain name, and in order to obtain an actual IP address, the browser needs to resolve the obtained CNAME domain name again to obtain the actual IP address; in this process, global load balancing DNS resolution is used, such as resolving corresponding IP addresses according to geographical location information, so that users can access nearby. 3. The IP address of the CDN cache server is obtained through the analysis, and the browser sends an access request to the cache server after obtaining the actual IP address; 4. the Cache server obtains an actual IP address of the domain name through the special DNS analysis in the Cache according to the domain name to be accessed provided by the browser, and then the Cache server submits an access request to the actual IP address; 5. after obtaining the content from the actual IP address, the cache server stores the content locally for later use on the one hand, and returns the obtained data to the client on the other hand to complete the data service process; 6. and the client displays the data returned by the cache server after obtaining the data, so that the whole data request process of the browser is completed.
Fig. 4 is a flowchart of a probability determination method according to an embodiment of the present invention, where the method is applicable to the case of determining the probability of a web site including a CDN acceleration node, and the method may be executed by a probability determination device, which may be composed of hardware and/or software and may be generally integrated in a server. As shown in fig. 4, the method specifically includes the following steps:
step 410, acquiring a domain name of a source website, and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas.
In an embodiment of the present invention, the source website is a website for providing a service to the client. The domain name of the source website may be a domain name input by a user or a domain name of a preset source website. Specifically, the domain name of at least one source website is obtained, that is, the domain name of one source website may be obtained, and the domain names of a plurality of source websites may also be obtained. Specifically, when domain names of a plurality of source websites are obtained, a domain name list may be obtained, where the domain name list includes domain names of at least two source websites.
In an embodiment of the present invention, a domain name of a source website is distributed to at least two engine nodes, wherein the engine nodes are distributed in different location areas. Illustratively, the domain name of the source website may be distributed to engine nodes of 93 operators distributed over 31 provinces across the country, wherein the engine nodes may be located in servers of different location areas. Specifically, the domain name of the source website may be adjusted or modified according to a preset domain name format, and then distributed to at least two engine nodes.
Step 420, respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, and summarizing each initial DNS data to generate target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node.
In the embodiment of the present invention, after the domain name of the source website is distributed to at least two engine nodes, each engine node sends a DNS request to a corresponding target DNS server, where the target DNS server corresponding to the engine node may be understood as a DNS server distributed in the vicinity of the engine node. And after receiving the DNS request, the target DNS server carries out domain name resolution on the domain name of the source website to generate DNS data and feeds the DNS data back to the corresponding engine node. In the embodiment of the invention, the DNS data sent by each engine node is obtained, and the received DNS data is used as the initial DNS data. Optionally, each initial DNS data may include an Internet Protocol (IP) address, and may also include a domain name of the CDN acceleration node. It can be understood that, when the DNS server responds to the DNS request sent by the corresponding engine node to resolve the domain name of the origin website, if the IP address corresponding to the domain name of the origin server is resolved, the client may access the origin website through the IP address, and at this time, the origin website does not include the CDN acceleration node, that is, the origin website directly provides a service for the client. And when responding to a DNS request sent by a corresponding engine node and analyzing the domain name of a source website, the DNS server analyzes the domain name of a CDN acceleration node corresponding to the domain name of the source server, and a client can obtain service data by accessing the CDN acceleration node, wherein the source website contains the CDN acceleration node, namely the CDN acceleration node in the source website provides service for the client.
In the embodiment of the invention, the initial DNS data acquired from each engine node is summarized and deduplicated to generate the target DNS data. Specifically, the initial DNS data sent by each engine node may be the same, and at this time, the same initial DNS data may be merged into one piece of DNS data. For example, the initial DNS data includes an IP address and a domain name (which may also be referred to as an alias) of the CDN acceleration node, and the domain names of the same CDN acceleration node in each initial DNS data may be deduplicated and then summarized.
Step 430, determining the probability that the source website contains the CDN acceleration node according to the target DNS data.
In the embodiment of the invention, the probability that the CDN acceleration node is contained in the source website is determined according to the target DNS data. Illustratively, the DNS data may be further analyzed, the number of IP addresses included in the DNS data, the level of the domain name of the CDN acceleration node, and a keyword in the domain name of the CDN acceleration node are determined, and the probability that the source website includes the CDN acceleration node is determined according to at least one of the number of IP addresses, the level of the domain name of the CDN acceleration node, and the keyword in the domain name of the CDN acceleration node. For example, the probability that the source website includes the CDN acceleration node may be determined according to the number of IP addresses, and a larger number of IP addresses indicates a larger probability that the source website includes the CDN acceleration node.
According to the probability determining method provided by the embodiment of the invention, the domain name of a source website is obtained and distributed to at least two engine nodes; wherein the engine nodes are distributed in different location areas; respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing each initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node; and determining the probability that the source website contains the CDN acceleration node according to the target DNS data. By the technical scheme provided by the embodiment of the invention, the technical problems that the CDN acceleration node of the website is missed and false, and the information is not updated timely because the CDN enterprise determines whether the website contains the CDN acceleration node or not in an autonomous reporting mode can be solved, the probability that the source website contains the CDN acceleration node can be accurately and quickly determined, and the method and the device are beneficial to determining whether the source website contains the CDN acceleration node or not according to the determined probability in the follow-up process.
In some embodiments, the target DNS data includes an internet protocol IP address; determining the probability that the source website contains the CDN acceleration node according to the target DNS data, wherein the probability comprises the following steps: determining a first number of IP addresses contained in the target DNS data; when the first number is larger than a first preset number threshold, determining that the probability that the CDN acceleration node is included in the source website is a first probability value; when the first number is smaller than the first preset number threshold, determining that the probability that the source website contains the CDN acceleration node is 0. For example, when the target DNS data only includes IP addresses, a first number of IP addresses included in the target DNS data is determined, when the first number is greater than a first preset number threshold (e.g., 4), it is determined that the probability that the source website includes CDN acceleration nodes is a first probability value (e.g., 20%), and when the first number is less than the first number threshold, it is directly determined that the source website does not include CDN acceleration nodes, that is, the probability that the source website includes CDN acceleration nodes is 0. It should be noted that, in the embodiment of the present invention, the magnitudes of the first preset number threshold and the first probability value are not limited.
In some embodiments, the target DNS data includes an IP address and an alias; the alias is a domain name of a CDN acceleration node of the content delivery network; determining the probability that the source website contains the CDN acceleration node according to the target DNS data, wherein the probability comprises the following steps: determining a second number of IP addresses contained in the DNS data, a hierarchy of aliases, and keywords contained in the aliases; and determining the probability that the CDN acceleration node is contained in the source website according to the second number, the hierarchy of the aliases and the keywords contained in the aliases.
Optionally, determining, according to the second number, the hierarchy of the aliases, and the keywords included in the aliases, a probability that the source website includes the CDN acceleration node includes: when the second number is larger than a second preset number threshold, judging whether the hierarchy of the alias is larger than a preset hierarchy threshold; when the hierarchy of the alias is larger than the preset hierarchy threshold value, judging whether the alias contains preset keywords or not; and when the alias contains a preset keyword, determining that the probability that the source website contains the CDN acceleration node is a second probability value. Optionally, when the second number is smaller than the second preset number threshold, determining that the probability that the source website includes the CDN acceleration node is 0; when the hierarchy of the alias is smaller than the preset hierarchy threshold value, determining that the probability that the source website contains the CDN acceleration node is a third probability value; when the alias does not contain a preset keyword, determining that the probability that the source website contains the CDN acceleration node is a fourth probability value; wherein the second probability value is greater than the fourth probability value, which is greater than the third probability value.
In the embodiment of the present invention, when the target DNS data includes an IP address and an alias (Cname), it indicates that when the engine nodes distributed in different location areas send the domain name of the source website to the corresponding DNS server for DNS resolution, for the same domain name, a plurality of different results can be resolved, which indicates that there is a high possibility that the source website includes a CDN acceleration node. Therefore, the target DNS data is analyzed, the second number of IP addresses included in the DNS data, the hierarchy of the alias, and the keyword included in the alias are determined, and the probability that the CDN acceleration node is included in the source website is determined according to the determined information. Specifically, whether a second number of IP addresses included in the target DNS data is greater than a second preset number threshold is determined, if so, whether a hierarchy of an alias included in the target DNS data is further determined to be greater than a preset hierarchy threshold, and if not, the probability that the web site includes the CDN acceleration node may be directly determined to be 0. When the level of the alias included in the target DNS data is smaller than a preset level threshold (e.g., the preset level threshold is 2), the probability that the CDN acceleration node is included in the source website may be determined to be a third probability value (e.g., 20%). It can be understood that, when the second number of IP addresses included in the target DNS data is greater than the second preset number threshold and the hierarchy of the alias included in the target DNS data is less than the preset hierarchy threshold, the probability that the CDN acceleration node is included in the source website may be determined as the third probability value. And when the hierarchy of the alias included in the target DNS data is greater than a preset hierarchy threshold (for example, the preset hierarchy threshold is 2), further determining whether the alias included in the target DNS data includes a preset keyword, if so, determining that the probability that the source website includes the CDN acceleration node is a second probability value (for example, 90%), otherwise, determining that the probability that the source website includes the CDN acceleration node is a fourth probability value (for example, 40%).
In the embodiment of the present invention, when a plurality of aliases are included in the target DNS, the hierarchies of the aliases are determined, and the maximum value is taken as the hierarchy of the alias included in the target DNS. Wherein, the alias hierarchy can be determined according to the number of identifiers contained in the alias, and the alias hierarchy is the same as the number of identifiers. Optionally, the preset keyword includes at least one of a DNS keyword, a CDN keyword, and a CACHE keyword.
It should be noted that the second probability value, the fourth probability value, and the third probability value are decreased step by step, but the specific sizes of the second probability value, the third probability value, and the fourth probability value are not limited in the embodiment of the present invention. In addition, the second preset number threshold and the preset level threshold are not limited in the embodiment of the present invention.
In some embodiments, the obtaining initial domain name resolution DNS data sent by the at least two engine nodes respectively includes: controlling the at least two engine nodes to send DNS requests to corresponding DNS servers in parallel based on the domain names; and respectively acquiring initial DNS data sent by the at least two engine nodes. Exemplarily, after the domain name of the source website is distributed to the at least two engine nodes, the at least two engine nodes are controlled to send DNS requests to corresponding DNS servers based on the domain name; responding to a DNS request sent by a corresponding engine node by the DNS server, analyzing the domain name, and sending initial DNS data obtained by analysis to the corresponding engine node; and acquiring initial DNS data sent by the at least two engine nodes.
Fig. 5 is a flowchart of a probability determination method in another embodiment of the invention, as shown in fig. 5, the method includes the following steps:
step 510, acquiring a domain name of a source website, and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas.
Step 520, respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, and summarizing each initial DNS data to generate target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node, and the target DNS data comprises an IP address and an alias; the alias is a domain name of a CDN acceleration node of the content delivery network.
Step 530 determines a second number of IP addresses contained in the DNS data, a hierarchy of aliases, and keywords contained in the aliases.
In step 540, it is determined whether the second quantity is greater than a second predetermined quantity threshold, if so, step 550 is executed, otherwise, step 580 is executed.
Step 550, determining whether the hierarchy of the alias is greater than a preset hierarchy threshold, if so, performing step 560, otherwise, performing step 590.
Step 560, determine whether the alias contains a preset keyword, if yes, execute step 570, otherwise, execute step 5100.
The preset keywords comprise at least one of DNS keywords, CDN keywords and CACHE keywords.
Step 570, determining that the probability that the source website includes the CDN acceleration node is a second probability value.
Step 580, determining that the probability that the CDN acceleration node is included in the source website is 0.
Step 590, determining that the probability that the source website includes the CDN acceleration node is a third probability value.
In step 5100, it is determined that the probability that the source website includes the CDN acceleration node is a fourth probability value.
Wherein the second probability value is greater than the fourth probability value, which is greater than the third probability value.
The probability determination method provided by the embodiment of the invention can solve the technical problems that whether the CDN acceleration node is included in the website or not is determined by a CDN enterprise autonomous reporting mode, so that the CDN acceleration node of the website is missed and false, and the information is not updated timely, can accurately and quickly determine the probability that the source website includes the CDN acceleration node, and is beneficial to determining whether the CDN acceleration node is included in the source website or not according to the determined probability in the follow-up process.
Fig. 6 is a schematic structural diagram of a probability determination apparatus according to another embodiment of the present invention. As shown in fig. 6, the apparatus includes: a domain name acquisition module 610, a dns data acquisition module 620, and a probability determination module 630. Wherein,
a domain name obtaining module 610, configured to obtain a domain name of a source website, and distribute the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas;
a DNS data obtaining module 620, configured to obtain initial domain name resolution DNS data sent by the at least two engine nodes, respectively, and summarize each initial DNS data to generate target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node;
a probability determining module 630, configured to determine, according to the target DNS data, a probability that the source website includes a CDN acceleration node.
The probability determination device provided by the embodiment of the invention acquires the domain name of a source website and distributes the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas; respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing each initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node; and determining the probability that the source website contains the CDN acceleration node according to the target DNS data. By the technical scheme provided by the embodiment of the invention, the technical problems that whether the CDN acceleration node is included in the website or not is determined by a CDN enterprise autonomous reporting mode, so that the CDN acceleration node of the website is missed and false, and the information is not updated timely can be solved, the probability that the CDN acceleration node is included in the source website can be accurately and quickly determined, and whether the CDN acceleration node is included in the source website or not can be judged subsequently according to the determined probability.
Optionally, the target DNS data includes an internet protocol IP address;
the probability determination module is configured to:
determining a first number of IP addresses contained in the target DNS data;
when the first number is larger than a first preset number threshold, determining that the probability that the CDN acceleration node is included in the source website is a first probability value;
when the first number is smaller than the first preset number threshold, determining that the probability that the source website contains the CDN acceleration node is 0.
Optionally, the target DNS data includes an IP address and an alias; the alias is a domain name of a CDN acceleration node of the content delivery network;
the probability determination module comprises:
an information determination unit configured to determine a second number of IP addresses, a hierarchy of aliases, and keywords included in the aliases included in the DNS data;
and a first probability determination unit, configured to determine, according to the second number, the hierarchy of the aliases, and the keywords included in the aliases, a probability that the source website includes the CDN acceleration node.
Optionally, the first probability determining unit is configured to:
when the second number is larger than a second preset number threshold, judging whether the hierarchy of the alias is larger than a preset hierarchy threshold;
when the level of the alias is greater than the preset level threshold value, judging whether the alias contains preset keywords or not;
and when the alias contains a preset keyword, determining that the probability that the source website contains the CDN acceleration node is a second probability value.
Optionally, the apparatus further comprises:
a second probability determining unit, configured to determine that the probability that the source website includes the CDN acceleration node is 0 when the second number is smaller than the second preset number threshold;
a third probability determining unit, configured to determine, when the level of the alias is smaller than the preset level threshold, that a probability that the source website includes a CDN acceleration node is a third probability value;
a fourth probability determining unit, configured to determine, when the alias does not include a preset keyword, that a probability that the source website includes a CDN acceleration node is a fourth probability value;
wherein the second probability value is greater than the fourth probability value, which is greater than the third probability value.
Optionally, the preset keyword includes at least one of a DNS keyword, a CDN keyword, and a CACHE keyword.
Optionally, the DNS data acquiring module is configured to:
controlling the at least two engine nodes to send DNS requests to corresponding DNS servers in parallel based on the domain names;
and respectively acquiring initial DNS data sent by the at least two engine nodes.
The device can execute the methods provided by all the embodiments of the invention, and has corresponding functional modules and beneficial effects for executing the methods. For technical details which are not described in detail in the embodiments of the present invention, reference may be made to the methods provided in all the embodiments of the present invention described above.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a probability determination method, the method comprising:
acquiring a domain name of a source website, and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas;
respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing each initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node;
and determining the probability that the source website contains the CDN acceleration node according to the target DNS data.
Storage medium-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDRRAM, SRAM, EDORAM, lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the internet). The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected via a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the probability determination operation described above, and may also perform related operations in the probability determination method provided by any embodiment of the present invention.
The embodiment of the invention provides a server, and the server can be integrated with the probability determination device provided by the embodiment of the invention. Fig. 7 is a block diagram of a server according to an embodiment of the present invention. The server 700 may include: a memory 701, a processor 702 and a computer program stored on the memory 701 and executable by the processor, wherein the processor 702 implements the probability determination method according to the embodiment of the present invention when executing the computer program.
The server provided by the embodiment of the invention acquires the domain name of a source website and distributes the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas; respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing each initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node; and determining the probability that the source website contains the CDN acceleration node according to the target DNS data. By the technical scheme provided by the embodiment of the invention, the technical problems that whether the CDN acceleration node is included in the website or not is determined by a CDN enterprise autonomous reporting mode, so that the CDN acceleration node of the website is missed and false, and the information is not updated timely can be solved, the probability that the CDN acceleration node is included in the source website can be accurately and quickly determined, and whether the CDN acceleration node is included in the source website or not can be judged subsequently according to the determined probability.
The probability determination device, the storage medium and the server provided in the above embodiments may execute the probability determination method provided in any embodiment of the present invention, and have corresponding functional modules and beneficial effects for executing the method. For technical details that are not described in detail in the above embodiments, reference may be made to the probability determination method provided in any embodiment of the present invention.
It is to be noted that the foregoing description is only exemplary of the invention and that the principles of the technology may be employed. Those skilled in the art will appreciate that the present invention is not limited to the particular embodiments described herein, and that various obvious changes, rearrangements and substitutions will now be apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (7)

1. A method for probability determination, comprising:
acquiring a domain name of a source website, and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas;
respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing each initial DNS data, and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node;
determining the probability that the source website contains CDN acceleration nodes according to the target DNS data;
the target DNS data comprises an IP address and an alias; the alias is a domain name of a CDN acceleration node of the content delivery network;
determining the probability that the source website contains the CDN acceleration node according to the target DNS data, wherein the probability comprises the following steps:
determining a second number of IP addresses contained in the DNS data, a hierarchy of aliases, and keywords contained in the aliases;
determining the probability that the CDN acceleration node is contained in the source website according to the second number, the hierarchy of the aliases and the keywords contained in the aliases;
determining the probability that the CDN acceleration node is included in the source website according to the second number, the hierarchy of the aliases and the keywords included in the aliases, wherein the determining the probability that the CDN acceleration node is included in the source website comprises:
when the second number is larger than a second preset number threshold, judging whether the hierarchy of the alias is larger than a preset hierarchy threshold;
when the hierarchy of the alias is larger than the preset hierarchy threshold value, judging whether the alias contains preset keywords or not;
when the alias contains a preset keyword, determining that the probability that the source website contains the CDN acceleration node is a second probability value;
further comprising:
when the second number is smaller than the second preset number threshold, determining that the probability that the CDN acceleration node is included in the source website is 0;
when the hierarchy of the alias is smaller than the preset hierarchy threshold value, determining that the probability that the source website contains the CDN acceleration node is a third probability value;
when the alias does not contain a preset keyword, determining that the probability that the source website contains the CDN acceleration node is a fourth probability value;
wherein the second probability value is greater than the fourth probability value, which is greater than the third probability value.
2. The method of claim 1, wherein the target DNS data includes an internet protocol IP address;
determining the probability that the source website contains the CDN acceleration node according to the target DNS data, wherein the probability comprises the following steps:
determining a first number of IP addresses contained in the target DNS data;
when the first number is larger than a first preset number threshold, determining that the probability that the CDN acceleration node is included in the source website is a first probability value;
when the first number is smaller than the first preset number threshold, determining that the probability that the source website contains the CDN acceleration node is 0.
3. The method of claim 1, wherein the predetermined keywords comprise at least one of DNS keywords, CDN keywords, and CACHE keywords.
4. The method according to claim 1, wherein obtaining initial domain name resolution (DNS) data sent by the at least two engine nodes respectively comprises:
controlling the at least two engine nodes to send DNS requests to corresponding DNS servers in parallel based on the domain names;
and respectively acquiring initial DNS data sent by the at least two engine nodes.
5. A probability determination device, comprising:
the domain name acquisition module is used for acquiring a domain name of a source website and distributing the domain name of the source website to at least two engine nodes; wherein the engine nodes are distributed in different location areas;
the DNS data acquisition module is used for respectively acquiring initial domain name resolution DNS data sent by the at least two engine nodes, summarizing the initial DNS data and generating target DNS data; the initial DNS data is data obtained by analyzing the domain name by a DNS server in response to a DNS request sent by a corresponding engine node;
a probability determination module, configured to determine, according to the target DNS data, a probability that the source website includes a CDN acceleration node;
the target DNS data comprises an IP address and an alias; the alias is a domain name of a CDN acceleration node of the content delivery network;
the probability determination module comprises:
an information determination unit configured to determine a second number of IP addresses, a hierarchy of aliases, and keywords included in the aliases included in the DNS data;
a first probability determination unit, configured to determine, according to the second number, a hierarchy of the aliases, and keywords included in the aliases, a probability that the CDN acceleration node is included in the source website;
the first probability determination unit to:
when the second number is larger than a second preset number threshold, judging whether the hierarchy of the alias is larger than a preset hierarchy threshold;
when the hierarchy of the alias is larger than the preset hierarchy threshold value, judging whether the alias contains preset keywords or not;
when the alias contains a preset keyword, determining that the probability that the source website contains the CDN acceleration node is a second probability value;
the device further comprises:
a second probability determining unit, configured to determine that the probability that the source website includes the CDN acceleration node is 0 when the second number is smaller than the second preset number threshold;
a third probability determining unit, configured to determine, when the hierarchy of the alias is smaller than the preset hierarchy threshold, that the probability that the source website includes the CDN acceleration node is a third probability value;
a fourth probability determining unit, configured to determine, when the alias does not include a preset keyword, that the probability that the source website includes the CDN acceleration node is a fourth probability value;
wherein the second probability value is greater than the fourth probability value, which is greater than the third probability value.
6. A computer-readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processing means, carries out the probability determination method as claimed in any one of claims 1-4.
7. A server comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the probability determination method according to any of claims 1-4.
CN202110139340.2A 2021-02-01 2021-02-01 Probability determination method and device, storage medium and server Active CN112968980B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110139340.2A CN112968980B (en) 2021-02-01 2021-02-01 Probability determination method and device, storage medium and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110139340.2A CN112968980B (en) 2021-02-01 2021-02-01 Probability determination method and device, storage medium and server

Publications (2)

Publication Number Publication Date
CN112968980A CN112968980A (en) 2021-06-15
CN112968980B true CN112968980B (en) 2023-04-18

Family

ID=76273072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110139340.2A Active CN112968980B (en) 2021-02-01 2021-02-01 Probability determination method and device, storage medium and server

Country Status (1)

Country Link
CN (1) CN112968980B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106603734A (en) * 2015-10-16 2017-04-26 任子行网络技术股份有限公司 CDN service IP detection method and system
CN107342913A (en) * 2017-05-24 2017-11-10 恒安嘉新(北京)科技股份公司 The detection method and device of a kind of CDN node
CN110912769A (en) * 2019-11-12 2020-03-24 中移(杭州)信息技术有限公司 CDN cache hit rate statistical method, system, network device and storage medium
CN111277461A (en) * 2020-01-19 2020-06-12 杭州安恒信息技术股份有限公司 Method, system and equipment for identifying content distribution network node

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106603734A (en) * 2015-10-16 2017-04-26 任子行网络技术股份有限公司 CDN service IP detection method and system
CN107342913A (en) * 2017-05-24 2017-11-10 恒安嘉新(北京)科技股份公司 The detection method and device of a kind of CDN node
CN110912769A (en) * 2019-11-12 2020-03-24 中移(杭州)信息技术有限公司 CDN cache hit rate statistical method, system, network device and storage medium
CN111277461A (en) * 2020-01-19 2020-06-12 杭州安恒信息技术股份有限公司 Method, system and equipment for identifying content distribution network node

Also Published As

Publication number Publication date
CN112968980A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
US11194719B2 (en) Cache optimization
US10027564B2 (en) Unobtrusive methods and systems for collecting information transmitted over a network
US8756325B2 (en) Content management
KR101914318B1 (en) Global traffic management using modified hostname
US9703705B2 (en) Performing efficient cache invalidation
CN112513830A (en) Back-source method and related device in content distribution network
US20150142845A1 (en) Smart database caching
CN109873855A (en) A kind of resource acquiring method and system based on block chain network
US20140095644A1 (en) Processing of write requests in application server clusters
CN112968980B (en) Probability determination method and device, storage medium and server
CN114301872B (en) Domain name based access method and device, electronic equipment and storage medium
CN111125580B (en) Network resource acquisition method and device, electronic equipment and storage medium
US8250177B2 (en) Uncached data control in server-cached page
US20240089339A1 (en) Caching across multiple cloud environments
CN109302505B (en) Data transmission method, system, device and storage medium
CN114995833A (en) Data distribution method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant