WO2020199029A1 - Data processing method and apparatus therefor - Google Patents

Data processing method and apparatus therefor Download PDF

Info

Publication number
WO2020199029A1
WO2020199029A1 PCT/CN2019/080652 CN2019080652W WO2020199029A1 WO 2020199029 A1 WO2020199029 A1 WO 2020199029A1 CN 2019080652 W CN2019080652 W CN 2019080652W WO 2020199029 A1 WO2020199029 A1 WO 2020199029A1
Authority
WO
WIPO (PCT)
Prior art keywords
domain name
domain
database
level
category
Prior art date
Application number
PCT/CN2019/080652
Other languages
French (fr)
Chinese (zh)
Inventor
莫邵文
肖艳光
向展
周东波
招伟俊
黄芷然
邝继欧
张伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2019/080652 priority Critical patent/WO2020199029A1/en
Priority to CN201980093696.8A priority patent/CN113545020B/en
Publication of WO2020199029A1 publication Critical patent/WO2020199029A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming

Definitions

  • This application relates to the field of computer technology, and in particular to a data processing method and device.
  • the Internet is playing an increasingly important role in modern people's lives, but it is undeniable that while the Internet brings people a convenient life, it also has many negative effects. For example, a large number of content containing violent, reactionary and other bad information floods the cyberspace, and is becoming more and more intense, threatening the physical and mental health of students; in addition, many students indulge in the Internet for a long time, which will cause mental and physical illnesses and affect healthy growth.
  • the green Internet technology solutions can effectively control the time students spend online and restrict their access to unhealthy websites.
  • the idea of the green Internet technology solution mainly includes: the green cloud server judges whether the website corresponding to the URL in the user request is an unhealthy website according to the uniform resource locator (URL) in the black and white lists, and prevents users from accessing unhealthy websites website.
  • URL uniform resource locator
  • the embodiments of the present application provide a data processing method and device, which can effectively reduce the load of a domain name classification server, thereby helping to increase the speed of website access.
  • an embodiment of the present application provides a data processing method.
  • the method includes: receiving an access request sent by a terminal device, the access request includes a first domain name, and obtaining the transfer probability of the first domain name. If the transfer probability of the first domain name is Is greater than the preset probability threshold, the first domain name is sent to the domain name classification server, and the category of the first domain name sent by the domain name classification server is received. If the category of the first domain name is the allowed access category, the Internet protocol address corresponding to the first domain name is obtained , And send the Internet Protocol address to the terminal device.
  • the transfer probability of the first domain name can be used to characterize the probability that the category of the first domain name exists in the domain name classification server, and the transfer probability of the first domain name is greater than the preset probability threshold.
  • Sending a domain name to the domain name classification server can increase the probability of successfully obtaining the category of the first domain name through the domain name server; on the other hand, it can avoid placing the first domain name when the transfer probability of the first domain name is less than or equal to the preset probability threshold.
  • the domain name is sent to the domain name classification server, but the category of the first domain name cannot be queried, which is beneficial to reducing the load of the domain name classification server and improving the speed of website access.
  • the gateway device may have a domain name database, and the domain name database may include multiple domain names and categories of each of the multiple domain names; if the transfer probability of the first domain name is greater than the preset probability threshold, the first domain name
  • the specific implementation manner of sending to the domain name classification server may be: if the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, the first domain name is sent to the domain name classification server.
  • the first domain name is sent to the domain name classification server only when it is satisfied that the transfer probability of the first domain name is greater than the preset probability threshold and the first domain name does not exist in the domain name database.
  • This can avoid the situation where the first domain name and the category of the first domain name exist in the domain name database, but the category of the first domain name is obtained through the domain name classification server.
  • This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the domain name Classify the load of the server, thereby helping to improve the speed of website access.
  • the method may further include: associating and storing the category of the first domain name and the first domain name in a domain name database.
  • the first domain name can be directly obtained in the domain name database when subsequent access requests including the first domain name are received again.
  • the method may further include: if there is no second domain name among the first-level domain names included in each domain name in the domain name database, Obtain the number of first-level domain names included in all domain names in the domain name database; if the number is greater than or equal to the preset number threshold, obtain the storage value of the first-level domain names included in each domain name in the domain name database; delete the target first-level domain name in the domain name database The domain name and all subdomains of the target first-level domain name.
  • the target first-level domain name is the first-level domain name with the lowest storage value among the first-level domain names included in all domain names in the domain name database; the category association of the first domain name and the first domain name is stored in the domain name database
  • the specific implementation in can be: the second domain name is used as the first-level domain name, the third domain name is used as the subdomain name of the second domain name, and the categories of the second domain name, the third domain name, and the first domain name are associated and stored in the domain name database .
  • the target first-level domain name with the lowest storage value and all subdomains of the target first-level domain name are deleted from the domain name database, and then the second domain name is regarded as the first-level domain name, and the third domain name is regarded as the subdomain name of the second domain name.
  • Domain names and store the categories of the second, third, and first domain names in the domain name database.
  • the gateway device When the gateway device subsequently receives an access request including the first domain name (composed of the second domain name and the third domain name) again, it can directly obtain the category of the first domain name in the domain name database of the gateway device without going through the domain name classification server Obtaining the category of the first domain name can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, and thus is beneficial to improve the website access speed.
  • the method may further include: upon detecting that the gateway device is powered on, sending a data initialization request to the domain name classification server, the data initialization request is used to request domain name data, and the domain name data includes domain name collections and domain name collections.
  • the domain names in the domain name set are determined by the domain name classification server according to the access time, access duration and/or access frequency of each domain name.
  • Each domain name in the domain name set includes the first-level domain name and the sub-domain name of the first-level domain name; Receive the domain name data and store the domain name data in the domain name database.
  • the gateway device when the gateway device is powered on again, the data stored in the domain name database may be lost.
  • the domain name can be classified by sending a data initialization request to the domain name classification server.
  • the server requests to obtain domain name data.
  • the initialization of the domain name database can be automatically completed, which is beneficial to improve the hit rate of domain names in the domain name database, and can reduce the number of access requests obtained through the domain name classification server. The number of categories of the domain name, thereby helping to reduce the load of the domain name classification server.
  • an embodiment of the present application provides a data processing device, which has the function of implementing the method described in the first aspect.
  • the function can be realized by hardware, or by hardware executing corresponding software.
  • the hardware or software includes one or more units corresponding to the above functions.
  • an embodiment of the present application provides a gateway device.
  • the gateway device includes a memory and a processor.
  • the memory stores program instructions.
  • the processor is connected to the memory via a bus.
  • the processor calls the program instructions stored in the memory to enable the service
  • the device executes the method described in the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium for storing computer program instructions used by the data processing device described in the second aspect, including instructions for executing the program involved in the first aspect.
  • an embodiment of the present application provides a computer program product, which includes a program, which implements the method described in the first aspect when the program is executed.
  • FIG. 1 is a schematic diagram of the architecture of a communication system provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a data processing method provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of another data processing method provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of another data processing method provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a scenario where there is a hierarchical storage relationship between a first-level domain name and a subdomain name provided by an embodiment of the present application;
  • Fig. 6 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • Fig. 7 is a schematic structural diagram of a gateway device provided by an embodiment of the present application.
  • the embodiments of the present application provide a data processing method and device, which can effectively reduce the load of a domain name classification server, thereby helping to increase the speed of website access.
  • FIG. 1 is a schematic diagram of the architecture of a communication system provided by an embodiment of this application.
  • the system may include: a terminal device 101, a gateway device 102, and a domain name classification server 103.
  • the terminal device 101 is mainly used to generate an access request and send the access request to the gateway device 102.
  • the access request may include a domain name, and the access request is used to obtain content recorded in a website corresponding to the domain name.
  • a domain name is the name of a computer or group of computers on a network consisting of a string of names separated by dots, used to identify the electronic location of the computer during data transmission.
  • the terminal device 101 may generate an access request according to the domain name in the URL.
  • the terminal device 101 may be a user equipment (user equipment, UE), a remote terminal, a mobile terminal, a wireless communication device, a user device, or the like.
  • the gateway device 102 is mainly used to determine whether to send the domain name to the domain name classification server 103 according to whether the transfer probability of the domain name in the access request is greater than a preset probability threshold; and after sending the domain name to the domain name classification server 103, receive the domain name classification The category of the domain name sent by the server 103, and then, when the category of the domain name is an allowed access category, the Internet protocol address corresponding to the domain name is obtained, and the Internet protocol address is sent to the terminal device 101 so that the terminal device 101 can obtain the The content stored in the storage device corresponding to the Internet Protocol address.
  • the gateway device 102 may send the domain name to the domain name classification server 103 when the transfer probability of the domain name is greater than the preset probability threshold; and when the transfer probability of the domain name is less than or equal to the preset probability threshold, not send the domain name
  • the domain name is sent to the domain name classification server 103.
  • the domain name classification server 103 stores a large number of domain names and categories of various domain names, and the category of each domain name can be an access allowed category or an access prohibited category.
  • the category of the domain name is an allowed access category, it indicates that the website corresponding to the domain name is a healthy website; when the category of the domain name is a prohibited access category, it indicates that the website corresponding to the domain name is an unhealthy website.
  • the domain name stored in the domain name classification server 103 may be crawled from multiple websites by the domain name classification server 103 through a web crawler, and the category of each domain name may be the domain name classification server 103 through the webpage corresponding to the domain name. The content is analyzed.
  • the website corresponding to the domain name stored in the domain name classification server 103 may be a website with relatively high traffic or a relatively common website.
  • the transfer probability of a domain name may be used to characterize the probability that the category of the domain name is stored in the domain name classification server 103. For example, for the domain names in the access request generated by some background plug-ins running in the terminal device 101, the websites corresponding to these domain names are relatively uncommon and have low traffic. Therefore, the probability of these domain names being crawled by the domain name classification server 103 is low Correspondingly, the probability that the categories of these domain names are stored in the domain name classification server 103 is low.
  • the gateway device 102 sends these domain names to the domain name classification server 103 to obtain the categories of these domain names.
  • the probability is also low. Therefore, in this embodiment of the present application, the gateway device 102 sends the domain name to the domain name classification server 103 only when the transfer probability of the domain name is greater than the preset probability threshold, which can avoid when the transfer probability of the domain name is less than or equal to the preset probability threshold.
  • the domain name is sent to the domain name classification server 103 but the category of the domain name cannot be queried, it is beneficial to reduce the load of the domain name classification server 103.
  • the gateway device 102 may be an optical network terminal (Optical network terminal, ONT), an optical network unit (Optical Network Unit, ONU), or an intelligent gateway device with routing function, etc.
  • ONT optical network terminal
  • ONU optical network unit
  • the domain name classification server 103 may be a physical server or a cloud server (such as a green cloud server in the green Internet technical solution), which is not limited in the embodiment of the present application.
  • the above-mentioned communication system includes a terminal device 101 for example only.
  • the communication system may include 2, 3, or other numbers of terminal devices. This embodiment of the application does not deal with this. limited.
  • FIG. 2 is a schematic flowchart of a data processing method provided by an embodiment of the present application. The method may include but is not limited to the following steps:
  • Step S201 The gateway device receives the access request sent by the terminal device, where the access request includes the first domain name.
  • the gateway device may parse the access request to obtain the first domain name.
  • the access request may be generated by the terminal device according to a user operation, or the access request may be generated by a background plug-in in the terminal device, which is not limited in the embodiment of the present application.
  • the gateway device can receive access requests sent by one or more terminal devices, and the gateway device processes the access requests sent by each terminal device in the same way. In this embodiment of the application, the gateway device receives one terminal device to send As an example, the visit request of.
  • Step S202 The gateway device obtains the transfer probability of the first domain name.
  • the gateway device After the gateway device obtains the first domain name, it needs to obtain the category of the first domain name.
  • multiple domain names and categories of each domain name may be stored in the domain name classification server.
  • the website corresponding to the domain name stored in the domain name classification server may be a website with a higher traffic volume or More common sites.
  • the category of the first domain name may or may not exist in the domain name classification server.
  • the transfer probability of the first domain name may be used to characterize the probability that the category of the first domain name exists in the domain name classification server. In an implementation manner, the transfer probability of the first domain name is greater than the preset probability threshold.
  • the gateway device may determine whether to send the first domain name to the domain name classification server according to whether the transfer probability of the first domain name is greater than the preset probability threshold. Specifically, the gateway device may determine whether the transfer probability of the first domain name is greater than the preset probability threshold. In this case, the first domain name is sent to the domain name classification server; and when the transfer probability of the first domain name is less than or equal to the preset probability threshold, the first domain name is not sent to the domain name classification server.
  • the first domain name may include a character string
  • a specific implementation manner for the gateway device to obtain the transfer probability of the first domain name may be: obtain each character pair in the character string and the transfer probability of each character pair, and according to The transition probability of each character pair is obtained, and the transition probability of the character string is taken as the transition probability of the first domain name.
  • the transition probability of each character pair can be obtained by training the character conversion model based on the Markov chain.
  • the discrete event random process with Markov property in Markov chain indexology at each step In, the system can change from one state to another according to the probability distribution, or it can maintain the current state.
  • the change of state is called transition, and the probability related to different state changes is called transition probability.
  • the transition probability of the character pair "ab” is the probability that the next character of the character "a” is "b".
  • the training process of the character conversion model based on Markov chain is as follows: Obtain a large number of domain names, split each domain name to obtain character pairs in each domain name, and then count the number of occurrences of each character pair , And standardize the total number of occurrences of each character pair (such as normalization processing) to obtain the transition probability of each character pair.
  • the transition probability of each character pair included in the first domain name is obtained according to the character conversion model based on Markov chain, and the product of the transition probability of each character pair included in the first domain name is taken as the character string included in the first domain name The transition probability. For example, when the character string included in the first domain name is "abc", and the transition probability of the character pair "ab" is p1, and the transition probability of the character pair "bc" is p2, the transition probability of the character string "abc" is p1* p2.
  • Step S203 If the transfer probability of the first domain name is greater than the preset probability threshold, the gateway device sends the first domain name to the domain name classification server.
  • the gateway device can determine whether to send the first domain name to the domain name classification server according to whether the transfer probability of the first domain name is greater than the preset probability threshold, by sending the first domain name to the domain name
  • the classification server can query the category of the first domain name in the domain name classification server.
  • the category of the first domain name may or may not exist in the domain name classification server.
  • the transfer probability of the first domain name is greater than the preset probability threshold, indicating that the category of the first domain name has a higher probability of being in the domain name classification server.
  • the gateway device sends the first domain name to the domain name classification
  • the server can make the probability of obtaining the category of the first domain name by querying the domain name classification server higher.
  • the transfer probability of the first domain name is less than or equal to the preset probability threshold, indicating that the probability that the first domain name category exists in the domain name classification server is low.
  • the gateway device may not transfer the first domain name Send to the domain name classification server, and directly obtain the Internet Protocol address corresponding to the first domain name, and send the obtained Internet Protocol address to the terminal device; or, the gateway device may not send the first domain name to the domain name classification server, and directly ignore or Delete access request.
  • the preset probability threshold may be set by the gateway device by default, or may be set by the gateway device according to user operations, which is not limited in the embodiment of the present application.
  • the gateway device may calculate and count the transfer probabilities of a large number of common domain names, and use the average value of the transfer probabilities of each common domain name as the preset probability threshold.
  • Step S204 The domain name classification server queries the category of the first domain name.
  • the domain name classification server can query whether the first domain name exists. If the first domain name exists, continue to query the category of the first domain name.
  • the category of the first domain name can be access allowed Category or forbidden category; if the first domain name does not exist, a query failure message is sent to the gateway device, and the query failure message is used to indicate that the first domain name or the category of the first domain name does not exist in the domain name classification server.
  • Step S205 The domain name classification server sends the category of the first domain name to the gateway device.
  • the domain name classification server can send the category of the first domain name to the gateway device, so that after receiving the category of the first domain name sent by the domain name classification server, the gateway device can The categories of the received access requests are processed differently.
  • Step S206 If the category of the first domain name is the allowed access category, the gateway device obtains the Internet Protocol address corresponding to the first domain name, and sends the Internet Protocol address to the terminal device.
  • the gateway device receives the category of the first domain name sent by the domain name classification server, if the category of the first domain name is an allowed access category, it indicates that the website corresponding to the first domain name is a healthy website. At this time, the gateway device can obtain the category of the first domain name. An Internet Protocol address corresponding to a domain name, and the Internet Protocol address is sent to the terminal device, so that the terminal device can obtain the content stored in the storage device corresponding to the Internet Protocol address. In one implementation, if the category of the first domain name is forbidden, it indicates that the website corresponding to the first domain name is an unhealthy website. At this time, the gateway device can ignore or delete the access request corresponding to the first domain name to prevent the terminal The device obtains content from unhealthy websites and affects the user's mental health.
  • DNS Domain Name System
  • the gateway device obtains the Internet protocol address corresponding to the first domain name.
  • a specific implementation manner may be: the gateway device queries the Internet Protocol address corresponding to the first domain name in the DNS cache.
  • the specific implementation manner for the gateway device to obtain the Internet Protocol address corresponding to the first domain name may also be: the gateway device sends the first domain name to the domain name resolution server, so that the domain name resolution server queries the Internet corresponding to the first domain name Protocol address, and receive the Internet protocol address sent by the domain name resolution server.
  • the specific implementation manner for the gateway device to obtain the Internet Protocol address corresponding to the first domain name may also be: the gateway device queries in the DNS cache whether the Internet Protocol address corresponding to the first domain name exists, and if it does not exist in the DNS cache The internet protocol address corresponding to the first domain name is sent to the domain name resolution server, so that the domain name resolution server queries the internet protocol address corresponding to the first domain name and receives the internet protocol address sent by the domain name resolution server.
  • HTTP HyperText Transfer Protocol
  • the embodiment of the application filters the domain name, and since the domain name data packet does not need to be unpacked, the cost of unpacking the gateway device can be effectively reduced.
  • the embodiment of the present application can intercept any protocol's data as long as the first domain name included in the access request meets the interception requirements (for example, the category of the first domain name is forbidden) Traffic and traffic on any port.
  • the aforementioned access request may also include the identification of the terminal device.
  • the gateway device may also determine whether the identification of the terminal device is a preset identification. If the identification of the terminal device is the preset identification, step S202 is triggered (ie, the transfer probability of the first domain name is acquired); if the identification of the terminal device is not the preset identification, the acquisition of the Internet Protocol address corresponding to the first domain name is triggered, and The step of sending the Internet Protocol address to the terminal device.
  • the identification of the terminal device is used to uniquely identify a terminal device.
  • the identification of the terminal device may be the unique identification code of the terminal device or the physical address of the terminal device, which is not limited in the embodiment of this application.
  • the preset identifier may be a preset identifier of a terminal device that needs to restrict access requests. In one implementation, it is applied to a green Internet technical solution, and the preset identifier may be an identifier of a minor terminal device .
  • this embodiment of the application determines whether the terminal device's identification is a preset identification, and triggers the execution of obtaining the transfer probability of the first domain name when the terminal device's identification is the preset identification
  • the steps can avoid triggering the acquisition of the access request when the terminal device's identifier is not the preset identifier (that is, the terminal device (such as the terminal device of an adult or parent) should not be restricted)
  • the terminal device such as the terminal device of an adult or parent
  • the preset identifier may be set by the gateway device according to user operations.
  • the first domain name can be sent to the domain name classification server only when the transfer probability of the first domain name is greater than the preset probability threshold. This can prevent the transfer probability of the first domain name from being less than or When it is equal to the preset probability threshold, the first domain name is sent to the domain name classification server but the category of the first domain name cannot be queried, thereby helping to reduce the load of the domain name classification server and improving the website access speed.
  • Figure 3 is a schematic flow chart of another data processing method provided by an embodiment of the present application. This method elaborates only when the transfer probability of the first domain name is greater than the preset probability threshold at the same time, and the domain name database is not The reason why the first domain name is sent to the domain name classification server only when the first domain name exists, the method includes but is not limited to the following steps:
  • Step S301 The gateway device receives the access request sent by the terminal device, where the access request includes the first domain name.
  • Step S302 The gateway device obtains the transfer probability of the first domain name.
  • step S301 to step S302 can be referred to the specific description of step S201 to step S202 in FIG. 2 respectively, which is not repeated here.
  • the gateway device may have a domain name database, and the domain name database may include multiple domain names and categories of each of the multiple domain names.
  • Step S303 If the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, the gateway device sends the first domain name to the domain name classification server.
  • the gateway device after the gateway device receives the access request sent by the terminal device, it can determine whether the transfer probability of the first domain name is greater than the preset probability threshold, and determine whether the first domain name exists in the domain name database, and only when the first domain name transfer is satisfied at the same time When the probability is greater than the preset probability threshold and the first domain name does not exist in the domain name database, the first domain name is sent to the domain name classification server. This can avoid the situation that the gateway device obtains the category of the first domain name through the domain name classification server when the first domain name and the category of the first domain name exist in the domain name database. This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to Reduce the load of the domain name classification server, thereby helping to improve the speed of website access.
  • the domain name stored in the domain name database is less than the domain name stored in the domain name classification server, so the probability of obtaining the category of the first domain name in the domain name database is lower than that of obtaining the first domain name in the domain name classification server.
  • the probability of the category that is, when the first domain name does not exist in the domain name database, by sending the first domain name to the domain name classification server, the category of the first domain name can be queried in the domain name classification server.
  • the gateway device may first determine whether the transfer probability of the first domain name is greater than the preset probability threshold, and when the transfer probability of the first domain name is greater than the preset probability threshold , It is determined whether the first domain name exists in the domain name database; when the transfer probability of the first domain name is less than or equal to the preset probability threshold, the gateway device can trigger to obtain the Internet Protocol address corresponding to the first domain name, and send the Internet Protocol address to The step of the terminal device, or the gateway device may ignore or delete the access request corresponding to the first domain name.
  • the gateway device may first determine whether the first domain name exists in the domain name database, and only determine whether the first domain name exists in the domain name database. Whether the transfer probability is greater than the preset probability threshold; when the first domain name exists in the domain name database, the gateway device can continue to query the domain name database for the category of the first domain name, and if the category of the first domain name is the allowed access category, it will trigger the acquisition of the first domain name Corresponding to the Internet Protocol address, and the step of sending the Internet Protocol address to the terminal device, if the category of the first domain name is a forbidden category, then ignore or delete the access request corresponding to the first domain name.
  • the embodiment of the application does not limit the execution order of judging whether the transfer probability of the first domain name is greater than the preset probability threshold and judging whether the first domain name exists in the domain name database. It can be executed in a sequential order or Simultaneous execution.
  • Step S304 The domain name classification server queries the category of the first domain name.
  • Step S305 The domain name classification server sends the category of the first domain name to the gateway device.
  • Step S306 If the category of the first domain name is the allowed access category, the gateway device obtains the Internet Protocol address corresponding to the first domain name, and sends the Internet Protocol address to the terminal device.
  • step S304 to step S306 can be referred to the specific description of step S204 to step S206 in FIG. 2 respectively, which will not be repeated here.
  • the gateway device may associate the first domain name with the category of the first domain name and store it in the domain name database.
  • the first domain name and the category of the first domain name can be added to the domain name database, so that when the gateway device subsequently receives an access request including the first domain name again, it can be directly stored in the domain name database in the gateway device Obtain the category of the first domain name without having to obtain the category of the first domain name through the domain name classification server. This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, thereby helping to improve website access speed.
  • the gateway device when it detects the power-on operation, it can send a data initialization request to the domain name classification server, and receive the domain name data sent by the domain name classification server, and store the domain name data in the domain name database, where the data
  • the initialization request can be used to request domain name data.
  • the domain name data can include the domain name collection and the category of each domain name in the domain name collection.
  • the domain name in the domain name collection can be determined by the domain name classification server according to the access time, access duration and/or access frequency of each domain name Yes, each domain name in the domain name set can include the first-level domain name and the sub-domain name of the first-level domain name.
  • the gateway device when the gateway device is powered on again, the data stored in the domain name database may be lost.
  • the data initialization request can be sent to the domain name classification server.
  • the classification server requests to obtain domain name data.
  • the initialization of the domain name database can be automatically completed. This helps to improve the hit rate of domain names in the domain name database, and can reduce the number of access requests obtained through the domain name classification server. The number of categories of a domain name, thereby helping to reduce the load of the domain name classification server.
  • the domain name classification server can count the access time, access duration, and/or access frequency of each domain name stored in the domain name classification server, and perform a weighted calculation on the access time, access duration, and/or access frequency of each domain name. Sum operation to obtain the access value of each domain name, and select the first number of domain names from all the domain names stored in the domain name classification server according to the order of access value from high to low. The first number of domain names form a domain name set and obtain the domain name The category of each domain name in the set is collected, and the domain name set and the category of each domain name in the domain name set are sent to the gateway device as domain name data.
  • the access time of the domain name can refer to the last time the domain name was hit in the domain name classification server; the access time of the domain name can refer to the total number of hits during the storage period of the domain name in the domain name classification server; the access frequency of the domain name can be Refers to the ratio of the total number of hits during the storage period of the domain name in the domain name classification server to the total storage duration.
  • the higher the visit value of the domain name indicates the higher the probability of the domain name being hit in the domain name classification server, or the user is more inclined to visit the website corresponding to the domain name with higher visit value. Therefore, sending a domain name with a higher access value to the gateway device can increase the probability that the domain name in the gateway device is hit, thereby helping to reduce the load of the domain name classification server.
  • each domain name when the gateway device stores each domain name in the domain name set, each domain name may be divided into multi-level storage.
  • the embodiment of the present application takes each domain name divided into two-level storage as an example for introduction.
  • each domain name may include the first-level domain name and the subdomain name of the first-level domain name.
  • the first-level domain name may refer to the top-level domain name or the next-level domain name of the top-level domain name. This is not limited.
  • the first-level domain name of the domain name can be “.com”
  • the subdomain name of the first-level domain name can be "www.abc”.
  • the first-level domain name of the domain name can be "abc.com”
  • the subdomains of the first-level domain name can be As "www”.
  • the gateway device when the gateway device divides each domain name in the domain name set into two-level storage, it can store the hash value of the first-level domain name of each domain name and the hash value of the sub-domain name of the first-level domain name.
  • the specific implementation manner for the gateway device to determine whether the first domain name exists in the domain name database may be: determining whether the hash value of the first-level domain name included in the first domain name exists in the domain name database, if the first If the hash value of the first-level domain name included in a domain name does not exist in the domain name database, it indicates that the first domain name does not exist in the domain name database. In this way, only the hash value of the first-level domain name included in the first domain name needs to be calculated, instead of calculating the hash value of the complete first domain name, the power consumption of the gateway device can be reduced.
  • the first domain name can be sent to the domain name classification server only when the transfer probability of the first domain name is greater than the preset probability threshold and the first domain name does not exist in the domain name database. Avoid the situation where the first domain name and the category of the first domain name exist in the domain name database, but the category of the first domain name is obtained through the domain name classification server. This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the domain name classification server The load, which helps to improve the speed of website access.
  • Figure 4 is a schematic flow diagram of another data processing method provided by an embodiment of this application. This method explains in detail how to update the domain name when there is a hierarchical storage relationship between the first-level domain name and the sub-domain name in the domain name database
  • the first-level domain name and subdomain name stored in the database includes but not limited to the following steps:
  • Step S401 The gateway device receives an access request sent by the terminal device, where the access request includes a first domain name, and the first domain name includes a second domain name and a third domain name.
  • the gateway device may have a domain name database, and each domain name stored in the domain name database may be divided into multi-level storage.
  • the embodiment of the present application takes the two-level storage of each domain name as an example for introduction.
  • each domain name in the domain name database may include a first-level domain name and a subdomain name of the first-level domain name.
  • the first-level domain name may refer to a top-level domain name or a sub-domain name of the top-level domain name, which is not limited in this application embodiment.
  • the second domain name included in the first domain name in the access request may be a first-level domain name
  • the third domain name included in the first domain name in the access request may be a subdomain name of the second domain name.
  • Step S402 The gateway device judges whether there is a second domain name among the first-level domain names included in each domain name in the domain name database.
  • the domain names in the domain name database are stored in a two-level storage manner (for example, divided into first-level domain names and sub-domain names), and there may be a hierarchical storage relationship between the first-level domain names and sub-domain names.
  • a two-level storage manner for example, divided into first-level domain names and sub-domain names
  • Figure 5 Take the scenario diagram of the hierarchical storage relationship between the first-level domain name and the subdomain name shown in Figure 5 as an example, when the complete domain name actually stored in the domain name database is "www.abc.com", "blog.xx.com” and When "blog.xx.net” and the first-level domain name and subdomain name in the domain name database have a hierarchical storage relationship, the storage form of these three complete domain names in the domain name database is shown in Figure 5.
  • the gateway device after the gateway device receives the access request sent by the terminal device, it can first determine whether the first domain name exists in the domain name database. If the first domain name does not exist in the domain name database, it triggers the transfer probability of obtaining the first domain name , To determine whether the transfer probability of the first domain name is greater than the preset probability threshold. In this way, when the first domain name exists in the domain name database, that is, when the category of the first domain name can be directly queried in the domain name database, It is possible to avoid obtaining the transfer probability of the first domain name and avoid requesting the domain name classification server to obtain the category of the first domain name, thereby helping to reduce the overhead of the gateway device and the domain name classification server at the same time.
  • the gateway device determines whether the first domain name exists in the domain name database.
  • the method can be: the gateway device judges whether there is a second domain name among the first-level domain names included in each domain name in the domain name database. If it does not exist, it means that the first domain name does not exist in the domain name database; if so, it continues to determine whether the second domain name exists in the domain name database.
  • the third domain name exists in the subdomain name of the second domain name, if the third domain name does not exist in the subdomain name of the second domain name in the domain name database, it means that the first domain name does not exist in the domain name database, if the second domain name in the domain name database If the third domain name exists in the subdomain name of, it indicates that the first domain name exists in the domain name database.
  • step S403 may be executed. If the second domain name exists among the first-level domain names included in each domain name in the domain name database, Then step S408 can be executed.
  • Step S403 If the second domain name does not exist among the first-level domain names included in each domain name in the domain name database, the gateway device obtains the transfer probability of the first domain name.
  • the gateway device needs to request the domain name classification server to obtain the category of the first domain name. .
  • the gateway device can obtain the transfer probability of the first domain name, and then estimate the transfer probability of the first domain name according to whether the transfer probability of the first domain name is greater than a preset probability threshold. The probability that the category is stored in the domain name classification server, and when the probability that the category of the first domain name is stored in the domain name classification server is high, the first domain name is sent to the domain name classification server.
  • Step S404 If the transfer probability of the first domain name is greater than the preset probability threshold, the gateway device sends the first domain name to the domain name classification server, so that the domain name classification server queries the category of the first domain name, and the gateway device receives the first domain name sent by the domain name classification server.
  • a category of a domain name If the category of the first domain name is an allowed access category, the Internet Protocol address corresponding to the first domain name is obtained, and the Internet Protocol address is sent to the terminal device.
  • step S404 can refer to the specific description of step S203 to step S206 in FIG. 2, which is not repeated here.
  • Step S405 The gateway device obtains the number of first-level domain names included in all domain names in the domain name database, and determines whether the number of first-level domain names included in all domain names in the domain name database is greater than or equal to a preset number threshold.
  • the gateway device may associate the category of the first domain name with the first domain name and store it in the domain name database.
  • the gateway device uses a two-level storage method to store domain names and there is a hierarchical storage relationship between the first-level domain name and the sub-domain name
  • the gateway device can store the first-level domain name (that is, the second domain name) and the first-level domain name included in the first domain name.
  • the category association of the subdomain name (ie the third domain name) of the domain name and the first domain name is stored in the domain name database.
  • the first domain name and the category of the first domain name can be added to the domain name database, so that when the gateway device subsequently receives an access request that includes the first domain name again, it can directly obtain the first domain name in the domain name database.
  • the gateway device can limit the number of first-level domain names stored in the domain name database and limit the number of subdomains of each first-level domain name. For example, the gateway device can set the upper limit of the number of first-level domain names stored in the domain name database to a preset number threshold (such as 10 or other numbers), and set the upper limit of the number of subdomains of each first-level domain name stored in the domain name database as the first predetermined number.
  • a preset number threshold such as 10 or other numbers
  • the gateway device can modify the upper limit of the number of first-level domain names and the upper limit of the number of subdomains of each first-level domain name according to user operations.
  • the gateway device after the gateway device receives the category of the first domain name sent by the domain name classification server, if the second domain name does not exist in the first-level domain names included in each domain name in the domain name database, the gateway device needs to be in the domain name database When adding the second domain name, the third domain name, and the first domain name category composed of the second domain name and the third domain name, the gateway device needs to first determine whether the total number of first-level domain names stored in the domain name database is greater than or equal to the preset number threshold If the total number of first-level domain names stored in the domain name database is less than the preset number threshold, the gateway device can directly add the categories of the second domain name, the third domain name, and the first domain name in the domain name database; If the total number of first-level domain names is greater than or equal to the preset number threshold, the gateway device needs to determine the target first-level domain name in the domain name database, and delete the target first-level domain name and all subdomains of the target first-level domain name before adding it to the domain name database
  • the preset number threshold is the upper limit of the preset first-level domain names.
  • the preset number threshold indicates that the total number of first-level domain names stored in the domain name database exceeds the preset number.
  • Set the upper limit of the number of first-level domain names when the total number of first-level domain names stored in the domain name database is equal to the preset number threshold, it indicates that adding a first-level domain name to the domain name database will increase the total number of first-level domain names in the domain name database. The number exceeds the preset upper limit number of first-level domain names.
  • the preset number threshold may be set by the gateway device by default, or may be set by the gateway device according to user operations, which is not limited in this embodiment of the application. It should be noted that because the physical storage space of the gateway device is limited, and if the number of first-level domain names stored in the domain name database of the gateway device is too large, the load of the gateway device will be too high when querying the domain name database. Therefore, the preset preset The number threshold should be less than or equal to the physical upper limit of the first-level domain name in the gateway device. The physical upper limit of the first-level domain name can be set by the gateway device at the factory.
  • the gateway device may first execute the step of obtaining the Internet Protocol address corresponding to the first domain name, and then execute step S405; or, it may execute step S405 first, and then The step of obtaining the internet protocol address corresponding to the first domain name is performed; or, the step S405 and the step of obtaining the internet protocol address corresponding to the first domain name may be performed at the same time, which is not limited in the embodiment of the present application.
  • Step S406 If the number of first-level domain names is greater than or equal to the preset number threshold, the gateway device obtains the storage value of the first-level domain names included in each domain name in the domain name database, and deletes the target first-level domain name and the target first-level domain name in the domain name database
  • the target first-level domain name is the first-level domain name with the lowest storage value among all the first-level domain names included in the domain name database.
  • the gateway device determines the first-level domain name with the lowest storage value as the target first-level domain name, and then uses the first domain name
  • the included second domain name replaces the target first-level domain name, which can increase the average hit rate of the first-level domain name in the domain name database, thereby minimizing the communication traffic between the gateway device and the domain name classification server, thereby helping to reduce the load of the domain name classification server .
  • the specific implementation manner for the gateway device to obtain the storage value of the first-level domain names included in each domain name in the domain name database may be: the gateway device according to the use time, use time and length of the first-level domain names included in each domain name in the domain name database / Or frequency of use, to obtain the storage value of the first-level domain names included in each domain name in the domain name database.
  • the gateway device may perform a weighted sum operation on the use time, use duration, and/or use frequency of the first-level domain names included in each domain name in the domain name database to obtain the storage value of the first-level domain names included in each domain name in the domain name database.
  • the use time of the first-level domain name can refer to the last time the first-level domain name was hit in the domain name database; the use time of the first-level domain name can refer to the total number of times the first-level domain name was hit during the storage period in the domain name database ;
  • the frequency of use of the first-level domain name may refer to the ratio between the total number of hits during the storage period of the first-level domain name in the domain name database and the total storage duration.
  • the gateway device can directly associate the categories of the second domain name, the third domain name, and the first domain name in the domain name database, that is, the gateway device can directly Step S407 is executed.
  • Step S407 The gateway device uses the second domain name as the first-level domain name, and the third domain name as the subdomain name of the second domain name, and stores the second domain name, the third domain name, and the category of the first domain name in the domain name database in association with each other.
  • the gateway device after the gateway device deletes the target first-level domain name and all subdomains of the target first-level domain name in the domain name database, it can use the second domain name as the first-level domain name, the third domain name as the subdomain name of the second domain name, and the first The category association of the second domain name, the third domain name and the first domain name is stored in the domain name database.
  • the gateway device subsequently receives an access request including the first domain name (consisting of the second domain name and the third domain name) again, it can directly obtain the category of the first domain name in the domain name database without passing the domain name.
  • the classification server obtains the category of the first domain name, so that the communication traffic between the gateway device and the domain name classification server can be reduced, which is beneficial to reduce the load of the domain name classification server, and thus is beneficial to improve the website access speed.
  • Step S408 If there is a second domain name among the first-level domain names included in each domain name in the domain name database, the gateway device determines whether the third domain name exists in the subdomain names of the second domain name in the domain name database.
  • the gateway device needs to further determine whether the third domain name exists in the subdomain name of the second domain name in the domain name database, in order to determine whether the first domain name exists in the domain name database.
  • Step S409 If the third domain name exists in the subdomain name of the second domain name in the domain name database, the gateway device queries the domain name database for the category of the first domain name, and if the category of the first domain name is the allowed access category, obtain the first domain name Corresponding Internet Protocol address, and send the Internet Protocol address to the terminal device.
  • the gateway device can directly query the domain name database to obtain the category of the first domain name, and according to the first domain name Different processing is performed on the category of a domain name. Specifically, if the category of the first domain name is an allowed access category, the gateway device can obtain the Internet Protocol address corresponding to the first domain name and send the Internet Protocol address to the terminal device. If the category of the first domain name is a forbidden category, the gateway device can ignore or delete the access request to prevent the terminal device from acquiring the content of the unhealthy website and affect the user's mental health.
  • Step S410 If the third domain name does not exist in the subdomain name of the second domain name in the domain name database, the gateway device obtains the transfer probability of the first domain name.
  • the gateway device needs Request the domain name classification server to obtain the category of the first domain name.
  • the gateway device may obtain the transfer probability of the first domain name, and then estimate the probability that the category of the first domain name is stored in the domain name classification server according to whether the transfer probability of the first domain name is greater than a preset probability threshold, and When the probability that the category of the first domain name is stored in the domain name classification server is high, the first domain name is sent to the domain name classification server.
  • Step S411 If the transfer probability of the first domain name is greater than the preset probability threshold, the gateway device sends the first domain name to the domain name classification server, so that the domain name classification server queries the category of the first domain name, and the gateway device receives the first domain name sent by the domain name classification server.
  • a category of a domain name If the category of the first domain name is an allowed access category, the Internet Protocol address corresponding to the first domain name is obtained, and the Internet Protocol address is sent to the terminal device.
  • step S411 refers to the specific description of step S203 to step S206 in FIG. 2, which will not be repeated here.
  • Step S412 The gateway device obtains the number of subdomains of the second domain name in the domain name database, and determines whether the number of subdomains of the second domain name in the domain name database is greater than or equal to a first preset number threshold.
  • the gateway device receives the domain name classification server After the category of the first domain name is sent, the category of the third domain name and the first domain name may be associated and stored in the domain name database. In this way, the first domain name and the category of the first domain name can be added to the domain name database, so that when the gateway device subsequently receives an access request that includes the first domain name again, it can directly obtain the first domain name in the domain name database. There is no need to obtain the category of the first domain name through the domain name classification server, which can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, and is beneficial to improve the website access speed.
  • the gateway device before the gateway device associates the category of the third domain name with the first domain name and stores it in the domain name database, it needs to first determine whether the total number of subdomain names of the second domain name stored in the domain name database is greater than or equal to the first preset. Set the number threshold. If the total number of subdomains of the second domain name stored in the domain name database is less than the first preset number threshold, it means that even if a third domain name is added to the subdomains of the second domain name in the domain name database, no This will cause the total number of subdomains of the second domain name to exceed the preset upper limit of subdomains, that is, the gateway device can directly add the third domain name and the category of the first domain name in the domain name database.
  • the total number of subdomains of the second domain name stored in the domain name database is greater than the first preset number threshold, it indicates that the total number of subdomains of the second domain name stored in the domain name database exceeds the preset upper limit of subdomains; If the total number of subdomains of the second domain name stored in the domain name database is equal to the first preset number threshold, it indicates that adding a subdomain to the subdomains of the second domain name stored in the domain name database will cause the domain name database to be stored The total number of subdomains of the second domain name exceeds the preset upper limit number of subdomains.
  • the gateway device needs to determine the target subdomain among the subdomains of the second domain name stored in the domain name database and delete the target subdomain,
  • the third domain name and the category of the first domain name can be added to the domain name database.
  • the first preset number threshold may be set by the gateway device by default, or may be set by the gateway device according to user operations, which is not limited in this embodiment of the application. It should be noted that the physical storage space of the gateway device is limited, and if the number of subdomains of each first-level domain name stored in the domain name database of the gateway device is too large, the load of the gateway device will be too high when querying the domain name database. Therefore, the first preset number threshold set should be less than or equal to the upper physical limit number of subdomains in the gateway device, and the physical upper limit number of subdomain names of each first-level domain name may be set by the gateway device at the factory.
  • the gateway device may first perform the step of obtaining the Internet Protocol address corresponding to the first domain name, and then perform step S412; or, it may perform step S412 first, and then The step of obtaining the internet protocol address corresponding to the first domain name is performed; or, step S412 and the step of obtaining the internet protocol address corresponding to the first domain name may be performed at the same time, which is not limited in the embodiment of the present application.
  • Step S413 If the number of subdomains of the second domain name is greater than or equal to the first preset number threshold, the gateway device obtains the storage value of each subdomain of the second domain name in the domain name database, and deletes the target subdomain name in the domain name database.
  • the target subdomain name is the subdomain name with the lowest stored value among the subdomain names of the second domain name in the domain name database.
  • the gateway device determines the subdomain with the lowest storage value as the target subdomain, and then uses the third domain name (that is, the second The subdomain name of the domain name) replaces the target subdomain name, which can increase the average hit rate of the subdomain name in the domain name database, thereby minimizing the communication traffic between the gateway device and the domain name classification server, thereby helping to reduce the load of the domain name classification server.
  • the specific implementation manner for the gateway device to determine the target subdomain name in the subdomain name of the second domain name may be: the gateway device obtains the storage value of each subdomain name of the second domain name, and combines all subdomain names of the second domain name The subdomain with the lowest stored value in the domain name is determined as the target subdomain.
  • a specific implementation manner for the gateway device to obtain the stored value of each subdomain name of the second domain name may be: the gateway device obtains the storage value of each subdomain name of the second domain name according to the use time, use duration, and/or use frequency of each subdomain name of the second domain name The storage value of each subdomain name of the second domain name.
  • the gateway device may directly associate the category of the third domain name and the first domain name in the domain name database, that is, the gateway device may Step S414 is directly executed.
  • Step S414 The gateway device uses the third domain name as a subdomain name of the second domain name, and stores the third domain name and the category of the first domain name in the domain name database.
  • the third domain name can be used as the subdomain name of the second domain name, and the third domain name and the category of the first domain name are associated and stored in the domain name database.
  • the gateway device subsequently receives an access request including the first domain name (composed of the second domain name and the third domain name) again, it can directly obtain the category of the first domain name in the domain name database without having to pass the domain name.
  • the classification server obtains the category of the first domain name, so that the communication traffic between the gateway device and the domain name classification server can be reduced, which is beneficial to reduce the load of the domain name classification server, and thus is beneficial to improve the website access speed.
  • the gateway device when the second domain name does not exist in the first-level domain names included in each domain name in the domain name database, after the gateway device receives the category of the first domain name sent by the domain name classification server, Delete the target first-level domain name with the lowest storage value and all subdomains of the target first-level domain name, and then use the second domain name as the first-level domain name, the third domain name as the subdomain name of the second domain name, and the second domain name and third domain name And the category association of the first domain name is stored in the domain name database.
  • the average hit rate of the first-level domain names in the domain name database can be increased; on the other hand, when the gateway device subsequently receives the first domain name (composed of the second domain name and the third domain name) again
  • the category of the first domain name can be directly obtained from the domain name database of the gateway device, instead of obtaining the category of the first domain name through the domain name classification server, which can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial Reduce the load of the domain name classification server, thereby helping to improve the speed of website access.
  • FIG. 6 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • the data processing device 60 is used to execute the steps performed by the gateway device in the method embodiments corresponding to FIG. 2 to FIG.
  • the data processing device 60 may include:
  • the receiving unit 601 is configured to receive an access request sent by a terminal device, where the access request includes the first domain name;
  • the obtaining unit 602 is configured to obtain the transfer probability of the first domain name
  • the sending unit 603 is configured to send the first domain name to the domain name classification server if the transfer probability of the first domain name is greater than the preset probability threshold;
  • the receiving unit 601 is further configured to receive the category of the first domain name sent by the domain name classification server;
  • the obtaining unit 602 is further configured to obtain an Internet Protocol address corresponding to the first domain name if the category of the first domain name is an allowed access category;
  • the sending unit 603 is also used to send the Internet Protocol address to the terminal device.
  • the first domain name may include a character string.
  • the obtaining unit 602 is used to obtain the transition probability of the first domain name, it is specifically used to: obtain each character pair in the character string and the transition probability of each character pair, according to For the transition probability of each character pair, the transition probability of the character string included in the first domain name is obtained, and the transition probability of the character string included in the first domain name is taken as the transition probability of the first domain name.
  • the gateway device may have a domain name database, and the domain name database may include multiple domain names and categories of each of the multiple domain names;
  • the sending unit 603 is configured to: if the transfer probability of the first domain name is greater than the preset probability threshold , When sending the first domain name to the domain name classification server, it is specifically used to: if the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, then the first domain name is sent to the domain name classification server .
  • the data processing device 60 may further include a storage unit 604 configured to associate the first domain name with the category of the first domain name and store in the domain name database.
  • each domain name in the domain name database may include a first-level domain name and a subdomain name of the first-level domain name.
  • the first domain name may include a second domain name and a third domain name; the first domain name that does not exist in the domain name database may include: The second domain name does not exist among the first-level domain names included in each domain name in the domain name database, and/or the third domain name does not exist among the subdomains of the first-level domain names included in each domain name in the domain name database.
  • the data processing device 60 may further include a deleting unit 605.
  • the obtaining unit 602 is further configured to obtain all domain names in the domain name database if the second domain name does not exist in the first-level domain names included in each domain name in the domain name database.
  • the target first-level domain name is the first-level domain name with the lowest storage value among all the first-level domain names included in the domain name database; the storage unit 604 is used to combine the first domain name with the first-level domain name.
  • the category association is stored in the domain name database, it is specifically used to: regard the second domain name as the first-level domain name, the third domain name as the subdomain name of the second domain name, and associate the second domain name, the third domain name, and the category of the first domain name Stored in the domain name database.
  • the obtaining unit 602 when configured to obtain the storage value of the first-level domain names included in each domain name in the domain name database, it is specifically used to: according to the use time, use time and length of the first-level domain names included in each domain name in the domain name database / Or frequency of use, to obtain the storage value of the first-level domain names included in each domain name in the domain name database.
  • the sending unit 603 is further configured to send a data initialization request to the domain name classification server when it is detected that the gateway device is powered on.
  • the data initialization request is used to request domain name data.
  • the domain name data includes domain name collections and domain name collections.
  • the domain names in the domain name set are determined by the domain name classification server according to the access time, access duration and/or access frequency of each domain name.
  • Each domain name in the domain name set includes the first-level domain name and the subdomains of the first-level domain name. Domain name;
  • the receiving unit 601 is also used to receive domain name data;
  • the storage unit 604 is also used to store the domain name data in the domain name database.
  • the access request may also include the identification of the terminal device
  • the data processing apparatus 60 may also include a processing unit 606.
  • the processing unit 606 is configured to trigger the acquisition of the first domain name if the identification of the terminal device is a preset identification. Steps of transition probability.
  • FIG. 7 is a schematic structural diagram of a distributed data management device provided by an embodiment of the present application.
  • the distributed data management device 70 includes: a transceiver 701, a processor 702, and a memory 703.
  • the transceiver 701, a processor 702 and memory 703 are connected by one or more communication buses.
  • the transceiver 701 is used for receiving data or sending data.
  • the transceiver 701 may be used for receiving an access request sent by a terminal device, or for sending the first domain name to a domain name classification server.
  • the processor 702 is configured to perform corresponding functions of the gateway device in the methods described in FIGS. 2 to 4.
  • the processor 702 may be a central processing unit (CPU), a network processor (NP), a hardware chip, or any combination thereof.
  • the memory 703 is used to store program codes and the like.
  • the memory 703 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM); the memory 703 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (read-only memory).
  • volatile memory volatile memory
  • non-volatile memory non-volatile memory
  • read-only memory read-only memory
  • ROM read-only memory
  • flash memory flash memory
  • HDD hard disk drive
  • SSD solid-state drive
  • memory 703 may also include a combination of the foregoing types of memories.
  • the processor 702 may call the program code stored in the memory 703 to perform the following operations:
  • the category of the first domain name is an allowed access category, then obtain the Internet Protocol address corresponding to the first domain name;
  • processor 702 may also perform operations corresponding to the gateway device in the embodiment shown in FIG. 2 to FIG. 4. For details, please refer to the description in the method embodiment, which will not be repeated here.
  • the embodiment of the present application also provides a computer-readable storage medium, which can be used to store computer software instructions used by the data processing apparatus in the embodiment shown in FIG. 6, which contains the program used to execute the gateway device in the above embodiment. .
  • the aforementioned computer-readable storage medium includes, but is not limited to, flash memory, hard disk, and solid state hard disk.
  • the embodiments of the present application also provide a computer program product.
  • the computer product When the computer product is run by a computing device, it can execute the data processing method designed for the gateway device in the embodiments of FIGS. 2 to 4 above.
  • An embodiment of the present application also provides a chip, including a processor and a memory, the memory includes a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory.
  • the computer program is used to implement the method in the above method embodiment.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted through the computer-readable storage medium.
  • the computer instructions can be sent from one website site, computer, server, or data center to another website site via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) , Computer, server or data center for transmission.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Disclosed in embodiments of the present application are a data processing method and an apparatus for implementing the method, wherein the method is applied to a gateway device. The method comprises: receiving an access request sent by a terminal device; acquiring the transfer probability of a first domain name in the access request; if the transfer probability of the first domain name is greater than a preset probability threshold, then sending the first domain name to a domain name classification server; receiving the category of the first domain name sent by the domain name classification server; and if the category of the first domain name is an allowed access category, acquiring an Internet protocol address corresponding to the first domain name, and sending the Internet protocol address to the terminal device. By means of implementing the embodiments of the present application, the first domain name may be sent to the domain name classification server only when the transfer probability of the first domain name is greater than the preset probability threshold, which may effectively reduce the load of the domain name classification server, thereby facilitating the increase of website access speed.

Description

一种数据处理方法及其装置Data processing method and device 技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种数据处理方法及其装置。This application relates to the field of computer technology, and in particular to a data processing method and device.
背景技术Background technique
互联网在现代人们生活中发挥越来越重要的作用,但是不可否认,互联网在带给人们便捷生活的同时,还存在不少负面作用。例如:大量含有暴力、反动等不良信息的内容充斥着网络空间,并且愈演愈烈,使学生的身心健康受到威胁;另外,众多学生长期沉湎于网络,会导致精神和躯体的病症,影响健康成长。The Internet is playing an increasingly important role in modern people's lives, but it is undeniable that while the Internet brings people a convenient life, it also has many negative effects. For example, a large number of content containing violent, reactionary and other bad information floods the cyberspace, and is becoming more and more intense, threatening the physical and mental health of students; in addition, many students indulge in the Internet for a long time, which will cause mental and physical illnesses and affect healthy growth.
为了杜绝学生访问不良网站,绿色上网技术方案应运而生,通过绿色上网技术方案可以有效地控制学生上网的时间并限制其对不良网站的访问。绿色上网技术方案的思想主要包括:绿色云端服务器根据黑、白名单中的统一资源定位符(Uniform Resource Locator,URL)以判断用户请求中的URL对应的网站是否为不健康网站,并阻止用户访问不健康网站。In order to prevent students from accessing undesirable websites, green Internet technology solutions have emerged. The green Internet technology solutions can effectively control the time students spend online and restrict their access to unhealthy websites. The idea of the green Internet technology solution mainly includes: the green cloud server judges whether the website corresponding to the URL in the user request is an unhealthy website according to the uniform resource locator (URL) in the black and white lists, and prevents users from accessing unhealthy websites website.
目前,用户每次访问网站时,都需要将待访问网站的URL发送至绿色云端服务器以查询该URL对应的网站是否为不健康网站。但是,在上网高峰期时用户访问量很大,这样会使得绿色云端服务器的负载过高,从而影响网站访问速度。Currently, every time a user visits a website, he needs to send the URL of the website to be visited to the green cloud server to query whether the website corresponding to the URL is an unhealthy website. However, users visit a lot during the peak period of the Internet, which will make the load of the green cloud server too high, thereby affecting the speed of website access.
发明内容Summary of the invention
本申请实施例提供了一种数据处理方法及其装置,可以有效降低域名分类服务器的负载,从而有利于提高网站访问速度。The embodiments of the present application provide a data processing method and device, which can effectively reduce the load of a domain name classification server, thereby helping to increase the speed of website access.
第一方面,本申请实施例提供了一种数据处理方法,该方法包括:接收终端设备发送的访问请求,访问请求包括第一域名,获取第一域名的转移概率,若第一域名的转移概率大于预设概率阈值,则将第一域名发送给域名分类服务器,接收域名分类服务器发送的第一域名的类别,若第一域名的类别为允许访问类别,则获取第一域名对应的互联网协议地址,并将互联网协议地址发送给终端设备。In the first aspect, an embodiment of the present application provides a data processing method. The method includes: receiving an access request sent by a terminal device, the access request includes a first domain name, and obtaining the transfer probability of the first domain name. If the transfer probability of the first domain name is Is greater than the preset probability threshold, the first domain name is sent to the domain name classification server, and the category of the first domain name sent by the domain name classification server is received. If the category of the first domain name is the allowed access category, the Internet protocol address corresponding to the first domain name is obtained , And send the Internet Protocol address to the terminal device.
在该技术方案中,第一域名的转移概率可以用于表征该第一域名的类别存在于域名分类服务器中的概率,通过仅在第一域名的转移概率大于预设概率阈值时,才将第一域名发送至域名分类服务器,一方面,可以提高通过域名服务器成功获取第一域名的类别的概率;另一方面,可以避免在第一域名的转移概率小于或等于预设概率阈值时将第一域名发送给域名分类服务器却不能查询得到该第一域名的类别的情况,从而有利于降低域名分类服务器的负载以及有利于提高网站访问速度。In this technical solution, the transfer probability of the first domain name can be used to characterize the probability that the category of the first domain name exists in the domain name classification server, and the transfer probability of the first domain name is greater than the preset probability threshold. Sending a domain name to the domain name classification server, on the one hand, can increase the probability of successfully obtaining the category of the first domain name through the domain name server; on the other hand, it can avoid placing the first domain name when the transfer probability of the first domain name is less than or equal to the preset probability threshold. The domain name is sent to the domain name classification server, but the category of the first domain name cannot be queried, which is beneficial to reducing the load of the domain name classification server and improving the speed of website access.
在一种实现方式中,网关设备中可以具有域名数据库,域名数据库可以包括多个域名以及多个域名中各个域名的类别;若第一域名的转移概率大于预设概率阈值,则将第一域名发送给域名分类服务器的具体实施方式可以为:若第一域名的转移概率大于预设概率阈值,且域名数据库中不存在第一域名,则将第一域名发送给域名分类服务器。In an implementation manner, the gateway device may have a domain name database, and the domain name database may include multiple domain names and categories of each of the multiple domain names; if the transfer probability of the first domain name is greater than the preset probability threshold, the first domain name The specific implementation manner of sending to the domain name classification server may be: if the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, the first domain name is sent to the domain name classification server.
在该技术方案中,仅在同时满足第一域名的转移概率大于预设概率阈值和域名数据库 中不存在第一域名的情况下,才将第一域名发送给域名分类服务器。这样可以避免域名数据库中存在第一域名以及第一域名的类别时却通过域名分类服务器获取第一域名的类别的情况,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。In this technical solution, the first domain name is sent to the domain name classification server only when it is satisfied that the transfer probability of the first domain name is greater than the preset probability threshold and the first domain name does not exist in the domain name database. This can avoid the situation where the first domain name and the category of the first domain name exist in the domain name database, but the category of the first domain name is obtained through the domain name classification server. This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the domain name Classify the load of the server, thereby helping to improve the speed of website access.
在一种实现方式中,接收域名分类服务器发送的第一域名的类别之后,该方法还可以包括:将第一域名与第一域名的类别关联存储于域名数据库中。In an implementation manner, after receiving the category of the first domain name sent by the domain name classification server, the method may further include: associating and storing the category of the first domain name and the first domain name in a domain name database.
在该技术方案中,通过将第一域名与第一域名的类别关联存储于域名数据库中,可以使得后续再次接收到包括该第一域名的访问请求时,可以直接在域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。In this technical solution, by associating the categories of the first domain name with the first domain name and storing it in the domain name database, the first domain name can be directly obtained in the domain name database when subsequent access requests including the first domain name are received again. There is no need to obtain the category of the first domain name through the domain name classification server, which can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, and is beneficial to improve the website access speed.
在一种实现方式中,将第一域名与第一域名的类别关联存储于域名数据库中之前,该方法还可以包括:若域名数据库中的各个域名包括的一级域名中不存在第二域名,则获取域名数据库中所有域名包括的一级域名的数量;若该数量大于或等于预设数量阈值,则获取域名数据库中各个域名包括的一级域名的存储价值;在域名数据库中删除目标一级域名以及目标一级域名的所有子域名,目标一级域名为域名数据库中所有域名包括的一级域名中存储价值最低的一级域名;将第一域名与第一域名的类别关联存储于域名数据库中的具体实施方式可以为:将第二域名作为一级域名,将第三域名作为第二域名的子域名,并将第二域名、第三域名以及第一域名的类别关联存储于域名数据库中。In an implementation manner, before the category association between the first domain name and the first domain name is stored in the domain name database, the method may further include: if there is no second domain name among the first-level domain names included in each domain name in the domain name database, Obtain the number of first-level domain names included in all domain names in the domain name database; if the number is greater than or equal to the preset number threshold, obtain the storage value of the first-level domain names included in each domain name in the domain name database; delete the target first-level domain name in the domain name database The domain name and all subdomains of the target first-level domain name. The target first-level domain name is the first-level domain name with the lowest storage value among the first-level domain names included in all domain names in the domain name database; the category association of the first domain name and the first domain name is stored in the domain name database The specific implementation in can be: the second domain name is used as the first-level domain name, the third domain name is used as the subdomain name of the second domain name, and the categories of the second domain name, the third domain name, and the first domain name are associated and stored in the domain name database .
在该技术方案中,通过在域名数据库中删除存储价值最低的目标一级域名以及目标一级域名的所有子域名,进而将第二域名作为一级域名,将第三域名作为第二域名的子域名,并将第二域名、第三域名以及第一域名的类别关联存储于域名数据库中,通过这种方式,一方面,可以提高域名数据库中的一级域名的平均命中率;另一方面,当网关设备后续再次接收到包括(由第二域名和第三域名组成的)第一域名的访问请求时,可以直接在网关设备的域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。In this technical solution, the target first-level domain name with the lowest storage value and all subdomains of the target first-level domain name are deleted from the domain name database, and then the second domain name is regarded as the first-level domain name, and the third domain name is regarded as the subdomain name of the second domain name. Domain names, and store the categories of the second, third, and first domain names in the domain name database. In this way, on the one hand, the average hit rate of the first-level domain names in the domain name database can be improved; on the other hand, When the gateway device subsequently receives an access request including the first domain name (composed of the second domain name and the third domain name) again, it can directly obtain the category of the first domain name in the domain name database of the gateway device without going through the domain name classification server Obtaining the category of the first domain name can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, and thus is beneficial to improve the website access speed.
在一种实现方式中,该方法还可以包括:在检测到网关设备上电时,向域名分类服务器发送数据初始化请求,数据初始化请求用于请求获取域名数据,域名数据包括域名集合、域名集合中各个域名的类别,域名集合中的域名是域名分类服务器根据各个域名的访问时间、访问时长和/或访问频率确定的,域名集合中的各个域名包括一级域名和该一级域名的子域名;接收域名数据,并将该域名数据存储于域名数据库中。In an implementation manner, the method may further include: upon detecting that the gateway device is powered on, sending a data initialization request to the domain name classification server, the data initialization request is used to request domain name data, and the domain name data includes domain name collections and domain name collections. The category of each domain name. The domain names in the domain name set are determined by the domain name classification server according to the access time, access duration and/or access frequency of each domain name. Each domain name in the domain name set includes the first-level domain name and the sub-domain name of the first-level domain name; Receive the domain name data and store the domain name data in the domain name database.
在该技术方案中,网关设备重新上电时,可能会导致域名数据库中存储的数据丢失,本申请实施例在检测到上电操作时,通过向域名分类服务器发送数据初始化请求,可以向域名分类服务器请求获取域名数据,通过将域名数据存储于域名数据库中,可以自动完成域名数据库的初始化,这样有利于提高域名数据库中的域名的命中率,可以减少通过域名分类服务器获取访问请求中的第一域名的类别的次数,从而有利于降低域名分类服务器的负载。In this technical solution, when the gateway device is powered on again, the data stored in the domain name database may be lost. When the power-on operation is detected in the embodiment of the application, the domain name can be classified by sending a data initialization request to the domain name classification server. The server requests to obtain domain name data. By storing the domain name data in the domain name database, the initialization of the domain name database can be automatically completed, which is beneficial to improve the hit rate of domain names in the domain name database, and can reduce the number of access requests obtained through the domain name classification server. The number of categories of the domain name, thereby helping to reduce the load of the domain name classification server.
第二方面,本申请实施例提供了一种数据处理装置,该装置具有实现第一方面所述的方法的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的单元。In the second aspect, an embodiment of the present application provides a data processing device, which has the function of implementing the method described in the first aspect. The function can be realized by hardware, or by hardware executing corresponding software. The hardware or software includes one or more units corresponding to the above functions.
第三方面,本申请实施例提供一种网关设备,该网关设备包括存储器和处理器,存储器中存储有程序指令,处理器通过总线与存储器连接,处理器调用存储器中存储的程序指令以使服务设备执行第一方面所述的方法。In a third aspect, an embodiment of the present application provides a gateway device. The gateway device includes a memory and a processor. The memory stores program instructions. The processor is connected to the memory via a bus. The processor calls the program instructions stored in the memory to enable the service The device executes the method described in the first aspect.
第四方面,本申请实施例提供一种计算机可读存储介质,用于储存为第二方面所述的数据处理装置所用的计算机程序指令,其包含用于执行上述第一方面所涉及的程序。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium for storing computer program instructions used by the data processing device described in the second aspect, including instructions for executing the program involved in the first aspect.
第五方面,本申请实施例提供一种计算机程序产品,该程序产品包括程序,所述程序被执行时实现上述第一方面所述的方法。In a fifth aspect, an embodiment of the present application provides a computer program product, which includes a program, which implements the method described in the first aspect when the program is executed.
附图说明Description of the drawings
图1是本申请实施例提供的一种通信系统的架构示意图;FIG. 1 is a schematic diagram of the architecture of a communication system provided by an embodiment of the present application;
图2是本申请实施例提供的一种数据处理方法的流程示意图;2 is a schematic flowchart of a data processing method provided by an embodiment of the present application;
图3是本申请实施例提供的另一种数据处理方法的流程示意图;FIG. 3 is a schematic flowchart of another data processing method provided by an embodiment of the present application;
图4是本申请实施例提供的又一种数据处理方法的流程示意图;FIG. 4 is a schematic flowchart of another data processing method provided by an embodiment of the present application;
图5是本申请实施例提供的一种一级域名与子域名之间具有层级存储关系的场景示意图;FIG. 5 is a schematic diagram of a scenario where there is a hierarchical storage relationship between a first-level domain name and a subdomain name provided by an embodiment of the present application;
图6是本申请实施例提供的数据处理装置的结构示意图;Fig. 6 is a schematic structural diagram of a data processing device provided by an embodiment of the present application;
图7是本申请实施例提供的网关设备的结构示意图。Fig. 7 is a schematic structural diagram of a gateway device provided by an embodiment of the present application.
具体实施方式detailed description
下面结合附图对本申请具体实施例作进一步的详细描述。The specific embodiments of the present application will be further described in detail below in conjunction with the accompanying drawings.
本申请实施例提供了一种数据处理方法及其装置,可以有效降低域名分类服务器的负载,从而有利于提高网站访问速度。The embodiments of the present application provide a data processing method and device, which can effectively reduce the load of a domain name classification server, thereby helping to increase the speed of website access.
为了能够更好地理解本申请实施例,下面对本申请实施例可应用的通信系统进行说明。In order to better understand the embodiments of the present application, a communication system applicable to the embodiments of the present application will be described below.
请参见图1,为本申请实施例提供的一种通信系统的架构示意图。如图1所示,该系统可以包括:终端设备101、网关设备102和域名分类服务器103。Please refer to FIG. 1, which is a schematic diagram of the architecture of a communication system provided by an embodiment of this application. As shown in FIG. 1, the system may include: a terminal device 101, a gateway device 102, and a domain name classification server 103.
其中,终端设备101主要用于生成访问请求,并将该访问请求发送至网关设备102。其中,访问请求可以包括域名,该访问请求用于获取该域名对应的网站中记载的内容。域名是由一串用点分隔的名字组成的网络上某一台计算机或计算机组的名称,用于在数据传输时标识计算机的电子方位。例如,当用户通过输入设备在终端设备101的浏览器中输入URL,并点击访问按钮时,终端设备101可以根据URL中的域名生成访问请求。在一种实现方式中,终端设备101可以是用户设备(user equipment,UE)、远程终端、移动终端、无线通信设备或用户装置等。The terminal device 101 is mainly used to generate an access request and send the access request to the gateway device 102. The access request may include a domain name, and the access request is used to obtain content recorded in a website corresponding to the domain name. A domain name is the name of a computer or group of computers on a network consisting of a string of names separated by dots, used to identify the electronic location of the computer during data transmission. For example, when the user inputs a URL in the browser of the terminal device 101 through the input device and clicks an access button, the terminal device 101 may generate an access request according to the domain name in the URL. In an implementation manner, the terminal device 101 may be a user equipment (user equipment, UE), a remote terminal, a mobile terminal, a wireless communication device, a user device, or the like.
网关设备102主要用于根据访问请求中的域名的转移概率是否大于预设概率阈值,判断是否将该域名发送给域名分类服务器103;并在将该域名发送给域名分类服务器103之后,接收域名分类服务器103发送的该域名的类别,然后,在该域名的类别为允许访问类别时, 获取该域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备101,以使终端设备101获取该互联网协议地址对应的存储设备中存储的内容。The gateway device 102 is mainly used to determine whether to send the domain name to the domain name classification server 103 according to whether the transfer probability of the domain name in the access request is greater than a preset probability threshold; and after sending the domain name to the domain name classification server 103, receive the domain name classification The category of the domain name sent by the server 103, and then, when the category of the domain name is an allowed access category, the Internet protocol address corresponding to the domain name is obtained, and the Internet protocol address is sent to the terminal device 101 so that the terminal device 101 can obtain the The content stored in the storage device corresponding to the Internet Protocol address.
具体的,网关设备102可以在域名的转移概率大于预设概率阈值的情况下,将该域名发送给域名分类服务器103;并在域名的转移概率小于或等于预设概率阈值的情况下,不将该域名发送给域名分类服务器103。其中,域名分类服务器103中存储了大量域名以及各个域名的类别,每个域名的类别可以为允许访问类别或者禁止访问类别。当域名的类别为允许访问类别时,表明该域名对应的网站为健康网站;当域名的类别为禁止访问类别时,表明该域名对应的网站为不健康网站。在一种实现方式中,域名分类服务器103中存储的域名可以是域名分类服务器103通过网络爬虫从多个网站中爬取的,各个域名的类别可以是域名分类服务器103通过对该域名对应的网页内容进行分析得到的。Specifically, the gateway device 102 may send the domain name to the domain name classification server 103 when the transfer probability of the domain name is greater than the preset probability threshold; and when the transfer probability of the domain name is less than or equal to the preset probability threshold, not send the domain name The domain name is sent to the domain name classification server 103. Among them, the domain name classification server 103 stores a large number of domain names and categories of various domain names, and the category of each domain name can be an access allowed category or an access prohibited category. When the category of the domain name is an allowed access category, it indicates that the website corresponding to the domain name is a healthy website; when the category of the domain name is a prohibited access category, it indicates that the website corresponding to the domain name is an unhealthy website. In one implementation, the domain name stored in the domain name classification server 103 may be crawled from multiple websites by the domain name classification server 103 through a web crawler, and the category of each domain name may be the domain name classification server 103 through the webpage corresponding to the domain name. The content is analyzed.
在一种实现方式中,域名分类服务器103中存储的域名对应的网站可以是访问量较高的网站或者是比较常见的网站。在本申请实施例中,域名的转移概率可以用于表征该域名的类别存储于域名分类服务器103中的概率。例如,由终端设备101中运行的某些后台插件产生的访问请求中的域名,这些域名对应的网站比较不常见且访问量较低,因此,这些域名被域名分类服务器103爬取的概率较低,相应的,这些域名的类别存储于域名分类服务器103中的概率较低,也就是说,网关设备102即使将这些域名发送至域名分类服务器103,在域名分类服务器103中查询得到这些域名的类别的概率也较低。因此,在本申请实施例中,网关设备102仅在域名的转移概率大于预设概率阈值时,才将该域名发送给域名分类服务器103,可以避免在域名的转移概率小于或等于预设概率阈值时,将该域名发送给域名分类服务器103却不能查询得到该域名的类别的情况,从而有利于降低域名分类服务器103的负载。In an implementation manner, the website corresponding to the domain name stored in the domain name classification server 103 may be a website with relatively high traffic or a relatively common website. In this embodiment of the application, the transfer probability of a domain name may be used to characterize the probability that the category of the domain name is stored in the domain name classification server 103. For example, for the domain names in the access request generated by some background plug-ins running in the terminal device 101, the websites corresponding to these domain names are relatively uncommon and have low traffic. Therefore, the probability of these domain names being crawled by the domain name classification server 103 is low Correspondingly, the probability that the categories of these domain names are stored in the domain name classification server 103 is low. In other words, even if the gateway device 102 sends these domain names to the domain name classification server 103, the domain name classification server 103 queries the domain name classification server 103 to obtain the categories of these domain names. The probability is also low. Therefore, in this embodiment of the present application, the gateway device 102 sends the domain name to the domain name classification server 103 only when the transfer probability of the domain name is greater than the preset probability threshold, which can avoid when the transfer probability of the domain name is less than or equal to the preset probability threshold When the domain name is sent to the domain name classification server 103 but the category of the domain name cannot be queried, it is beneficial to reduce the load of the domain name classification server 103.
在一种实现方式中,网关设备102可以是光网络终端(Optical network terminal,ONT)、光网络单元(Optical Network Unit,ONU)或者具有路由功能的智能网关设备等,本申请实施例对此不作限定。在一种实现方式中,域名分类服务器103可以为物理服务器或者云服务器(如绿色上网技术方案中的绿色云端服务器),本申请实施例对此不作限定。In an implementation manner, the gateway device 102 may be an optical network terminal (Optical network terminal, ONT), an optical network unit (Optical Network Unit, ONU), or an intelligent gateway device with routing function, etc. This embodiment of the application does not do this. limited. In an implementation manner, the domain name classification server 103 may be a physical server or a cloud server (such as a green cloud server in the green Internet technical solution), which is not limited in the embodiment of the present application.
需要说明的是,上述通信系统包括一个终端设备101仅用于举例,在其他可行的实现方式中,该通信系统可以包括2个、3个或其他数量的终端设备,本申请实施例对此不作限定。It should be noted that the above-mentioned communication system includes a terminal device 101 for example only. In other feasible implementation manners, the communication system may include 2, 3, or other numbers of terminal devices. This embodiment of the application does not deal with this. limited.
可以理解的是,本申请实施例描述的网络架构是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。It is understandable that the network architecture described in the embodiments of this application is to illustrate the technical solutions of the embodiments of this application more clearly, and does not constitute a limitation on the technical solutions provided in the embodiments of this application. Those of ordinary skill in the art will know that With the evolution of the system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of the present application are equally applicable to similar technical problems.
以下对本申请所提供的数据处理方法及其装置进行详细地介绍。The data processing method and device provided by this application are described in detail below.
基于图1所示的通信系统的架构示意图,请参见图2,图2是本申请实施例提供的一种数据处理方法的流程示意图,该方法可以包括但不限于如下步骤:Based on the schematic structural diagram of the communication system shown in FIG. 1, please refer to FIG. 2. FIG. 2 is a schematic flowchart of a data processing method provided by an embodiment of the present application. The method may include but is not limited to the following steps:
步骤S201:网关设备接收终端设备发送的访问请求,访问请求包括第一域名。Step S201: The gateway device receives the access request sent by the terminal device, where the access request includes the first domain name.
具体的,网关设备接收到终端设备发送的访问请求之后,可以解析访问请求以得到第一域名。在一种实现方式中,访问请求可以是终端设备根据用户操作生成的,或者,访问请求可以是终端设备中的后台插件生成的,本申请实施例对此不作限定。在一种实现方式 中,网关设备可以接收一个或多个终端设备发送的访问请求,且网关设备对于各个终端设备发送的访问请求的处理方法相同,本申请实施例以网关设备接收一个终端设备发送的访问请求为例进行介绍。Specifically, after receiving the access request sent by the terminal device, the gateway device may parse the access request to obtain the first domain name. In an implementation manner, the access request may be generated by the terminal device according to a user operation, or the access request may be generated by a background plug-in in the terminal device, which is not limited in the embodiment of the present application. In one implementation, the gateway device can receive access requests sent by one or more terminal devices, and the gateway device processes the access requests sent by each terminal device in the same way. In this embodiment of the application, the gateway device receives one terminal device to send As an example, the visit request of.
步骤S202:网关设备获取第一域名的转移概率。Step S202: The gateway device obtains the transfer probability of the first domain name.
具体的,网关设备获取第一域名之后,需要获取第一域名的类别。在本申请实施例中,域名分类服务器中可以存储有多个域名以及各个域名的类别,在一种实现方式中,域名分类服务器中存储的域名对应的网站可以是访问量较高的网站或者是比较常见的网站。在一种实现方式中,第一域名的类别可能存在于域名分类服务器中,也可能不存在于域名分类服务器中。在本申请实施例中,第一域名的转移概率可以用于表征该第一域名的类别存在于域名分类服务器中的概率,在一种实现方式中,第一域名的转移概率大于预设概率阈值,表明该第一域名的类别存在于域名分类服务器中的概率较高;第一域名的转移概率小于或等于预设概率阈值,表明该第一域名的类别存在于域名分类服务器中的概率较低。网关设备可以根据第一域名的转移概率是否大于预设概率阈值,以判断是否将该第一域名发送给域名分类服务器,具体的,网关设备可以在第一域名的转移概率大于预设概率阈值的情况下,将该第一域名发送给域名分类服务器;并在第一域名的转移概率小于或等于预设概率阈值的情况下,不将该第一域名发送给域名分类服务器。通过这种方式,可以避免在第一域名的转移概率小于或等于预设概率阈值时,将该第一域名发送给域名分类服务器却不能查询得到该第一域名的类别的情况,从而有利于降低域名分类服务器的负载以及有利于提高网站访问速度。Specifically, after the gateway device obtains the first domain name, it needs to obtain the category of the first domain name. In the embodiment of this application, multiple domain names and categories of each domain name may be stored in the domain name classification server. In an implementation manner, the website corresponding to the domain name stored in the domain name classification server may be a website with a higher traffic volume or More common sites. In an implementation manner, the category of the first domain name may or may not exist in the domain name classification server. In this embodiment of the application, the transfer probability of the first domain name may be used to characterize the probability that the category of the first domain name exists in the domain name classification server. In an implementation manner, the transfer probability of the first domain name is greater than the preset probability threshold. , Indicating that the category of the first domain name has a high probability of being in the domain name classification server; the transfer probability of the first domain name is less than or equal to the preset probability threshold, indicating that the category of the first domain name has a low probability of being in the domain name classification server . The gateway device may determine whether to send the first domain name to the domain name classification server according to whether the transfer probability of the first domain name is greater than the preset probability threshold. Specifically, the gateway device may determine whether the transfer probability of the first domain name is greater than the preset probability threshold. In this case, the first domain name is sent to the domain name classification server; and when the transfer probability of the first domain name is less than or equal to the preset probability threshold, the first domain name is not sent to the domain name classification server. In this way, it is possible to avoid the situation that when the transfer probability of the first domain name is less than or equal to the preset probability threshold, the first domain name is sent to the domain name classification server but the category of the first domain name cannot be obtained, which is beneficial to reduce The load of the domain name classification server is conducive to improving the speed of website access.
在一种实现方式中,第一域名可以包括字符串,网关设备获取第一域名的转移概率的具体实施方式可以为:获取该字符串中的各个字符对以及各个字符对的转移概率,并根据各个字符对的转移概率,得到该字符串的转移概率,并将该字符串的转移概率作为第一域名的转移概率。其中,字符对是指第一域名包括的字符串中相邻的两个字符,若该字符串包括n个字符,则该字符串中的字符对的数量为n-1(n>=2)。例如,第一域名包括的字符串为“www.abc.com”时,该字符串中的各个字符对为:“ww”、“ww”、“w.”、“.a”、“ab”、“bc”、“c.”、“.c”、“co”和“om”。在一种实现方式中,各个字符对的转移概率可以是基于马尔科夫链的字符转换模型训练得到的,马尔科夫链指数学中具有马尔科夫性质的离散事件随机过程,在其每一步中,系统根据概率分布可以从一个状态变到另一个状态,也可以保持当前状态,状态的改变叫做转移,与不同的状态改变相关的概率叫做转移概率。在本申请实施例中,字符对“ab”的转移概率为字符“a”的下一个字符为“b”的概率。In an implementation manner, the first domain name may include a character string, and a specific implementation manner for the gateway device to obtain the transfer probability of the first domain name may be: obtain each character pair in the character string and the transfer probability of each character pair, and according to The transition probability of each character pair is obtained, and the transition probability of the character string is taken as the transition probability of the first domain name. Among them, a character pair refers to two adjacent characters in the character string included in the first domain name. If the character string includes n characters, the number of character pairs in the character string is n-1 (n>=2) . For example, when the character string included in the first domain name is "www.abc.com", the character pairs in the character string are: "ww", "ww", "w.", ".a", "ab" , "Bc", "c.", ".c", "co" and "om". In one implementation, the transition probability of each character pair can be obtained by training the character conversion model based on the Markov chain. The discrete event random process with Markov property in Markov chain indexology, at each step In, the system can change from one state to another according to the probability distribution, or it can maintain the current state. The change of state is called transition, and the probability related to different state changes is called transition probability. In the embodiment of the present application, the transition probability of the character pair "ab" is the probability that the next character of the character "a" is "b".
在一种实现方式中,基于马尔科夫链的字符转换模型的训练过程如下:获取大量域名,并对各个域名进行拆分处理以得到各个域名中的字符对,然后统计各个字符对出现的次数,并对各个字符对出现的总次数进行标准化处理(如归一化处理),以得到各个字符对的转移概率。In one implementation, the training process of the character conversion model based on Markov chain is as follows: Obtain a large number of domain names, split each domain name to obtain character pairs in each domain name, and then count the number of occurrences of each character pair , And standardize the total number of occurrences of each character pair (such as normalization processing) to obtain the transition probability of each character pair.
在一种实现方式中,网关设备根据各个字符对的转移概率,得到第一域名包括的字符串的转移概率的具体实施方式可以为:对第一域名进行拆分处理,得到第一域名包括的各个字符对,并根据基于马尔科夫链的字符转换模型得到第一域名包括的各个字符对的转移概率,将第一域名包括的各个字符对的转移概率的乘积作为第一域名包括的字符串的转移 概率。例如,当第一域名包括的字符串为“abc”,且字符对“ab”的转移概率为p1,字符对“bc”的转移概率为p2时,字符串“abc”的转移概率为p1*p2。In an implementation manner, the gateway device obtains the transfer probability of the character string included in the first domain name according to the transfer probability of each character pair may be: splitting the first domain name to obtain the transfer probability of the first domain name For each character pair, the transition probability of each character pair included in the first domain name is obtained according to the character conversion model based on Markov chain, and the product of the transition probability of each character pair included in the first domain name is taken as the character string included in the first domain name The transition probability. For example, when the character string included in the first domain name is "abc", and the transition probability of the character pair "ab" is p1, and the transition probability of the character pair "bc" is p2, the transition probability of the character string "abc" is p1* p2.
步骤S203:若第一域名的转移概率大于预设概率阈值,则网关设备将第一域名发送给域名分类服务器。Step S203: If the transfer probability of the first domain name is greater than the preset probability threshold, the gateway device sends the first domain name to the domain name classification server.
具体的,网关设备获取第一域名的转移概率之后,可以根据第一域名的转移概率是否大于预设概率阈值,以判断是否将第一域名发送给域名分类服务器,通过将第一域名发送给域名分类服务器,可以在域名分类服务器中查询第一域名的类别。在一种实现方式中,第一域名的类别可能存在于域名分类服务器中,也可能不存在于域名分类服务器中。在一种实现方式中,第一域名的转移概率大于预设概率阈值,表明该第一域名的类别存在于域名分类服务器中的概率较高,此时,网关设备将第一域名发送给域名分类服务器,可以使得在域名分类服务器中查询得到第一域名的类别的概率较高。Specifically, after the gateway device obtains the transfer probability of the first domain name, it can determine whether to send the first domain name to the domain name classification server according to whether the transfer probability of the first domain name is greater than the preset probability threshold, by sending the first domain name to the domain name The classification server can query the category of the first domain name in the domain name classification server. In an implementation manner, the category of the first domain name may or may not exist in the domain name classification server. In one implementation, the transfer probability of the first domain name is greater than the preset probability threshold, indicating that the category of the first domain name has a higher probability of being in the domain name classification server. At this time, the gateway device sends the first domain name to the domain name classification The server can make the probability of obtaining the category of the first domain name by querying the domain name classification server higher.
在一种实现方式中,第一域名的转移概率小于或等于预设概率阈值,表明该第一域名的类别存在于域名分类服务器中的概率较低,此时,网关设备可以不将第一域名发送给域名分类服务器,而直接获取第一域名对应的互联网协议地址,并将获取的互联网协议地址发送给终端设备;或者,网关设备可以不将第一域名发送给域名分类服务器,而直接忽略或删除访问请求。通过这种方式,可以避免在第一域名的转移概率小于或等于预设概率阈值时,将该第一域名发送给域名分类服务器却不能查询得到该第一域名的类别的情况,从而有利于降低域名分类服务器的负载。In an implementation manner, the transfer probability of the first domain name is less than or equal to the preset probability threshold, indicating that the probability that the first domain name category exists in the domain name classification server is low. In this case, the gateway device may not transfer the first domain name Send to the domain name classification server, and directly obtain the Internet Protocol address corresponding to the first domain name, and send the obtained Internet Protocol address to the terminal device; or, the gateway device may not send the first domain name to the domain name classification server, and directly ignore or Delete access request. In this way, it is possible to avoid the situation that when the transfer probability of the first domain name is less than or equal to the preset probability threshold, the first domain name is sent to the domain name classification server but the category of the first domain name cannot be obtained, which is beneficial to reduce The load of the domain name classification server.
在一种实现方式中,预设概率阈值可以是网关设备默认设置的,也可以是网关设备根据用户操作设置的,本申请实施例对此不作限定。例如,网关设备可以计算并统计大量的常见域名的转移概率,并将各个常见域名的转移概率的平均值作为预设概率阈值。In an implementation manner, the preset probability threshold may be set by the gateway device by default, or may be set by the gateway device according to user operations, which is not limited in the embodiment of the present application. For example, the gateway device may calculate and count the transfer probabilities of a large number of common domain names, and use the average value of the transfer probabilities of each common domain name as the preset probability threshold.
步骤S204:域名分类服务器查询第一域名的类别。Step S204: The domain name classification server queries the category of the first domain name.
具体的,域名分类服务器接收到网关设备发送的第一域名之后,可以查询是否存在第一域名,若存在第一域名,则继续查询该第一域名的类别,第一域名的类别可以为允许访问类别或禁止访问类别;若不存在第一域名,则向网关设备发送查询失败消息,该查询失败消息用于指示域名分类服务器中不存在第一域名或者第一域名的类别。Specifically, after the domain name classification server receives the first domain name sent by the gateway device, it can query whether the first domain name exists. If the first domain name exists, continue to query the category of the first domain name. The category of the first domain name can be access allowed Category or forbidden category; if the first domain name does not exist, a query failure message is sent to the gateway device, and the query failure message is used to indicate that the first domain name or the category of the first domain name does not exist in the domain name classification server.
步骤S205:域名分类服务器将第一域名的类别发送给网关设备。Step S205: The domain name classification server sends the category of the first domain name to the gateway device.
具体的,域名分类服务器查询到第一域名的类别之后,可以将第一域名的类别发送给网关设备,以使网关设备接收到域名分类服务器发送的第一域名的类别之后,可以根据第一域名的类别对接收到的访问请求进行不同的处理。Specifically, after the domain name classification server finds the category of the first domain name, it can send the category of the first domain name to the gateway device, so that after receiving the category of the first domain name sent by the domain name classification server, the gateway device can The categories of the received access requests are processed differently.
步骤S206:若第一域名的类别为允许访问类别,则网关设备获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备。Step S206: If the category of the first domain name is the allowed access category, the gateway device obtains the Internet Protocol address corresponding to the first domain name, and sends the Internet Protocol address to the terminal device.
具体的,网关设备接收到域名分类服务器发送的第一域名的类别之后,若第一域名的类别为允许访问类别,则表明第一域名对应的网站为健康网站,此时,网关设备可以获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备,以使终端设备获取该互联网协议地址对应的存储设备中存储的内容。在一种实现方式中,若第一域名的类别为禁止访问类别,则表明第一域名对应的网站为不健康网站,此时,网关设备可以忽略或删除第一域名对应的访问请求,以阻止终端设备获取不健康网站中的内容并对用户的 心理健康造成影响。Specifically, after the gateway device receives the category of the first domain name sent by the domain name classification server, if the category of the first domain name is an allowed access category, it indicates that the website corresponding to the first domain name is a healthy website. At this time, the gateway device can obtain the category of the first domain name. An Internet Protocol address corresponding to a domain name, and the Internet Protocol address is sent to the terminal device, so that the terminal device can obtain the content stored in the storage device corresponding to the Internet Protocol address. In one implementation, if the category of the first domain name is forbidden, it indicates that the website corresponding to the first domain name is an unhealthy website. At this time, the gateway device can ignore or delete the access request corresponding to the first domain name to prevent the terminal The device obtains content from unhealthy websites and affects the user's mental health.
在一种实现方式中,网关设备中存在域名系统(Domain Name System,DNS)缓存,DNS缓存中记录了多个域名以及各个域名对应的互联网协议地址,网关设备获取第一域名对应的互联网协议地址的具体实施方式可以为:网关设备在DNS缓存中查询第一域名对应的互联网协议地址。在一种实现方式中,网关设备获取第一域名对应的互联网协议地址的具体实施方式还可以为:网关设备将第一域名发送给域名解析服务器,以使域名解析服务器查询第一域名对应的互联网协议地址,并接收域名解析服务器发送的互联网协议地址。在一种实现方式中,网关设备获取第一域名对应的互联网协议地址的具体实施方式还可以为:网关设备在DNS缓存中查询是否存在第一域名对应的互联网协议地址,若DNS缓存中不存在第一域名对应的互联网协议地址,则将第一域名发送给域名解析服务器,以使域名解析服务器查询第一域名对应的互联网协议地址,并接收域名解析服务器发送的互联网协议地址。In one implementation, there is a Domain Name System (DNS) cache in the gateway device. Multiple domain names and the Internet protocol address corresponding to each domain name are recorded in the DNS cache. The gateway device obtains the Internet protocol address corresponding to the first domain name. A specific implementation manner may be: the gateway device queries the Internet Protocol address corresponding to the first domain name in the DNS cache. In an implementation manner, the specific implementation manner for the gateway device to obtain the Internet Protocol address corresponding to the first domain name may also be: the gateway device sends the first domain name to the domain name resolution server, so that the domain name resolution server queries the Internet corresponding to the first domain name Protocol address, and receive the Internet protocol address sent by the domain name resolution server. In an implementation manner, the specific implementation manner for the gateway device to obtain the Internet Protocol address corresponding to the first domain name may also be: the gateway device queries in the DNS cache whether the Internet Protocol address corresponding to the first domain name exists, and if it does not exist in the DNS cache The internet protocol address corresponding to the first domain name is sent to the domain name resolution server, so that the domain name resolution server queries the internet protocol address corresponding to the first domain name and receives the internet protocol address sent by the domain name resolution server.
相较于现有技术对URL进行过滤时,会存在加密流量无法拦截而需要拆包,并且对URL进行过滤时只能拦截超文本传输协议(HyperText Transfer Protocol,HTTP)流量以及知名端口的流量,而无法拦截其他协议的流量、动态端口的流量或者私有端口的流量的问题。本申请实施例通过对域名进行过滤,由于域名数据包不用拆包,所以可以有效降低网关设备拆包的开销。另外,由于域名中不包括协议和端口号,所以本申请实施例只要在访问请求包括的第一域名满足拦截要求(如第一域名的类别为禁止访问类别)的情况下,可以拦截任意协议的流量和任何端口的流量。Compared with the prior art when filtering URLs, encrypted traffic cannot be intercepted and needs to be unpacked, and when filtering URLs, only HyperText Transfer Protocol (HTTP) traffic and traffic on well-known ports can be intercepted. It cannot intercept the traffic of other protocols, the traffic of dynamic ports, or the traffic of private ports. The embodiment of the application filters the domain name, and since the domain name data packet does not need to be unpacked, the cost of unpacking the gateway device can be effectively reduced. In addition, since the domain name does not include the protocol and port number, the embodiment of the present application can intercept any protocol's data as long as the first domain name included in the access request meets the interception requirements (for example, the category of the first domain name is forbidden) Traffic and traffic on any port.
在一种实现方式中,前述访问请求还可以包括终端设备的标识,在步骤S201(即网关设备接收终端设备发送的访问请求)之后,网关设备还可以判断终端设备的标识是否为预设标识,若终端设备的标识为预设标识,则触发执行步骤S202(即获取第一域名的转移概率);若终端设备的标识不为预设标识,则触发获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备的步骤。In an implementation manner, the aforementioned access request may also include the identification of the terminal device. After step S201 (that is, the gateway device receives the access request sent by the terminal device), the gateway device may also determine whether the identification of the terminal device is a preset identification. If the identification of the terminal device is the preset identification, step S202 is triggered (ie, the transfer probability of the first domain name is acquired); if the identification of the terminal device is not the preset identification, the acquisition of the Internet Protocol address corresponding to the first domain name is triggered, and The step of sending the Internet Protocol address to the terminal device.
其中,终端设备的标识用于唯一标识一个终端设备,在一种实现方式中,终端设备的标识可以是该终端设备的唯一标识码或者该终端设备的物理地址,本申请实施例对此不作限定。预设标识可以为预先设定的需要对其访问请求进行限制的终端设备的标识,在一种实现方式中,应用于绿色上网技术方案中,预设标识可以为未成年人的终端设备的标识。本申请实施例在获取第一域名的转移概率之前,通过判断终端设备的标识是否为预设标识,并在终端设备的标识为预设标识的情况下,才触发执行获取第一域名的转移概率的步骤,可以避免终端设备的标识不为预设标识的情况下(即该终端设备(如成年人或家长的终端设备)发送的访问请求不应该被限制的情况下),触发获取该访问请求包括的第一域名的转移概率的步骤,这样一方面可以避免拦截家长(或成年人)的终端设备发起的访问请求,从而造成误操作;另一方面,也可以避免不必要的开销,降低网关设备和域名分类服务器的负载。在一种实现方式中,预设标识可以是网关设备根据用户操作设置的。The identification of the terminal device is used to uniquely identify a terminal device. In an implementation manner, the identification of the terminal device may be the unique identification code of the terminal device or the physical address of the terminal device, which is not limited in the embodiment of this application. . The preset identifier may be a preset identifier of a terminal device that needs to restrict access requests. In one implementation, it is applied to a green Internet technical solution, and the preset identifier may be an identifier of a minor terminal device . Before obtaining the transfer probability of the first domain name, this embodiment of the application determines whether the terminal device's identification is a preset identification, and triggers the execution of obtaining the transfer probability of the first domain name when the terminal device's identification is the preset identification The steps can avoid triggering the acquisition of the access request when the terminal device's identifier is not the preset identifier (that is, the terminal device (such as the terminal device of an adult or parent) should not be restricted) Include the steps of the transfer probability of the first domain name, so that on the one hand, it can avoid intercepting the access request initiated by the parent's (or adult) terminal device, thereby causing misoperation; on the other hand, it can also avoid unnecessary expenses and reduce the gateway The load of the device and the domain name classification server. In an implementation manner, the preset identifier may be set by the gateway device according to user operations.
可见,通过实施本申请实施例,可以在第一域名的转移概率大于预设概率阈值的情况下,才将该第一域名发送给域名分类服务器,这样可以避免在第一域名的转移概率小于或等于预设概率阈值时,将该第一域名发送给域名分类服务器却不能查询得到该第一域名的 类别的情况,从而有利于降低域名分类服务器的负载以及有利于提高网站访问速度。It can be seen that by implementing the embodiments of this application, the first domain name can be sent to the domain name classification server only when the transfer probability of the first domain name is greater than the preset probability threshold. This can prevent the transfer probability of the first domain name from being less than or When it is equal to the preset probability threshold, the first domain name is sent to the domain name classification server but the category of the first domain name cannot be queried, thereby helping to reduce the load of the domain name classification server and improving the website access speed.
请参见图3,图3是本申请实施例提供的另一种数据处理方法的流程示意图,该方法详细阐述了仅在同时满足第一域名的转移概率大于预设概率阈值,且域名数据库中不存在第一域名的情况下,才将第一域名发送给域名分类服务器的原因,该方法包括但不限于如下步骤:Please refer to Figure 3, which is a schematic flow chart of another data processing method provided by an embodiment of the present application. This method elaborates only when the transfer probability of the first domain name is greater than the preset probability threshold at the same time, and the domain name database is not The reason why the first domain name is sent to the domain name classification server only when the first domain name exists, the method includes but is not limited to the following steps:
步骤S301:网关设备接收终端设备发送的访问请求,访问请求包括第一域名。Step S301: The gateway device receives the access request sent by the terminal device, where the access request includes the first domain name.
步骤S302:网关设备获取第一域名的转移概率。Step S302: The gateway device obtains the transfer probability of the first domain name.
需要说明的是,步骤S301~步骤S302的具体执行过程可分别参见图2中步骤S201~步骤S202的具体描述,在此不赘述。It should be noted that the specific execution process of step S301 to step S302 can be referred to the specific description of step S201 to step S202 in FIG. 2 respectively, which is not repeated here.
在一种实现方式中,网关设备中可以具有域名数据库,域名数据库可以包括多个域名以及多个域名中各个域名的类别。In an implementation manner, the gateway device may have a domain name database, and the domain name database may include multiple domain names and categories of each of the multiple domain names.
步骤S303:若第一域名的转移概率大于预设概率阈值,且域名数据库中不存在第一域名,则网关设备将第一域名发送给域名分类服务器。Step S303: If the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, the gateway device sends the first domain name to the domain name classification server.
具体的,网关设备接收终端设备发送的访问请求之后,可以判断第一域名的转移概率是否大于预设概率阈值,并判断域名数据库中是否存在第一域名,并仅在同时满足第一域名的转移概率大于预设概率阈值和域名数据库中不存在第一域名时,才将第一域名发送给域名分类服务器。这样可以避免域名数据库中存在第一域名以及第一域名的类别时网关设备却通过域名分类服务器获取第一域名的类别的情况,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。Specifically, after the gateway device receives the access request sent by the terminal device, it can determine whether the transfer probability of the first domain name is greater than the preset probability threshold, and determine whether the first domain name exists in the domain name database, and only when the first domain name transfer is satisfied at the same time When the probability is greater than the preset probability threshold and the first domain name does not exist in the domain name database, the first domain name is sent to the domain name classification server. This can avoid the situation that the gateway device obtains the category of the first domain name through the domain name classification server when the first domain name and the category of the first domain name exist in the domain name database. This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to Reduce the load of the domain name classification server, thereby helping to improve the speed of website access.
在一种实现方式中,域名数据库中存储的域名少于域名分类服务器中存储的域名,所以在域名数据库中查询得到第一域名的类别的概率低于在域名分类服务器中查询得到第一域名的类别的概率,也就是说,域名数据库中不存在第一域名时,通过将第一域名发送给域名分类服务器,可以在域名分类服务器中查询第一域名的类别。In one implementation, the domain name stored in the domain name database is less than the domain name stored in the domain name classification server, so the probability of obtaining the category of the first domain name in the domain name database is lower than that of obtaining the first domain name in the domain name classification server. The probability of the category, that is, when the first domain name does not exist in the domain name database, by sending the first domain name to the domain name classification server, the category of the first domain name can be queried in the domain name classification server.
在一种实现方式中,网关设备在接收到终端设备发送的访问请求之后,可以先判断第一域名的转移概率是否大于预设概率阈值,并在第一域名的转移概率大于预设概率阈值时,才判断域名数据库中是否存在第一域名;在第一域名的转移概率小于或等于预设概率阈值时,网关设备可以触发获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备的步骤,或者,网关设备可以忽略或删除第一域名对应的访问请求。在一种实现方式中,网关设备在接收到终端设备发送的访问请求之后,可以先判断域名数据库中是否存在第一域名,并在域名数据库中不存在第一域名时,才判断第一域名的转移概率是否大于预设概率阈值;域名数据库中存在第一域名时,网关设备可以继续在域名数据库中查询第一域名的类别,若第一域名的类别为允许访问类别,则触发获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备的步骤,若第一域名的类别为禁止访问类别,则忽略或删除第一域名对应的访问请求。需要说明的是,本申请实施例对判断第一域名的转移概率是否大于预设概率阈值,与判断域名数据库中是否存在第一域名的执行顺序不做限定,既可以按照先后顺序执行,也可以同时执行。In an implementation manner, after receiving the access request sent by the terminal device, the gateway device may first determine whether the transfer probability of the first domain name is greater than the preset probability threshold, and when the transfer probability of the first domain name is greater than the preset probability threshold , It is determined whether the first domain name exists in the domain name database; when the transfer probability of the first domain name is less than or equal to the preset probability threshold, the gateway device can trigger to obtain the Internet Protocol address corresponding to the first domain name, and send the Internet Protocol address to The step of the terminal device, or the gateway device may ignore or delete the access request corresponding to the first domain name. In one implementation, after receiving the access request sent by the terminal device, the gateway device may first determine whether the first domain name exists in the domain name database, and only determine whether the first domain name exists in the domain name database. Whether the transfer probability is greater than the preset probability threshold; when the first domain name exists in the domain name database, the gateway device can continue to query the domain name database for the category of the first domain name, and if the category of the first domain name is the allowed access category, it will trigger the acquisition of the first domain name Corresponding to the Internet Protocol address, and the step of sending the Internet Protocol address to the terminal device, if the category of the first domain name is a forbidden category, then ignore or delete the access request corresponding to the first domain name. It should be noted that the embodiment of the application does not limit the execution order of judging whether the transfer probability of the first domain name is greater than the preset probability threshold and judging whether the first domain name exists in the domain name database. It can be executed in a sequential order or Simultaneous execution.
步骤S304:域名分类服务器查询第一域名的类别。Step S304: The domain name classification server queries the category of the first domain name.
步骤S305:域名分类服务器将第一域名的类别发送给网关设备。Step S305: The domain name classification server sends the category of the first domain name to the gateway device.
步骤S306:若第一域名的类别为允许访问类别,则网关设备获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备。Step S306: If the category of the first domain name is the allowed access category, the gateway device obtains the Internet Protocol address corresponding to the first domain name, and sends the Internet Protocol address to the terminal device.
需要说明的是,步骤S304~步骤S306的具体执行过程可分别参见图2中步骤S204~步骤S206的具体描述,在此不赘述。It should be noted that the specific execution process of step S304 to step S306 can be referred to the specific description of step S204 to step S206 in FIG. 2 respectively, which will not be repeated here.
在一种实现方式中,在步骤S304(即网关设备接收域名分类服务器发送的第一域名的类别)之后,网关设备可以将第一域名与第一域名的类别关联存储于域名数据库中。通过这种方式,可以在域名数据库中添加第一域名以及该第一域名的类别,这样当网关设备后续再次接收到包括该第一域名的访问请求时,可以直接在网关设备中的域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。In an implementation manner, after step S304 (that is, the gateway device receives the category of the first domain name sent by the domain name classification server), the gateway device may associate the first domain name with the category of the first domain name and store it in the domain name database. In this way, the first domain name and the category of the first domain name can be added to the domain name database, so that when the gateway device subsequently receives an access request including the first domain name again, it can be directly stored in the domain name database in the gateway device Obtain the category of the first domain name without having to obtain the category of the first domain name through the domain name classification server. This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, thereby helping to improve website access speed.
在一种实现方式中,网关设备在检测到上电操作时,可以向域名分类服务器发送数据初始化请求,并接收域名分类服务器发送的域名数据,并将域名数据存储于域名数据库中,其中,数据初始化请求可以用于请求获取域名数据,域名数据可以包括域名集合、域名集合中各个域名的类别,域名集合中的域名可以是域名分类服务器根据各个域名的访问时间、访问时长和/或访问频率确定的,域名集合中的各个域名可以包括一级域名和一级域名的子域名。In one implementation, when the gateway device detects the power-on operation, it can send a data initialization request to the domain name classification server, and receive the domain name data sent by the domain name classification server, and store the domain name data in the domain name database, where the data The initialization request can be used to request domain name data. The domain name data can include the domain name collection and the category of each domain name in the domain name collection. The domain name in the domain name collection can be determined by the domain name classification server according to the access time, access duration and/or access frequency of each domain name Yes, each domain name in the domain name set can include the first-level domain name and the sub-domain name of the first-level domain name.
在一种实现方式中,网关设备重新上电时,可能会导致域名数据库中存储的数据丢失,本申请实施例在检测到上电操作时,通过向域名分类服务器发送数据初始化请求,可以向域名分类服务器请求获取域名数据,通过将域名数据存储于域名数据库中,可以自动完成域名数据库的初始化,这样有利于提高域名数据库中的域名的命中率,可以减少通过域名分类服务器获取访问请求中的第一域名的类别的次数,从而有利于降低域名分类服务器的负载。In an implementation manner, when the gateway device is powered on again, the data stored in the domain name database may be lost. When the power-on operation is detected in the embodiment of the application, the data initialization request can be sent to the domain name classification server. The classification server requests to obtain domain name data. By storing the domain name data in the domain name database, the initialization of the domain name database can be automatically completed. This helps to improve the hit rate of domain names in the domain name database, and can reduce the number of access requests obtained through the domain name classification server. The number of categories of a domain name, thereby helping to reduce the load of the domain name classification server.
在一种实现方式中,域名分类服务器可以统计域名分类服务器中存储的各个域名的访问时间、访问时长和/或访问频率,并对各个域名的访问时间、访问时长和/或访问频率进行加权求和运算,以得到各个域名的访问价值,并按照访问价值从高到低的顺序从域名分类服务器存储的所有域名中筛选出第一数量的域名,第一数量的域名组成域名集合,并获取域名集合中各个域名的类别,并将域名集合以及域名集合中各个域名的类别作为域名数据发送给网关设备。其中,域名的访问时间可以指该域名上一次在域名分类服务器中被命中的时刻;域名的访问时长可以指该域名在域名分类服务器中的存储期间内被命中的总次数;域名的访问频率可以指该域名在域名分类服务器中的存储期间内被命中的总次数与存储总时长之间的比值。域名的访问价值越高表明该域名在域名分类服务器中被命中的概率越高,或者,用户更偏向于访问访问价值较高的域名对应的网站。所以将访问价值较高的域名发送给网关设备,可以提高网关设备中域名被命中的概率,从而有利于降低域名分类服务器的负载。In one implementation, the domain name classification server can count the access time, access duration, and/or access frequency of each domain name stored in the domain name classification server, and perform a weighted calculation on the access time, access duration, and/or access frequency of each domain name. Sum operation to obtain the access value of each domain name, and select the first number of domain names from all the domain names stored in the domain name classification server according to the order of access value from high to low. The first number of domain names form a domain name set and obtain the domain name The category of each domain name in the set is collected, and the domain name set and the category of each domain name in the domain name set are sent to the gateway device as domain name data. Among them, the access time of the domain name can refer to the last time the domain name was hit in the domain name classification server; the access time of the domain name can refer to the total number of hits during the storage period of the domain name in the domain name classification server; the access frequency of the domain name can be Refers to the ratio of the total number of hits during the storage period of the domain name in the domain name classification server to the total storage duration. The higher the visit value of the domain name indicates the higher the probability of the domain name being hit in the domain name classification server, or the user is more inclined to visit the website corresponding to the domain name with higher visit value. Therefore, sending a domain name with a higher access value to the gateway device can increase the probability that the domain name in the gateway device is hit, thereby helping to reduce the load of the domain name classification server.
在一种实现方式中,网关设备在存储域名集合中的各个域名时,可以将各个域名分为 多级存储,本申请实施例以各个域名分为两级存储为例进行介绍。网关设备将各个域名分为两级存储时,各个域名可以包括一级域名和一级域名的子域名,其中,一级域名可以指顶级域名或者顶级域名的下一级域名,本申请实施例对此不做限定。例如,当一级域名指顶级域名,且域名为“www.abc.com”时,该域名的一级域名可以为“.com”,该一级域名的子域名可以为“www.abc”。又如,当一级域名指顶级域名的下一级域名,且域名为“www.abc.com”时,该域名的一级域名可以为“abc.com”,该一级域名的子域名可以为“www”。在一种实现方式中,网关设备将域名集合中的各个域名分为两级存储时,可以存储各个域名的一级域名的哈希值和该一级域名的子域名的哈希值。网关设备采用两级存储方式存储域名时,网关设备判断第一域名是否存在于域名数据库的具体实施方式可以为:判断第一域名包括的一级域名的哈希值是否存在于域名数据库,若第一域名包括的一级域名的哈希值不存在于域名数据库,则表明第一域名不存在于域名数据库。通过这种方式,仅需计算第一域名包括的一级域名的哈希值,而不用计算完整的第一域名的哈希值,可以降低网关设备的功耗。In an implementation manner, when the gateway device stores each domain name in the domain name set, each domain name may be divided into multi-level storage. The embodiment of the present application takes each domain name divided into two-level storage as an example for introduction. When the gateway device divides each domain name into two levels of storage, each domain name may include the first-level domain name and the subdomain name of the first-level domain name. The first-level domain name may refer to the top-level domain name or the next-level domain name of the top-level domain name. This is not limited. For example, when the first-level domain name refers to the top-level domain name and the domain name is "www.abc.com", the first-level domain name of the domain name can be ".com", and the subdomain name of the first-level domain name can be "www.abc". For another example, when the first-level domain name refers to the next-level domain name of the top-level domain name, and the domain name is "www.abc.com", the first-level domain name of the domain name can be "abc.com", and the subdomains of the first-level domain name can be As "www". In an implementation manner, when the gateway device divides each domain name in the domain name set into two-level storage, it can store the hash value of the first-level domain name of each domain name and the hash value of the sub-domain name of the first-level domain name. When the gateway device uses a two-level storage method to store the domain name, the specific implementation manner for the gateway device to determine whether the first domain name exists in the domain name database may be: determining whether the hash value of the first-level domain name included in the first domain name exists in the domain name database, if the first If the hash value of the first-level domain name included in a domain name does not exist in the domain name database, it indicates that the first domain name does not exist in the domain name database. In this way, only the hash value of the first-level domain name included in the first domain name needs to be calculated, instead of calculating the hash value of the complete first domain name, the power consumption of the gateway device can be reduced.
通过实施本申请实施例,可以仅在同时满足第一域名的转移概率大于预设概率阈值和域名数据库中不存在第一域名的情况下,才将该第一域名发送给域名分类服务器,这样可以避免域名数据库中存在第一域名以及第一域名的类别时却通过域名分类服务器获取第一域名的类别的情况,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。By implementing the embodiments of this application, the first domain name can be sent to the domain name classification server only when the transfer probability of the first domain name is greater than the preset probability threshold and the first domain name does not exist in the domain name database. Avoid the situation where the first domain name and the category of the first domain name exist in the domain name database, but the category of the first domain name is obtained through the domain name classification server. This can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the domain name classification server The load, which helps to improve the speed of website access.
请参见图4,图4是本申请实施例提供的又一种数据处理方法的流程示意图,该方法详细阐述了域名数据库中的一级域名与子域名之间具有层级存储关系时,如何更新域名数据库中存储的一级域名和子域名,该方法包括但不限于如下步骤:Please refer to Figure 4. Figure 4 is a schematic flow diagram of another data processing method provided by an embodiment of this application. This method explains in detail how to update the domain name when there is a hierarchical storage relationship between the first-level domain name and the sub-domain name in the domain name database The first-level domain name and subdomain name stored in the database, the method includes but not limited to the following steps:
步骤S401:网关设备接收终端设备发送的访问请求,访问请求包括第一域名,第一域名包括第二域名和第三域名。Step S401: The gateway device receives an access request sent by the terminal device, where the access request includes a first domain name, and the first domain name includes a second domain name and a third domain name.
在本申请实施例中,网关设备可以具有域名数据库,域名数据库中存储的各个域名可以分为多级存储,本申请实施例以各个域名分为两级存储为例进行介绍。具体的,域名数据库中的各个域名可以包括一级域名以及该一级域名的子域名,其中,一级域名可以指顶级域名或者顶级域名的下一级域名,本申请实施例对此不做限定。在一种实现方式中,访问请求中的第一域名包括的第二域名可以为一级域名,访问请求中的第一域名包括的第三域名可以为第二域名的子域名。In the embodiment of the present application, the gateway device may have a domain name database, and each domain name stored in the domain name database may be divided into multi-level storage. The embodiment of the present application takes the two-level storage of each domain name as an example for introduction. Specifically, each domain name in the domain name database may include a first-level domain name and a subdomain name of the first-level domain name. The first-level domain name may refer to a top-level domain name or a sub-domain name of the top-level domain name, which is not limited in this application embodiment. . In an implementation manner, the second domain name included in the first domain name in the access request may be a first-level domain name, and the third domain name included in the first domain name in the access request may be a subdomain name of the second domain name.
需要说明的是,网关设备接收终端设备发送的访问请求的具体执行过程可参见图2中步骤S201的具体描述,在此不赘述。It should be noted that, for the specific execution process of the gateway device receiving the access request sent by the terminal device, refer to the specific description of step S201 in FIG. 2, which will not be repeated here.
步骤S402:网关设备判断域名数据库中的各个域名包括的一级域名中是否存在第二域名。Step S402: The gateway device judges whether there is a second domain name among the first-level domain names included in each domain name in the domain name database.
在本申请实施例中,域名数据库中的域名采用两级存储方式(如分为一级域名和子域名)进行存储,且一级域名与子域名之间可以具有层级存储关系。以图5所示的一级域名与子域名之间具有层级存储关系的场景示意图为例,当域名数据库中实际存储的完整域名为“www.abc.com”、“blog.xx.com”和“blog.xx.net”,且域名数据库中的一级域名与子域名 之间具有层级存储关系时,这3个完整域名在域名数据库中的存储形式如图5所示。In this embodiment of the application, the domain names in the domain name database are stored in a two-level storage manner (for example, divided into first-level domain names and sub-domain names), and there may be a hierarchical storage relationship between the first-level domain names and sub-domain names. Take the scenario diagram of the hierarchical storage relationship between the first-level domain name and the subdomain name shown in Figure 5 as an example, when the complete domain name actually stored in the domain name database is "www.abc.com", "blog.xx.com" and When "blog.xx.net" and the first-level domain name and subdomain name in the domain name database have a hierarchical storage relationship, the storage form of these three complete domain names in the domain name database is shown in Figure 5.
在一种实现方式中,网关设备接收到终端设备发送的访问请求之后,可以首先判断域名数据库中是否存在第一域名,若域名数据库中不存在第一域名,才触发获取第一域名的转移概率,以判断第一域名的转移概率是否大于预设概率阈值,通过这种方式,在域名数据库中存在第一域名的情况下,即可以直接在域名数据库中查询第一域名的类别的情况下,可以避免获取第一域名的转移概率以及避免向域名分类服务器请求获取第一域名的类别的情况,从而有利于同时降低网关设备和域名分类服务器的开销。In one implementation, after the gateway device receives the access request sent by the terminal device, it can first determine whether the first domain name exists in the domain name database. If the first domain name does not exist in the domain name database, it triggers the transfer probability of obtaining the first domain name , To determine whether the transfer probability of the first domain name is greater than the preset probability threshold. In this way, when the first domain name exists in the domain name database, that is, when the category of the first domain name can be directly queried in the domain name database, It is possible to avoid obtaining the transfer probability of the first domain name and avoid requesting the domain name classification server to obtain the category of the first domain name, thereby helping to reduce the overhead of the gateway device and the domain name classification server at the same time.
在一种实现方式中,当域名数据库中的域名采用两级存储方式进行存储,且一级域名与子域名之间具有层级存储关系时,网关设备判断域名数据库中是否存在第一域名的具体实施方式可以为:网关设备判断域名数据库中的各个域名包括的一级域名中是否存在第二域名,若不存在,则表明域名数据库中不存在第一域名;若存在,则继续判断域名数据库中的第二域名的子域名中是否存在第三域名,若域名数据库中的第二域名的子域名中不存在第三域名,则表明域名数据库中不存在第一域名,若域名数据库中的第二域名的子域名中存在第三域名,则表明域名数据库中存在第一域名。In one implementation, when the domain name in the domain name database is stored in a two-level storage mode, and there is a hierarchical storage relationship between the first-level domain name and the sub-domain name, the gateway device determines whether the first domain name exists in the domain name database. The method can be: the gateway device judges whether there is a second domain name among the first-level domain names included in each domain name in the domain name database. If it does not exist, it means that the first domain name does not exist in the domain name database; if so, it continues to determine whether the second domain name exists in the domain name database. Whether the third domain name exists in the subdomain name of the second domain name, if the third domain name does not exist in the subdomain name of the second domain name in the domain name database, it means that the first domain name does not exist in the domain name database, if the second domain name in the domain name database If the third domain name exists in the subdomain name of, it indicates that the first domain name exists in the domain name database.
在本申请实施例中,若域名数据库中的各个域名包括的一级域名中不存在第二域名,则可以执行步骤S403,若域名数据库中的各个域名包括的一级域名中存在第二域名,则可以执行步骤S408。In this embodiment of the application, if the second domain name does not exist among the first-level domain names included in each domain name in the domain name database, step S403 may be executed. If the second domain name exists among the first-level domain names included in each domain name in the domain name database, Then step S408 can be executed.
步骤S403:若域名数据库中的各个域名包括的一级域名中不存在第二域名,则网关设备获取第一域名的转移概率。Step S403: If the second domain name does not exist among the first-level domain names included in each domain name in the domain name database, the gateway device obtains the transfer probability of the first domain name.
具体的,若域名数据库中的各个域名包括的一级域名中不存在第二域名,则表明域名数据库中不存在第一域名,此时,网关设备需要向域名分类服务器请求获取第一域名的类别。在一种实现方式中,在域名数据库中不存在第一域名时,网关设备可以获取第一域名的转移概率,进而根据第一域名的转移概率是否大于预设概率阈值,以估计第一域名的类别存储于域名分类服务器中的概率,并在第一域名的类别存储于域名分类服务器中的概率较高时,才将第一域名发送给域名分类服务器。Specifically, if the second domain name does not exist in the first-level domain names included in each domain name in the domain name database, it indicates that the first domain name does not exist in the domain name database. At this time, the gateway device needs to request the domain name classification server to obtain the category of the first domain name. . In an implementation manner, when the first domain name does not exist in the domain name database, the gateway device can obtain the transfer probability of the first domain name, and then estimate the transfer probability of the first domain name according to whether the transfer probability of the first domain name is greater than a preset probability threshold. The probability that the category is stored in the domain name classification server, and when the probability that the category of the first domain name is stored in the domain name classification server is high, the first domain name is sent to the domain name classification server.
需要说明的是,网关设备获取第一域名的转移概率的具体执行过程可参见图2中步骤S202的具体描述,在此不赘述。It should be noted that, for the specific execution process of the gateway device obtaining the transfer probability of the first domain name, refer to the specific description of step S202 in FIG. 2, which is not repeated here.
步骤S404:若第一域名的转移概率大于预设概率阈值,则网关设备将第一域名发送给域名分类服务器,以使域名分类服务器查询第一域名的类别,网关设备接收域名分类服务器发送的第一域名的类别,若第一域名的类别为允许访问类别,则获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备。Step S404: If the transfer probability of the first domain name is greater than the preset probability threshold, the gateway device sends the first domain name to the domain name classification server, so that the domain name classification server queries the category of the first domain name, and the gateway device receives the first domain name sent by the domain name classification server. A category of a domain name. If the category of the first domain name is an allowed access category, the Internet Protocol address corresponding to the first domain name is obtained, and the Internet Protocol address is sent to the terminal device.
需要说明的是,步骤S404的具体执行过程可参见图2中步骤S203~步骤S206的具体描述,在此不赘述。It should be noted that the specific execution process of step S404 can refer to the specific description of step S203 to step S206 in FIG. 2, which is not repeated here.
步骤S405:网关设备获取域名数据库中所有域名包括的一级域名的数量,并判断域名数据库中所有域名包括的一级域名的数量是否大于或等于预设数量阈值。Step S405: The gateway device obtains the number of first-level domain names included in all domain names in the domain name database, and determines whether the number of first-level domain names included in all domain names in the domain name database is greater than or equal to a preset number threshold.
具体的,网关设备接收到域名分类服务器发送的第一域名的类别之后,可以将第一域名与第一域名的类别关联存储于域名数据库中。当网关设备中的域名数据库采用两级存储方式存储域名且一级域名与子域名之间具有层级存储关系时,网关设备可以将第一域名包 括的一级域名(即第二域名)、一级域名的子域名(即第三域名)和第一域名的类别关联存储于域名数据库中。通过这种方式,可以在域名数据库中添加第一域名以及该第一域名的类别,这样当网关设备后续再次接收到包括该第一域名的访问请求时,可以直接在域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。Specifically, after receiving the category of the first domain name sent by the domain name classification server, the gateway device may associate the category of the first domain name with the first domain name and store it in the domain name database. When the domain name database in the gateway device uses a two-level storage method to store domain names and there is a hierarchical storage relationship between the first-level domain name and the sub-domain name, the gateway device can store the first-level domain name (that is, the second domain name) and the first-level domain name included in the first domain name. The category association of the subdomain name (ie the third domain name) of the domain name and the first domain name is stored in the domain name database. In this way, the first domain name and the category of the first domain name can be added to the domain name database, so that when the gateway device subsequently receives an access request that includes the first domain name again, it can directly obtain the first domain name in the domain name database. There is no need to obtain the category of the first domain name through the domain name classification server, which can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, and is beneficial to improve the website access speed.
在一种实现方式中,由于网关设备的存储空间有限,因此,网关设备可以对域名数据库中存储的一级域名的数量进行限制,并对每个一级域名的子域名的数量进行限制。例如,网关设备可以设置域名数据库存储的一级域名的数量上限为预设数量阈值(如10个或者其他数量),设置域名数据库存储的每个一级域名的子域名的数量上限为第一预设数量阈值(如20个或者其他数量),也就是说,当域名数据库存储的一级域名的数量上限为10个,且域名数据库存储的每个一级域名的子域名的数量上限为20个时,相当于域名数据库存储的域名的数量上限为200个。需要说明的是,网关设备可以根据用户操作修改一级域名的数量上限以及每个一级域名的子域名的数量上限。In an implementation manner, because the storage space of the gateway device is limited, the gateway device can limit the number of first-level domain names stored in the domain name database and limit the number of subdomains of each first-level domain name. For example, the gateway device can set the upper limit of the number of first-level domain names stored in the domain name database to a preset number threshold (such as 10 or other numbers), and set the upper limit of the number of subdomains of each first-level domain name stored in the domain name database as the first predetermined number. Set the number threshold (such as 20 or other numbers), that is, when the upper limit of the number of first-level domain names stored in the domain name database is 10, and the upper limit of the number of sub-domain names of each first-level domain name stored in the domain name database is 20 At the time, the upper limit of the number of domain names stored in the domain name database is equivalent to 200. It should be noted that the gateway device can modify the upper limit of the number of first-level domain names and the upper limit of the number of subdomains of each first-level domain name according to user operations.
在一种实现方式中,网关设备接收到域名分类服务器发送的第一域名的类别之后,若域名数据库中的各个域名包括的一级域名中不存在第二域名,即网关设备需要在域名数据库中添加第二域名、第三域名以及由第二域名和第三域名组成的第一域名的类别时,网关设备需要先判断域名数据库中存储的一级域名的总数量是否大于或等于预设数量阈值,若域名数据库中存储的一级域名的总数量小于预设数量阈值,则网关设备可以直接在域名数据库中添加第二域名、第三域名以及第一域名的类别;若域名数据库中存储的一级域名的总数量大于或等于预设数量阈值,则网关设备需要在域名数据库中确定目标一级域名,并删除目标一级域名以及目标一级域名的所有子域名之后,才能在域名数据库中添加第二域名、第三域名以及第一域名的类别。其中,预设数量阈值为预先设置的一级域名的上限数量,域名数据库中存储的一级域名的总数量大于预设数量阈值时,表明域名数据库中存储的一级域名的总数量超过了预先设置的一级域名的上限数量;域名数据库中存储的一级域名的总数量等于预设数量阈值时,表明在域名数据库中添加一个一级域名之后,会使得域名数据库中的一级域名的总数量超过预先设置的一级域名的上限数量。在一种实现方式中,预设数量阈值可以是网关设备默认设置的,也可以是网关设备根据用户操作设置的,本申请实施例对此不作限定。需要说明的是,由于网关设备的物理存储空间有限,并且若网关设备的域名数据库中存储的一级域名的数量过多时,会导致查询域名数据库时网关设备的负载过高,因此,设置的预设数量阈值应小于或等于网关设备中一级域名的物理上限数量,一级域名的物理上限数量可以是网关设备在出厂时设置的。In one implementation, after the gateway device receives the category of the first domain name sent by the domain name classification server, if the second domain name does not exist in the first-level domain names included in each domain name in the domain name database, the gateway device needs to be in the domain name database When adding the second domain name, the third domain name, and the first domain name category composed of the second domain name and the third domain name, the gateway device needs to first determine whether the total number of first-level domain names stored in the domain name database is greater than or equal to the preset number threshold If the total number of first-level domain names stored in the domain name database is less than the preset number threshold, the gateway device can directly add the categories of the second domain name, the third domain name, and the first domain name in the domain name database; If the total number of first-level domain names is greater than or equal to the preset number threshold, the gateway device needs to determine the target first-level domain name in the domain name database, and delete the target first-level domain name and all subdomains of the target first-level domain name before adding it to the domain name database The category of the second domain name, third domain name, and first domain name. Among them, the preset number threshold is the upper limit of the preset first-level domain names. When the total number of first-level domain names stored in the domain name database is greater than the preset number threshold, it indicates that the total number of first-level domain names stored in the domain name database exceeds the preset number. Set the upper limit of the number of first-level domain names; when the total number of first-level domain names stored in the domain name database is equal to the preset number threshold, it indicates that adding a first-level domain name to the domain name database will increase the total number of first-level domain names in the domain name database. The number exceeds the preset upper limit number of first-level domain names. In an implementation manner, the preset number threshold may be set by the gateway device by default, or may be set by the gateway device according to user operations, which is not limited in this embodiment of the application. It should be noted that because the physical storage space of the gateway device is limited, and if the number of first-level domain names stored in the domain name database of the gateway device is too large, the load of the gateway device will be too high when querying the domain name database. Therefore, the preset preset The number threshold should be less than or equal to the physical upper limit of the first-level domain name in the gateway device. The physical upper limit of the first-level domain name can be set by the gateway device at the factory.
需要说明的是,网关设备接收到域名分类服务器发送的第一域名的类别之后,可以先执行获取第一域名对应的互联网协议地址的步骤,后执行步骤S405;或者,可以先执行步骤S405,后执行获取第一域名对应的互联网协议地址的步骤;或者,可以同时执行步骤S405和获取第一域名对应的互联网协议地址的步骤,本申请实施例对此不作限定。It should be noted that after the gateway device receives the category of the first domain name sent by the domain name classification server, it may first execute the step of obtaining the Internet Protocol address corresponding to the first domain name, and then execute step S405; or, it may execute step S405 first, and then The step of obtaining the internet protocol address corresponding to the first domain name is performed; or, the step S405 and the step of obtaining the internet protocol address corresponding to the first domain name may be performed at the same time, which is not limited in the embodiment of the present application.
步骤S406:若一级域名的数量大于或等于预设数量阈值,则网关设备获取域名数据库中各个域名包括的一级域名的存储价值,并在域名数据库中删除目标一级域名以及目标一 级域名的所有子域名,目标一级域名为域名数据库中所有域名包括的一级域名中存储价值最低的一级域名。Step S406: If the number of first-level domain names is greater than or equal to the preset number threshold, the gateway device obtains the storage value of the first-level domain names included in each domain name in the domain name database, and deletes the target first-level domain name and the target first-level domain name in the domain name database The target first-level domain name is the first-level domain name with the lowest storage value among all the first-level domain names included in the domain name database.
其中,一级域名的存储价值越高表明该一级域名在域名数据库中被命中的概率越高,因此,网关设备将存储价值最低的一级域名确定为目标一级域名,进而用第一域名包括的第二域名替换目标一级域名,可以提高域名数据库中的一级域名的平均命中率,进而可以尽量减少网关设备与域名分类服务器之间的通信流量,从而有利于降低域名分类服务器的负载。Among them, the higher the storage value of the first-level domain name, the higher the probability of the first-level domain name being hit in the domain name database. Therefore, the gateway device determines the first-level domain name with the lowest storage value as the target first-level domain name, and then uses the first domain name The included second domain name replaces the target first-level domain name, which can increase the average hit rate of the first-level domain name in the domain name database, thereby minimizing the communication traffic between the gateway device and the domain name classification server, thereby helping to reduce the load of the domain name classification server .
在一种实现方式中,网关设备获取域名数据库中各个域名包括的一级域名的存储价值的具体实施方式可以为:网关设备根据域名数据库中各个域名包括的一级域名的使用时间、使用时长和/或使用频率,得到域名数据库中各个域名包括的一级域名的存储价值。具体的,网关设备可以对域名数据库中各个域名包括的一级域名的使用时间、使用时长和/或使用频率进行加权求和运算,以得到域名数据库中各个域名包括的一级域名的存储价值。其中,一级域名的使用时间可以指该一级域名上一次在域名数据库中被命中的时刻;一级域名的使用时长可以指该一级域名在域名数据库中的存储期间内被命中的总次数;一级域名的使用频率可以指该一级域名在域名数据库中的存储期间内被命中的总次数与存储总时长之间的比值。In an implementation manner, the specific implementation manner for the gateway device to obtain the storage value of the first-level domain names included in each domain name in the domain name database may be: the gateway device according to the use time, use time and length of the first-level domain names included in each domain name in the domain name database / Or frequency of use, to obtain the storage value of the first-level domain names included in each domain name in the domain name database. Specifically, the gateway device may perform a weighted sum operation on the use time, use duration, and/or use frequency of the first-level domain names included in each domain name in the domain name database to obtain the storage value of the first-level domain names included in each domain name in the domain name database. Among them, the use time of the first-level domain name can refer to the last time the first-level domain name was hit in the domain name database; the use time of the first-level domain name can refer to the total number of times the first-level domain name was hit during the storage period in the domain name database ; The frequency of use of the first-level domain name may refer to the ratio between the total number of hits during the storage period of the first-level domain name in the domain name database and the total storage duration.
在一种实现方式中,若一级域名的数量小于预设数量阈值,则网关设备可以直接将第二域名、第三域名以及第一域名的类别关联存储于域名数据库中,即网关设备可以直接执行步骤S407。In one implementation, if the number of first-level domain names is less than the preset number threshold, the gateway device can directly associate the categories of the second domain name, the third domain name, and the first domain name in the domain name database, that is, the gateway device can directly Step S407 is executed.
步骤S407:网关设备将第二域名作为一级域名,将第三域名作为第二域名的子域名,并将第二域名、第三域名以及第一域名的类别关联存储于域名数据库中。Step S407: The gateway device uses the second domain name as the first-level domain name, and the third domain name as the subdomain name of the second domain name, and stores the second domain name, the third domain name, and the category of the first domain name in the domain name database in association with each other.
具体的,网关设备在域名数据库中删除目标一级域名以及目标一级域名的所有子域名之后,可以将第二域名作为一级域名,将第三域名作为第二域名的子域名,并将第二域名、第三域名以及第一域名的类别关联存储于域名数据库中。通过这种方式,当网关设备后续再次接收到包括(由第二域名和第三域名组成的)第一域名的访问请求时,可以直接在域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。Specifically, after the gateway device deletes the target first-level domain name and all subdomains of the target first-level domain name in the domain name database, it can use the second domain name as the first-level domain name, the third domain name as the subdomain name of the second domain name, and the first The category association of the second domain name, the third domain name and the first domain name is stored in the domain name database. In this way, when the gateway device subsequently receives an access request including the first domain name (consisting of the second domain name and the third domain name) again, it can directly obtain the category of the first domain name in the domain name database without passing the domain name. The classification server obtains the category of the first domain name, so that the communication traffic between the gateway device and the domain name classification server can be reduced, which is beneficial to reduce the load of the domain name classification server, and thus is beneficial to improve the website access speed.
步骤S408:若域名数据库中的各个域名包括的一级域名中存在第二域名,则网关设备判断域名数据库中的第二域名的子域名中是否存在第三域名。Step S408: If there is a second domain name among the first-level domain names included in each domain name in the domain name database, the gateway device determines whether the third domain name exists in the subdomain names of the second domain name in the domain name database.
具体的,由于域名数据库中的域名采用两级存储方式进行存储,且一级域名与子域名之间具有层级存储关系,因此,仅在域名数据库中存在第二域名以及第二域名的子域名中存在第三域名时,才能确定域名数据库中存在第一域名。因此,若域名数据库中的各个域名包括的一级域名中存在第二域名,网关设备需要进一步判断域名数据库中的第二域名的子域名中是否存在第三域名,才能确定域名数据库中是否存在第一域名。Specifically, since the domain names in the domain name database are stored in a two-level storage manner, and there is a hierarchical storage relationship between the first-level domain name and the subdomain name, therefore, only the second domain name and the subdomain name of the second domain name exist in the domain name database When the third domain name exists, it can be determined that the first domain name exists in the domain name database. Therefore, if there is a second domain name in the first-level domain name included in each domain name in the domain name database, the gateway device needs to further determine whether the third domain name exists in the subdomain name of the second domain name in the domain name database, in order to determine whether the first domain name exists in the domain name database. A domain name.
步骤S409:若域名数据库中的第二域名的子域名中存在第三域名,则网关设备在域名数据库中查询第一域名的类别,若第一域名的类别为允许访问类别,则获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备。Step S409: If the third domain name exists in the subdomain name of the second domain name in the domain name database, the gateway device queries the domain name database for the category of the first domain name, and if the category of the first domain name is the allowed access category, obtain the first domain name Corresponding Internet Protocol address, and send the Internet Protocol address to the terminal device.
具体的,若域名数据库中的第二域名的子域名中存在第三域名,则表明域名数据库中存在(由第二域名和第三域名组成的)第一域名。在本申请实施例中,当域名数据库中存在第一域名时,表明域名数据库也存储有第一域名的类别,因此,网关设备可以直接在域名数据库中查询得到第一域名的类别,并根据第一域名的类别进行不同的处理,具体的,若第一域名的类别为允许访问类别,则网关设备可以获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备。若第一域名的类别为禁止访问类别,则网关设备可以忽略或删除访问请求,以阻止终端设备获取不健康网站中的内容并对用户的心理健康造成影响。Specifically, if the third domain name exists in the subdomain name of the second domain name in the domain name database, it indicates that the first domain name (composed of the second domain name and the third domain name) exists in the domain name database. In the embodiment of the present application, when the first domain name exists in the domain name database, it indicates that the domain name database also stores the category of the first domain name. Therefore, the gateway device can directly query the domain name database to obtain the category of the first domain name, and according to the first domain name Different processing is performed on the category of a domain name. Specifically, if the category of the first domain name is an allowed access category, the gateway device can obtain the Internet Protocol address corresponding to the first domain name and send the Internet Protocol address to the terminal device. If the category of the first domain name is a forbidden category, the gateway device can ignore or delete the access request to prevent the terminal device from acquiring the content of the unhealthy website and affect the user's mental health.
步骤S410:若域名数据库中的第二域名的子域名中不存在第三域名,则网关设备获取第一域名的转移概率。Step S410: If the third domain name does not exist in the subdomain name of the second domain name in the domain name database, the gateway device obtains the transfer probability of the first domain name.
具体的,若域名数据库中的第二域名的子域名中不存在第三域名,则表明域名数据库中不存在(由第二域名和第三域名组成的)第一域名,此时,网关设备需要向域名分类服务器请求获取第一域名的类别。在一种实现方式中,网关设备可以获取第一域名的转移概率,进而根据第一域名的转移概率是否大于预设概率阈值,以估计第一域名的类别存储于域名分类服务器中的概率,并在第一域名的类别存储于域名分类服务器中的概率较高时,才将第一域名发送给域名分类服务器。Specifically, if the third domain name does not exist in the subdomain name of the second domain name in the domain name database, it means that the first domain name (composed of the second domain name and the third domain name) does not exist in the domain name database. At this time, the gateway device needs Request the domain name classification server to obtain the category of the first domain name. In an implementation manner, the gateway device may obtain the transfer probability of the first domain name, and then estimate the probability that the category of the first domain name is stored in the domain name classification server according to whether the transfer probability of the first domain name is greater than a preset probability threshold, and When the probability that the category of the first domain name is stored in the domain name classification server is high, the first domain name is sent to the domain name classification server.
需要说明的是,网关设备获取第一域名的转移概率的具体执行过程可参见图2中步骤S202的具体描述,在此不赘述。It should be noted that, for the specific execution process of the gateway device obtaining the transfer probability of the first domain name, refer to the specific description of step S202 in FIG. 2, which is not repeated here.
步骤S411:若第一域名的转移概率大于预设概率阈值,则网关设备将第一域名发送给域名分类服务器,以使域名分类服务器查询第一域名的类别,网关设备接收域名分类服务器发送的第一域名的类别,若第一域名的类别为允许访问类别,则获取第一域名对应的互联网协议地址,并将该互联网协议地址发送给终端设备。Step S411: If the transfer probability of the first domain name is greater than the preset probability threshold, the gateway device sends the first domain name to the domain name classification server, so that the domain name classification server queries the category of the first domain name, and the gateway device receives the first domain name sent by the domain name classification server. A category of a domain name. If the category of the first domain name is an allowed access category, the Internet Protocol address corresponding to the first domain name is obtained, and the Internet Protocol address is sent to the terminal device.
需要说明的是,步骤S411的具体执行过程可参见图2中步骤S203~步骤S206的具体描述,在此不赘述。It should be noted that, for the specific execution process of step S411, refer to the specific description of step S203 to step S206 in FIG. 2, which will not be repeated here.
步骤S412:网关设备获取域名数据库中第二域名的子域名的数量,并判断域名数据库中第二域名的子域名的数量是否大于或等于第一预设数量阈值。Step S412: The gateway device obtains the number of subdomains of the second domain name in the domain name database, and determines whether the number of subdomains of the second domain name in the domain name database is greater than or equal to a first preset number threshold.
具体的,在域名数据库中的各个域名包括的一级域名中存在第二域名,且域名数据库中的第二域名的子域名中不存在第三域名的情况下,网关设备在接收到域名分类服务器发送的第一域名的类别之后,可以将第三域名与第一域名的类别关联存储于域名数据库中。通过这种方式,可以在域名数据库中添加第一域名以及该第一域名的类别,这样当网关设备后续再次接收到包括该第一域名的访问请求时,可以直接在域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。Specifically, when the second domain name exists in the first-level domain names included in each domain name in the domain name database, and the third domain name does not exist in the subdomains of the second domain name in the domain name database, the gateway device receives the domain name classification server After the category of the first domain name is sent, the category of the third domain name and the first domain name may be associated and stored in the domain name database. In this way, the first domain name and the category of the first domain name can be added to the domain name database, so that when the gateway device subsequently receives an access request that includes the first domain name again, it can directly obtain the first domain name in the domain name database. There is no need to obtain the category of the first domain name through the domain name classification server, which can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial to reduce the load of the domain name classification server, and is beneficial to improve the website access speed.
在一种实现方式中,网关设备将第三域名与第一域名的类别关联存储于域名数据库之前,需要先判断域名数据库中存储的第二域名的子域名的总数量是否大于或等于第一预设数量阈值,若域名数据库中存储的第二域名的子域名的总数量小于第一预设数量阈值,则表明即使在域名数据库中的第二域名的子域名中添加一个第三域名,也不会导致第二域名 的子域名的总数量超过预先设置的子域名的上限数量,也就是说,网关设备可以直接在域名数据库中添加第三域名以及第一域名的类别。若域名数据库中存储的第二域名的子域名的总数量大于第一预设数量阈值,则表明域名数据库中存储的第二域名的子域名的总数量超过了预先设置的子域名的上限数量;若域名数据库中存储的第二域名的子域名的总数量等于第一预设数量阈值,则表明在域名数据库中存储的第二域名的子域名中添加一个子域名之后,会使得域名数据库中存储的第二域名的子域名的总数量超过预先设置的子域名的上限数量,此时,网关设备需要在域名数据库存储的第二域名的子域名中确定目标子域名,并删除目标子域名之后,才能在域名数据库中添加第三域名以及第一域名的类别。在一种实现方式中,第一预设数量阈值可以是网关设备默认设置的,也可以是网关设备根据用户操作设置的,本申请实施例对此不作限定。需要说明的是,由于网关设备的物理存储空间有限,并且若网关设备的域名数据库中存储的每个一级域名的子域名的数量过多时,会导致查询域名数据库时网关设备的负载过高,因此,设置的第一预设数量阈值应小于或等于网关设备中子域名的物理上限数量,每个一级域名的子域名的物理上限数量可以是网关设备在出厂时设置的。In one implementation, before the gateway device associates the category of the third domain name with the first domain name and stores it in the domain name database, it needs to first determine whether the total number of subdomain names of the second domain name stored in the domain name database is greater than or equal to the first preset. Set the number threshold. If the total number of subdomains of the second domain name stored in the domain name database is less than the first preset number threshold, it means that even if a third domain name is added to the subdomains of the second domain name in the domain name database, no This will cause the total number of subdomains of the second domain name to exceed the preset upper limit of subdomains, that is, the gateway device can directly add the third domain name and the category of the first domain name in the domain name database. If the total number of subdomains of the second domain name stored in the domain name database is greater than the first preset number threshold, it indicates that the total number of subdomains of the second domain name stored in the domain name database exceeds the preset upper limit of subdomains; If the total number of subdomains of the second domain name stored in the domain name database is equal to the first preset number threshold, it indicates that adding a subdomain to the subdomains of the second domain name stored in the domain name database will cause the domain name database to be stored The total number of subdomains of the second domain name exceeds the preset upper limit number of subdomains. At this time, the gateway device needs to determine the target subdomain among the subdomains of the second domain name stored in the domain name database and delete the target subdomain, The third domain name and the category of the first domain name can be added to the domain name database. In an implementation manner, the first preset number threshold may be set by the gateway device by default, or may be set by the gateway device according to user operations, which is not limited in this embodiment of the application. It should be noted that the physical storage space of the gateway device is limited, and if the number of subdomains of each first-level domain name stored in the domain name database of the gateway device is too large, the load of the gateway device will be too high when querying the domain name database. Therefore, the first preset number threshold set should be less than or equal to the upper physical limit number of subdomains in the gateway device, and the physical upper limit number of subdomain names of each first-level domain name may be set by the gateway device at the factory.
需要说明的是,网关设备接收到域名分类服务器发送的第一域名的类别之后,可以先执行获取第一域名对应的互联网协议地址的步骤,后执行步骤S412;或者,可以先执行步骤S412,后执行获取第一域名对应的互联网协议地址的步骤;或者,可以同时执行步骤S412和获取第一域名对应的互联网协议地址的步骤,本申请实施例对此不作限定。It should be noted that after the gateway device receives the category of the first domain name sent by the domain name classification server, it may first perform the step of obtaining the Internet Protocol address corresponding to the first domain name, and then perform step S412; or, it may perform step S412 first, and then The step of obtaining the internet protocol address corresponding to the first domain name is performed; or, step S412 and the step of obtaining the internet protocol address corresponding to the first domain name may be performed at the same time, which is not limited in the embodiment of the present application.
步骤S413:若第二域名的子域名的数量大于或等于第一预设数量阈值,则网关设备获取域名数据库中第二域名的各个子域名的存储价值,并在域名数据库中删除目标子域名,目标子域名为域名数据库中第二域名的子域名中存储价值最低的子域名。Step S413: If the number of subdomains of the second domain name is greater than or equal to the first preset number threshold, the gateway device obtains the storage value of each subdomain of the second domain name in the domain name database, and deletes the target subdomain name in the domain name database. The target subdomain name is the subdomain name with the lowest stored value among the subdomain names of the second domain name in the domain name database.
其中,子域名的存储价值越高表明该子域名在域名数据库中被命中的概率越高,因此,网关设备将存储价值最低的子域名确定为目标子域名,进而用第三域名(即第二域名的子域名)替换目标子域名,可以提高域名数据库中的子域名的平均命中率,进而可以尽量减少网关设备与域名分类服务器之间的通信流量,从而有利于降低域名分类服务器的负载。Among them, the higher the storage value of the subdomain, the higher the probability that the subdomain is hit in the domain name database. Therefore, the gateway device determines the subdomain with the lowest storage value as the target subdomain, and then uses the third domain name (that is, the second The subdomain name of the domain name) replaces the target subdomain name, which can increase the average hit rate of the subdomain name in the domain name database, thereby minimizing the communication traffic between the gateway device and the domain name classification server, thereby helping to reduce the load of the domain name classification server.
在一种实现方式中,网关设备在第二域名的子域名中确定目标子域名的具体实施方式可以为:网关设备获取第二域名的各个子域名的存储价值,并将第二域名的所有子域名中存储价值最低的子域名确定为目标子域名。在一种实现方式中,网关设备获取第二域名的各个子域名的存储价值的具体实施方式可以为:网关设备根据第二域名的各个子域名的使用时间、使用时长和/或使用频率,得到第二域名的各个子域名的存储价值。In an implementation manner, the specific implementation manner for the gateway device to determine the target subdomain name in the subdomain name of the second domain name may be: the gateway device obtains the storage value of each subdomain name of the second domain name, and combines all subdomain names of the second domain name The subdomain with the lowest stored value in the domain name is determined as the target subdomain. In an implementation manner, a specific implementation manner for the gateway device to obtain the stored value of each subdomain name of the second domain name may be: the gateway device obtains the storage value of each subdomain name of the second domain name according to the use time, use duration, and/or use frequency of each subdomain name of the second domain name The storage value of each subdomain name of the second domain name.
在一种实现方式中,若第二域名的子域名的数量小于第一预设数量阈值,则网关设备可以直接将第三域名以及第一域名的类别关联存储于域名数据库中,即网关设备可以直接执行步骤S414。In one implementation, if the number of subdomains of the second domain name is less than the first preset number threshold, the gateway device may directly associate the category of the third domain name and the first domain name in the domain name database, that is, the gateway device may Step S414 is directly executed.
步骤S414:网关设备将第三域名作为第二域名的子域名,并将第三域名以及第一域名的类别关联存储于域名数据库中。Step S414: The gateway device uses the third domain name as a subdomain name of the second domain name, and stores the third domain name and the category of the first domain name in the domain name database.
具体的,网关设备在域名数据库中删除目标子域名之后,可以将第三域名作为第二域名的子域名,并将第三域名以及第一域名的类别关联存储于域名数据库中。通过这种方式,当网关设备后续再次接收到包括(由第二域名和第三域名组成的)第一域名的访问请求时, 可以直接在域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。Specifically, after the gateway device deletes the target subdomain name in the domain name database, the third domain name can be used as the subdomain name of the second domain name, and the third domain name and the category of the first domain name are associated and stored in the domain name database. In this way, when the gateway device subsequently receives an access request including the first domain name (composed of the second domain name and the third domain name) again, it can directly obtain the category of the first domain name in the domain name database without having to pass the domain name. The classification server obtains the category of the first domain name, so that the communication traffic between the gateway device and the domain name classification server can be reduced, which is beneficial to reduce the load of the domain name classification server, and thus is beneficial to improve the website access speed.
通过实施本申请实施例,在域名数据库中的各个域名包括的一级域名中不存在第二域名的情况下,网关设备接收到域名分类服务器发送的第一域名的类别之后,通过在域名数据库中删除存储价值最低的目标一级域名以及目标一级域名的所有子域名,进而将第二域名作为一级域名,将第三域名作为第二域名的子域名,并将第二域名、第三域名以及第一域名的类别关联存储于域名数据库中。通过这种方式,一方面,可以提高域名数据库中的一级域名的平均命中率;另一方面,当网关设备后续再次接收到包括(由第二域名和第三域名组成的)第一域名的访问请求时,可以直接在网关设备的域名数据库中获取第一域名的类别,而不必通过域名分类服务器获取第一域名的类别,这样可以降低网关设备与域名分类服务器之间的通信流量,有利于降低域名分类服务器的负载,从而有利于提高网站访问速度。By implementing the embodiments of this application, when the second domain name does not exist in the first-level domain names included in each domain name in the domain name database, after the gateway device receives the category of the first domain name sent by the domain name classification server, Delete the target first-level domain name with the lowest storage value and all subdomains of the target first-level domain name, and then use the second domain name as the first-level domain name, the third domain name as the subdomain name of the second domain name, and the second domain name and third domain name And the category association of the first domain name is stored in the domain name database. In this way, on the one hand, the average hit rate of the first-level domain names in the domain name database can be increased; on the other hand, when the gateway device subsequently receives the first domain name (composed of the second domain name and the third domain name) again When accessing the request, the category of the first domain name can be directly obtained from the domain name database of the gateway device, instead of obtaining the category of the first domain name through the domain name classification server, which can reduce the communication traffic between the gateway device and the domain name classification server, which is beneficial Reduce the load of the domain name classification server, thereby helping to improve the speed of website access.
请参见图6,图6是本申请实施例提供的一种数据处理装置的结构示意图,该数据处理装置60用于执行图2-图4对应的方法实施例中网关设备所执行的步骤,该数据处理装置60可以包括:Please refer to FIG. 6. FIG. 6 is a schematic structural diagram of a data processing device provided by an embodiment of the present application. The data processing device 60 is used to execute the steps performed by the gateway device in the method embodiments corresponding to FIG. 2 to FIG. The data processing device 60 may include:
接收单元601,用于接收终端设备发送的访问请求,访问请求包括第一域名;The receiving unit 601 is configured to receive an access request sent by a terminal device, where the access request includes the first domain name;
获取单元602,用于获取第一域名的转移概率;The obtaining unit 602 is configured to obtain the transfer probability of the first domain name;
发送单元603,用于若第一域名的转移概率大于预设概率阈值,则将第一域名发送给域名分类服务器;The sending unit 603 is configured to send the first domain name to the domain name classification server if the transfer probability of the first domain name is greater than the preset probability threshold;
接收单元601,还用于接收域名分类服务器发送的第一域名的类别;The receiving unit 601 is further configured to receive the category of the first domain name sent by the domain name classification server;
获取单元602,还用于若第一域名的类别为允许访问类别,则获取第一域名对应的互联网协议地址;The obtaining unit 602 is further configured to obtain an Internet Protocol address corresponding to the first domain name if the category of the first domain name is an allowed access category;
发送单元603,还用于将互联网协议地址发送给终端设备。The sending unit 603 is also used to send the Internet Protocol address to the terminal device.
在一种实现方式中,第一域名可以包括字符串,获取单元602用于获取第一域名的转移概率时,具体用于:获取字符串中的各个字符对以及各个字符对的转移概率,根据各个字符对的转移概率,得到第一域名包括的字符串的转移概率,并将第一域名包括的字符串的转移概率作为第一域名的转移概率。In an implementation manner, the first domain name may include a character string. When the obtaining unit 602 is used to obtain the transition probability of the first domain name, it is specifically used to: obtain each character pair in the character string and the transition probability of each character pair, according to For the transition probability of each character pair, the transition probability of the character string included in the first domain name is obtained, and the transition probability of the character string included in the first domain name is taken as the transition probability of the first domain name.
在一种实现方式中,网关设备中可以具有域名数据库,域名数据库可以包括多个域名以及多个域名中各个域名的类别;发送单元603,用于若第一域名的转移概率大于预设概率阈值,则将第一域名发送给域名分类服务器时,具体用于:若第一域名的转移概率大于预设概率阈值,且域名数据库中不存在第一域名,则将第一域名发送给域名分类服务器。In an implementation manner, the gateway device may have a domain name database, and the domain name database may include multiple domain names and categories of each of the multiple domain names; the sending unit 603 is configured to: if the transfer probability of the first domain name is greater than the preset probability threshold , When sending the first domain name to the domain name classification server, it is specifically used to: if the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, then the first domain name is sent to the domain name classification server .
在一种实现方式中,数据处理装置60还可以包括存储单元604,存储单元604用于将第一域名与第一域名的类别关联存储于域名数据库中。In an implementation manner, the data processing device 60 may further include a storage unit 604 configured to associate the first domain name with the category of the first domain name and store in the domain name database.
在一种实现方式中,域名数据库中的各个域名可以包括一级域名以及一级域名的子域名,第一域名可以包括第二域名和第三域名;域名数据库中不存在第一域名可以包括:域名数据库中的各个域名包括的一级域名中不存在第二域名,和/或,域名数据库中的各个域 名包括的一级域名的子域名中不存在第三域名。In an implementation manner, each domain name in the domain name database may include a first-level domain name and a subdomain name of the first-level domain name. The first domain name may include a second domain name and a third domain name; the first domain name that does not exist in the domain name database may include: The second domain name does not exist among the first-level domain names included in each domain name in the domain name database, and/or the third domain name does not exist among the subdomains of the first-level domain names included in each domain name in the domain name database.
在一种实现方式中,数据处理装置60还可以包括删除单元605,获取单元602还用于若域名数据库中的各个域名包括的一级域名中不存在第二域名,则获取域名数据库中所有域名包括的一级域名的数量;若一级域名的数量大于或等于预设数量阈值,则获取域名数据库中各个域名包括的一级域名的存储价值;删除单元605用于在域名数据库中删除目标一级域名以及目标一级域名的所有子域名,目标一级域名为域名数据库中所有域名包括的一级域名中存储价值最低的一级域名;存储单元604用于将第一域名与第一域名的类别关联存储于域名数据库中时,具体用于:将第二域名作为一级域名,将第三域名作为第二域名的子域名,并将第二域名、第三域名以及第一域名的类别关联存储于域名数据库中。In an implementation manner, the data processing device 60 may further include a deleting unit 605. The obtaining unit 602 is further configured to obtain all domain names in the domain name database if the second domain name does not exist in the first-level domain names included in each domain name in the domain name database. The number of first-level domain names included; if the number of first-level domain names is greater than or equal to the preset number threshold, the storage value of the first-level domain names included in each domain name in the domain name database is obtained; the deleting unit 605 is used to delete target one in the domain name database The first-level domain name and all subdomains of the target first-level domain name. The target first-level domain name is the first-level domain name with the lowest storage value among all the first-level domain names included in the domain name database; the storage unit 604 is used to combine the first domain name with the first-level domain name. When the category association is stored in the domain name database, it is specifically used to: regard the second domain name as the first-level domain name, the third domain name as the subdomain name of the second domain name, and associate the second domain name, the third domain name, and the category of the first domain name Stored in the domain name database.
在一种实现方式中,获取单元602用于获取域名数据库中各个域名包括的一级域名的存储价值时,具体用于:根据域名数据库中各个域名包括的一级域名的使用时间、使用时长和/或使用频率,得到域名数据库中各个域名包括的一级域名的存储价值。In an implementation manner, when the obtaining unit 602 is configured to obtain the storage value of the first-level domain names included in each domain name in the domain name database, it is specifically used to: according to the use time, use time and length of the first-level domain names included in each domain name in the domain name database / Or frequency of use, to obtain the storage value of the first-level domain names included in each domain name in the domain name database.
在一种实现方式中,发送单元603还用于在检测到网关设备上电时,向域名分类服务器发送数据初始化请求,数据初始化请求用于请求获取域名数据,域名数据包括域名集合、域名集合中各个域名的类别,域名集合中的域名是所述域名分类服务器根据各个域名的访问时间、访问时长和/或访问频率确定的,域名集合中的各个域名包括一级域名和该一级域名的子域名;接收单元601还用于接收域名数据;存储单元604还用于将域名数据存储于域名数据库中。In an implementation manner, the sending unit 603 is further configured to send a data initialization request to the domain name classification server when it is detected that the gateway device is powered on. The data initialization request is used to request domain name data. The domain name data includes domain name collections and domain name collections. The category of each domain name. The domain names in the domain name set are determined by the domain name classification server according to the access time, access duration and/or access frequency of each domain name. Each domain name in the domain name set includes the first-level domain name and the subdomains of the first-level domain name. Domain name; The receiving unit 601 is also used to receive domain name data; the storage unit 604 is also used to store the domain name data in the domain name database.
在一种实现方式中,访问请求还可以包括终端设备的标识,数据处理装置60还可以包括处理单元606,处理单元606用于若终端设备的标识为预设标识,则触发获取第一域名的转移概率的步骤。In an implementation manner, the access request may also include the identification of the terminal device, and the data processing apparatus 60 may also include a processing unit 606. The processing unit 606 is configured to trigger the acquisition of the first domain name if the identification of the terminal device is a preset identification. Steps of transition probability.
需要说明的是,图6对应的实施例中未提及的内容以及各个单元执行步骤的具体实现方式可参见图2-图4所示实施例以及前述内容,这里不再赘述。It should be noted that the content not mentioned in the embodiment corresponding to FIG. 6 and the specific implementation of the steps performed by each unit can refer to the embodiment shown in FIG. 2 to FIG. 4 and the foregoing content, and details are not repeated here.
在一种实现方式中,图6中的各个单元所实现的相关功能可以结合处理器与收发器来实现。参见图7,图7是本申请实施例提供的一种分布式数据管理设备的结构示意图,该分布式数据管理设备70包括:收发器701、处理器702和存储器703,收发器701、处理器702和存储器703通过一条或多条通信总线连接。In an implementation manner, related functions implemented by each unit in FIG. 6 can be implemented in combination with a processor and a transceiver. Referring to FIG. 7, FIG. 7 is a schematic structural diagram of a distributed data management device provided by an embodiment of the present application. The distributed data management device 70 includes: a transceiver 701, a processor 702, and a memory 703. The transceiver 701, a processor 702 and memory 703 are connected by one or more communication buses.
收发器701用于接收数据或者发送数据,例如,收发器701可以用于接收终端设备发送的访问请求,或者,用于将第一域名发送给域名分类服务器。The transceiver 701 is used for receiving data or sending data. For example, the transceiver 701 may be used for receiving an access request sent by a terminal device, or for sending the first domain name to a domain name classification server.
处理器702被配置为执行图2-图4所述方法中网关设备相应的功能。该处理器702可以是中央处理器(central processing unit,CPU),网络处理器(network processor,NP),硬件芯片或者其任意组合。The processor 702 is configured to perform corresponding functions of the gateway device in the methods described in FIGS. 2 to 4. The processor 702 may be a central processing unit (CPU), a network processor (NP), a hardware chip, or any combination thereof.
存储器703用于存储程序代码等。存储器703可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM);存储器703也可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);存储器703还可以包括上述种类的存储器的组合。The memory 703 is used to store program codes and the like. The memory 703 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM); the memory 703 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (read-only memory). Only memory (ROM), flash memory (flash memory), hard disk drive (HDD) or solid-state drive (SSD); memory 703 may also include a combination of the foregoing types of memories.
处理器702可以调用存储器703中存储的程序代码以执行以下操作:The processor 702 may call the program code stored in the memory 703 to perform the following operations:
接收终端设备发送的访问请求,访问请求包括第一域名;Receiving an access request sent by the terminal device, where the access request includes the first domain name;
获取第一域名的转移概率;Obtain the transfer probability of the first domain name;
若第一域名的转移概率大于预设概率阈值,则将第一域名发送给域名分类服务器;If the transfer probability of the first domain name is greater than the preset probability threshold, sending the first domain name to the domain name classification server;
接收域名分类服务器发送的第一域名的类别;Receiving the category of the first domain name sent by the domain name classification server;
若第一域名的类别为允许访问类别,则获取第一域名对应的互联网协议地址;If the category of the first domain name is an allowed access category, then obtain the Internet Protocol address corresponding to the first domain name;
将互联网协议地址发送给终端设备。Send the Internet Protocol address to the terminal device.
进一步地,处理器702还可以执行图2-图4所示实施例中网关设备对应的操作,具体可参见方法实施例中的描述,在此不再赘述。Further, the processor 702 may also perform operations corresponding to the gateway device in the embodiment shown in FIG. 2 to FIG. 4. For details, please refer to the description in the method embodiment, which will not be repeated here.
本申请实施例还提供一种计算机可读存储介质,可以用于存储图6所示实施例中数据处理装置所用的计算机软件指令,其包含用于执行上述实施例中为网关设备所设计的程序。The embodiment of the present application also provides a computer-readable storage medium, which can be used to store computer software instructions used by the data processing apparatus in the embodiment shown in FIG. 6, which contains the program used to execute the gateway device in the above embodiment. .
上述计算机可读存储介质包括但不限于快闪存储器、硬盘、固态硬盘。The aforementioned computer-readable storage medium includes, but is not limited to, flash memory, hard disk, and solid state hard disk.
本申请实施例还提供一种计算机程序产品,该计算机产品被计算设备运行时,可以执行上述图2-图4实施例中为网关设备所设计的数据处理方法。The embodiments of the present application also provide a computer program product. When the computer product is run by a computing device, it can execute the data processing method designed for the gateway device in the embodiments of FIGS. 2 to 4 above.
在本申请实施例中还提供一种芯片,包括处理器和存储器,该存储器用包括处理器和存储器,该存储器用于存储计算机程序,该处理器用于从存储器中调用并运行该计算机程序,该计算机程序用于实现上述方法实施例中的方法。An embodiment of the present application also provides a chip, including a processor and a memory, the memory includes a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory. The computer program is used to implement the method in the above method embodiment.
本领域普通技术人员可以意识到,结合本申请中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps described in the examples in combination with the embodiments disclosed in this application can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted through the computer-readable storage medium. The computer instructions can be sent from one website site, computer, server, or data center to another website site via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) , Computer, server or data center for transmission. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)), etc.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (20)

  1. 一种数据处理方法,其特征在于,应用于网关设备,所述方法包括:A data processing method, characterized in that it is applied to a gateway device, and the method includes:
    接收终端设备发送的访问请求,所述访问请求包括第一域名;Receiving an access request sent by a terminal device, where the access request includes the first domain name;
    获取所述第一域名的转移概率;Acquiring the transfer probability of the first domain name;
    若所述第一域名的转移概率大于预设概率阈值,则将所述第一域名发送给域名分类服务器;If the transfer probability of the first domain name is greater than the preset probability threshold, sending the first domain name to the domain name classification server;
    接收所述域名分类服务器发送的所述第一域名的类别;Receiving the category of the first domain name sent by the domain name classification server;
    若所述第一域名的类别为允许访问类别,则获取所述第一域名对应的互联网协议地址,并将所述互联网协议地址发送给所述终端设备。If the category of the first domain name is an allowed access category, the Internet Protocol address corresponding to the first domain name is obtained, and the Internet Protocol address is sent to the terminal device.
  2. 根据权利要求1所述的方法,其特征在于,所述第一域名包括字符串,所述获取所述第一域名的转移概率,包括:The method according to claim 1, wherein the first domain name comprises a character string, and the obtaining the transfer probability of the first domain name comprises:
    获取所述字符串中的各个字符对以及所述各个字符对的转移概率;Acquiring each character pair in the character string and the transition probability of each character pair;
    根据所述各个字符对的转移概率,得到所述字符串的转移概率,并将所述字符串的转移概率作为所述第一域名的转移概率。According to the transition probability of each character pair, the transition probability of the character string is obtained, and the transition probability of the character string is taken as the transition probability of the first domain name.
  3. 根据权利要求1或2所述的方法,其特征在于,所述网关设备中具有域名数据库,所述域名数据库包括多个域名以及所述多个域名中各个域名的类别;所述若所述第一域名的转移概率大于预设概率阈值,则将所述第一域名发送给域名分类服务器,包括:The method according to claim 1 or 2, wherein the gateway device has a domain name database, and the domain name database includes multiple domain names and categories of each of the multiple domain names; If the transfer probability of a domain name is greater than the preset probability threshold, sending the first domain name to the domain name classification server includes:
    若所述第一域名的转移概率大于所述预设概率阈值,且所述域名数据库中不存在所述第一域名,则将所述第一域名发送给所述域名分类服务器。If the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, sending the first domain name to the domain name classification server.
  4. 根据权利要求3所述的方法,其特征在于,所述接收所述域名分类服务器发送的所述第一域名的类别之后,所述方法还包括:The method according to claim 3, wherein after the receiving the category of the first domain name sent by the domain name classification server, the method further comprises:
    将所述第一域名与所述第一域名的类别关联存储于所述域名数据库中。The category association of the first domain name and the first domain name is stored in the domain name database.
  5. 根据权利要求4所述的方法,其特征在于,所述域名数据库中的各个域名包括一级域名以及所述一级域名的子域名,所述第一域名包括第二域名和第三域名;The method according to claim 4, wherein each domain name in the domain name database includes a first-level domain name and a subdomain name of the first-level domain name, and the first domain name includes a second domain name and a third domain name;
    所述域名数据库中不存在所述第一域名包括:所述域名数据库中的各个域名包括的一级域名中不存在所述第二域名,和/或,所述域名数据库中的各个域名包括的一级域名的子域名中不存在所述第三域名。The absence of the first domain name in the domain name database includes: the second domain name does not exist in the first-level domain names included in each domain name in the domain name database, and/or, each domain name in the domain name database includes The third domain name does not exist in the subdomains of the first-level domain name.
  6. 根据权利要求5所述的方法,其特征在于,所述将所述第一域名与所述第一域名的类别关联存储于所述域名数据库中之前,所述方法还包括:The method according to claim 5, wherein before storing the category association of the first domain name with the first domain name in the domain name database, the method further comprises:
    若所述域名数据库中的各个域名包括的一级域名中不存在所述第二域名,则获取所述域名数据库中所有域名包括的一级域名的数量;If the second domain name does not exist among the first-level domain names included in each domain name in the domain name database, acquiring the number of first-level domain names included in all domain names in the domain name database;
    若所述数量大于或等于预设数量阈值,则获取所述域名数据库中各个域名包括的一级域名的存储价值;If the number is greater than or equal to the preset number threshold, acquiring the storage value of the first-level domain names included in each domain name in the domain name database;
    在所述域名数据库中删除目标一级域名以及所述目标一级域名的所有子域名,所述目标一级域名为所述域名数据库中所有域名包括的一级域名中存储价值最低的一级域名;Delete the target first-level domain name and all subdomains of the target first-level domain name from the domain name database, where the target first-level domain name is the first-level domain name with the lowest stored value among all the first-level domain names included in the domain name database ;
    所述将所述第一域名与所述第一域名的类别关联存储于所述域名数据库中,包括:The storing the first domain name and the category of the first domain name in the domain name database includes:
    将所述第二域名作为一级域名,将所述第三域名作为所述第二域名的子域名,并将所述第二域名、所述第三域名以及所述第一域名的类别关联存储于所述域名数据库中。Use the second domain name as a first-level domain name, use the third domain name as a subdomain name of the second domain name, and store the categories of the second domain name, the third domain name, and the first domain name in association with each other In the domain name database.
  7. 根据权利要求6所述的方法,其特征在于,所述获取所述域名数据库中各个域名包括的一级域名的存储价值,包括:The method according to claim 6, wherein said obtaining the storage value of the first-level domain names included in each domain name in the domain name database comprises:
    根据所述域名数据库中各个域名包括的一级域名的使用时间、使用时长和/或使用频率,得到所述域名数据库中各个域名包括的一级域名的存储价值。According to the use time, use duration and/or use frequency of the first-level domain names included in each domain name in the domain name database, the storage value of the first-level domain names included in each domain name in the domain name database is obtained.
  8. 根据权利要求3~7任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 3-7, wherein the method further comprises:
    在检测到所述网关设备上电时,向所述域名分类服务器发送数据初始化请求,所述数据初始化请求用于请求获取域名数据,所述域名数据包括域名集合、域名集合中各个域名的类别,所述域名集合中的域名是所述域名分类服务器根据各个域名的访问时间、访问时长和/或访问频率确定的,所述域名集合中的各个域名包括一级域名和所述一级域名的子域名;When it is detected that the gateway device is powered on, a data initialization request is sent to the domain name classification server. The data initialization request is used to request domain name data. The domain name data includes a domain name set and the category of each domain name in the domain name set. The domain names in the domain name set are determined by the domain name classification server according to the access time, access duration, and/or access frequency of each domain name. Each domain name in the domain name set includes a first-level domain name and a subdomain of the first-level domain name. domain name;
    接收所述域名数据,并将所述域名数据存储于所述域名数据库中。Receiving the domain name data, and storing the domain name data in the domain name database.
  9. 根据权利要求1~8任一项所述的方法,其特征在于,所述访问请求还包括所述终端设备的标识,所述接收终端设备发送的访问请求之后,所述方法还包括:The method according to any one of claims 1 to 8, wherein the access request further comprises an identification of the terminal device, and after the access request sent by the terminal device is received, the method further comprises:
    若所述终端设备的标识为预设标识,则触发获取所述第一域名的转移概率的步骤。If the identifier of the terminal device is a preset identifier, the step of obtaining the transfer probability of the first domain name is triggered.
  10. 一种数据处理装置,其特征在于,包括:A data processing device, characterized by comprising:
    接收单元,用于接收终端设备发送的访问请求,所述访问请求包括第一域名;A receiving unit, configured to receive an access request sent by a terminal device, where the access request includes the first domain name;
    获取单元,用于获取所述第一域名的转移概率;An obtaining unit, configured to obtain the transfer probability of the first domain name;
    发送单元,用于若所述第一域名的转移概率大于预设概率阈值,则将所述第一域名发送给域名分类服务器;A sending unit, configured to send the first domain name to a domain name classification server if the transfer probability of the first domain name is greater than a preset probability threshold;
    所述接收单元,还用于接收所述域名分类服务器发送的所述第一域名的类别;The receiving unit is further configured to receive the category of the first domain name sent by the domain name classification server;
    所述获取单元,还用于若所述第一域名的类别为允许访问类别,则获取所述第一域名对应的互联网协议地址;The obtaining unit is further configured to obtain an Internet Protocol address corresponding to the first domain name if the category of the first domain name is an allowed access category;
    所述发送单元,还用于将所述互联网协议地址发送给所述终端设备。The sending unit is further configured to send the Internet Protocol address to the terminal device.
  11. 根据权利要求10所述的装置,其特征在于,所述第一域名包括字符串,所述获取单元用于获取所述第一域名的转移概率时,具体用于:The apparatus according to claim 10, wherein the first domain name comprises a character string, and when the obtaining unit is used to obtain the transfer probability of the first domain name, it is specifically used to:
    获取所述字符串中的各个字符对以及所述各个字符对的转移概率,根据所述各个字符对的转移概率,得到所述字符串的转移概率,并将所述字符串的转移概率作为所述第一域名的转移概率。Obtain each character pair in the character string and the transition probability of each character pair, obtain the transition probability of the character string according to the transition probability of each character pair, and use the transition probability of the character string as the The transfer probability of the first domain name.
  12. 根据权利要求10或11所述的装置,其特征在于,所述网关设备中具有域名数据库,所述域名数据库包括多个域名以及所述多个域名中各个域名的类别;所述发送单元,用于若所述第一域名的转移概率大于预设概率阈值,则将所述第一域名发送给域名分类服务器时,具体用于:The apparatus according to claim 10 or 11, wherein the gateway device has a domain name database, and the domain name database includes a plurality of domain names and the category of each domain name in the plurality of domain names; the sending unit uses If the transfer probability of the first domain name is greater than the preset probability threshold, when sending the first domain name to the domain name classification server, it is specifically used to:
    若所述第一域名的转移概率大于所述预设概率阈值,且所述域名数据库中不存在所述第一域名,则将所述第一域名发送给所述域名分类服务器。If the transfer probability of the first domain name is greater than the preset probability threshold, and the first domain name does not exist in the domain name database, sending the first domain name to the domain name classification server.
  13. 根据权利要求12所述的装置,其特征在于,所述数据处理装置还包括存储单元;The device according to claim 12, wherein the data processing device further comprises a storage unit;
    所述存储单元,用于将所述第一域名与所述第一域名的类别关联存储于所述域名数据库中。The storage unit is configured to store the category association of the first domain name and the first domain name in the domain name database.
  14. 根据权利要求13所述的装置,其特征在于,所述域名数据库中的各个域名包括一 级域名以及所述一级域名的子域名,所述第一域名包括第二域名和第三域名;The apparatus according to claim 13, wherein each domain name in the domain name database includes a first-level domain name and a subdomain name of the first-level domain name, and the first domain name includes a second domain name and a third domain name;
    所述域名数据库中不存在所述第一域名包括:所述域名数据库中的各个域名包括的一级域名中不存在所述第二域名,和/或,所述域名数据库中的各个域名包括的一级域名的子域名中不存在所述第三域名。The absence of the first domain name in the domain name database includes: the second domain name does not exist in the first-level domain names included in each domain name in the domain name database, and/or, each domain name in the domain name database includes The third domain name does not exist in the subdomains of the first-level domain name.
  15. 根据权利要求14所述的装置,其特征在于,所述数据处理装置还包括删除单元;The device according to claim 14, wherein the data processing device further comprises a deletion unit;
    所述获取单元,还用于若所述域名数据库中的各个域名包括的一级域名中不存在所述第二域名,则获取所述域名数据库中所有域名包括的一级域名的数量;若所述数量大于或等于预设数量阈值,则获取所述域名数据库中各个域名包括的一级域名的存储价值;The obtaining unit is further configured to obtain the number of first-level domain names included in all domain names in the domain name database if the second domain name does not exist among the first-level domain names included in each domain name in the domain name database; If the number is greater than or equal to the preset number threshold, the storage value of the first-level domain names included in each domain name in the domain name database is obtained;
    所述删除单元,用于在所述域名数据库中删除目标一级域名以及所述目标一级域名的所有子域名,所述目标一级域名为所述域名数据库中所有域名包括的一级域名中存储价值最低的一级域名;The deletion unit is configured to delete a target first-level domain name and all subdomains of the target first-level domain name in the domain name database, and the target first-level domain name is among the first-level domain names included in all domain names in the domain name database The first-level domain name with the lowest storage value;
    所述存储单元用于将所述第一域名与所述第一域名的类别关联存储于所述域名数据库中时,具体用于:将所述第二域名作为一级域名,将所述第三域名作为所述第二域名的子域名,并将所述第二域名、所述第三域名以及所述第一域名的类别关联存储于所述域名数据库中。When the storage unit is used to store the category association between the first domain name and the first domain name in the domain name database, it is specifically used to: use the second domain name as the first-level domain name, and store the third The domain name is used as a subdomain name of the second domain name, and the category association of the second domain name, the third domain name, and the first domain name is stored in the domain name database.
  16. 根据权利要求15所述的装置,其特征在于,所述获取单元用于获取所述域名数据库中各个域名包括的一级域名的存储价值时,具体用于:The device according to claim 15, wherein when the obtaining unit is used to obtain the stored value of the first-level domain names included in each domain name in the domain name database, it is specifically used to:
    根据所述域名数据库中各个域名包括的一级域名的使用时间、使用时长和/或使用频率,得到所述域名数据库中各个域名包括的一级域名的存储价值。According to the use time, use duration and/or use frequency of the first-level domain names included in each domain name in the domain name database, the storage value of the first-level domain names included in each domain name in the domain name database is obtained.
  17. 根据权利要求12~16任一项所述的装置,其特征在于,所述数据处理装置还包括存储单元;The device according to any one of claims 12 to 16, wherein the data processing device further comprises a storage unit;
    所述发送单元,还用于在检测到所述网关设备上电时,向所述域名分类服务器发送数据初始化请求,所述数据初始化请求用于请求获取域名数据,所述域名数据包括域名集合、域名集合中各个域名的类别,所述域名集合中的域名是所述域名分类服务器根据各个域名的访问时间、访问时长和/或访问频率确定的,所述域名集合中的各个域名包括一级域名和所述一级域名的子域名;The sending unit is further configured to send a data initialization request to the domain name classification server when it is detected that the gateway device is powered on. The data initialization request is used to request domain name data, and the domain name data includes a collection of domain names, The category of each domain name in the domain name set, the domain names in the domain name set are determined by the domain name classification server according to the access time, access duration and/or access frequency of each domain name, and each domain name in the domain name set includes a first-level domain name And subdomains of the first-level domain name;
    所述接收单元,还用于接收所述域名数据;The receiving unit is further configured to receive the domain name data;
    所述存储单元,用于将所述域名数据存储于所述域名数据库中。The storage unit is configured to store the domain name data in the domain name database.
  18. 根据权利要求10~17任一项所述的装置,其特征在于,所述访问请求还包括所述终端设备的标识,所述数据处理装置还包括处理单元;The device according to any one of claims 10 to 17, wherein the access request further includes an identifier of the terminal device, and the data processing device further includes a processing unit;
    所述处理单元,用于若所述终端设备的标识为预设标识,则触发获取所述第一域名的转移概率的步骤。The processing unit is configured to trigger the step of obtaining the transfer probability of the first domain name if the identifier of the terminal device is a preset identifier.
  19. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1~9任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to execute The method of any one of claims 1-9.
  20. 一种网关设备,其特征在于,包括存储器和处理器,所述存储器中存储有程序指令,所述处理器通过总线与所述存储器连接,所述处理器执行所述存储器中存储的程序指令,以使所述网关设备执行如权利要求1~9任一项所述的方法。A gateway device, characterized by comprising a memory and a processor, the memory stores program instructions, the processor is connected to the memory through a bus, and the processor executes the program instructions stored in the memory, So that the gateway device executes the method according to any one of claims 1-9.
PCT/CN2019/080652 2019-03-29 2019-03-29 Data processing method and apparatus therefor WO2020199029A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/080652 WO2020199029A1 (en) 2019-03-29 2019-03-29 Data processing method and apparatus therefor
CN201980093696.8A CN113545020B (en) 2019-03-29 2019-03-29 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/080652 WO2020199029A1 (en) 2019-03-29 2019-03-29 Data processing method and apparatus therefor

Publications (1)

Publication Number Publication Date
WO2020199029A1 true WO2020199029A1 (en) 2020-10-08

Family

ID=72664412

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/080652 WO2020199029A1 (en) 2019-03-29 2019-03-29 Data processing method and apparatus therefor

Country Status (2)

Country Link
CN (1) CN113545020B (en)
WO (1) WO2020199029A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115865427A (en) * 2022-11-14 2023-03-28 重庆伏特猫科技有限公司 Data acquisition and monitoring method based on data routing gateway

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025713A (en) * 2010-02-09 2011-04-20 中国移动通信集团北京有限公司 Access control method, system and DNS (Domain Name Server) server
CN104901943A (en) * 2012-03-31 2015-09-09 北京奇虎科技有限公司 Method and system for accessing website
WO2018032936A1 (en) * 2016-08-18 2018-02-22 中兴通讯股份有限公司 Method and device for checking domain name generated by domain generation algorithm
CN108200034A (en) * 2017-12-27 2018-06-22 新华三信息安全技术有限公司 A kind of method and device for identifying domain name

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102355490B (en) * 2011-08-23 2013-08-21 武汉大学 Spatial information cluster cache pre-fetching method for network spatial information service system
US9875355B1 (en) * 2013-09-17 2018-01-23 Amazon Technologies, Inc. DNS query analysis for detection of malicious software

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025713A (en) * 2010-02-09 2011-04-20 中国移动通信集团北京有限公司 Access control method, system and DNS (Domain Name Server) server
CN104901943A (en) * 2012-03-31 2015-09-09 北京奇虎科技有限公司 Method and system for accessing website
WO2018032936A1 (en) * 2016-08-18 2018-02-22 中兴通讯股份有限公司 Method and device for checking domain name generated by domain generation algorithm
CN108200034A (en) * 2017-12-27 2018-06-22 新华三信息安全技术有限公司 A kind of method and device for identifying domain name

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115865427A (en) * 2022-11-14 2023-03-28 重庆伏特猫科技有限公司 Data acquisition and monitoring method based on data routing gateway

Also Published As

Publication number Publication date
CN113545020A (en) 2021-10-22
CN113545020B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
US10574681B2 (en) Detection of known and unknown malicious domains
US10691763B2 (en) Trustable web searching verification in a blockchain
WO2018113594A1 (en) Method and device for defending dns attack and storage medium
WO2018176874A1 (en) Dns evaluation method and apparatus
WO2019165665A1 (en) Domain name resolution method, server and system
WO2018107784A1 (en) Method and device for detecting webshell
TW201824047A (en) Attack request determination method, apparatus and server
WO2022068333A1 (en) Access request processing method and apparatus, electronic device, and computer-readable storage medium
WO2017114206A1 (en) Method and device for processing short link, and short link server
US20110191342A1 (en) URL Reputation System
CN109768992B (en) Webpage malicious scanning processing method and device, terminal device and readable storage medium
CN110198313B (en) Method and device for generating strategy
CN108647240B (en) Method and device for counting access amount, electronic equipment and storage medium
WO2017000439A1 (en) Detection method, system and device for malicious behaviour, and computer storage medium
WO2013181972A1 (en) Method and device for identifying network access behaviour
CN107301215B (en) Search result caching method and device and search method and device
CN111224941B (en) Threat type identification method and device
CN105337786A (en) Server performance detection method, apparatus and equipment
US20200314064A1 (en) Domain name server based validation of network connections
WO2020199029A1 (en) Data processing method and apparatus therefor
US20190109851A1 (en) Highly scalable fine grained rate limiting
US8694659B1 (en) Systems and methods for enhancing domain-name-server responses
CN109873788B (en) Botnet detection method and device
EP3789890A1 (en) Fully qualified domain name (fqdn) determination
CN107736003B (en) Method and apparatus for securing domain names

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19922441

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19922441

Country of ref document: EP

Kind code of ref document: A1