CN104144223B - A kind of data capture method and device - Google Patents

A kind of data capture method and device Download PDF

Info

Publication number
CN104144223B
CN104144223B CN201410415518.1A CN201410415518A CN104144223B CN 104144223 B CN104144223 B CN 104144223B CN 201410415518 A CN201410415518 A CN 201410415518A CN 104144223 B CN104144223 B CN 104144223B
Authority
CN
China
Prior art keywords
data source
connection quality
source node
source nodes
connection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410415518.1A
Other languages
Chinese (zh)
Other versions
CN104144223A (en
Inventor
高阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201410415518.1A priority Critical patent/CN104144223B/en
Publication of CN104144223A publication Critical patent/CN104144223A/en
Application granted granted Critical
Publication of CN104144223B publication Critical patent/CN104144223B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention discloses a kind of data capture method and device, by the way that the data source nodes for meeting connection request are grouped according to a factor for influenceing quality of connection;According to the data source nodes of Hash mapping rule hit matching in every group of data source nodes;According to the quality of connection index of the data source nodes, the optimal data source nodes of a quality of connection are selected from the data source nodes of each group of hit matching;Data are obtained in the data source nodes optimal from the quality of connection.The embodiment of the present invention can select the optimal data source nodes of quality of connection according to quality of connection index, the optimal data source nodes of quality of connection are more suitable for connecting client, so that the process efficiency of transmission that data are obtained from data source nodes is also higher, therefore the efficiency of transmission of whole network can be lifted.

Description

Data acquisition method and device
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a data acquisition method and apparatus.
Background
In an internet with a tree structure, when a current client sends a connection request to a server, the server receiving the connection request may not have response data locally, and needs to send a data request to a parent node thereof to obtain response data.
In the above process, there are a plurality of data source nodes that meet the connection request of the client, and the proxy server needs to select one of the different data source nodes, acquire data from the selected data source node, and return the acquired data to the client to respond to the connection request of the client. In the prior art, a data source node is hit by adopting a dynamic random selection or round robin scheduling mode.
The mode adopted by the prior art cannot flexibly hit the data source node which is more suitable for being connected with the client, so that the transmission efficiency is not high when the data is acquired from the hit data source node which is not suitable for being connected with the client, and the transmission efficiency of the whole network cannot be improved.
Disclosure of Invention
The embodiment of the invention aims to provide a data acquisition method and a data acquisition device, which can acquire a data source node more suitable for connecting a client, and further can improve the transmission efficiency when acquiring data and the transmission efficiency of the whole network.
In order to achieve the above purpose, the embodiment of the invention discloses a data acquisition method. The technical scheme is as follows:
grouping the data source nodes which accord with the connection request according to a factor which influences the connection quality;
the matched data source nodes are hit in each group of data source nodes according to a Hash mapping rule;
selecting a data source node with optimal connection quality from the hit matched data source nodes of each group according to the connection quality index of the data source node;
and acquiring data from the data source node with the optimal connection quality.
Before grouping the data source nodes meeting the connection request according to a factor influencing the connection quality, the method further comprises the following steps:
and initializing the connection quality index of the data source node, so that the initial value of the connection quality index ensures that the data source node conforms to the connection request.
The grouping of the data source nodes conforming to the connection request according to a factor affecting the connection quality comprises:
grouping the data source nodes which accord with the connection request according to regional distribution; or,
and grouping the data source nodes which conform to the connection request according to the operator.
The method for hitting the matched data source nodes in each group of data source nodes according to the hash rule comprises the following steps:
and taking the universal resource identifier URI of the connection request as the input of a hash function in the hash mapping rule in each group of data source nodes, and hitting the matched data source nodes.
After the matched data source nodes are hit in each group of data source nodes according to the hash mapping rule, the method further comprises the following steps:
judging whether the time length of the matched data source node is hit exceeds a preset time threshold value or not;
and when the time threshold is exceeded, restoring the value of the connection quality index of the hit matched data source node to be an initial value.
After selecting a data source node with the optimal connection quality from the hit matched data source nodes according to the connection quality index of the data source node, the method further includes:
updating the connection quality index of the data source node with the optimal connection quality into a connection quality index measured in real time;
correspondingly, the acquiring data from the data source node with the optimal quality includes:
and acquiring data from the selected data source node according to the updated connection quality index.
The connection quality index comprises: connection speed and/or transmission bandwidth.
In order to achieve the above object, an embodiment of the present invention further discloses a data acquisition apparatus, including:
the grouping module is used for grouping the data source nodes which accord with the connection request according to a factor which influences the connection quality;
the hit module is used for hitting the matched data source nodes in each group of data source nodes divided by the grouping module according to a Hash mapping rule;
the selection module is used for selecting a data source node with the optimal connection quality from the data source nodes which are matched in each group by the hit module according to the connection quality index of the data source node;
and the acquisition module is used for acquiring data from the data source node with the optimal connection quality selected by the selection module.
The device further comprises:
and the initialization module is used for initializing the connection quality index of the data source node, so that the initial value of the connection quality index ensures that the data source node conforms to the connection request.
The grouping module is used for grouping the data into a group,
the method is specifically used for grouping data source nodes which accord with the connection request according to regional distribution; or,
in particular for grouping data source nodes that comply with a connection request according to an operator.
The hit module is specifically configured to:
and taking the universal resource identifier URI of the connection request as the input of a hash function in the hash mapping rule in each group of data source nodes, and hitting the matched data source nodes.
The device further comprises:
the judging module is used for judging whether the time length after the matched data source node is hit exceeds a preset time threshold value or not;
and the recovery module is used for recovering the value of the connection quality index of the hit matched data source node as the initial value obtained by the initialization module when the time length after the judgment module judges that the hit matched data source node exceeds the preset time threshold.
The device further comprises:
the updating module is used for updating the connection quality index of the data source node with the optimal connection quality selected by the selecting module into a connection quality index measured in real time;
correspondingly, the obtaining module is specifically configured to:
and acquiring data from the selected data source node according to the connection quality index updated by the updating module.
The connection quality index comprises: connection speed and/or transmission bandwidth.
According to the technical scheme of the embodiment of the invention, the data source nodes which accord with the connection request are grouped according to a factor which affects the connection quality, a matched data source node is hit in each group according to a Hash mapping rule, and a data source node with the optimal connection quality is selected from the hit matched data source nodes to acquire data. Because the data source nodes which accord with the connection request have the connection quality difference, the data source nodes are grouped according to a factor which affects the connection quality, the connection quality of each group of data source nodes is different, one data source node is hit from each group, the data source node can be selected from one group of data source nodes with good connection quality, and the data source node with the optimal connection quality is selected according to the connection quality index. The data source node with the optimal connection quality is more suitable for being connected with the client, so that the transmission efficiency of the process of acquiring data from the data source node is higher, and the transmission efficiency of the whole network can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a data acquisition method according to an embodiment of the present invention;
fig. 2 is a flowchart of a second data acquisition method according to an embodiment of the present invention;
FIG. 3 is a flow chart of another data acquisition method provided by the embodiments of the present invention;
fig. 4 is a schematic structural diagram of a data acquisition apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of another data acquisition apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present invention provides a data acquisition method, as shown in fig. 1, the data acquisition method includes:
s101: grouping the data source nodes which accord with the connection request according to a factor which influences the connection quality;
in the embodiment of the invention, the data source node is a server for providing data according to the connection request sent by the client. The connection quality of different data source nodes is influenced by some factors, for example, the connection quality of the data source nodes operated by some operators is good, and the connection quality of the data source nodes operated by some operators is poor; or the connection quality of the data source nodes in some regions is good, and the connection quality of the data source nodes in some regions is poor. The embodiment of the invention groups the data source nodes which accord with the connection request according to a factor which affects the connection quality, thereby ensuring that one data source node can be selected from a group of data source nodes with good connection quality.
Specifically, the grouping may be to group the data source nodes meeting the connection request according to regional distribution, or may be to group the data source nodes meeting the connection request according to different operators.
S102: the matched data source nodes are hit in each group of data source nodes according to a Hash mapping rule;
the Hash mapping rule comprises a Hash function, an input with any length can be input into the Hash function to obtain an output with a fixed length, and a set of keywords in the input can be mapped onto a certain address set through the Hash mapping rule to be used for hitting a certain address.
In the embodiment of the present invention, the hash mapping rule is used to correspond the connection request sent by the client to the data source node, and in a specific implementation, the identity of the client, such as an IP and a host name, may be used as an input of the hash function, or other data may be used as an input of the hash function. In the embodiment of the invention, the data source node matched with the input value is found from each group of data source nodes by using the Hash mapping rule, so that one data source node is hit in each group of data source nodes through the Hash mapping rule.
S103: and selecting a data source node with the optimal connection quality from the hit matched data source nodes of each group according to the connection quality index of the data source node.
The connection quality indicators of the data source nodes may include one or more, for example, one or more of the indicators such as connection speed, transmission bandwidth, connection time, delay time, and the like may be included, in the embodiment of the present invention, the number of the connection quality indicators and which connection quality indicator is specifically adopted are not limited, and a data source node with the best connection quality may be selected from the hit-matched data source nodes in each group according to the connection quality indicator concerned by the user.
Specifically, in the embodiment of the present invention, the connection quality indicator of the data source node is a connection speed and/or a transmission bandwidth.
S104: and acquiring data from the data source node with the optimal connection quality.
According to the embodiment of the invention, the data source nodes which accord with the connection request are grouped according to a factor which affects the connection quality, a matched data source node is hit in each group according to a Hash mapping rule, and a data source node with the optimal connection quality is selected from the hit matched data source nodes to obtain data. Because the data source nodes which accord with the connection request have the connection quality difference, the data source nodes are grouped according to a factor which affects the connection quality, the connection quality of each group of data source nodes is different, one data source node is hit from each group, the data source node can be selected from one group of data source nodes with good connection quality, and the data source node with the optimal connection quality is selected according to the connection quality index. The data source node with the optimal connection quality is more suitable for being connected with the client, so that the transmission efficiency of the process of acquiring data from the data source node is higher, and the transmission efficiency of the whole network can be improved.
On the basis of the foregoing embodiment, in order to obtain a better technical effect, an embodiment of the present invention provides a second data acquisition method, as shown in fig. 2, the data acquisition method includes:
s201: initializing a connection quality index of a data source node, so that the initial value of the connection quality index ensures that the data source node conforms to a connection request;
the initialization process assigns an initial value to the connection quality index of the data source node, and the initial state value can ensure that more data source nodes all conform to the connection request sent by the client, so that more data source nodes can participate in the selection of the data source nodes in the following steps, and the initial state value ensures that the network system where the data source nodes are located can normally operate.
Specifically, initial values of the connection quality index, such as values of the connection speed and the transmission bandwidth, may be added to an information table at the time of initialization. The values in the information table can be updated in real time according to the operation process of the network system, and can also be updated by setting an updated time threshold value, so as to ensure that the values of the connection quality indexes in the information table are all in the latest state of the data source node. The table of information is stored on the server that receives the connection request and is available for viewing by technicians who maintain the network.
S202: grouping the data source nodes which accord with the connection request according to a factor which influences the connection quality;
specifically, in the embodiment of the present invention, grouping is performed according to operator factors affecting connection quality, an operator from which each data source node comes is determined by identifiers of operator data centers carried by the data source nodes, the data source nodes meeting the connection request are grouped, and the data source nodes belonging to the identifiers of the data centers of the same operator are divided into a group. For example, the number of operators is 3, which are respectively telecom, Unicom and Mobile, and in this step, the data source nodes are divided into 3 groups according to the identifier of the telecom data center, the identifier of the Unicom data center and the identifier of the Mobile data center. The grouping information may also be added to the information table in a specific embodiment of the invention.
Step S201 may be performed before step S202, or may be performed simultaneously with step S202, that is, grouping is performed while initializing the connection quality index initial value of the data source node.
S203: the matched data source nodes are hit in each group of data source nodes according to a Hash mapping rule;
compared with the prior art that the identity of the client is used as the input of the hash function, in the embodiment of the present invention, a URI (Uniform Resource Identifier) of a connection request sent by the client is used as the input of the hash function in the hash mapping rule in each group of data source nodes, so as to hit the matched data source nodes.
Specifically, since a connection request sent by a general client appears in the form of a Uniform Resource Locator (URL) address, in order to ensure the accuracy of mapping in the embodiment of the present invention, the connection request sent by the client needs to be cut first, a domain name in front of the URL and content related to an IP address are removed, a URI is obtained, and then the obtained URI portion is used as an input of a hash function to hit a matched data source node. The URI is used as the input of the hash function, so that the mapping relation between the connection request and the data source node can be directly realized, the matched data source node is directly hit according to the connection request sent by the client, and the matching degree is higher. And the uniformity of the Hash mapping rule can know that the hit rate can not be deviated by adopting the method, and the purpose of reasonably utilizing all data source nodes can be achieved.
It should be noted that, because the mapping relationship between the connection request sent by the client and the data source node is directly implemented, when a plurality of connection requests sent by a plurality of clients are the same, data in the cache on the data source node is directly used, so that the use efficiency of the cache on the data source node is increased, and the transmission efficiency of the network is further improved.
S204: selecting a data source node with optimal connection quality from the hit matched data source nodes of each group according to the connection quality index of the data source node;
in the embodiment of the invention, if the selection is carried out only according to one connection quality index, one data source node with the optimal connection quality can be selected only by comparing the connection quality indexes of the hit matched data source nodes one by one; if the selection is performed according to at least two connection quality indexes, the priority level of the connection quality indexes can be determined, and the data source node with the optimal connection quality is selected by comparing the values of the connection quality indexes according to the priority levels. And respectively setting a threshold value for at least two connection quality indexes, and selecting all data source nodes with the connection quality indexes larger than the threshold value as the data source nodes with the optimal connection quality.
Specifically, the connection quality index including the connection speed and the transmission bandwidth is taken as an example for explanation. For example, the data source nodes conforming to the connection request are divided into 3 groups, each group hits one matching data source node, and there are 3 data source nodes in total. In an embodiment of the present invention, if the priority level of the index of the connection speed is higher, one of the 3 data source nodes with the highest connection speed is selected as the data source node with the optimal connection quality according to the value of the connection speed, and if there is a data source node with the same connection speed, the data source node with the largest transmission bandwidth is selected as the data source node with the optimal connection quality according to the value of the transmission bandwidth. In another embodiment of the present invention, the value of the connection speed of the data source node 1 is the highest, but the value of the transmission bandwidth is the lowest; the value of the connection speed of the data source node 2 is the lowest, but the value of the transmission bandwidth is the highest; the value of the connection speed of the data source node 3 and the value of the transmission bandwidth are not the highest but are both larger than the set threshold, and the data source node 3 is selected as the optimal data source node by comprehensive consideration. In a specific embodiment of the present invention, the values of the connection quality indicators may be recorded in an information table, and the data source node with the optimal connection quality is selected directly according to the values of the connection quality indicators in the information table.
S205: updating the connection quality index of the data source node with the optimal connection quality into a connection quality index measured in real time;
s206: and acquiring data from the selected data source node according to the updated connection quality index.
After the data source node with the optimal connection quality is selected, the connection quality index of the data source node with the optimal connection quality is measured in real time, the connection quality index of the data source node with the optimal connection quality is updated to be the connection quality index measured in real time, data are obtained from the selected data source node according to the connection quality index measured in real time, and the hysteresis of the system can be reduced.
Specifically, the values of the connection quality indicators may all be recorded in an information table, and the contents in the information table are updated in real time so that the values of the connection quality indicators can be known in real time.
Preferably, referring to fig. 3, after step S203 on the basis of the above embodiment, the method further includes:
s301: judging whether the time length of the matched data source node is hit exceeds a preset time threshold value or not;
s302: and when the time threshold is exceeded, recovering the value of the connection quality index of the hit matched data source node as an initial value.
Specifically, the preset time threshold may be a connection speed from the data source node, or may be a fixed value. When the time threshold value is exceeded, the initial value, namely the value of the connection quality index of the data source node during initialization, of the matched data source node hit by the Hash mapping rule can be recovered when the optimal data source node is not selected, so that the data source node can obtain the opportunity of being selected as the optimal data source node again when a connection request comes next time, and the stability and the balance of the network system are improved.
The data source nodes meeting the connection request are grouped according to a factor influencing the connection quality by initializing the connection quality index of the data source nodes, then the matched data source nodes are hit in each group of data source nodes according to a Hash mapping rule, one data source node with the optimal connection quality is selected from the hit matched data source nodes in each group, and data is obtained from the selected data source nodes according to the connection quality index measured in real time. In the embodiment of the invention, all the data source nodes are initialized to ensure that more data source nodes conform to the connection request sent by the client; meanwhile, after the optimal data source node is selected, the quality index information of the optimal data source node is measured in real time, the quality index information is updated, the data on the data node can be obtained according to the latest connection quality index, the operation hysteresis of the network system is reduced, further, the transmission efficiency of the process of obtaining the data from the data source node is higher, and therefore the transmission efficiency of the whole network can be improved.
Corresponding to the above method embodiment, an embodiment of the present invention further provides a data obtaining apparatus, where the apparatus is applied to a server, and as shown in fig. 4, the data obtaining apparatus includes:
a grouping module 401, configured to group data source nodes that meet the connection request according to a factor that affects connection quality;
a hit module 402, configured to hit the matched data source node in each group of data source nodes divided by the grouping module 401 according to a hash mapping rule;
a selecting module 403, configured to select, according to the connection quality index of the data source node, a data source node with the optimal connection quality from the data source nodes hit and matched in each group by the hitting module 402;
an obtaining module 404, configured to obtain data from the data source node with the optimal connection quality selected by the selecting module 403.
The grouping module 401 is configured to group the packets,
the method is specifically used for grouping data source nodes which accord with the connection request according to regional distribution; or,
in particular for grouping data source nodes that comply with a connection request according to an operator.
The hit module 403 is specifically configured to:
and taking the universal resource identifier URI of the connection request as the input of a hash function in the hash mapping rule in each group of data source nodes, and hitting the matched data source nodes.
It should be noted that, the apparatus in the embodiment of the present invention is applied to a server, and is configured to receive a connection request sent by a client, and when there is no response data on the local server, it needs to forward the connection request to a parent node of the server, that is, to other data source node servers conforming to the connection request, so as to obtain data from the other data source node servers conforming to the connection request, where the server, the client, and the other servers conforming to the connection request are an internet with a tree structure.
According to the technical scheme of the embodiment of the invention, the data source nodes which accord with the connection request are grouped according to a factor which affects the connection quality through the grouping module, a matched data source node is hit in each group according to a Hash mapping rule through the hitting module, a data source node with the optimal connection quality is selected from the hit matched data source nodes through the selecting module, and finally the data is acquired from the data source node with the optimal connection quality through the acquiring module. Because the data source nodes which accord with the connection request have connection quality difference, the grouping module in the embodiment of the invention carries out grouping according to a factor which affects the connection quality, the connection quality of each group of data source nodes has difference, the hit module hits one data source node from each group, so as to ensure that one data source node can be selected from one group of data source nodes with good connection quality, and the selection module selects the data source node with the optimal connection quality according to the connection quality index. The data source node with the optimal connection quality is more suitable for being connected with the client, so that the transmission efficiency of the process of acquiring data from the data source node by the acquisition module is higher, and the transmission efficiency of the whole network can be improved.
Further, referring to fig. 5, the apparatus further includes:
an initializing module 501, configured to initialize a connection quality indicator of the data source node, so that the initial value of the connection quality indicator ensures that the data source node all conforms to the connection request.
The device further comprises:
a judging module 502, configured to judge whether a duration of time elapsed after the matched data source node is hit exceeds a preset time threshold;
a restoring module 503, configured to restore, when the time length elapsed after the determining module 502 determines that the data source node is hit exceeds a preset time threshold, the value of the connection quality index of the data source node that is hit and matched as an initial value obtained by initialization of the initializing module.
The device further comprises:
an updating module 504, configured to update the connection quality index of the data source node with the optimal connection quality selected by the selecting module 403 to a connection quality index measured in real time;
accordingly, the obtaining module 404 is specifically configured to:
and acquiring data from the selected data source node according to the connection quality index updated by the updating module 504.
In the embodiment of the invention, all data source nodes are initialized through an initialization module so as to ensure that more data source nodes conform to connection requests sent by a client; meanwhile, after the connection quality index of the optimal data source node is measured in real time through the updating module, the value of the connection quality index is updated, the acquisition module can be ensured to acquire data on the data node according to the latest connection quality index, the operation hysteresis of a network system is reduced, and further, the transmission efficiency of the process of acquiring the data from the data source node is higher, so that the transmission efficiency of the whole network can be improved.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
Those skilled in the art will appreciate that all or part of the steps in the above method embodiments may be implemented by a program to instruct relevant hardware to perform the steps, and the program may be stored in a computer-readable storage medium, which is referred to herein as a storage medium, such as: ROM/RAM, magnetic disk, optical disk, etc.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (12)

1. A data acquisition method is applied to a server and is characterized by comprising the following steps:
grouping the data source nodes which accord with the connection request according to a factor which influences the connection quality;
the matched data source nodes are hit in each group of data source nodes according to a Hash mapping rule;
selecting a data source node with optimal connection quality from the hit matched data source nodes of each group according to the connection quality index of the data source node;
acquiring data from the data source node with the optimal connection quality;
the method for hitting the matched data source nodes in each group of data source nodes according to the hash rule comprises the following steps:
and taking the universal resource identifier URI of the connection request as the input of a hash function in the hash mapping rule in each group of data source nodes, and hitting the matched data source nodes.
2. The method of claim 1, wherein prior to grouping the data source nodes that are eligible for a connection request according to a factor that affects connection quality, further comprising:
and initializing the connection quality index of the data source node, so that the initial value of the connection quality index ensures that the data source node conforms to the connection request.
3. The method of claim 1, wherein grouping data source nodes that are eligible for a connection request according to a factor that affects connection quality comprises:
grouping the data source nodes which accord with the connection request according to regional distribution; or,
and grouping the data source nodes which conform to the connection request according to the operator.
4. The method of claim 2, wherein after the matching data source node is hit in each group of data source nodes according to the hash mapping rule, the method further comprises:
judging whether the time length of the matched data source node is hit exceeds a preset time threshold value or not;
and when the time threshold is exceeded, restoring the value of the connection quality index of the hit matched data source node to be an initial value.
5. The method according to claim 1, wherein after selecting a data source node with the best connection quality from the hit matched data source nodes according to the connection quality indicator of the data source node, the method further comprises:
updating the connection quality index of the data source node with the optimal connection quality into a connection quality index measured in real time;
correspondingly, the acquiring data from the data source node with the optimal connection quality includes:
and acquiring data from the selected data source node according to the updated connection quality index.
6. The method according to any of claims 1-5, wherein the connection quality indicator comprises: connection speed and/or transmission bandwidth.
7. A data acquisition device applied to a server is characterized by comprising:
the grouping module is used for grouping the data source nodes which accord with the connection request according to a factor which influences the connection quality;
the hit module is used for hitting the matched data source nodes in each group of data source nodes divided by the grouping module according to a Hash mapping rule;
the selection module is used for selecting a data source node with the optimal connection quality from the data source nodes which are matched in each group by the hit module according to the connection quality index of the data source node;
the acquisition module is used for acquiring data from the data source node with the optimal connection quality selected by the selection module;
wherein the hit module is specifically configured to:
and taking the universal resource identifier URI of the connection request as the input of a hash function in the hash mapping rule in each group of data source nodes, and hitting the matched data source nodes.
8. The apparatus of claim 7, further comprising:
and the initialization module is used for initializing the connection quality index of the data source node, so that the initial value of the connection quality index ensures that the data source node conforms to the connection request.
9. The apparatus of claim 7, wherein the grouping module,
the method is specifically used for grouping data source nodes which accord with the connection request according to regional distribution; or,
in particular for grouping data source nodes that comply with a connection request according to an operator.
10. The apparatus of claim 8, further comprising:
the judging module is used for judging whether the time length after the matched data source node is hit exceeds a preset time threshold value or not;
and the recovery module is used for recovering the value of the connection quality index of the hit matched data source node as the initial value obtained by the initialization module when the time length after the judgment module judges that the hit matched data source node exceeds the preset time threshold.
11. The apparatus of claim 7, further comprising:
the updating module is used for updating the connection quality index of the data source node with the optimal connection quality selected by the selecting module into a connection quality index measured in real time;
correspondingly, the obtaining module is specifically configured to:
and acquiring data from the selected data source node according to the connection quality index updated by the updating module.
12. The apparatus according to any of claims 7-11, wherein the connection quality indicator comprises: connection speed and/or transmission bandwidth.
CN201410415518.1A 2014-08-21 2014-08-21 A kind of data capture method and device Active CN104144223B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410415518.1A CN104144223B (en) 2014-08-21 2014-08-21 A kind of data capture method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410415518.1A CN104144223B (en) 2014-08-21 2014-08-21 A kind of data capture method and device

Publications (2)

Publication Number Publication Date
CN104144223A CN104144223A (en) 2014-11-12
CN104144223B true CN104144223B (en) 2018-02-09

Family

ID=51853290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410415518.1A Active CN104144223B (en) 2014-08-21 2014-08-21 A kind of data capture method and device

Country Status (1)

Country Link
CN (1) CN104144223B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106487844A (en) * 2015-08-28 2017-03-08 北京奇虎科技有限公司 The method and system of the effectiveness of URL is promoted in a kind of detection
CN106210058B (en) * 2016-07-13 2019-04-16 成都知道创宇信息技术有限公司 A kind of reverse proxy method of multi-core parallel concurrent
CN106202582B (en) * 2016-08-31 2020-07-14 北京冀凯信息技术有限公司 Task scheduling method and system in enterprise distributed data warehouse
CN106878185B (en) * 2017-04-13 2020-04-07 浪潮集团有限公司 Message IP address matching circuit and method
CN108810078B (en) * 2018-04-20 2021-05-18 深圳市网心科技有限公司 Node selection method, terminal device and computer readable storage medium
CN109067817B (en) * 2018-05-31 2021-12-07 北京五八信息技术有限公司 Media content flow distribution method and device, electronic equipment and server

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150506A (en) * 2007-08-24 2008-03-26 华为技术有限公司 Content acquisition method, device and content transmission system
CN102629929A (en) * 2012-04-18 2012-08-08 华为技术有限公司 Method and system and device for obtaining data
CN102647460A (en) * 2012-03-30 2012-08-22 华为技术有限公司 Transaction data downloading method and mobile terminal
CN103036967A (en) * 2012-12-10 2013-04-10 北京奇虎科技有限公司 Data download system and device and method for download management
CN103593368A (en) * 2012-08-16 2014-02-19 深圳市世纪光速信息技术有限公司 Method, server, terminal and system for selecting data sources

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8384753B1 (en) * 2006-12-15 2013-02-26 At&T Intellectual Property I, L. P. Managing multiple data sources

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150506A (en) * 2007-08-24 2008-03-26 华为技术有限公司 Content acquisition method, device and content transmission system
CN102647460A (en) * 2012-03-30 2012-08-22 华为技术有限公司 Transaction data downloading method and mobile terminal
CN102629929A (en) * 2012-04-18 2012-08-08 华为技术有限公司 Method and system and device for obtaining data
CN103593368A (en) * 2012-08-16 2014-02-19 深圳市世纪光速信息技术有限公司 Method, server, terminal and system for selecting data sources
CN103036967A (en) * 2012-12-10 2013-04-10 北京奇虎科技有限公司 Data download system and device and method for download management

Also Published As

Publication number Publication date
CN104144223A (en) 2014-11-12

Similar Documents

Publication Publication Date Title
CN104144223B (en) A kind of data capture method and device
CN108881448B (en) API request processing method and device
CN106878262B (en) Message detection method and device, and method and device for establishing local threat information library
CN103856569B (en) A kind of method and apparatus of synchronous domain name system asset information
CN106534392B (en) Positioning information acquisition method, positioning method and device
CN103051740B (en) Domain name analytic method, dns server and domain name analysis system
CN109729183B (en) Request processing method, device, equipment and storage medium
WO2015127075A1 (en) Content delivery network architecture with edge proxy
CN107404541B (en) Method and system for selecting neighbor node in peer-to-peer network transmission
CN108124020B (en) Domain name resolution method, system and equipment
CN109474718B (en) Domain name resolution method and device
CN106888277B (en) Domain name query method and device
WO2017166524A1 (en) Domain name parsing method and apparatus
CN113452808A (en) Domain name resolution method, device, equipment and storage medium
US8903998B2 (en) Apparatus and method for monitoring web application telecommunication data by user
CN105792247B (en) data pushing method and device
US20150156259A1 (en) Load balancing apparatus, information processing system, method and medium
CN105323290B (en) A kind of content scheduling method and device based on customer flow distribution characteristics
CN108337280B (en) Resource updating method and device
EP2874368B1 (en) Method and device for generating aggregate layer networkmap and aggregate layer costmap
EP2860637B1 (en) Information processing system, method, and program
CN105337931B (en) A kind of limit control method and distributed limit control system
CN109474696B (en) Network service method, device, electronic equipment and readable storage medium
CN109582829B (en) Processing method, device, equipment and readable storage medium
CN105025042B (en) A kind of method and system of determining data information, proxy server

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant