CN112769643B - Resource scheduling method and device, electronic equipment and storage medium - Google Patents

Resource scheduling method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112769643B
CN112769643B CN202011577129.0A CN202011577129A CN112769643B CN 112769643 B CN112769643 B CN 112769643B CN 202011577129 A CN202011577129 A CN 202011577129A CN 112769643 B CN112769643 B CN 112769643B
Authority
CN
China
Prior art keywords
cdn
node
quality data
fault
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011577129.0A
Other languages
Chinese (zh)
Other versions
CN112769643A (en
Inventor
李博
马茗
罗喆
程媛
郭君健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202011577129.0A priority Critical patent/CN112769643B/en
Publication of CN112769643A publication Critical patent/CN112769643A/en
Application granted granted Critical
Publication of CN112769643B publication Critical patent/CN112769643B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The disclosure relates to a resource scheduling method, a resource scheduling device, electronic equipment and a storage medium. The resource scheduling method may include: acquiring quality data of the CDN; performing fault detection on the CDN based on the obtained quality data, wherein performing the fault detection on the CDN comprises locating fault nodes in the CDN according to quality data thresholds predicted by a time sequence model and/or locating the fault nodes in the CDN according to quality differences between CDN nodes; executing fault decision according to the detection result; and performing CDN scheduling according to the decision result.

Description

Resource scheduling method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of signal processing, and in particular, to a method, an apparatus, an electronic device, and a storage medium for resource scheduling.
Background
Currently, many online services utilize Content Delivery Networks (CDNs) for data (e.g., streaming media data) transmission. The method and the device can accurately discover the faults of the CDN in time and effectively realize CDN dispatching, and are very important for online service. However, the existing CDN fault location and scheduling method always uses a manually set fixed threshold to determine abnormality of CDN quality data when performing fault location and scheduling, which results in that fault location is not accurate in time, and thus effective CDN scheduling cannot be achieved.
Disclosure of Invention
The disclosure provides a method, a device, an electronic device and a storage medium for resource scheduling, so as to at least solve the problem that the fault location in the related technology is not accurate enough in time, and therefore, the effective CDN scheduling cannot be realized.
According to a first aspect of embodiments of the present disclosure, there is provided a method for resource scheduling, the method comprising: acquiring quality data of the CDN; performing fault detection on the CDN based on the obtained quality data, wherein performing the fault detection on the CDN comprises locating fault nodes in the CDN according to quality data thresholds predicted by a time sequence model and/or locating the fault nodes in the CDN according to quality differences between CDN nodes; executing fault decision according to the detection result; and performing CDN scheduling according to the decision result.
Optionally, the acquiring quality data of the CDN includes: acquiring quality data of each CDN node under different dimension combinations; the performing fault detection on the CDN based on the obtained quality data includes: and performing fault detection on the CDNs under the different dimensional combinations based on the acquired quality data of each CDN node under the different dimensional combinations.
Optionally, locating the faulty node in the CDN according to the quality data threshold predicted by the time series model includes: acquiring a quality data threshold value under each dimension combination or a quality data threshold value of each CDN node, which is predicted by using a time sequence model based on historical quality data of each CDN node under each dimension combination; and comparing the acquired quality data of each CDN node under each dimension combination with a corresponding quality data threshold value to locate a fault node under each dimension combination.
Optionally, the step of comparing the obtained quality data of each CDN node in each dimension combination with a corresponding quality data threshold to locate a fault node in each dimension combination includes: determining whether the quality data of each CDN node is greater than a corresponding quality data threshold, and determining whether the ratio of the quality data of each CDN node to the corresponding quality data threshold is greater than a predetermined value; determining the duty ratio of the quality data which satisfies the quality data with the ratio of the quality data with the quality data threshold value larger than the preset value in all the quality data of each CDN node; and determining the CDN node with the duty ratio meeting the preset condition under each dimension combination as a fault node under each dimension combination.
Optionally, locating the faulty node in the CDN according to the quality difference between CDN nodes includes: comparing the quality data of CDN nodes under each dimension combination with each other; and according to the comparison result, determining CDN nodes with poor quality compared with other CDN nodes of a preset proportion under the same dimension combination as fault nodes in the CDN.
Optionally, the CDN quality data includes a value of an index indicating CDN quality obtained at a specific time interval over a specific length of time.
Optionally, the performing fault decision according to the detection result includes: and screening out fault nodes which can be scheduled from the positioned fault nodes according to a preset fault decision rule according to the detection result.
Optionally, the performing CDN scheduling according to the decision result includes: traffic on failed nodes that can be scheduled is scheduled to normal nodes in the same dimension combination based on the current resource configuration situation of the CDN.
Optionally, the scheduling the traffic on the failed node capable of being scheduled to the normal node under the same dimension combination based on the current resource allocation condition of the CDN includes: and according to the traffic of the fault node which can be scheduled, adopting different scheduling modes to schedule the traffic on the fault node to the normal node under the same dimension combination.
Optionally, the scheduling the traffic on the failed node capable of being scheduled to the normal node of the CDN based on the current resource configuration condition of the CDN includes: for a fault node which can be scheduled and has the traffic size meeting the first preset condition, after the traffic on the fault node is scheduled to a normal node, the following scheduling process is repeated: after waiting for a predetermined time, rescheduling at least a portion of the scheduled traffic back to the failed node; rescheduling previously scheduled traffic back to the failed node if the quality of the failed node returns to normal during a specified period of time; if the quality of the failed node remains abnormal during the certain period of time, re-scheduling the at least a portion of the scheduled back traffic, wherein the predetermined time waiting after each re-scheduling of traffic is progressively longer; and for the fault node which can be scheduled and has the flow size meeting the second preset condition, after the flow on the fault node is scheduled to the normal node, the scheduled flow is not re-scheduled back to the fault node even if the quality of the fault node is recovered to be normal.
Optionally, the scheduling the traffic on the failed node capable of being scheduled to the normal node under the same dimension combination based on the current resource allocation condition of the CDN includes: and dispatching the traffic on the fault nodes capable of being dispatched to the normal nodes under the same dimension combination according to the dispatching priorities of the fault nodes capable of being dispatched, wherein the dispatching priorities are determined according to the traffic on the fault nodes capable of being dispatched, and the dispatching priorities of the fault nodes are higher as the traffic on the fault nodes is larger.
Optionally, the performing fault detection on the CDN under different dimensional combinations includes: performing health testing on CDN nodes under each dimension combination based on the obtained quality data under each dimension combination; performing fault detection based on the quality data obtained for each CDN node under a dimension combination that passes the health test only for the dimension combination to locate a faulty node under the dimension combination.
According to a second aspect of embodiments of the present disclosure, there is provided an apparatus for resource scheduling, the apparatus comprising: a quality data acquisition unit configured to acquire quality data of the CDN; a fault detection unit configured to perform fault detection on the CDN based on the obtained quality data, wherein performing the fault detection on the CDN includes locating a fault node in the CDN according to a quality data threshold predicted by the time series model and/or locating a fault node in the CDN according to a quality difference between CDN nodes; a fault decision unit configured to perform a fault decision based on the detection result; and a scheduling unit configured to perform CDN scheduling according to the decision result.
Optionally, the acquiring quality data of the CDN includes: acquiring quality data of each CDN node under different dimensional combinations, wherein the performing fault detection on the CDNs based on the acquired quality data comprises the following steps: and performing fault detection on the CDNs under the different dimensional combinations based on the acquired quality data of each CDN node under the different dimensional combinations.
Optionally, locating the faulty node in the CDN according to the quality data threshold predicted by the time series model includes: acquiring a quality data threshold value under each dimension combination or a quality data threshold value of each CDN node, which is predicted by using a time sequence model based on historical quality data of each CDN node under each dimension combination; and comparing the acquired quality data of each CDN node under each dimension combination with a corresponding quality data threshold value to locate a fault node under each dimension combination.
Optionally, the step of comparing the obtained quality data of each CDN node in each dimension combination with a corresponding quality data threshold to locate a fault node in each dimension combination includes: determining whether the quality data of each CDN node is greater than a corresponding quality data threshold, and determining whether the ratio of the quality data of each CDN node to the corresponding quality data threshold is greater than a predetermined value; determining the duty ratio of the quality data which satisfies the quality data with the ratio of the quality data with the quality data threshold value larger than the preset value in all the quality data of each CDN node; and determining the CDN node with the duty ratio meeting the preset condition under each dimension combination as a fault node under each dimension combination.
Optionally, locating the faulty node in the CDN according to the quality difference between CDN nodes includes: comparing the quality data of CDN nodes under each dimension combination with each other; and according to the comparison result, determining CDN nodes with poor quality compared with other CDN nodes of a preset proportion under the same dimension combination as fault nodes in the CDN.
Optionally, the CDN quality data includes a value of an index indicating CDN quality obtained at a specific time interval over a specific length of time.
Optionally, the performing fault decision according to the detection result includes: and screening out fault nodes which can be scheduled from the positioned fault nodes according to a preset fault decision rule according to the detection result.
Optionally, the performing CDN scheduling according to the decision result includes: traffic on failed nodes that can be scheduled is scheduled to normal nodes in the same dimension combination based on the current resource configuration situation of the CDN.
Optionally, the scheduling the traffic on the failed node capable of being scheduled to the normal node under the same dimension combination based on the current resource allocation condition of the CDN includes: and according to the traffic of the fault node which can be scheduled, adopting different scheduling modes to schedule the traffic on the fault node to the normal node under the same dimension combination.
Optionally, the scheduling the traffic on the failed node capable of being scheduled to the normal node of the CDN based on the current resource configuration condition of the CDN includes: for a fault node which can be scheduled and has the traffic meeting the first preset condition, after the traffic on the fault node is scheduled to a normal node, the following scheduling process is repeated: after waiting for a predetermined time, rescheduling at least a portion of the scheduled traffic back to the failed node; rescheduling previously scheduled traffic back to the failed node if the quality of the failed node returns to normal during a specified period of time; if the quality of the failed node remains abnormal during the certain period of time, re-scheduling the at least a portion of the scheduled back traffic, wherein the predetermined time waiting after each re-scheduling of traffic is progressively longer; and for the fault node with the flow meeting the second preset condition and capable of being scheduled, after the flow on the fault node is scheduled to a normal node, the scheduled flow is not re-scheduled back to the fault node even if the quality of the fault node is recovered to be normal.
Optionally, the scheduling the traffic on the failed node capable of being scheduled to the normal node under the same dimension combination based on the current resource allocation condition of the CDN includes: and dispatching the traffic on the fault nodes capable of being dispatched to the normal nodes under the same dimension combination according to the dispatching priorities of the fault nodes capable of being dispatched, wherein the dispatching priorities are determined according to the traffic on the fault nodes capable of being dispatched, and the dispatching priorities of the fault nodes are higher as the traffic on the fault nodes is larger.
Optionally, the performing fault detection on the CDN under different dimensional combinations includes: performing health testing on CDN nodes under each dimension combination based on the obtained quality data under each dimension combination; performing fault detection based on the quality data obtained for each CDN node under a dimension combination that passes the health test only for the dimension combination to locate a faulty node under the dimension combination.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic device comprising: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the method as described above.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium storing instructions, which when executed by at least one processor, cause the at least one processor to perform a method as described above.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, instructions in which are executed by at least one processor in an electronic device to perform the method as described above.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: according to the embodiment of the disclosure, the fault nodes in the CDN are positioned according to the quality data threshold predicted by the time sequence model, and/or the fault nodes in the CDN are positioned according to the quality difference between the CDN nodes, and on the basis, CDN scheduling is performed, so that the fault nodes in the CDN can be more timely and accurately positioned, and effective CDN scheduling is realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments consistent with the disclosure and, together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.
FIG. 1 is an exemplary system architecture in which exemplary embodiments of the present disclosure may be applied;
FIG. 2 is a flowchart of a method for resource scheduling in accordance with an exemplary embodiment of the present disclosure;
FIG. 3 is a schematic diagram illustrating an example of a method for resource scheduling in an exemplary embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating performing fault detection of an exemplary embodiment of the present disclosure;
FIG. 5 is a schematic diagram illustrating performing fault decisions of an exemplary embodiment of the present disclosure;
FIG. 6 is a schematic diagram illustrating performing CDN scheduling in accordance with an exemplary embodiment of the present disclosure;
FIG. 7 is a block diagram of an apparatus for resource scheduling in accordance with an exemplary embodiment of the present disclosure;
fig. 8 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The embodiments described in the examples below are not representative of all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
It should be noted that, in this disclosure, "at least one of the items" refers to a case where three types of juxtaposition including "any one of the items", "a combination of any of the items", "an entirety of the items" are included. For example, "including at least one of a and B" includes three cases side by side as follows: (1) comprises A; (2) comprising B; (3) includes A and B. For example, "at least one of the first and second steps is executed", that is, three cases are juxtaposed as follows: (1) performing step one; (2) executing the second step; (3) executing the first step and the second step.
Fig. 1 illustrates an exemplary system architecture 100 in which exemplary embodiments of the present disclosure may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. A user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages (e.g., an audio-video data upload request, an audio-video data download request), etc. Various communication client applications, such as audio and video recording software, audio and video players, instant messaging tools, mailbox clients, social platform software, etc., can be installed on the terminal devices 101, 102, 103. The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and capable of audio and video playback and recording, including but not limited to smart phones, tablet computers, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they may be installed in the above-listed electronic devices, which may be implemented as a plurality of software or software modules (e.g. to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.
The terminal devices 101, 102, 103 may be equipped with image capturing means (e.g. cameras) to capture video data. In practice, the smallest visual unit that makes up a video is a Frame. Each frame is a static image. A sequence of temporally successive frames is synthesized together to form a dynamic video. In addition, the terminal apparatuses 101, 102, 103 may also be mounted with components (e.g., speakers) for converting electric signals into sound to play the sound, and may also be mounted with means (e.g., microphones) for converting analog audio signals into digital audio signals to collect the sound.
The server 105 may be a server providing various services, such as a background server providing support for multimedia applications installed on the terminal devices 101, 102, 103. The background server may analyze, store, etc. the received data such as the audio and video data upload request, and may also receive the audio and video data download request sent by the terminal devices 101, 102, 103, and feed back the audio and video data indicated by the audio and video data download request to the terminal devices 101, 102, 103.
As an example, server 105 may be a streaming server and network 104 may be a Content Delivery Network (CDN). The streaming server may transmit streaming media to the respective terminal devices through the content distribution network. The content delivery network includes a plurality of nodes that provide streaming media content. When a node in the content distribution network fails, an abnormality occurs in providing streaming media services to the terminal device by the streaming media server. If the fault node can be timely and accurately positioned, the normal provision of the streaming media service can be ensured by effective scheduling.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be noted that, the resource scheduling method provided in the embodiments of the present disclosure is generally executed by the server 105, and accordingly, the resource scheduling device is generally disposed in the server 105. However, the resource scheduling method provided by the embodiments of the present disclosure may also be cooperatively performed by the terminal device and the server. Accordingly, the resource scheduling means may also be provided in both the terminal device and the server.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers as desired, and the disclosure is not limited in this regard.
Fig. 2 is a flowchart of a method for resource scheduling (hereinafter, simply referred to as a "resource scheduling method" for convenience of description) according to an exemplary embodiment of the present disclosure.
Referring to fig. 2, quality data of a CDN is acquired in step S201. According to an exemplary embodiment, the CDN quality data may include values of indicators indicating CDN quality obtained at specific time intervals (or time granularity) over a specific length of time (or a specific time span). Here, the index may be any one or more indexes indicating the quality of the CDN, for example, the index may be a download failure rate, but is not limited thereto. In addition, CDN quality data may be stored in a predetermined database each time it is acquired, and thus CDN quality data may be acquired from the predetermined database. Since one fault in the real scenario is typically a combination of multiple dimensions, the CDN quality data may be quality data of each CDN node in a different combination of dimensions, and accordingly, acquiring CDN quality data in step S201 may include acquiring quality data of each CDN node in a different combination of dimensions. Here, the dimension combination may be, for example, a combination of the following dimensions: traffic, operators (ISPs), provinces, and traffic types, e.g., traffic for accounts of very high interest (e.g., accounts of interest greater than a first threshold), traffic for accounts of higher interest (e.g., accounts of interest less than a first threshold and greater than a second threshold), traffic for accounts of less interest (e.g., accounts of interest less than a second threshold), etc.
Fig. 3 is a schematic diagram illustrating an example of a method for resource scheduling according to an exemplary embodiment of the present disclosure. As shown in fig. 3, for example, a value of an index indicating the quality of the CDN may be calculated using a real-time calculation framework Flink, and the calculated index value may be stored in a database using a guide store. When resource scheduling is needed, index values under specific time span, time granularity and dimension combination can be obtained from the database, namely CDN quality data is obtained. It should be noted that, the present disclosure does not limit the manner of calculating the index.
After the quality data of the CDN is acquired, in step S202, fault detection is performed on the CDN based on the acquired quality data. According to an exemplary embodiment, performing fault detection on the CDN includes locating a fault node in the CDN according to quality data thresholds predicted by the time series model and/or locating a fault node in the CDN according to quality differences between CDN nodes. That is, a faulty node in the CDN may be located according to a quality data threshold predicted by the time series model based on the acquired quality data (hereinafter, referred to as a first fault location manner). Alternatively, a faulty node in the CDN may be located according to a quality difference between CDN nodes based on the acquired quality data (hereinafter, referred to as a second fault location manner). Alternatively, the fault nodes may be located together according to the above two fault locating modes based on the acquired quality data, in which case the finally located fault node may be a union of the fault nodes located by the above two fault locating modes.
As described above, a fault in a real scenario is typically a combination of dimensions, and for fault localization it is desirable to determine whether a fault affects one or multiple lines of traffic, under each affected line of traffic, those ISPs have caused the fault, how large the area of the fault is, what account type of traffic the CDN fault of the area of the fault is, or a combination of account types of traffic. Thus, fault localization often locates a fault node under a combination of different dimensions. According to an exemplary embodiment, in step S202, fault detection may be performed on the CDN in different dimensional combinations based on the obtained quality data for each CDN node in the different dimensional combinations. Optionally, according to an exemplary embodiment, performing fault detection on the CDN under different dimensional combinations may include: performing health testing on CDN nodes under each dimension combination based on the obtained quality data under each dimension combination; and performing fault detection based on the obtained quality data of each CDN node under the dimension combination to locate a fault node under the dimension combination only for the dimension combination that passes the health test. Here, the health test may be performed on each CDN node by comparing the quality data of each CDN node under each dimension combination with a threshold value set in advance by a user. For example, if the quality data of a CDN node is less than the predetermined threshold, the CDN node is determined to be a healthy node. If the number of all healthy nodes per dimension combination meets a predetermined condition (e.g., exceeds a predetermined number), then the dimension combination is deemed to pass the health test. In this case, fault detection may be performed on the dimension combination that passes the health test based on the quality data obtained for each CDN node under the dimension combination to locate a faulty node under the dimension combination.
As shown in fig. 3, for example, fault detection may be performed in a dimension combination of [ province+isp+service line+cdn ] (also may be referred to as [ p+i+b+c ]), where finer dimension combinations enable accurate localization of fault nodes and causes. As another example, fault detection may be performed in a dimension combination of [ service line+cdn ] (also referred to as [ b+c ]), which may be a dimension combination at the service line level, which may satisfy fault detection for the service line. As another example, fault detection may be performed in a dimension combination of [ very high-interest account + CDN ] (also may be referred to as [ s+c ]), which may be a special level dimension combination that may specifically locate those nodes that have failed due to the traffic of very high-interest accounts. It should be noted that, although an example of three-dimensional combinations is given above, the present disclosure is not limited to the manner of dimensional combinations, and fault detection may be performed under any dimensional combination according to actual needs.
In addition, the fault node may be located using different fault locating methods for different dimensional combinations. For example, for this combination of dimensions [ s+c ], the second failure localization approach may be utilized to localize the failed node, while for the two combinations of dimensions [ p+i+b+c ] and [ b+cdn ], the first failure localization approach may be utilized to localize the failed node.
Specifically, when locating the fault node according to the first fault locating method, a quality data threshold value under each dimension combination or a quality data threshold value of each CDN node predicted by using a time series model based on historical quality data of each CDN node under each dimension combination may be first obtained, and then the fault node under each dimension combination may be located by comparing the obtained quality data of each CDN node under each dimension combination with a corresponding quality data threshold value. For example, as shown in fig. 3, by performing fault detection on the CDN in three different dimensional combinations of [ p+i+b+c ], [ b+c ], and [ s+c ], fault nodes in the dimensional combinations can be located, respectively.
Fig. 4 is a schematic diagram illustrating performing fault detection of an exemplary embodiment of the present disclosure. In the example of fig. 4, after CDN quality data is acquired, a health test is first performed, and fault detection is performed only for combinations of dimensions that pass the monitoring test. Alternatively, it may be determined whether the number of acquired quality data is greater than a predetermined threshold, and the subsequent failure detection process may be performed only if the number of acquired quality data is greater than the predetermined threshold. Further, in the example of fig. 4, in the case where quality data is acquired and the number of quality data is greater than a predetermined threshold value, the two fault locating manners described above are respectively performed to locate the faulty node. That is, not only are the failed nodes in the CDN located according to the quality data thresholds predicted by the time series model, but also the failed nodes in the CDN are located according to the quality differences between CDN nodes.
Here, the time series model is a pre-trained machine learning model (for example, the time series model may be an Arima model, but is not limited thereto), and may predict a quality data threshold value for each of the dimensional combinations based on historical quality data of each of the CDN nodes for each of the dimensional combinations, or predict a quality data threshold value for each of the CDN nodes. Here, sensitivity control parameters may be set for the time series model, the sensitivity parameters may control the degree of fluctuation of the quality data threshold value predicted by the time series model, and different sensitivity control parameters may be set for the time series model at different periods of time.
The predicted quality data threshold may be stored in a database or memory. Each time a fault detection is required, a predicted quality data threshold may be obtained from a database or memory and the fault node in each dimension combination is located by comparing the current quality data obtained with the predicted quality data threshold. As shown in fig. 4, an Arima predicted quality data threshold (which may also be referred to as an Arima upper bound) may be obtained. If the quality threshold is successfully obtained (i.e., the upper bound is not null), the failed node may be located by comparison to the quality threshold. For example, a failed node under each dimension combination may be located by the following comparison: determining whether the quality data of each CDN node is greater than a corresponding quality data threshold, and determining whether the ratio of the quality data of each CDN node to the corresponding quality data threshold is greater than a predetermined value; determining a duty ratio of mass data satisfying more than a corresponding mass data threshold (hereinafter, may be referred to as a first condition) and a ratio of the corresponding mass data threshold to the mass data greater than the predetermined value (hereinafter, may be referred to as a second condition) among all mass data of each CDN node; and determining the CDN node with the duty ratio meeting the preset condition under each dimension combination as a fault node under each dimension combination. Here, determining whether the quality data for each CDN node is greater than a corresponding quality data threshold may include determining whether the quality data for each CDN node is continuously greater than a corresponding quality data threshold for a predetermined period of time. For example, if a CDN node shares five quality data Q1, Q2, Q3, Q4, and Q5 for a certain dimension combination, where three quality data Q1, Q2, and Q3 are greater than their corresponding quality data thresholds, and only the ratio of Q1 and Q2 to their corresponding quality data thresholds among the ratios of the five quality data to their corresponding quality data thresholds is greater than a predetermined value (e.g., 4), it may be determined that only Q1 and Q2 among the five quality data satisfies both the greater than their corresponding quality data thresholds and the ratio of the threshold of the quality data corresponding thereto is greater than the predetermined value, and therefore, for the CDN node, the duty ratio of the quality data satisfying both the first condition and the second condition is 2/5. Assuming that the above predetermined condition set for the duty cycle is that the duty cycle is greater than a predetermined ratio (e.g., 1/5), the CDN node is determined to be a failed node in the dimension combination. In the above comparison, all the failed nodes for each dimension combination can be determined. It should be noted that, although an example of a comparison method is given above, other comparison methods may be used to determine the failed node according to the service requirement.
As described above, a failed node in the CDN may also be located based on quality differences between CDN nodes. In the second fault location manner, the quality data of the CDN nodes in each dimension combination may be first compared with each other, and then, according to the comparison result, CDN nodes that are inferior to other CDN nodes in a predetermined proportion in the same dimension combination may be determined as the fault nodes in the CDN. For example, as shown in fig. 4, assuming that there are 4 CDN nodes in a certain dimension combination and the predetermined ratio is 50%, each CDN node may be compared with the other CDN nodes in the dimension combination to respectively perform quality data, and if the CDN node is poorer in quality than two of the other CDN nodes (i.e., is poorer than 2/3 of the other nodes), the node may be determined to be a failure node in the dimension combination because 2/3 exceeds the predetermined ratio by 50% (i.e., the number of differences exceeds the predetermined ratio).
Finally, as shown in fig. 4, the failed node determined in the above manner may be added to the CDN failed node list.
Referring back to fig. 2, after the failed node is determined through the failure detection, in step S203, a failure decision may be performed according to the detection result. Specifically, for example, a fault node that can be scheduled (i.e., a schedulable fault node) may be screened from the located fault nodes according to a predetermined fault decision rule according to the detection result. For example, as shown in fig. 3, according to the detection result, the schedulable fault nodes in the dimension combinations can be selected from the fault nodes positioned in the three different dimension combinations of [ p+i+b+c ], [ b+c ] and [ s+c ] according to a predetermined fault decision rule. Here, the predetermined fault decision rule may be preset by the user according to actual needs. Further, the predetermined fault decision rules may include decision rules for different dimensional combinations, and the fault decision rules set for different dimensional combinations may be different. In addition, the predetermined fault decision rules may also include rules that compare fault nodes in different dimensional combinations. For example, by comparing the failed nodes in the combination of the dimensions [ p+i+b+c ] and [ b+c ], it can be determined whether certain CDN nodes that are certain traffic lines cause large area failures, and if so, traffic on these nodes can be scheduled to normal nodes at a later scheduling time. For another example, by comparing the failed nodes in the combination of the dimensions of [ b+c ] and [ s+c ], it can be determined whether some account of very high interest is failed, and if so, the traffic of that account can be scheduled to other normal nodes at a later schedule.
Fig. 5 is a schematic diagram illustrating performing fault decisions of an exemplary embodiment of the present disclosure. As shown in fig. 5, for example, it is assumed that the failure node in the three different dimensional combinations is determined by performing failure detection in the three different dimensional combinations of [ p+i+b+c ], [ b+c ], and [ s+c ], respectively. At this time, as shown in fig. 5, for the fault node in the [ p+i+b+c ] dimension combination, for example, the following two rules should be obeyed: 1. two or more healthy nodes must exist within a dimension combination, otherwise a failed node under the dimension combination cannot be scheduled; 2. if all CDN nodes in a dimension combination are unhealthy, then the failed nodes in the dimension combination cannot be scheduled. For a failed node in the [ b+c ] dimension combination, since the failure in this dimension combination is typically a global failure, it is necessary to follow a rule that the amount of failed nodes that can be scheduled, for example, cannot generally exceed forty percent. While for a failed node in this combination of dimensions [ s+c ], it may not need to be limited by the decision rules above for the other two-dimensional combinations. And then filtering out fault nodes which cannot be scheduled according to a fault decision rule, and finally screening out CDN fault nodes which can be scheduled.
Referring back to fig. 2, after performing the failure decision, CDN scheduling may be performed according to the decision result in step S204. According to an exemplary embodiment, at step S204, traffic on a failed node that can be scheduled may be scheduled to a normal node in the same dimensional combination based on the current resource configuration situation of the CDN. Here, the current resource allocation situation of the CDN may be, for example, a traffic allocation situation of CDN nodes (e.g., traffic proportions assumed by each CDN node).
For example, as shown in fig. 3, after determining the failed nodes schedulable in the three different dimensional combinations of [ p+i+b+c ], [ b+c ], and [ s+c ], scheduling in the three dimensional combinations may be performed, respectively. The scheduling setting information about the scheduling priority, scheduling manner, etc. may be transmitted to the scheduling execution end through the PB/RPC to perform scheduling. In addition, the current resource allocation situation of the CDN may also be transmitted to the scheduling execution end to execute scheduling.
According to an exemplary embodiment, traffic on a failed node that can be scheduled may be scheduled to normal nodes in the same dimensional combination according to their respective scheduling priorities. Here, the scheduling priority may be determined according to traffic on the failed node that can be scheduled, and the greater the traffic on the failed node, the higher the scheduling priority of the failed node.
In addition, according to the traffic size of the fault node which can be scheduled, different scheduling modes can be adopted to schedule the traffic on the fault node to the normal node under the same dimension combination. For example, for a failed node capable of being scheduled, the traffic size of which meets a first predetermined condition, after scheduling traffic on the failed node to a normal node, the following scheduling process is repeated: after waiting for a predetermined time, rescheduling at least a portion of the scheduled traffic back to the failed node; rescheduling previously scheduled traffic back to the failed node if the quality of the failed node returns to normal during a specified period of time; if the quality of the failed node remains abnormal during said certain period of time, said at least a portion of the scheduled back traffic is scheduled away again, wherein the predetermined time waiting after each rescheduling of traffic is gradually longer. The method can effectively avoid risks caused by frequent traffic scheduling. For example, a failed node that can be scheduled for which the traffic size satisfies the first predetermined condition may be a failed node that serves an account of lesser concern (e.g., an account of lesser concern than the second threshold) (hereinafter referred to as a first type of failed node) and a failed node that serves an account of higher concern (e.g., an account of lesser concern than the first threshold and greater than the second threshold) (hereinafter referred to as a second type of failed node). In addition, for a first type of failed node, at least a portion of the scheduled traffic is equal to the scheduled traffic, and for a second type of failed node, at least a portion of the scheduled traffic is less than the scheduled traffic. And for the fault node which can be scheduled and has the flow size meeting the second preset condition, after the flow on the fault node is scheduled to the normal node, the scheduled flow is not re-scheduled back to the fault node even if the quality of the fault node is recovered to be normal. For example, a failed node capable of being scheduled whose traffic size satisfies the second predetermined condition may be a failed node (hereinafter, referred to as a third type of failed node) that serves an account of very high attention (e.g., an account of attention greater than the first threshold).
Fig. 6 is a schematic diagram illustrating performing CDN scheduling for an exemplary embodiment of the present disclosure. In the example of fig. 6, after determining the failed node that can be scheduled, previous CDN scheduling information, manual scheduling information (i.e., node information for manually performing scheduling), and CDN current resource configuration information may be further obtained. The information described above may be obtained, for example, by an RPC message. Optionally, the previously scheduled failed node and the manually scheduled failed node may be further filtered out of the determined failed nodes that can be scheduled. The traffic on the failed node can then be scheduled to the normal node in the same dimension combination on the basis of the scheduling described above. For example, as shown in fig. 6, for a third type of failed node, traffic on the failed node may be directly scheduled to other normal nodes. Specifically, it may be determined, according to the current resource configuration situation of the CDN, at what traffic proportion to distribute traffic on the failed node to other normal nodes. The corresponding traffic proportioning information may be written into the RPC message so that scheduling may be performed according to the information in the RPC message. As described above, for the traffic of the third type of failed node, once the failure is found, the call is directly taken away and no longer taken back. For the first type of fault node and the second type of fault node, at least a part of the traffic (for the second type of fault node, for example, only a part of the traffic is returned, for example, 10% of the traffic is returned for the first type of fault node) can be returned after 30 minutes, the traffic is returned if the period of 10 minutes is abnormal, the traffic which is attempted to be returned is returned if the period of 10 minutes is not abnormal, the traffic which is attempted to be returned is returned again returned if the problem exists, and the process can be continuously circulated after 60 minutes, for example, the result is checked after the time when the part of the traffic is returned for 10 minutes. Here, for example, whether the first type of failed node is normal may be judged on the basis that 250 pieces of quality data are stable for 10 minutes in succession.
Above, the resource scheduling method according to the exemplary embodiment of the present disclosure has been described in connection with fig. 2 to 6. According to the resource scheduling method of the exemplary embodiment of the disclosure, the fault node in the CDN can be more timely and accurately positioned, and effective CDN scheduling is realized.
Fig. 7 is a block diagram of an apparatus for resource scheduling (hereinafter, simply referred to as "resource scheduling apparatus" for convenience of description) of an exemplary embodiment of the present disclosure.
Referring to fig. 7, a resource scheduling apparatus 700 may include a quality data acquisition unit 701, a failure detection unit 702, a failure decision unit 703, and a scheduling unit 704. Specifically, the quality data acquisition unit 701 may be configured to acquire quality data of the CDN. The fault detection unit 702 may be configured to perform fault detection on the CDN based on the acquired quality data. Here, performing fault detection on the CDN includes locating a fault node in the CDN according to a quality data threshold predicted by the time series model and/or locating a fault node in the CDN according to a quality difference between CDN nodes. The fault decision unit 703 may be configured to perform a fault decision based on the detection result. The scheduling unit 704 may be configured to perform CDN scheduling according to the decision result.
Since the resource scheduling method shown in fig. 2 may be performed by the resource scheduling apparatus 700 shown in fig. 7, and the quality data acquisition unit 701, the failure detection unit 702, the failure decision unit 703, and the scheduling unit 704 may perform operations corresponding to steps 201, 202, 203, and 204 in fig. 2, respectively, any relevant details concerning the operations performed by the units in fig. 7 may be referred to the corresponding description concerning fig. 2, and will not be repeated here.
Further, it should be noted that, although the resource scheduling apparatus 700 is described above as being divided into units for performing the respective processes, it is clear to those skilled in the art that the processes performed by the respective units described above may be performed without any specific division of units or without explicit demarcation between the units by the resource scheduling apparatus 700. In addition, the resource scheduling device 700 may further include other units, such as a data processing unit, a storage unit, and the like.
Fig. 8 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
Referring to fig. 8, an electronic device 800 may include at least one memory 801 having stored therein a set of computer-executable instructions that, when executed by the at least one processor, perform a resource scheduling method according to an embodiment of the present disclosure, and at least one processor 802.
By way of example, the electronic device may be a PC computer, tablet device, personal digital assistant, smart phone, or other device capable of executing the above-described set of instructions. Here, the electronic device is not necessarily a single electronic device, but may be any device or an aggregate of circuits capable of executing the above-described instructions (or instruction set) singly or in combination. The electronic device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with either locally or remotely (e.g., via wireless transmission).
In an electronic device, a processor may include a Central Processing Unit (CPU), a Graphics Processor (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
The processor may execute instructions or code stored in the memory, wherein the memory may also store data. The instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.
The memory may be integrated with the processor, for example, RAM or flash memory disposed within an integrated circuit microprocessor or the like. In addition, the memory may include a stand-alone device, such as an external disk drive, a storage array, or any other storage device usable by a database system. The memory and the processor may be operatively coupled or may communicate with each other, for example, through an I/O port, a network connection, etc., such that the processor is able to read files stored in the memory.
In addition, the electronic device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device may be connected to each other via a bus and/or a network.
According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions, wherein the instructions, when executed by at least one processor, cause the at least one processor to perform a resource scheduling method according to an exemplary embodiment of the present disclosure. Examples of the computer readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, nonvolatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, blu-ray or optical disk storage, hard Disk Drives (HDD), solid State Disks (SSD), card memory (such as multimedia cards, secure Digital (SD) cards or ultra-fast digital (XD) cards), magnetic tape, floppy disks, magneto-optical data storage, hard disks, solid state disks, and any other means configured to store computer programs and any associated data, data files and data structures in a non-transitory manner and to provide the computer programs and any associated data, data files and data structures to a processor or computer to enable the processor or computer to execute the programs. The computer programs in the computer readable storage media described above can be run in an environment deployed in a computer device, such as a client, host, proxy device, server, etc., and further, in one example, the computer programs and any associated data, data files, and data structures are distributed across networked computer systems such that the computer programs and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.
In accordance with an embodiment of the present disclosure, a computer program product may also be provided, instructions in which are executable by at least one processor in an electronic device to perform a resource scheduling method according to an exemplary embodiment of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (18)

1. A method for scheduling resources, comprising:
acquiring quality data of the content delivery network, CDN, wherein acquiring the quality data of the CDN comprises: acquiring quality data of each CDN node under different dimension combinations;
performing fault detection on the CDN based on the obtained quality data, wherein performing fault detection on the CDN includes: locating a fault node in the CDN according to a quality data threshold predicted by the time sequence model; or, locating the fault nodes in the CDN according to the quality data threshold predicted by the time sequence model, and locating the fault nodes in the CDN according to the quality difference between the CDN nodes;
screening out fault nodes which can be scheduled from the positioned fault nodes according to a detection result and a preset fault decision rule; and
dispatching the traffic on the fault node capable of being dispatched to the normal node under the same dimension combination based on the current resource configuration condition of the CDN, wherein the resource configuration condition comprises the traffic configuration condition of the CDN node,
The scheduling, based on the current resource allocation condition of the CDN, traffic on a failed node that can be scheduled to a normal node under the same dimensional combination includes:
for a fault node which can be scheduled and has the traffic size meeting the first preset condition, after the traffic on the fault node is scheduled to a normal node, the following scheduling process is repeated: after waiting for a predetermined time, rescheduling at least a portion of the scheduled traffic back to the failed node; rescheduling previously scheduled traffic back to the failed node if the quality of the failed node returns to normal during a specified period of time; rescheduling said at least a portion of the scheduled back traffic if the quality of the failed node remains abnormal during said certain period of time, wherein the predetermined time waiting after each rescheduling of traffic is gradually longer, wherein the failed node capable of being scheduled whose traffic size meets the first predetermined condition is a failed node serving an account whose attention is less than a first threshold and greater than a second threshold and a failed node serving an account whose attention is less than a second threshold, wherein for a failed node serving an account whose attention is less than the second threshold, at least a portion of the scheduled back traffic is equal to the scheduled back traffic and for a failed node serving an account whose attention is less than the first threshold and greater than the second threshold, at least a portion of the scheduled back traffic is less than the scheduled back traffic;
And for the fault node which can be scheduled and has the flow size meeting the second preset condition, after the flow on the fault node is scheduled to the normal node, the scheduled flow is not re-scheduled back to the fault node even if the quality of the fault node is recovered to be normal, wherein the fault node which can be scheduled and has the flow size meeting the second preset condition is the fault node which serves the account with the attention degree larger than the first threshold value.
2. The method of claim 1, wherein the performing fault detection on the CDN based on the acquired quality data comprises:
and performing fault detection on the CDNs under the different dimensional combinations based on the acquired quality data of each CDN node under the different dimensional combinations.
3. The method of claim 2, wherein the locating the failed node in the CDN based on the quality data threshold predicted by the time series model comprises:
acquiring a quality data threshold value under each dimension combination or a quality data threshold value of each CDN node, which is predicted by using a time sequence model based on historical quality data of each CDN node under each dimension combination;
and comparing the acquired quality data of each CDN node under each dimension combination with a corresponding quality data threshold value to locate a fault node under each dimension combination.
4. The method of claim 3 wherein locating the failed node in each of the dimensional combinations by comparing the acquired quality data for each CDN node in the each dimensional combination to a corresponding quality data threshold comprises:
determining whether the quality data of each CDN node is greater than a corresponding quality data threshold, and determining whether the ratio of the quality data of each CDN node to the corresponding quality data threshold is greater than a predetermined value;
determining the duty ratio of the quality data which satisfies the quality data with the ratio of the quality data with the quality data threshold value larger than the preset value in all the quality data of each CDN node;
and determining the CDN node with the duty ratio meeting the preset condition under each dimension combination as a fault node under each dimension combination.
5. The method of claim 2, wherein locating the failed node in the CDN based on the quality difference between CDN nodes comprises:
comparing the quality data of CDN nodes under each dimension combination with each other;
and according to the comparison result, determining CDN nodes with poor quality compared with other CDN nodes of a preset proportion under the same dimension combination as fault nodes in the CDN.
6. The method of claim 1 wherein the CDN quality data comprises values of indicators indicating CDN quality obtained at specific time intervals over a specific length of time.
7. The method of claim 1, wherein the CDN-based current resource configuration scenario schedules traffic on failed nodes that can be scheduled to normal nodes in the same dimensional combination, further comprising:
and dispatching the traffic on the fault nodes capable of being dispatched to the normal nodes under the same dimension combination according to the dispatching priorities of the fault nodes capable of being dispatched, wherein the dispatching priorities are determined according to the traffic on the fault nodes capable of being dispatched, and the dispatching priorities of the fault nodes are higher as the traffic on the fault nodes is larger.
8. The method of claim 2, wherein performing fault detection on the CDN in different combinations of dimensions comprises:
performing health testing on CDN nodes under each dimension combination based on the obtained quality data under each dimension combination;
performing fault detection based on the quality data obtained for each CDN node under a dimension combination that passes the health test only for the dimension combination to locate a faulty node under the dimension combination.
9. An apparatus for scheduling resources, comprising:
a quality data obtaining unit configured to obtain quality data of a content delivery network CDN, wherein obtaining the quality data of the CDN includes: acquiring quality data of each CDN node under different dimension combinations;
a failure detection unit configured to perform failure detection on the CDN based on the acquired quality data, wherein performing the failure detection on the CDN includes: locating a fault node in the CDN according to a quality data threshold predicted by the time sequence model; or, locating the fault nodes in the CDN according to the quality data threshold predicted by the time sequence model, and locating the fault nodes in the CDN according to the quality difference between the CDN nodes;
the fault decision unit is configured to screen out fault nodes which can be scheduled from the positioned fault nodes according to a preset fault decision rule according to a detection result; and
a scheduling unit configured to schedule traffic on a failed node capable of being scheduled to a normal node in the same dimension combination based on a current resource configuration situation of the CDN, wherein the resource configuration situation includes a traffic configuration situation of the CDN node,
The dispatching the traffic on the fault node capable of being dispatched to the normal node of the CDN based on the current resource allocation condition of the CDN comprises the following steps:
for a fault node which can be scheduled and has the traffic meeting the first preset condition, after the traffic on the fault node is scheduled to a normal node, the following scheduling process is repeated: after waiting for a predetermined time, rescheduling at least a portion of the scheduled traffic back to the failed node; rescheduling previously scheduled traffic back to the failed node if the quality of the failed node returns to normal during a specified period of time; rescheduling said at least a portion of the scheduled back traffic if the quality of the failed node remains abnormal during said certain period of time, wherein the predetermined time waiting after each rescheduling of traffic is gradually longer, wherein the failed node capable of being scheduled whose traffic size meets the first predetermined condition is a failed node serving an account whose attention is less than a first threshold and greater than a second threshold and a failed node serving an account whose attention is less than a second threshold, wherein for a failed node serving an account whose attention is less than the second threshold, at least a portion of the scheduled back traffic is equal to the scheduled back traffic and for a failed node serving an account whose attention is less than the first threshold and greater than the second threshold, at least a portion of the scheduled back traffic is less than the scheduled back traffic;
And for the fault node with the flow meeting the second preset condition capable of being scheduled, after the flow on the fault node is scheduled to a normal node, the scheduled flow is not re-scheduled back to the fault node even if the quality of the fault node is recovered to be normal, wherein the fault node with the flow meeting the second preset condition capable of being scheduled is the fault node serving the account with the attention degree larger than the first threshold value.
10. The apparatus of claim 9, wherein the performing fault detection on the CDN based on the acquired quality data comprises:
and performing fault detection on the CDNs under the different dimensional combinations based on the acquired quality data of each CDN node under the different dimensional combinations.
11. The apparatus of claim 10, wherein the locating the failed node in the CDN based on the quality data threshold predicted by the time series model comprises:
acquiring a quality data threshold value under each dimension combination or a quality data threshold value of each CDN node, which is predicted by using a time sequence model based on historical quality data of each CDN node under each dimension combination;
and comparing the acquired quality data of each CDN node under each dimension combination with a corresponding quality data threshold value to locate a fault node under each dimension combination.
12. The apparatus of claim 11 wherein locating the failed node in each of the dimensional combinations by comparing the acquired quality data for each CDN node in the each dimensional combination to a corresponding quality data threshold comprises:
determining whether the quality data of each CDN node is greater than a corresponding quality data threshold, and determining whether the ratio of the quality data of each CDN node to the corresponding quality data threshold is greater than a predetermined value;
determining the duty ratio of the quality data which satisfies the quality data with the ratio of the quality data with the quality data threshold value larger than the preset value in all the quality data of each CDN node;
and determining the CDN node with the duty ratio meeting the preset condition under each dimension combination as a fault node under each dimension combination.
13. The apparatus of claim 10, wherein locating a failed node in the CDN based on a quality difference between CDN nodes comprises:
comparing the quality data of CDN nodes under each dimension combination with each other;
and according to the comparison result, determining CDN nodes with poor quality compared with other CDN nodes of a preset proportion under the same dimension combination as fault nodes in the CDN.
14. The apparatus of claim 9 wherein the CDN quality data comprises values of indicators indicating CDN quality obtained at specific time intervals over a specific length of time.
15. The apparatus of claim 9, wherein the CDN-based current resource configuration scenario schedules traffic on failed nodes that can be scheduled to normal nodes in the same dimensional combination, further comprising:
and dispatching the traffic on the fault nodes capable of being dispatched to the normal nodes under the same dimension combination according to the dispatching priorities of the fault nodes capable of being dispatched, wherein the dispatching priorities are determined according to the traffic on the fault nodes capable of being dispatched, and the dispatching priorities of the fault nodes are higher as the traffic on the fault nodes is larger.
16. The apparatus of claim 10, wherein the performing fault detection on the CDN in different combinations of dimensions comprises:
performing health testing on CDN nodes under each dimension combination based on the obtained quality data under each dimension combination;
performing fault detection based on the quality data obtained for each CDN node under a dimension combination that passes the health test only for the dimension combination to locate a faulty node under the dimension combination.
17. An electronic device, comprising:
at least one processor;
at least one memory storing computer-executable instructions,
wherein the computer executable instructions, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1 to 8.
18. A computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the method of any of claims 1 to 8.
CN202011577129.0A 2020-12-28 2020-12-28 Resource scheduling method and device, electronic equipment and storage medium Active CN112769643B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011577129.0A CN112769643B (en) 2020-12-28 2020-12-28 Resource scheduling method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011577129.0A CN112769643B (en) 2020-12-28 2020-12-28 Resource scheduling method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112769643A CN112769643A (en) 2021-05-07
CN112769643B true CN112769643B (en) 2023-12-29

Family

ID=75697715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011577129.0A Active CN112769643B (en) 2020-12-28 2020-12-28 Resource scheduling method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112769643B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113259714B (en) * 2021-06-30 2021-10-15 腾讯科技(深圳)有限公司 Content distribution processing method and device, electronic equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106412063A (en) * 2016-09-29 2017-02-15 赛尔网络有限公司 CDN node detection and resource scheduling system and method in education network
CN107769963A (en) * 2017-09-29 2018-03-06 贵州白山云科技有限公司 A kind of content distributing network Fault Locating Method and device
CN107911240A (en) * 2017-11-14 2018-04-13 北京知道创宇信息技术有限公司 A kind of fault detection method and device
CN108234207A (en) * 2017-12-29 2018-06-29 北京奇虎科技有限公司 A kind of Fault Locating Method and device based on content distributing network CDN
CN109842563A (en) * 2017-11-24 2019-06-04 中国电信股份有限公司 Content delivery network flow dispatching method, device and computer readable storage medium
CN110501160A (en) * 2019-07-31 2019-11-26 中国神华能源股份有限公司神朔铁路分公司 Train bearing fault early warning method, device, system and storage medium
CN111371826A (en) * 2018-12-26 2020-07-03 北京奇虎科技有限公司 CDN node performance detection method, device and system
CN111611074A (en) * 2020-05-14 2020-09-01 北京达佳互联信息技术有限公司 Method and device for scheduling cluster resources
CN111614484A (en) * 2020-04-13 2020-09-01 网宿科技股份有限公司 Node flow calling and recovering method, system and central server
CN111628878A (en) * 2019-02-27 2020-09-04 北京奇虎科技有限公司 Fault positioning method, device and system based on multi-stage network nodes
CN111931860A (en) * 2020-09-01 2020-11-13 腾讯科技(深圳)有限公司 Abnormal data detection method, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10243860B2 (en) * 2016-04-05 2019-03-26 Nokia Technologies Oy Method and apparatus for end-to-end QoS/QoE management in 5G systems
EP3451614B1 (en) * 2017-01-22 2020-12-30 Huawei Technologies Co., Ltd. Dispatching method and device in content delivery network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106412063A (en) * 2016-09-29 2017-02-15 赛尔网络有限公司 CDN node detection and resource scheduling system and method in education network
CN107769963A (en) * 2017-09-29 2018-03-06 贵州白山云科技有限公司 A kind of content distributing network Fault Locating Method and device
CN107911240A (en) * 2017-11-14 2018-04-13 北京知道创宇信息技术有限公司 A kind of fault detection method and device
CN109842563A (en) * 2017-11-24 2019-06-04 中国电信股份有限公司 Content delivery network flow dispatching method, device and computer readable storage medium
CN108234207A (en) * 2017-12-29 2018-06-29 北京奇虎科技有限公司 A kind of Fault Locating Method and device based on content distributing network CDN
CN111371826A (en) * 2018-12-26 2020-07-03 北京奇虎科技有限公司 CDN node performance detection method, device and system
CN111628878A (en) * 2019-02-27 2020-09-04 北京奇虎科技有限公司 Fault positioning method, device and system based on multi-stage network nodes
CN110501160A (en) * 2019-07-31 2019-11-26 中国神华能源股份有限公司神朔铁路分公司 Train bearing fault early warning method, device, system and storage medium
CN111614484A (en) * 2020-04-13 2020-09-01 网宿科技股份有限公司 Node flow calling and recovering method, system and central server
CN111611074A (en) * 2020-05-14 2020-09-01 北京达佳互联信息技术有限公司 Method and device for scheduling cluster resources
CN111931860A (en) * 2020-09-01 2020-11-13 腾讯科技(深圳)有限公司 Abnormal data detection method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112769643A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN109522287B (en) Monitoring method, system, equipment and medium for distributed file storage cluster
CN106878064B (en) Data monitoring method and device
US10178198B2 (en) System and method for selection and switching of content sources for a streaming content session
US9049105B1 (en) Systems and methods for tracking and managing event records associated with network incidents
US9553909B2 (en) System and method for assignment and switching of content sources for a streaming content session
CN110851342A (en) Fault prediction method, device, computing equipment and computer readable storage medium
CN112702198B (en) Abnormal root cause positioning method and device, electronic equipment and storage medium
US8528031B2 (en) Distributed diagnostics for internet video link
CN109039787A (en) log processing method, device and big data cluster
US10108522B1 (en) Sampling technique to adjust application sampling rate
US20160094392A1 (en) Evaluating Configuration Changes Based on Aggregate Activity Level
CN112769643B (en) Resource scheduling method and device, electronic equipment and storage medium
US10372524B2 (en) Storage anomaly detection
CN111581002A (en) Automatic fault reporting method, device and equipment for server fault
US20190349247A1 (en) Scalable and real-time anomaly detection
CN112671590B (en) Data transmission method and device, electronic equipment and computer storage medium
CN111756798B (en) Service scheduling method, device, equipment and storage medium based on gateway cascade
CN114090382A (en) Health inspection method and device for super-converged cluster
CN115391127A (en) Dial testing method and device, storage medium and chip
CN112835780A (en) Service detection method and device
CN110309045B (en) Method, apparatus, medium and computing device for determining future state of server
CN114189467B (en) Content distribution network service evaluation method and device
CN111352992B (en) Data consistency detection method, device and server
CN117971674A (en) Execution management method and device of automation script, electronic equipment and medium
CN116431347A (en) Method, device, electronic equipment and storage medium for resource processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant