WO2021051839A1 - 一种数据处理方法、装置、系统及存储介质 - Google Patents

一种数据处理方法、装置、系统及存储介质 Download PDF

Info

Publication number
WO2021051839A1
WO2021051839A1 PCT/CN2020/090810 CN2020090810W WO2021051839A1 WO 2021051839 A1 WO2021051839 A1 WO 2021051839A1 CN 2020090810 W CN2020090810 W CN 2020090810W WO 2021051839 A1 WO2021051839 A1 WO 2021051839A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
data
deployment
data processing
data file
Prior art date
Application number
PCT/CN2020/090810
Other languages
English (en)
French (fr)
Inventor
刘晓威
Original Assignee
深圳市网心科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市网心科技有限公司 filed Critical 深圳市网心科技有限公司
Publication of WO2021051839A1 publication Critical patent/WO2021051839A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data

Definitions

  • This application relates to the field of cloud computing, and in particular to a data processing method, device, system and storage medium.
  • the cloud server architecture based on the CDN model continues to make substantial progress in applications.
  • One of the main purposes of the current cloud server architecture based on the CDN model is to provide corresponding data files according to users' access requirements.
  • the current cloud server framework composed of service devices based on the CDN network mode has gradually become the current development trend.
  • the purpose of this application is to provide a data processing method, device, system and storage medium to ensure that data files are accurately deployed to matching service equipment, thereby ensuring that the service equipment’s ability to provide data files matches the user’s needs for data files .
  • this application provides a data processing method, including:
  • the working quality parameters include quality of life parameters, TCP connection quality parameters, and hole punching quality parameters;
  • the status information includes service status information of the service device and file deployment status information.
  • obtaining the file popularity information of the data file includes:
  • the statistical file request includes:
  • deploying the data file to the target service device includes:
  • the data files are deployed to the target service device according to the priority order of the data files being accessed.
  • obtaining the file popularity information of the data file and calculating the deployment requirements of the data file according to the file popularity information includes:
  • the flow characteristic data includes the flow fluctuation information of the data file and the correlation information between the data files;
  • the traffic characteristic data is input into the preset characteristic matching deployment model as the file popularity information, and the deployment requirements of the data file are calculated.
  • calculating the deployment requirements of the data file according to the file popularity information includes:
  • this application also provides a data processing device.
  • the device includes a memory, a processor, and a bus.
  • the memory stores a data processing program that can be transmitted to the processor by the bus and run on the processor.
  • the data processing program is executed by the processor. Realize the data processing method as described above.
  • this application also provides a data processing system, which includes:
  • the status statistics unit is used to collect the work quality parameters of the service equipment and analyze the work quality parameters to generate status information
  • the demand calculation unit is used to obtain the file popularity information of the data file, and calculate the deployment requirements of the data file according to the file popularity information;
  • the equipment selection module is used to select the target service equipment that meets the deployment requirements according to the status information
  • the data distribution module is used to deploy the data file to the target service device.
  • the present application also provides a computer-readable storage medium on which a data processing program is stored, and the data processing program can be executed by one or more processors to implement the above-mentioned data processing method.
  • the data processing method provided in this application first collects the work quality parameters of the service equipment, and then analyzes the work quality parameters to generate the status information of the service equipment, and then obtains the file popularity information of the data file, and calculates the data file according to the file popularity information According to the deployment requirements of the service device, the target service device that meets the deployment requirements is selected according to the status information of the service device, and the data file is deployed to the target service device. Because this method first collects and analyzes the working quality parameters of the service equipment, generates status information that characterizes the overall working conditions of the service equipment, and then selects the target service equipment that can deploy the data file according to the status information of the service equipment, and performs data file analysis.
  • Deployment is equivalent to accurately deploying data files to matching service devices according to the availability of service devices, thereby relatively ensuring that the service device's ability to provide data files is consistent with the user's access requirements for data files.
  • the present application also provides a data processing device, system, and storage medium, and the beneficial effects are the same as those described above.
  • FIG. 1 is a flowchart of a data processing method provided by an embodiment of this application.
  • FIG. 3 is a flowchart of another data processing method provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of the architecture of a scheduling deployment system in a specific scenario provided by an embodiment of the application;
  • FIG. 5 is a structural diagram of a data processing device provided by an embodiment of the application.
  • the core of this application is to provide a data processing method, device, system and storage medium to ensure that data files are accurately deployed to matching service equipment, thereby ensuring that the service equipment’s ability to provide data files matches the user’s needs for data files .
  • Fig. 1 is a flowchart of a data processing method provided by an embodiment of the application. Please refer to FIG. 1, the data processing method can be applied to the deployment server in the CDN, and can also be applied to other devices with deployment functions.
  • the specific steps of the data processing method include:
  • Step S10 Collect work quality parameters of the service equipment, and analyze the work quality parameters to generate status information.
  • the service device in this step refers to a device that can provide data file services to user equipment.
  • the service device may be a CDN node device, which can store data files deployed by the deployment server, and can Transfer the stored data file to the client that needs service.
  • the service equipment may include various types of electronic equipment, such as the player cloud, optical modem, router, TV box, smart TV, IPFS mining machine, network attached storage (NAS), personal computer, and server. Because the hardware performance of the service device itself determines the service capabilities of the data files that the service device itself can provide, the work quality parameters of the service device are collected in this step, and the purpose is to generate the working status information of the service device according to the work quality parameters. , The working status information characterizes the working performance of the service equipment.
  • Step S11 Obtain the file popularity information of the data file, and calculate the deployment requirement of the data file according to the file popularity information.
  • the purpose of this step is to obtain the file popularity information of the data file, and then calculate the number of data files that need to be deployed according to the file popularity information, that is, the deployment requirement of the data file.
  • the file popularity information obtained in this step may be statistics generated access information based on user equipment's access to data files.
  • File popularity information represents the demand for data files of the user equipment, and the demand for files in the service device is high. The deployment of the data files in the service device can relatively ensure the access hit rate of the user equipment to the data files in the service device.
  • Step S12 Select a target service device that meets the deployment requirements according to the status information.
  • the status information characterizes the service device's ability to carry the deployment of data files, and the deployment requirement characterizes the user equipment's access requirements for data files
  • the status information determines whether the service device can deploy data files to meet the user equipment's demand for data files. File access requirements, and then this step selects a target service device that meets the deployment requirements based on the status information. For example, if the demand for accessing data files is 100,000 times per unit time, when deploying data files, ensure that the service equipment can normally respond to the user terminal’s access to data files 100,000 times per unit time.
  • the status information of the service device determines whether the service device has the ability to provide one hundred thousand services per unit time to the user terminal, and then selects the target service device that can meet the deployment requirements.
  • Step S13 Deploy the data file to the target service device.
  • the data file is deployed to the target service device that meets the data file deployment requirements, so as to ensure that the target service device normally provides the data file to the user terminal.
  • the execution object of each step in this embodiment may be a scheduling deployment device, and a communication connection between the scheduling device and the service device is established, so that the deployment of data files to the service device can be implemented.
  • the data processing method provided in this application first collects the work quality parameters of the service equipment, and then analyzes the work quality parameters to generate the status information of the service equipment, and then obtains the file popularity information of the data file, and calculates the data file according to the file popularity information According to the deployment requirements of the service device, the target service device that meets the deployment requirements is selected according to the status information of the service device, and the data file is deployed to the target service device. Because this method first collects and analyzes the working quality parameters of the service equipment, generates status information that characterizes the overall working conditions of the service equipment, and then selects the target service equipment that can deploy the data file according to the status information of the service equipment, and performs data file analysis. Deployment is equivalent to accurately deploying data files to matching service devices according to the availability of service devices, thereby relatively ensuring that the service device's ability to provide data files is consistent with the user's access requirements for data files.
  • the working quality parameter may include a quality of life parameter, a TCP connection quality parameter, and a hole punching quality parameter.
  • the status information includes service status information of the service device and file deployment status information.
  • the quality of life parameter of the service device characterizes the reliability of the service device's own operation
  • the TCP connection quality parameter of the service device characterizes the reliability of data file transmission between the service device and the user device
  • the hole punching quality parameter also characterizes the reliability of the data file transmission between the service device and the user device.
  • Hole connection and TCP connection are two ways to connect between the service device and the SDK.
  • SDK refers to the software development tool set in the user device application.
  • Packet, hole punching refers to the UDP-based connection between the service device and the SDK.
  • the quality of survival parameters can specifically include the communication success rate of the service device or the probability of resource deadlock or downtime of the service device, that is, the probability of the service device normally sending and receiving data; TCP connection quality parameters can specifically include data in the TCP connection state Transmission efficiency and the correctness of the transmitted data, etc.; the hole punching quality parameter can specifically include the data transmission efficiency and the correctness of the transmitted data in the UDP protocol connection state.
  • the TCP connection quality parameter and the hole punching quality parameter can reflect the service status information of the service device in the communication service of the data file and the file deployment status in the local acquisition and deployment of the data file, this embodiment Furthermore, for the local acquisition and deployment of data files by the service equipment and the communication service capabilities, the specific content of the work quality parameters is refined, the accuracy of the status information is improved, and the reliability of the target service equipment selected based on the status information is improved.
  • FIG. 2 is a flowchart of another data processing method provided by an embodiment of the application. Please refer to Figure 2. The specific steps of the data processing method include:
  • Step S20 Collect work quality parameters of the service equipment, and analyze the work quality parameters to generate status information.
  • Step S21 Receive a file request from the user equipment in real time.
  • Step S22 The file request is counted, and the file popularity information of the data file is generated.
  • the method of generating file popularity information is generated based on real-time statistics of file requests incoming from user equipment. That is to say, in this embodiment, the scheduling equipment and the user equipment and user equipment The service device establishes a communication connection, and then the user device initiates a file request for the data file to the dispatch device. When the dispatch device receives the file request of the user device for the data file, it will perform corresponding statistics on the amount of access to the data file.
  • the file popularity information of the data file is generated according to the file request amount of the user equipment for the data file. Since the file request is received in real time from the user equipment, the file popularity information generated by the statistics of the file request has high real-time performance.
  • Step S23 Calculate the deployment requirement of the data file according to the file popularity information.
  • Step S24 Select a target service device that meets the deployment requirements according to the status information.
  • Step S25 Deploy the data file to the target service device.
  • the file popularity information of the data file corresponding to the file request is collected according to the file request. Therefore, the file popularity information of the data file is updated in real time, and the file popularity information is updated according to the file request.
  • the deployment requirements of the data files calculated by the heat information are also updated in real time, with high timeliness, and can relatively improve the overall accuracy of the data file deployment.
  • the statistical file request includes:
  • the focus of this embodiment is that when the file requests are counted, the current statistical time and the file requests within the continuous time period before the current statistical time are collectively counted, that is to say, the present embodiment It collects statistics on file requests within a period of time, which is equivalent to repeating the overall situation of file requests in the historical period, so as to learn the overall situation of each data file accessed by user equipment at the current time and a period of time before the current time, which can be relatively avoided.
  • the inaccuracy of the file heat information caused by the large or relatively scattered heat of the data file in a short period of time ensures that the file heat information of the data file can relatively accurately reflect the heat of the data file, and further improves the heat of the data file. Deployment accuracy.
  • the number of requests for data files can be counted according to the second-level time as the statistical period, and the popularity information of the file can be evaluated according to the number of requests in the current statistical period and the number of requests in several periods before the current statistics.
  • the statistical period can be set to 1 second, and then the number of accesses to the data file in the current one second and the number of requests for the data file in the previous T seconds are counted to obtain the heat information of the data file.
  • the T can be set as required.
  • deploying the data file to the target service device includes:
  • the data files are deployed to the target service device according to the priority order of the data files being accessed.
  • this embodiment takes into account that data files often have different degrees of importance, and the importance of data files here refers to the possibility of data files being accessed by users. For example, in a scene where a user watches a video, because the user’s viewing habit is often to watch from the initial moment of the video track, it may not be possible to watch the entire video completely, that is, if the video file is divided into multiple In the case of a data file, the importance of the data file corresponding to the beginning of the video is high. The data file corresponding to the end of the video should first ensure that the data file corresponding to the beginning of the video can be efficiently provided to the user.
  • the importance of the data file in the overall data file should be proportional to the priority of the data file in the overall data file, for example, A
  • the importance of the three data files of, B and C is divided into A is the most important, B is less important than A, and C is less important than B.
  • the priority order of the three data files A, B, and C is A
  • the priority is the highest, B is inferior to A in priority, and C is inferior to B.
  • the access priority order is preset for the data file, and then the data file is deployed to the target service device according to the access priority order of the data file, that is, in the order of priority from high to low.
  • the corresponding data files are deployed to the service device, so that data files with a higher priority order can be accessed by the user device first, which can relatively ensure the service efficiency of the service device for the data files.
  • FIG. 3 is a flowchart of another data processing method provided by an embodiment of the application. Please refer to Figure 3. The specific steps of the data processing method include:
  • Step S30 Collect work quality parameters of the service equipment, and analyze the work quality parameters to generate status information.
  • Step S31 Receive the flow characteristic data of the data file reported by the user equipment; the flow characteristic data includes the flow fluctuation information of the data file and the association information between the data files.
  • the deployment device accepts the traffic characteristic data about the data file reported by the user equipment.
  • the traffic characteristic data includes the traffic fluctuation information of the data file and the data files.
  • the correlation information between data files refers to the overall change in the amount of data files accessed by user equipment over a period of time, and the correlation information between data files refers to the affiliation between different data files or the content of the files The relationship.
  • Step S32 Input the flow characteristic data as the file popularity information into the preset characteristic matching deployment model, and calculate the deployment requirement of the data file.
  • the flow characteristic data is input into the preset characteristic matching deployment model as the file popularity information to calculate the deployment requirement of the data file.
  • the feature matching deployment model in this embodiment can represent the corresponding relationship between deployment costs and deployment benefits, and then the traffic feature data is input into the preset feature matching deployment model as file popularity information, with the purpose of On the premise that the flow characteristic data is used as the deployment requirement, the corresponding deployment cost and deployment revenue are calculated, and the deployment requirements when the deployment cost and deployment revenue are within a certain threshold range are selected.
  • Step S33 Select a target service device that meets the deployment requirements according to the status information.
  • Step S34 Deploy the data file to the target service device.
  • the dispatching device receives the flow characteristic data generated and reported by the user equipment, and then analyzes the flow characteristic data based on the characteristic matching deployment model to generate corresponding deployment requirements.
  • the deployment requirements can be generated offline, which can realize the advance deployment of certain data files with higher popularity and the data files with larger relevance, and further ensure the accuracy of data file deployment.
  • calculating the deployment requirements of the data file according to the file popularity information includes:
  • This embodiment uses a weighted method to calculate the deployment requirements of data files based on the file popularity information, and can set corresponding weights when calculating the deployment requirements according to the importance of the file popularity information, so as to realize the calculation of deployment requirements more flexibly and further ensure The accuracy of the deployment of data files is improved.
  • the architecture of the scheduling deployment system relies on:
  • Data platform responsible for the SDK (user equipment) and peer (service equipment), as well as the scheduling and deployment system, indexing hole system and other data reporting, log statistics;
  • Data statistics Based on the data platform, statistics related data of each system and each module;
  • Data visualization Visual display of statistical data and analysis data
  • Monitoring and alarm Configure related indicator monitoring and alarm strategies, push notifications and alarm information, etc.;
  • Operation and maintenance console including related operation and maintenance management platforms such as hierarchical release and plan management;
  • Operation console Including business parameter configuration, manual deployment triggering of the provided business, viewing of business-related traffic and other data, Peer resource operation, etc.;
  • the dispatch deployment system consists of 3 main parts:
  • Peer information collection collect related information of Peer for scheduling and deployment index
  • Peer information statistics aggregate statistics on Peer information
  • Control information issuing: issuing scheduling and deployment system control commands to Peer, such as index deletion control, Peer self-protection parameter control, etc.;
  • Dial test and detection regularly detect Peer's quality of life, TCP connection quality, hole punching quality, etc.;
  • Peer information analysis Evaluate Peer's sharing ability and service quality, download deployment ability, in order to fine-tune scheduling and deployment;
  • the SDK requests the Peer information that meets the requirements from the scheduling module on demand, so that the SDK can enter the P2P sharing mode at the right time;
  • Heat information collection collect request records of the scheduling module
  • request information is maintained in a sliding time window with a granularity of seconds; multiple traffic convergence cycles, real-time collection of request records for a continuous period of time in the past. So as to realize the second-level perception to meet the hot file deployment requirements, and at the same time, multiple convergence cycles ensure that you will not miss the more dispersed deployment requirements;
  • Deployment demand calculation According to the real-time popularity and weighted redundancy coefficient, calculate the current point selection demand for the file;
  • Point selection deployment According to Peer information, combined with multiple point selection strategies, such as selecting points according to disk space, selecting points according to sharing capabilities, selecting points according to feature matching, selecting points according to configuration weights, etc.;
  • Deployment task issuance the task information to be deployed is sent to Peer in real-time & periodic, sorted by priority;
  • Index/cache management build index based on Peer deployment success and failure information; process Peer cache delete request; maintain index cache delete according to disk space and global perspective;
  • Deployment resource management Control the cost of CDN back-to-source deployment, and try to use P2P deployment resources without affecting the efficiency of Peer sharing and deployment;
  • Service flow collection collect the service flow characteristic data reported by the SDK
  • Traffic characteristics analysis Analyze traffic hot certificates, including statistical information of traffic fluctuations of all files; correlation between files; Peer usage, etc., to set deployment parameters such as redundancy coefficients; generate pre-deployed file correlation feature data; train SDK -Peer feature matching deployment scheduling selection point model, etc.;
  • Pre-deployment demand management According to the traffic correlation characteristics, integrate the deployment requirements of files that meet the associated characteristics (such files may not meet the deployment hot requirements, but the associated files have met the high hot requirements, at this time, it is considered that the relevant files may be next. If the probability of being hot is higher, you can refer to the number of deployment copies of the corresponding hot file to pre-deploy in advance, and the redundancy of deployment is affected by correlation. The greater the correlation, the smaller the discount on the number of deployment copies);
  • Deployment parameter training According to traffic characteristics, Peer usage and other data, control the adaptive adjustment of the deployment system redundancy coefficient (business arrangement flow model & cost control); control scheduling, deployment and selection of feature matching rules, etc.;
  • Service equipment resource management As the types of service equipment increase, the service difference between service equipment becomes larger and larger.
  • the "equal treatment” scheduling deployment strategy has caused more and more problems such as uneven running volume of service equipment and low resource utilization efficiency. serious;
  • Reverse dialing test detection activity to ensure the availability of Peer, Peer's overall service capability and service quality assessment, to support the combination of Peer's service capabilities and service quality, to do fine scheduling and deployment;
  • the main goal of the online deployment system is to ensure timely deployment of valuable files to the target Peer
  • the updated heat convergence method realizes multiple convergence cycles (perceives more dispersed file deployment requirements) second-level response (previous convergence per second);
  • Offline deployment system A new offline deployment system has been added
  • FIG. 5 is a structural diagram of a data processing device provided by an embodiment of the application.
  • a data processing device 1 provided by an embodiment of the present application includes a memory 11, a processor 12, and a bus 13.
  • the memory 11 stores data that can be transmitted by the bus 13 to the processor 12 and run on the processor 12.
  • the processing program when the data processing program is executed by the processor 12, realizes the above-mentioned data processing method.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc.
  • the memory 11 may be an internal storage unit of the data processing device 1 in some embodiments, such as a hard disk of the data processing device 1.
  • the memory 11 may also be an external storage device of the data processing device 1, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SMC) equipped on the data processing device 1.
  • SD Secure Digital
  • flash Card flash Card
  • the memory 11 may also include both an internal storage unit of the data processing apparatus 1 and an external storage device.
  • the memory 11 can be used not only to store application software and various data installed in the data processing device 1, such as the code of a video transcoding program, etc., but also to temporarily store data that has been output or will be output.
  • the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip, for running program codes or processing stored in the memory 11 Data, such as the execution of video transcoding programs, etc.
  • CPU central processing unit
  • controller microcontroller
  • microprocessor microprocessor
  • other data processing chip for running program codes or processing stored in the memory 11 Data, such as the execution of video transcoding programs, etc.
  • the bus 13 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used to represent in FIG. 5, but it does not mean that there is only one bus or one type of bus.
  • the data processing device provided in this application first collects the work quality parameters of the service equipment, and then analyzes the work quality parameters to generate the status information of the service equipment, and then obtains the file popularity information of the data file, and calculates the data file according to the file popularity information According to the deployment requirements of the service device, the target service device that meets the deployment requirements is selected according to the status information of the service device, and the data file is deployed to the target service device. Because this device first collects and analyzes the work quality parameters of the service equipment, generates status information that characterizes the overall working conditions of the service equipment, and then selects the target service equipment that can deploy the data file according to the status information of the service equipment, and performs data file deployment. Deployment is equivalent to accurately deploying data files to matching service devices according to the availability of service devices, thereby relatively ensuring that the service device's ability to provide data files is consistent with the user's access requirements for data files.
  • This application also provides a data processing system, which includes:
  • the status statistics unit is used to collect the work quality parameters of the service equipment and analyze the work quality parameters to generate status information
  • the demand calculation unit is used to obtain the file popularity information of the data file, and calculate the deployment requirements of the data file according to the file popularity information;
  • the equipment selection module is used to select the target service equipment that meets the deployment requirements according to the status information
  • the data distribution module is used to deploy the data file to the target service device.
  • the data processing system provided by this application first collects the work quality parameters of the service equipment, then analyzes the work quality parameters to generate the status information of the service equipment, and then obtains the file popularity information of the data file, and calculates the data file according to the file popularity information According to the deployment requirements of the service device, the target service device that meets the deployment requirements is selected according to the status information of the service device, and the data file is deployed to the target service device. Because this system first collects and analyzes the work quality parameters of the service equipment, generates status information that characterizes the overall working conditions of the service equipment, and then selects the target service equipment that can deploy the data file according to the status information of the service equipment, and performs data file deployment. Deployment is equivalent to accurately deploying data files to matching service devices according to the availability of service devices, thereby relatively ensuring that the service device's ability to provide data files is consistent with the user's access requirements for data files.
  • the present application also provides a computer-readable storage medium on which a data processing program is stored, and the data processing program can be executed by one or more processors to implement the above-mentioned data processing method.
  • the computer-readable storage medium provided by this application first collects the work quality parameters of the service equipment, and then analyzes the work quality parameters to generate the status information of the service equipment, and then obtains the file popularity information of the data file, and calculates the information according to the file popularity information According to the deployment requirements of the data file, the target service device that meets the deployment requirements is selected according to the status information of the service device, and the data file is deployed to the target service device.
  • the computer-readable storage medium first collects and analyzes the work quality parameters of the service equipment, generates status information that characterizes the overall working conditions of the service equipment, and then selects the target service equipment that can deploy the data file according to the status information of the service equipment, and
  • the deployment of data files is equivalent to accurately deploying data files to matching service devices based on the availability of service devices, thereby relatively ensuring that the service device's ability to provide data files is consistent with the user's access requirements for data files.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of modules is only a logical function division, and there may be other divisions in actual implementation, for example, multiple modules or components can be combined or integrated. To another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may be in electrical, mechanical or other forms.
  • the modules described as separate components may or may not be physically separate, and the components displayed as modules may or may not be physical modules, that is, they may be located in one place, or they may be distributed on multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种数据处理方法、装置、系统及存储介质。该方法的步骤包括:收集服务设备的工作质量参数,并对工作质量参数进行分析生成状态信息;获取数据文件的文件热度信息,并根据文件热度信息计算数据文件的部署需求;根据状态信息选取符合部署需求的目标服务设备;将数据文件部署至目标服务设备。本方法根据服务设备的可用性,将数据文件准确的部署至相匹配服务设备中,进而能够相对确保服务设备提供数据文件的能力与用户对数据文件的访问需求一致。此外,本申请还提供一种数据处理装置、系统及存储介质,有益效果同上所述。

Description

一种数据处理方法、装置、系统及存储介质
本申请要求于2019年9月18日提交中国专利局、申请号为201910883498.3、发明名称为“一种数据处理方法、装置、系统及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及云计算领域,特别是涉及一种数据处理方法、装置、系统及存储介质。
背景技术
随着云计算的不断发展,基于CDN模式的云服务器架构在应用方面不断取得实质性的进展,当前基于CDN模式的云服务器架构主要的用途之一是根据用户的访问需求提供相应的数据文件。为了更进一步的增加CDN网络模式下数据节点的量级,当前基于CDN网络模式的服务设备构成的云服务器框架,逐渐成为当前的发展趋势。
云服务器架构中,如何确保数据文件准确的部署至相匹配服务设备中,进而确保服务设备提供数据文件的能力与用户对数据文件的需求吻合,是当前本领域主要面临的问题。
发明内容
本申请的目的是提供一种数据处理方法、装置、系统及存储介质,以确保数据文件准确的部署至相匹配服务设备中,进而确保服务设备提供数据文件的能力与用户对数据文件的需求吻合。
为解决上述技术问题,本申请提供一种数据处理方法,包括:
收集服务设备的工作质量参数,并对工作质量参数进行分析生成状态信息;
获取数据文件的文件热度信息,并根据文件热度信息计算数据文件的部署需求;
根据状态信息选取符合部署需求的目标服务设备;
将数据文件部署至目标服务设备。
优选地,工作质量参数包括存活质量参数、TCP连接质量参数以及打洞质量参数;
状态信息包括服务设备的服务状态信息以及文件部署状态信息。
优选地,获取数据文件的文件热度信息,包括:
接收用户设备实时传入的文件请求;
统计文件请求,生成数据文件的文件热度信息。
优选地,统计文件请求,包括:
统计当前时刻以及当前时刻之前连续时长的文件请求。
优选地,将数据文件部署至目标服务设备,包括:
根据数据文件的受访问的优先级顺序将数据文件部署至目标服务设备。
优选地,获取数据文件的文件热度信息,并根据文件热度信息计算数据文件的部署需求,包括:
接收用户设备上报的数据文件的流量特征数据;流量特征数据包括数据文件的流量波动信息以及数据文件之间的关联性信息;
将流量特征数据作为文件热度信息输入至预设的特征匹配部署模型,计算数据文件的部署需求。
优选地,根据文件热度信息计算数据文件的部署需求,包括:
根据文件热度信息以加权方式计算数据文件的部署需求。
此外,本申请还提供一种数据处理装置,装置包括存储器、处理器和总线,存储器上存储有可由总线传输至处理器并在处理器上运行的数据处理程序,数据处理程序被处理器执行时实现如上述的数据处理方法。
此外,本申请还提供一种数据处理系统,系统包括:
状态统计单元,用于收集服务设备的工作质量参数,并对工作质量参数进行分析生成状态信息;
需求计算单元,用于获取数据文件的文件热度信息,并根据文件热度信息计算数据文件的部署需求;
设备选取模块,用于根据状态信息选取符合部署需求的目标服务设备;
数据下发模块,用于将数据文件部署至目标服务设备。
此外,本申请还提供一种计算机可读存储介质,计算机可读存储介质上存储有数据处理程序,数据处理程序可被一个或者多个处理器执行,以实现如上述的数据处理方法。
本申请所提供的数据处理方法,首先收集服务设备的工作质量参数,进而对工作质量参数进行分析生成服务设备的状态信息,进而获取数据文件的文件热度信息,并根据文件热度信息计算该数据文件的部署需求,进而根据服务设备的状态信息选取符合部署需求的目标服务设备,并将数据文件部署至目标服务设备。由于本方法首先对服务设备的工作质量参数进行收集及分析,生成表征服务设备整体工况的状态信息,进而根据服务设备的状态信息选择能够进行数据文件部署的目标服务设备,并进行数据文件的部署,相当于根据服务设备的可用性,将数据文件准确的部署至相匹配服务设备中,进而能够相对确保服务设备提供数据文件的能力与用户对数据文件的访问需求一致。此外,本申请还提供一种数据处理装置、系统及存储介质,有益效果同上所述。
附图说明
图1为本申请实施例提供的一种数据处理方法的流程图;
图2为本申请实施例提供的另一种数据处理方法的流程图;
图3为本申请实施例提供的另一种数据处理方法的流程图;
图4为本申请实施例提供的一种具体场景下的调度部署系统的架构示意图;
图5为本申请实施例提供的一种数据处理装置的结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部实施例。基于本申请中的实施例,本领域普通技术人员在没有 做出创造性劳动前提下,所获得的所有其他实施例,都属于本申请保护范围。
云服务器架构中,如何确保数据文件准确的部署至相匹配服务设备中,进而确保服务设备提供数据文件的能力与用户对数据文件的需求吻合,是当前本领域主要面临的问题。
本申请的核心是提供一种数据处理方法、装置、系统及存储介质,以确保数据文件准确的部署至相匹配服务设备中,进而确保服务设备提供数据文件的能力与用户对数据文件的需求吻合。
为了使本技术领域的人员更好地理解本申请方案,下面结合附图和具体实施方式对本申请作进一步的详细说明。
图1为本申请实施例提供的一种数据处理方法的流程图。请参考图1,所述数据处理方法可以应用于CDN中的部署服务器,也可以应用于其他有部署功能的设备中。所述数据处理方法的具体步骤包括:
步骤S10:收集服务设备的工作质量参数,并对工作质量参数进行分析生成状态信息。
需要说明的是,本步骤中的服务设备指的是能够向用户设备提供数据文件服务的设备,具体的,所述服务设备可以为CDN的节点设备,能够存储部署服务器部署的数据文件,并能够将存储的数据文件传输至需要服务的用户端。所述服务设备可以包括玩客云、光猫、路由器、电视盒子、智能电视、IPFS矿机、网络附属存储(NAS)、个人电脑、和服务器等各类型的电子设备。由于服务设备本身硬件性能的不同决定着服务设备自身能够提供的数据文件的服务能力也有所不同,而本步骤收集服务设备的工作质量参数,目的是根据工作质量参数生成服务设备的工作状态信息进行,工作状态信息表征的是服务设备的工作性能。
步骤S11:获取数据文件的文件热度信息,并根据文件热度信息计算数据文件的部署需求。
本步骤的目的是获取数据文件的文件热度信息,进而根据文件热度信息计算需要对数据文件进行部署的数量,即数据文件的部署需求。本步骤 中获取的文件热度信息可以为根据用户设备对于数据文件的访问而统计生成的访问量信息,文件热度信息表征的是用户设备对于数据文件的需求量,在服务设备中对文件需求量高的数据文件进行部署,能够相对确保用户设备对于服务设备中数据文件的访问命中率。
步骤S12:根据状态信息选取符合部署需求的目标服务设备。
由于状态信息表征的是服务设备能够承载对于数据文件的部署,而部署需求则表征的是用户设备对于数据文件的访问需求,因此状态信息决定着服务设备能否部署数据文件以满足用户设备对于数据文件的访问需求,进而本步骤根据状态信息选取符合部署需求的目标服务设备。例如,数据文件的访问需求在单位时间内累计为十万次,则对数据文件进行部署时应确保服务设备能够正常响应用户终端在单位时间内对于数据文件进行十万次的访问,因此需要根据服务设备的状态信息判断服务设备是否具有向用户终端提供单位时间内十万次服务的能力,进而选取出能够满足部署需求的目标服务设备。
步骤S13:将数据文件部署至目标服务设备。
本步骤将数据文件部署至满足符合数据文件部署需求的目标服务设备中,以此确保目标服务设备正常向用户终端提供该数据文件。
本实施例中各个步骤的执行对象可以为调度部署设备,调度设备与服务设备建立有通信连接,以此能够对服务设备实现数据文件的部署。
本申请所提供的数据处理方法,首先收集服务设备的工作质量参数,进而对工作质量参数进行分析生成服务设备的状态信息,进而获取数据文件的文件热度信息,并根据文件热度信息计算该数据文件的部署需求,进而根据服务设备的状态信息选取符合部署需求的目标服务设备,并将数据文件部署至目标服务设备。由于本方法首先对服务设备的工作质量参数进行收集及分析,生成表征服务设备整体工况的状态信息,进而根据服务设备的状态信息选择能够进行数据文件部署的目标服务设备,并进行数据文件的部署,相当于根据服务设备的可用性,将数据文件准确的部署至相匹配服务设备中,进而能够相对确保服务设备提供数据文件的能力与用户对数据文件的访问需求一致。
作为一种优选的实施方式,工作质量参数可以包括存活质量参数、TCP连接质量参数以及打洞质量参数。状态信息包括服务设备的服务状态信息以及文件部署状态信息。
服务设备的存活质量参数表征的是服务设备自身运行的可靠性,服务设备的TCP连接质量参数表征的是服务设备与用户设备之间进行数据文件传输时的可靠性;打洞质量参数表征的也是服务设备与用户设备之间进行数据文件传输时的可靠性,打洞连接和TCP连接是服务设备和SDK之间连接的两种方式,SDK指的是设置于用户设备应用程序中的软件开发工具包,打洞指的是服务设备和SDK之间基于UDP协议的连接方式。存活质量参数可以具体包括服务设备的通信成功率或服务设备的发生资源死锁或宕机的概率,也就是服务设备正常收发数据的概率;TCP连接质量参数可以具体包括在TCP连接状态下的数据传输效率以及所传输数据的正确性等;打洞质量参数可以具体包括在UDP协议连接状态下的数据传输效率以及所传输数据的正确性等。根据服务设备的存活质量参数、TCP连接质量参数以及打洞质量参数能够反映服务设备在对数据文件进行通信服务方面的服务状态信息及对数据文件进行本地获取部署方面的文件部署状态,本实施方式进一步针对服务设备对数据文件的本地获取部署以及通信服务能力,细化了工作质量参数的具体内容,提高了状态信息的准确性,进而提高了根据状态信息选取的目标服务设备的可靠性。
图2为本申请实施例提供的另一种数据处理方法的流程图。请参考图2,数据处理方法的具体步骤包括:
步骤S20:收集服务设备的工作质量参数,并对工作质量参数进行分析生成状态信息。
步骤S21:接收用户设备实时传入的文件请求。
步骤S22:统计文件请求,生成数据文件的文件热度信息。
需要说明的是,本实施例的重点在于文件热度信息的生成方式是基于对用户设备传入的文件请求进行实时统计而生成的,也就是说在本实施例中,调度设备同时与用户设备以及服务设备建立有通信连接,进而用户设 备向调度设备发起对于数据文件的文件请求,调度设备在接收到用户设备对于数据文件的文件请求时,即对该数据文件的访问量进行相应的统计,以此根据用户设备对于数据文件的文件请求量生成数据文件的文件热度信息。由于是接收用户设备实时传入的文件请求,因此对于文件请求进行统计所生成的文件热度信息具有较高的实时性。
步骤S23:根据文件热度信息计算数据文件的部署需求。
步骤S24:根据状态信息选取符合部署需求的目标服务设备。
步骤S25:将数据文件部署至目标服务设备。
本实施例中,每次用户设备传入文件请求时,即根据文件请求对与该文件请求相应的数据文件进行文件热度信息的统计,因此数据文件的文件热度信息时实时更新的,进而根据文件热度信息计算得到的数据文件的部署需求也是实时更新的,具有较高的时效性,能够相对提高对于数据文件进行部署时的整体准确性。
在上述实施例的基础上,作为一种优选的实施方式,统计文件请求,包括:
统计当前时刻以及当前时刻之前连续时长的文件请求。
需要说明的是,本实施方式的重点是在对文件请求进行统计时,是对当前的统计时刻以及当前的统计时刻之前的连续时长内的文件请求进行共同的统计,也就是说,本实施方式是对一段时间内的文件请求进行统计,相当于赘述了历史周期内文件请求的整体情况,以此获悉当前时刻以及当前时刻之前一段时间内各个数据文件受用户设备访问的整体情况,能够相对避免因短时间内数据文件的热度变化较大或相对分散而造成的文件热度信息不准确的情况,确保了数据文件的文件热度信息能够相对准确的反映数据文件的热度,进一步提高了对于数据文件的部署准确性。
具体的,可以按照秒级时间作为统计周期来统计数据文件的请求次数,并根据当前统计周期的请求次数和当前统计之前的若干周期的请求次数来评估文件的热度信息。例如,可以设定统计周期为1秒,则统计当前1秒内数据文件的访问次数,和之前的T秒数据文件的被请求次数,以得到数据文件的热度信息。所述T可以根据需要进行设置。通过上述方式,可以 实现秒级粒度感知数据文件的热度变化,提升数据文件部署的及时性。
更进一步的,作为一种优选的实施方式,将数据文件部署至目标服务设备,包括:
根据数据文件的受访问的优先级顺序将数据文件部署至目标服务设备。
需要说明的是,本实施方式是考虑到数据文件之间往往具有不同的重要程度,此处的数据文件的重要程度指的是数据文件受用户访问的可能性。例如,在用户观看视频的场景中,由于用户的观看习惯往往是先从视频轨的初始时刻开始观看,但是不一定能够完整的看完全部视频,也就是说,如果在将视频文件划分为多个数据文件时,视频开始部分对应的数据文件的重要程度高与视频结尾部分对应的数据文件,应首先确保能够向用户高效提供视频开始部分对应的数据文件。
对于重要程度较高的数据文件需要用户设备能够在最短时间内对其进行访问,数据文件的在整体数据文件中的重要程度应与该数据文件在整体数据文件中的优先级成正比,例如A、B、C三个数据文件的重要程度分为为A最重要、B重要性次于A,C重要性次于B,则A、B、C三个数据文件之间的优先级顺序为A优先级最高,B优先级次于A,C优先级次于B。在本实施方式中,对于数据文件预先设置有受访问的优先级顺序,进而根据数据文件的受访问的优先级顺序将数据文件部署至目标服务设备,也就是按照优选级从高到低的顺序将相应的数据文件部署至服务设备,因此对于优先级顺序较高的数据文件能够优先被用户设备访问到,能够相对确保服务设备对于数据文件的服务效率。
图3为本申请实施例提供的另一种数据处理方法的流程图。请参考图3,数据处理方法的具体步骤包括:
步骤S30:收集服务设备的工作质量参数,并对工作质量参数进行分析生成状态信息。
步骤S31:接收用户设备上报的数据文件的流量特征数据;流量特征数据包括数据文件的流量波动信息以及数据文件之间的关联性信息。
需要说明的是,本实施例是采用离线的方式对部署需求进行的运算,部署设备接受用户设备上报的关于数据文件的流量特征数据,流量特征数据包括数据文件的流量波动信息以及数据文件之间的关联性信息,流量波动信息指的是数据文件在一段时间中被用户设备访问的整体访问量变化情况,数据文件之间的关联性信息指的是不同数据文件之间的从属关系或文件内容的关联关系。
步骤S32:将流量特征数据作为文件热度信息输入至预设的特征匹配部署模型,计算数据文件的部署需求。
进而在接收到用户设备上报的数据文件的流量特征数据后,通过将流量特征数据作为文件热度信息输入至预设的特征匹配部署模型,计算数据文件的部署需求。
需要说明的是,本实施例中的特征匹配部署模型表征的可以是部署成本与部署收益之间对应关系,进而将流量特征数据作为文件热度信息输入至预设的特征匹配部署模型,目的是在以流量特征数据作为生成部署需求的前提下,计算相应的部署成本以及部署收益,选取部署成本以及部署收益处于一定阈值范围内时的部署需求。
步骤S33:根据状态信息选取符合部署需求的目标服务设备。
步骤S34:将数据文件部署至目标服务设备。
本实施例中调度设备接收用户设备生成并上报的流量特征数据,进而基于特征匹配部署模型对流量特征数据进行分析,生成相应的部署需求。本实施例能够通过离线生成部署需求的方式,能够实现对于某些热度较高的数据文件以及关联性较大的数据文件的提前部署,进一步确保对数据文件部署的准确性。
更进一步的,作为一种优选的实施方式,根据文件热度信息计算数据文件的部署需求,包括:
根据文件热度信息以加权方式计算数据文件的部署需求。
本实施方式通过加权方式根据文件热度信息计算数据文件的部署需求,能够根据文件热度信息的重要程度,在计算部署需求时设置相应的权重,以此更加灵活的实现对于部署需求的计算,进一步确保了对数据文件 进行部署的准确性。
为了加深对于本申请技术方案的理解,下面根据一种具体场景下的场景实施例进行说明:
在具体场景下调度部署系统的架构示意图如图4所示。
调度部署系统的架构依托于:
数据平台:负责SDK(用户设备)和Peer(服务设备),以及调度部署系统,索引打洞系统等数据上报,日志统计;
数据统计:在数据平台基础上,统计各系统和各模块相关数据;
数据分析:常规指标、流量数据初步分析;
数据可视化:将统计数据和分析数据可视化展示;
监控报警:配置相关指标监控和报警策略,推送通告和报警信息等;
运维控制台:包括分级发布、预案管理等相关运维管理平台;
运营控制台:包括业务参数配置、提供的业务手动部署触发、业务相关流量等数据查看,Peer资源运营等;
计费:专有计费模块,负责统计成本-收益数据;
登录:用户管理平台;
索引:索引维护;
打洞:协助P2P打洞;
调度部署系统包含3个主要部分:
服务设备资源管理:
Peer信息收集:收集调度和部署索引的Peer的相关信息;
Peer信息统计:对Peer信息进行汇聚统计;
控制信息下发:下发调度、部署系统对Peer的控制命令,例如索引删除控制,Peer自我保护参数控制等;
拨测探活:定期探测Peer存活质量、TCP连接质量、打洞质量等;
Peer信息分析:评估Peer的分享能力及服务质量、下载部署能力,以便精细化调度和部署;
在线调度部署系统:
P2P调度:SDK按需向调度模块请求满足需求的Peer信息,以便SDK在合适的时机进入P2P分享模式;
热度信息采集:收集调度模块的请求记录;
热度信息汇聚:以秒为粒度滑动时间窗口维护请求信息;多流量汇聚周期,实时汇聚过去一段连续时间的请求记录。从而实现秒级感知满足热度的文件部署需求,同时多个汇聚周期保证不会错过热度较分散的部署需求;
部署需求计算:根据实时热度,加权冗余系数,为文件计算当前选点需求;
选点部署:根据Peer信息,结合多重选点策略,例如根据磁盘空间选点,根据分享能力选点,根据特征匹配选点,根据配置权重选点等;
部署任务下发:将待部署任务信息,实时&周期性,按优先级排序后,下发至Peer;
索引/缓存管理:根据Peer部署成功和失败信息,建立索引;处理Peer缓存删除请求;根据磁盘空间,全局视角维护索引缓存删除;
部署资源管理:控制CDN回源部署成本,在不影响Peer分享和部署效率的基础上,尽量使用P2P部署资源;
业务流量收集:收集SDK上报的业务流量特征数据;
业务流量统计:统计SDK上报的业务流量特征数据;
流量特征分析:分析流量热证,包括所有文件的流量波动的统计信息;文件之间相关性;Peer使用情况等,用以设置冗余系数等部署参数;生成预部署文件关联特征数据;训练SDK-Peer特征匹配部署调度选点模型等;
预部署需求管理:根据流量相关性特征,整合满足关联特征的文件的部署需求(此类文件可能未达到部署热度要求,但其关联文件已满足高热度要求,此时认为相关文件再接下来可能会热的概率较高,则可参考对应热文件的部署份数提前预部署,部署冗余度,受相关性影响,关联性越大,部署份数打的折扣越小);
部署参数训练:根据流量特征、Peer使用情况等数据,控制部署系统冗余系数(业务整理流量模型&成本控制)的自适应调整;控制调度、部 署选点时的特征匹配规则等;
服务设备资源管理:随着服务设备种类增多,服务设备之间服务差异也越来越大,“一视同仁”的调度部署策略导致的诸如服务设备跑量不均衡,资源利用效率低等问题越来越严重;
服务设备资源管理除常规的服务设备地域-运营商,类型,磁盘,带宽等信息维护等功能外,框架新增:
反向拨测探活:保证Peer可用性,Peer的整体服务能力和服务质量评估,以支持结合Peer的服务能力和服务质量,做精细化调度和部署;
缓存、自我保护的中心化控制:调度感知全局服务情况,控制Peer的cache清理以及分享上限控制;
在线部署系统:
在线部署系统主要目标是保证及时的把有部署价值的文件,快速部署到目标Peer中去;
更新的热度汇聚的方式,实现了多汇聚周期(感知热度更分散的文件部署需求)秒级别响应(每秒粒度向前汇聚);
更合理的HTTP和P2P协调部署,平衡成本和部署效率;
离线部署系统:新增了离线部署系统;
以适应当前点播业务内容多样,缺乏热度集中性和持续性特征,采用了分析业务请求之间的相关性,对某些较热文件,相关性较大的问题,提前进行部署,从而提高分享比,并且根据流量热证,权衡质量和成本,自适应调整部署控制参数,进而分析业务的请求流量特征,结合成本-收益模型,控制部署冗余系数等。
在上文中对于数据处理方法的实施例进行了详细的描述,本申请还提供一种与该方法对应的数据处理装置,由于数据处理装置部分的实施例与方法部分的实施例相互对应,数据处理装置部分的实施例的描述,这里暂不赘述。
图5为本申请实施例提供的一种数据处理装置的结构图。
请参考图5,本申请实施例提供的一种数据处理装置1包括存储器11、 处理器12和总线13,存储器11上存储有可由总线13传输至处理器12并在处理器12上运行的数据处理程序,数据处理程序被处理器12执行时实现如上述的数据处理方法。
其中,存储器11至少包括一种类型的可读存储介质,可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、磁性存储器、磁盘、光盘等。存储器11在一些实施例中可以是数据处理装置1的内部存储单元,例如该数据处理装置1的硬盘。存储器11在另一些实施例中也可以是数据处理装置1的外部存储设备,例如数据处理装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括数据处理装置1的内部存储单元也包括外部存储设备。存储器11不仅可以用于存储安装于数据处理装置1的应用软件及各类数据,例如视频转码程序的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行视频转码程序等。
该总线13可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
本申请所提供的数据处理装置,首先收集服务设备的工作质量参数,进而对工作质量参数进行分析生成服务设备的状态信息,进而获取数据文件的文件热度信息,并根据文件热度信息计算该数据文件的部署需求,进而根据服务设备的状态信息选取符合部署需求的目标服务设备,并将数据文件部署至目标服务设备。由于本装置首先对服务设备的工作质量参数进行收集及分析,生成表征服务设备整体工况的状态信息,进而根据服务设备的状态信息选择能够进行数据文件部署的目标服务设备,并进行数据文 件的部署,相当于根据服务设备的可用性,将数据文件准确的部署至相匹配服务设备中,进而能够相对确保服务设备提供数据文件的能力与用户对数据文件的访问需求一致。
本申请还提供一种数据处理系统,系统包括:
状态统计单元,用于收集服务设备的工作质量参数,并对工作质量参数进行分析生成状态信息;
需求计算单元,用于获取数据文件的文件热度信息,并根据文件热度信息计算数据文件的部署需求;
设备选取模块,用于根据状态信息选取符合部署需求的目标服务设备;
数据下发模块,用于将数据文件部署至目标服务设备。
本申请所提供的数据处理系统,首先收集服务设备的工作质量参数,进而对工作质量参数进行分析生成服务设备的状态信息,进而获取数据文件的文件热度信息,并根据文件热度信息计算该数据文件的部署需求,进而根据服务设备的状态信息选取符合部署需求的目标服务设备,并将数据文件部署至目标服务设备。由于本系统首先对服务设备的工作质量参数进行收集及分析,生成表征服务设备整体工况的状态信息,进而根据服务设备的状态信息选择能够进行数据文件部署的目标服务设备,并进行数据文件的部署,相当于根据服务设备的可用性,将数据文件准确的部署至相匹配服务设备中,进而能够相对确保服务设备提供数据文件的能力与用户对数据文件的访问需求一致。
此外,本申请还提供一种计算机可读存储介质,计算机可读存储介质上存储有数据处理程序,数据处理程序可被一个或者多个处理器执行,以实现如上述的数据处理方法。
本申请所提供的计算机可读存储介质,首先收集服务设备的工作质量参数,进而对工作质量参数进行分析生成服务设备的状态信息,进而获取数据文件的文件热度信息,并根据文件热度信息计算该数据文件的部署需求,进而根据服务设备的状态信息选取符合部署需求的目标服务设备,并 将数据文件部署至目标服务设备。由于本计算机可读存储介质首先对服务设备的工作质量参数进行收集及分析,生成表征服务设备整体工况的状态信息,进而根据服务设备的状态信息选择能够进行数据文件部署的目标服务设备,并进行数据文件的部署,相当于根据服务设备的可用性,将数据文件准确的部署至相匹配服务设备中,进而能够相对确保服务设备提供数据文件的能力与用户对数据文件的访问需求一致。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情 况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (10)

  1. 一种数据处理方法,其特征在于,包括:
    收集服务设备的工作质量参数,并对所述工作质量参数进行分析生成状态信息;
    获取数据文件的文件热度信息,并根据所述文件热度信息计算所述数据文件的部署需求;
    根据所述状态信息选取符合所述部署需求的目标服务设备;
    将所述数据文件部署至目标服务设备。
  2. 根据权利要求1所述的数据处理方法,其特征在于,所述工作质量参数包括存活质量参数、TCP连接质量参数以及打洞质量参数;
    所述状态信息包括所述服务设备的服务状态信息以及文件部署状态信息。
  3. 根据权利要求1所述的数据处理方法,其特征在于,所述获取数据文件的文件热度信息,包括:
    接收用户设备实时传入的文件请求;
    统计所述文件请求,生成所述数据文件的文件热度信息。
  4. 根据权利要求3所述的数据处理方法,其特征在于,所述统计所述文件请求,包括:
    统计当前时刻以及所述当前时刻之前连续时长的所述文件请求。
  5. 根据权利要求4所述的数据处理方法,其特征在于,所述将所述数据文件部署至目标服务设备,包括:
    根据所述数据文件的受访问的优先级顺序将所述数据文件部署至所述目标服务设备。
  6. 根据权利要求1所述的数据处理方法,其特征在于,所述获取数据文件的文件热度信息,并根据所述文件热度信息计算所述数据文件的部署需求,包括:
    接收用户设备上报的所述数据文件的流量特征数据;所述流量特征数据包括所述数据文件的流量波动信息以及所述数据文件之间的关联性信息;
    将所述流量特征数据作为所述文件热度信息输入至预设的特征匹配部署模型,计算所述数据文件的部署需求。
  7. 根据权利要求1至6任意一项所述的数据处理方法,其特征在于,所述根据所述文件热度信息计算所述数据文件的部署需求,包括:
    根据所述文件热度信息以加权方式计算所述数据文件的部署需求。
  8. 一种数据处理装置,其特征在于,所述装置包括存储器、处理器和总线,所述存储器上存储有可由总线传输至所述处理器并在所述处理器上运行的数据处理程序,所述数据处理程序被所述处理器执行时实现如权利要求1至7中任一项所述的数据处理方法。
  9. 一种数据处理系统,其特征在于,所述系统包括:
    状态统计单元,用于收集服务设备的工作质量参数,并对所述工作质量参数进行分析生成状态信息;
    需求计算单元,用于获取数据文件的文件热度信息,并根据所述文件热度信息计算所述数据文件的部署需求;
    设备选取模块,用于根据所述状态信息选取符合所述部署需求的目标服务设备;
    数据下发模块,用于将所述数据文件部署至目标服务设备。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有数据处理程序,所述数据处理程序可被一个或者多个处理器执行,以实现如权利要求1至7中任一项所述的数据处理方法。
PCT/CN2020/090810 2019-09-18 2020-05-18 一种数据处理方法、装置、系统及存储介质 WO2021051839A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910883498.3 2019-09-18
CN201910883498.3A CN110460682A (zh) 2019-09-18 2019-09-18 一种数据处理方法、装置、系统及存储介质

Publications (1)

Publication Number Publication Date
WO2021051839A1 true WO2021051839A1 (zh) 2021-03-25

Family

ID=68492402

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/090810 WO2021051839A1 (zh) 2019-09-18 2020-05-18 一种数据处理方法、装置、系统及存储介质

Country Status (2)

Country Link
CN (1) CN110460682A (zh)
WO (1) WO2021051839A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113485973A (zh) * 2021-07-02 2021-10-08 中国联合网络通信集团有限公司 数据同步方法及装置

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110460682A (zh) * 2019-09-18 2019-11-15 深圳市网心科技有限公司 一种数据处理方法、装置、系统及存储介质
CN113312329B (zh) * 2020-02-26 2024-03-01 阿里巴巴集团控股有限公司 数据文件的调度方法、装置及设备
CN111770180B (zh) * 2020-06-29 2023-06-30 百度在线网络技术(北京)有限公司 部署方法、装置、设备以及存储介质
CN114244903B (zh) * 2021-11-01 2024-05-28 网宿科技股份有限公司 资源调度方法、系统、服务器及存储介质
CN114881802B (zh) * 2022-07-11 2022-10-04 湖南三湘银行股份有限公司 基于元数据的数据资产管理方法及系统
CN116170476B (zh) * 2023-04-20 2023-11-03 北京华医网科技股份有限公司 一种医学继教用云服务信息管理方法和平台

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102687112A (zh) * 2009-11-03 2012-09-19 皮斯佩斯有限公司 在分布式存储系统中管理文件的装置及方法
US20150112951A1 (en) * 2013-10-23 2015-04-23 Netapp, Inc. Data management in distributed file systems
CN105049268A (zh) * 2015-08-28 2015-11-11 东方网力科技股份有限公司 分布式计算资源分配系统和任务处理方法
US20160173620A1 (en) * 2014-12-11 2016-06-16 International Business Machines Corporation Time-based data placement in a distributed storage system
CN107302561A (zh) * 2017-05-23 2017-10-27 南京邮电大学 一种云存储系统中热点数据副本放置方法
CN110035306A (zh) * 2019-04-23 2019-07-19 深圳市网心科技有限公司 文件的部署方法及装置、调度方法及装置
CN110460682A (zh) * 2019-09-18 2019-11-15 深圳市网心科技有限公司 一种数据处理方法、装置、系统及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102687112A (zh) * 2009-11-03 2012-09-19 皮斯佩斯有限公司 在分布式存储系统中管理文件的装置及方法
US20150112951A1 (en) * 2013-10-23 2015-04-23 Netapp, Inc. Data management in distributed file systems
US20160173620A1 (en) * 2014-12-11 2016-06-16 International Business Machines Corporation Time-based data placement in a distributed storage system
CN105049268A (zh) * 2015-08-28 2015-11-11 东方网力科技股份有限公司 分布式计算资源分配系统和任务处理方法
CN107302561A (zh) * 2017-05-23 2017-10-27 南京邮电大学 一种云存储系统中热点数据副本放置方法
CN110035306A (zh) * 2019-04-23 2019-07-19 深圳市网心科技有限公司 文件的部署方法及装置、调度方法及装置
CN110460682A (zh) * 2019-09-18 2019-11-15 深圳市网心科技有限公司 一种数据处理方法、装置、系统及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113485973A (zh) * 2021-07-02 2021-10-08 中国联合网络通信集团有限公司 数据同步方法及装置
CN113485973B (zh) * 2021-07-02 2023-05-16 中国联合网络通信集团有限公司 数据同步方法及装置

Also Published As

Publication number Publication date
CN110460682A (zh) 2019-11-15

Similar Documents

Publication Publication Date Title
WO2021051839A1 (zh) 一种数据处理方法、装置、系统及存储介质
Orsolic et al. A machine learning approach to classifying YouTube QoE based on encrypted network traffic
Hoßfeld et al. Internet video delivery in YouTube: From traffic measurements to quality of experience
CN102347864B (zh) 基于内容分发网络的服务质量监控系统
WO2022222755A1 (zh) 业务处理方法、装置及存储介质
EP1887732B1 (en) A method and system for content charging
Orsolic et al. Youtube qoe estimation based on the analysis of encrypted network traffic using machine learning
WO2016101464A1 (zh) 用户体验质量QoE评估方法、装置、终端及服务器
Vilas et al. User behavior analysis of a video-on-demand service with a wide variety of subjects and lengths
CN107786992B (zh) 一种检测移动通信网络质量的方法和装置
WO2014090075A1 (en) System and method for estimating an effective bandwidth
CN109348264B (zh) 视频资源共享方法、装置、存储介质及电子设备
CN106101264B (zh) 内容分发网络日志推送方法、装置和系统
CN104202220A (zh) 压力测试方法和装置
Xu et al. Modeling buffer starvations of video streaming in cellular networks with large-scale measurement of user behavior
US11632588B2 (en) Measuring the performance of a peer-managed content distribution network
CN117221148A (zh) 多类型网络应用服务质量评估系统和方法
Carvalho et al. QoE-aware container scheduler for co-located cloud environments
US20170013083A1 (en) Data processing method and apparatus used for terminal application
CN113079062A (zh) 一种资源调整方法、装置、计算机设备和存储介质
CN109194545A (zh) 一种网络试验平台流量生成系统、方法、装置及电子设备
Kang et al. Measurement, modeling, and analysis of internet video sharing site workload: A case study
WO2023045434A1 (zh) 访问检测方法、系统及装置
CN114640705B (zh) 一种大规模物联终端心跳监控方法
Qiao et al. Understanding and improving user engagement in adaptive video streaming

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20866423

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20/07/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20866423

Country of ref document: EP

Kind code of ref document: A1