CN110955642A - Data acquisition optimization method, device and equipment and readable storage medium - Google Patents

Data acquisition optimization method, device and equipment and readable storage medium Download PDF

Info

Publication number
CN110955642A
CN110955642A CN201910968760.4A CN201910968760A CN110955642A CN 110955642 A CN110955642 A CN 110955642A CN 201910968760 A CN201910968760 A CN 201910968760A CN 110955642 A CN110955642 A CN 110955642A
Authority
CN
China
Prior art keywords
user behavior
behavior data
storage space
data
cloud storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910968760.4A
Other languages
Chinese (zh)
Inventor
任熊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910968760.4A priority Critical patent/CN110955642A/en
Publication of CN110955642A publication Critical patent/CN110955642A/en
Priority to PCT/CN2020/099365 priority patent/WO2021068568A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1017Server selection for load balancing based on a round robin mechanism
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of big data, and discloses a data acquisition optimization method, which comprises the following steps: the user behavior log file is sent to a sub-cloud storage space of a cloud storage space for collecting data at regular time; traversing user behavior data in a user behavior log file of a child cloud storage space, and counting the storage space which can be occupied by the traversed user behavior data to obtain the size of the storage space occupied by the user behavior data; calculating the variance of the size of the storage space occupied by the user behavior data; judging whether the variance is larger than a preset threshold value or not; and if so, adjusting the data collection frequency of the sub-cloud storage space by adopting a weighted polling algorithm until the variance is less than or equal to a preset threshold value. The invention also discloses a data acquisition optimization device, equipment and a computer readable storage medium. The data acquisition optimization method provided by the invention solves the technical problem of low utilization rate of the storage space in the prior art, and improves the utilization rate of the storage space.

Description

Data acquisition optimization method, device and equipment and readable storage medium
Technical Field
The invention relates to the technical field of big data, in particular to a data acquisition optimization method, a data acquisition optimization device, data acquisition optimization equipment and a computer-readable storage medium.
Background
At present, with the rapid development of computer technology, people have entered the information age, and information and data storage become important parts of people's daily life. The data storage capacity of enterprises and personal users is greatly increased, and storage space resources are greatly occupied. How to optimize the data acquisition process to solve the technical problem of low utilization rate of storage space resources is a problem to be urgently solved by technical personnel in the field at present.
Disclosure of Invention
The invention mainly aims to provide a data acquisition optimization method, a data acquisition optimization device, data acquisition optimization equipment and a computer-readable storage medium, and aims to solve the technical problem of low utilization rate of storage space resources.
In order to achieve the above object, the present invention provides a data acquisition optimization method, which comprises the following steps:
executing a Linux Shell script through a cron of Linux, and regularly sending a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data, wherein the cloud storage space comprises a plurality of sub-cloud storage spaces;
traversing user behavior data in the user behavior log file of the child cloud storage space, and counting the storage space which can be occupied by the traversed user behavior data to obtain the size of the storage space occupied by the user behavior data;
calculating the variance of the size of the storage space occupied by the user behavior data through the following formula;
Figure BDA0002231382140000021
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data;
judging whether the variance is larger than a preset threshold value or not;
if the variance is larger than a preset threshold value, adjusting the data collection frequency of the sub-cloud storage space by adopting a weighted polling algorithm until the variance is smaller than or equal to the preset threshold value, if the variance is smaller than or equal to the preset threshold value, executing a Linux Shell script through a cron of Linux, and regularly sending the user behavior log file to the sub-cloud storage space of the cloud storage space for collecting data.
Optionally, after the step of executing the Linux Shell script through the cron of Linux and periodically sending the user behavior log file to the child cloud storage space of the cloud storage space for collecting data, the method further includes the following steps:
monitoring a user behavior log file in the cloud storage space in real time through a Flume plug-in, and collecting user behavior data in the user behavior log file;
and storing the user behavior data in real time through a distributed file system.
Optionally, before the step of traversing the user behavior data in the user behavior log file of the child cloud storage space, and counting the storage space that can be occupied by the traversed user behavior data to obtain the size of the storage space that is occupied by the user behavior data, the method further includes the following steps:
setting different identifications for the user behavior log files of the child cloud storage space to obtain user behavior log files with the identifications set;
traversing the user behavior data in the user behavior log file with the set identifier through a binary search tree, and counting the traversed user behavior data to obtain the number of the user behavior data corresponding to different identifiers.
Optionally, after the step of executing the Linux Shell script through the cron of Linux and periodically sending the user behavior log file to the child cloud storage space of the cloud storage space for collecting data, the method further includes the following steps:
judging whether an acquisition request of user behavior data exists or not;
if the request for acquiring the user behavior data exists at present, acquiring the user behavior data from a cloud storage space, and judging whether redundant user behavior data exists in the user behavior data;
if the user behavior data has redundant user behavior data, clearing the redundant user behavior data existing in the user behavior data through a preset redundancy strategy to obtain the user behavior data after clearing the redundant user behavior data, and if the user behavior data does not have the redundant user behavior data, not processing the user behavior data.
Optionally, if there is a request for acquiring user behavior data currently, acquiring the user behavior data from a cloud storage space, and determining whether redundant user behavior data exists in the user behavior data includes the following steps:
if the request for acquiring the user behavior data exists at present, acquiring the user behavior data from a cloud storage space through a flux plug-in and monitoring the user behavior data to obtain a monitoring result;
and comparing a preset monitoring index with the monitoring result to judge whether redundant user behavior data exist in the user behavior data, wherein the redundant user behavior data are the user behavior data exceeding the monitoring index.
Optionally, if there is redundant user behavior data in the user behavior data, the redundant user behavior data existing in the user behavior data is cleared by using a preset redundancy policy to obtain the user behavior data with the redundant user behavior data cleared, and if there is no redundant user behavior data in the user behavior data of the data collection end, the method does not include the following steps:
if the redundant user behavior data exists in the user behavior data, judging whether the redundant user behavior data exists in the user behavior data of the data collection end;
if the redundant user behavior data exists in the user behavior data of the data collection end, the redundant user behavior data existing in the user behavior data is eliminated through a mean shift algorithm to obtain the user behavior data after the redundant user behavior data is eliminated, and if the redundant user behavior data does not exist in the user behavior data of the data collection end, the user behavior data is not processed.
Further, in order to achieve the above object, the present invention further provides a data acquisition optimization apparatus, including:
the system comprises a sending module, a storage module and a processing module, wherein the sending module is used for executing a Linux Shell script through a cron of Linux and sending a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data at regular time, and the cloud storage space comprises a plurality of sub-cloud storage spaces;
the first traversal module is used for traversing the user behavior data in the user behavior log file of the child cloud storage space, and counting the storage space which can be occupied by the traversed user behavior data to obtain the size of the storage space occupied by the user behavior data;
the calculation module is used for calculating the variance of the size of the storage space occupied by the user behavior data through the following formula;
Figure BDA0002231382140000041
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data;
the first judgment module is used for judging whether the variance is larger than a preset threshold value or not;
and the adjusting module is used for adjusting the data collection frequency of the sub-cloud storage space by adopting a weighted polling algorithm if the variance is greater than a preset threshold value until the variance is less than or equal to the preset threshold value, executing a Linux Shell script through a cron of Linux if the variance is less than or equal to the preset threshold value, and sending the user behavior log file to the sub-cloud storage space of the cloud storage space for collecting data at regular time.
Optionally, the data acquisition optimization apparatus further includes the following modules:
the monitoring acquisition module is used for monitoring the user behavior log file in the cloud storage space in real time through a flash plug-in and acquiring user behavior data in the user behavior log file;
and the storage module is used for storing the user behavior data in real time through a distributed file system.
Optionally, the data acquisition optimization apparatus further includes the following modules:
the setting module is used for setting different identifications for the user behavior log files of the sub-cloud storage space to obtain the user behavior log files with the identifications set;
and the second traversal module is used for traversing the user behavior data in the user behavior log file with the set identifier through a binary search tree, and counting the traversed user behavior data to obtain the number of the user behavior data corresponding to different identifiers.
Optionally, the data acquisition optimization apparatus further includes the following modules:
the second judgment module is used for judging whether an acquisition request of user behavior data exists or not;
the third judgment module is used for acquiring the user behavior data from the cloud storage space if the acquisition request of the user behavior data currently exists, and judging whether redundant user behavior data exists in the user behavior data;
and the clearing module is used for clearing the redundant user behavior data in the user behavior data through a preset redundancy strategy to obtain the user behavior data after clearing the redundant user behavior data if the redundant user behavior data exists in the user behavior data, and not processing the user behavior data if the redundant user behavior data does not exist in the user behavior data of the data collection end.
Optionally, the third determining module includes the following units:
the monitoring unit is used for acquiring the user behavior data from the cloud storage space through a flux plug-in and monitoring the user behavior data to obtain a monitoring result if the request for acquiring the user behavior data exists at present;
and the first judgment unit is used for judging whether redundant user behavior data exist in the user behavior data or not by comparing a preset monitoring index with the monitoring result, wherein the redundant user behavior data are the user behavior data exceeding the monitoring index.
Optionally, the purge module comprises the following units:
the second judgment unit is used for judging whether redundant user behavior data exist in the user behavior data of the data collection end or not if the redundant user behavior data exist in the user behavior data;
and the clearing unit is used for clearing the redundant user behavior data in the user behavior data through a mean shift algorithm if the redundant user behavior data exists in the user behavior data of the data collection end to obtain the user behavior data after clearing the redundant user behavior data, and not processing the user behavior data if the redundant user behavior data does not exist in the user behavior data of the data collection end.
Further, in order to achieve the above object, the present invention also provides a data acquisition optimization method device, which includes a memory, a processor, and a data acquisition optimization method program stored in the memory and executable on the processor, wherein when the data acquisition optimization method program is executed by the processor, the data acquisition optimization method device implements the steps of any one of the data acquisition optimization method methods described above.
Further, to achieve the above object, the present invention also provides a computer readable storage medium, on which a data acquisition optimization method program is stored, which when executed by a processor implements the steps of the data acquisition optimization method according to any one of the above items.
The invention has the beneficial effects that: the invention aims to solve the technical problem of low utilization rate of storage space in the prior art. A data acquisition optimization method is provided. The realization process of the invention is as follows: the data collection frequency of the sub-cloud storage spaces is adjusted by adopting a weighted polling algorithm, so that load balance among the sub-cloud storage spaces is realized, the storage space resources are saved, redundant data are eliminated by adopting a mean shift algorithm, the redundant data are prevented from occupying the storage resources, and the utilization rate of the storage space is improved.
Drawings
Fig. 1 is a schematic structural diagram of an operating environment of a data acquisition optimization device according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of the data acquisition optimization method of the present invention;
FIG. 3 is a schematic flow chart of a data acquisition optimization method according to a second embodiment of the present invention;
FIG. 4 is a schematic flow chart of a data acquisition optimization method according to a third embodiment of the present invention;
FIG. 5 is a detailed flowchart of step S80 in FIG. 4;
FIG. 6 is a detailed flowchart of step S90 in FIG. 4;
FIG. 7 is a schematic flow chart of a data acquisition optimization method according to a fourth embodiment of the present invention;
FIG. 8 is a functional block diagram of a first embodiment of the data collection optimization device of the present invention;
fig. 9 is a schematic functional block diagram of a data acquisition optimizing apparatus according to a second embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a data acquisition optimization device.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an operating environment of a data acquisition optimization device according to an embodiment of the present invention.
As shown in fig. 1, the data acquisition optimization apparatus includes: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the hardware configuration of the data acquisition optimization device shown in fig. 1 does not constitute a limitation of the data acquisition optimization device, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, the memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a data collection optimization program. The operating system is a program for managing and controlling the data acquisition optimization device and software resources, and supports the operation of the data acquisition optimization program and other software and/or programs.
In the hardware structure of the data acquisition optimization device shown in fig. 1, the network interface 1004 is mainly used for accessing a network; the user interface 1003 is mainly used for detecting a confirmation instruction, an editing instruction, and the like. And the processor 1001 may be configured to invoke the data collection optimization program stored in the memory 1005 and perform the operations of the various embodiments of the data collection optimization method below.
Based on the hardware structure of the data acquisition optimization equipment, the data acquisition optimization method provided by the invention has various embodiments.
Referring to fig. 2, fig. 2 is a schematic flow chart of a data acquisition optimization method according to a first embodiment of the present invention. In this embodiment, the data acquisition optimization method includes the following steps:
step S10, executing a Linux Shell script through a cron of Linux, and regularly sending a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data, wherein the cloud storage space comprises a plurality of sub-cloud storage spaces;
in this embodiment, multiple commands and control statements, such as conditional control statements for if and else and loop control statements for and select, are set in the Shell script. The commands built in one Shell script are executed at one time, information cannot be continuously returned to the user, and according to the characteristic of the Shell script, the user behavior log file is sent to the sub-cloud storage space of the cloud storage space for data collection at regular time. Before this, the cloud storage space needs to be divided into a plurality of sub-cloud storage spaces for storing different user behavior log files.
Step S20, traversing the user behavior data in the user behavior log file of the child cloud storage space, and counting the storage space occupied by the traversed user behavior data to obtain the size of the storage space occupied by the user behavior data;
in this embodiment, in order to fully utilize the storage resources of each sub-cloud storage space and avoid resource waste, in this embodiment, the number of user behavior data in the user behavior log file sent to the sub-cloud storage space is obtained in real time, the user behavior data in the user behavior log file may be traversed in a traversal manner, and the traversed user behavior data are summed to obtain the number of the user behavior data.
Step S30, calculating the variance of the size of the storage space occupied by the user behavior data through the following formula;
Figure BDA0002231382140000081
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data;
in this embodiment, the following formula is used:
Figure BDA0002231382140000082
calculating a variance of the amount of user behavior data, the variance describing a degree of deviation of the variable from a mean, wherein,
Figure BDA0002231382140000083
mu is the mean value of the size of the sub-cloud storage space for storing the user behavior data X, V (X) is a variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, n is the number of the user behavior data, the variances with different sizes can be obtained through the formula, if the variance is large, the difference between the number of the user behavior data and the mean value of the size of the sub-cloud storage space is large, and if the variance is zero, the number of the user behavior data is just matched with the mean value of the size of the sub-cloud storage space.
Step S40, judging whether the variance is larger than a preset threshold value;
in this embodiment, the preset threshold refers to a value of the user behavior data quantity that can be stored in the sub-cloud storage space, and is set to prevent the phenomenon that some sub-cloud storage spaces have a larger data quantity and some other sub-cloud storage spaces have a smaller data quantity.
Step S50, if the variance is larger than a preset threshold, adjusting the data collection frequency of the sub-cloud storage space by adopting a weighted polling algorithm until the variance is smaller than or equal to the preset threshold, if the variance is smaller than or equal to the preset threshold, executing a Linux Shell script through a cron of Linux, and sending the user behavior log file to the sub-cloud storage space of the cloud storage space for collecting data at regular time, wherein the cloud storage space comprises a plurality of sub-cloud storage spaces.
In this embodiment, if the variance is greater than the preset threshold, it indicates that the amount of the user behavior data exceeds the average value of the sizes of the sub-cloud storage spaces, and if the acquisition process is not adjusted, the load may be unbalanced. Therefore, if the variance is greater than a preset threshold, a weighted polling algorithm is adopted to adjust the data collection frequency of the sub-cloud storage space, and the difference D between x (t) - μ is greater than zero, less than zero, or equal to zero. Only the first two cases need be considered in this embodiment. If the difference value D is larger than zero, the number of the user behavior data is larger than the average value of the sizes of the sub-cloud storage spaces, and if the difference value D is smaller than zero, the number of the user behavior data is smaller than the average value of the sizes of the sub-cloud storage spaces.
If N sub-cloud storage spaces exist in the cloud storage space, S ═ S1,S2,...,SnAnd the initial weight of the sub-cloud storage space is as follows: w ═ W1,W2,...,WnAnd the front effective weight of the sub-cloud storage space is as follows: CW ═ CW1,CW2,...,CWn}. For example, the initial weights of the first and second cloud storage spaces before storing no data are the same and are both W, that is, WFirst of allIs equal to WSecond stepWhen the quantity of the user behavior data stored in the first cloud storage space is larger than that of the user behavior data stored in the second cloud storage space, W is calculatedFirst of allAdjust to large to obtain CWFirst of allW is to beSecond stepAdjusted to small to obtain CWSecond stepI.e. CWFirst of allGreater than CWSecond step
Each sub-cloud storage space i except for the existence of an initial weight WiIn addition, there is a current effective weight CWiAnd CWiInitialized weight of WiThe mean of the initial weight sums of all the sub-cloud storage spaces is M:
Figure BDA0002231382140000091
the current effective weight of the sub-cloud storage space i is CWi,CWiInitialized weight of WiBy the following formula:
Figure BDA0002231382140000092
find an initialMean value M of the sum of weights, by the formula P ═ CWi-MiAnd obtaining a difference value P between the current weight of the sub-cloud storage space i and the mean value M, setting weights for all sub-cloud storage spaces in the cloud storage space according to the difference value, if the difference value P is larger and the difference value D is smaller than zero, adjusting the acquisition frequency greatly, if the difference value P is smaller and the difference value D is larger than zero, adjusting the acquisition frequency slightly, arranging the sub-cloud storage spaces with different weights in a queue form, and when an instruction for sending the user behavior log file to the sub-cloud storage space of the cloud storage space for data collection exists, using the weighted sub-cloud storage spaces according to the queue sequence.
Referring to fig. 3, fig. 3 is a schematic flow chart of a data acquisition optimization method according to a second embodiment of the present invention. In this embodiment, after step S10 in fig. 2, the method further includes the following steps:
step S60, monitoring a user behavior log file in the cloud storage space in real time through a flash plug-in, and collecting user behavior data in the user behavior log file;
in this embodiment, the user behavior log file in the cloud storage space is monitored in real time through the flux plug-in, and user behavior data in the user behavior log file is collected. The monitoring modes of the flash plug-in are http and ganglia, the http monitoring can only obtain monitoring data in a json format through one http address, and the ganglia monitoring is displayed in an interface mode after the data are obtained, so that the monitoring is relatively visual.
And step S70, storing the user behavior data in real time through a distributed file system.
In the embodiment, the Linux Shell script is executed through the cron of Linux, and the user behavior log file is sent to the sub-cloud storage space of the cloud storage space for data collection at regular time.
Referring to fig. 4, fig. 4 is a schematic flow chart of a data acquisition optimization method according to a third embodiment of the present invention. In this embodiment, after step S10 in fig. 2, the method further includes the following steps:
step S80, judging whether there is a request for acquiring user behavior data;
in this embodiment, only when there is a request for acquiring user behavior data, the data is acquired, and therefore it is necessary to determine whether there is a request for acquiring user behavior data.
Step S90, if there is a request for acquiring user behavior data, acquiring user behavior data from a cloud storage space, and judging whether redundant user behavior data exists in the user behavior data, if not, not processing;
in this embodiment, if there is a request for acquiring user behavior data, the user behavior data is acquired from the cloud storage space, and it is determined whether redundant user behavior data exists in the user behavior data, because the same user behavior data is continuously acquired when the user behavior data is acquired from the cloud storage space, if a large amount of repeated user behavior data exists in the acquired data set, the large amount of repeated user behavior data occupies a large amount of storage space, which may affect user experience, it is necessary to determine whether redundant user behavior data exists in the user behavior data.
Step S100, if the redundant user behavior data exists in the user behavior data, clearing the redundant user behavior data existing in the user behavior data through a preset redundancy strategy to obtain the user behavior data with the redundant user behavior data cleared, and if the redundant user behavior data does not exist in the user behavior data of the data collection end, not processing the user behavior data.
In this embodiment, if there is redundant user behavior data in the user behavior data, the redundant user behavior data existing in the user behavior data is cleared by using a preset redundancy policy, so as to obtain the user behavior data from which the redundant user behavior data is cleared, and the purpose of processing the data with redundancy is to prevent a large amount of repeated data from entering a data storage space. The preset redundancy strategy refers to processing data with redundancy through a preset algorithm, for example, the preset algorithm may be a mean shift algorithm.
Referring to fig. 5, fig. 5 is a detailed flowchart of step S80 in fig. 4. In this embodiment, step S80 includes the following steps:
step S801, if a request for acquiring user behavior data exists currently, acquiring the user behavior data from a cloud storage space through a flux plug-in and monitoring the user behavior data to obtain a monitoring result, and if not, not processing the monitoring result;
in this embodiment, if there is a request for obtaining user behavior data currently, the user behavior data is collected from the cloud storage space through the flux plug-in, and the user behavior data is monitored in real time through the flux plug-in to obtain a monitoring result. The monitoring modes of the Flume plug-in are http and ganglia, for example, the http monitoring obtains monitoring data in a json format through an http address access, and the ganglia monitoring shows a monitoring result in an interface mode after obtaining the data.
Step S802, comparing a preset monitoring index with the monitoring result, and judging whether redundant user behavior data exists in the user behavior data, wherein the redundant user behavior data is the user behavior data exceeding the monitoring index.
In this embodiment, the preset monitoring index refers to an index for presetting whether redundant data exists in the evaluation data, for example, the same data repeatedly appears ten times, the user behavior data is monitored in real time through the Flume plug-in to obtain a monitoring result, and if the same data repeatedly appears ten times in the monitoring result, it indicates that the monitoring index is exceeded. And judging whether redundant user behavior data exist in the user behavior data according to the monitoring result. If the same data repeatedly appears, the redundant user behavior data exists in the user behavior data.
Referring to fig. 6, fig. 6 is a detailed flowchart of step S90 in fig. 4. In this embodiment, step S90 includes the following steps:
step S901, if there is redundant user behavior data in the user behavior data, determining whether there is redundant user behavior data in the user behavior data of the data collection end;
in this embodiment, if there is redundant user behavior data, it is determined whether there is redundant user behavior data in the user behavior data of the data collection end, where the redundant user behavior data in the user behavior data of the data collection end indicates that the same user behavior data is repeatedly collected by the data collection end. In this embodiment, only whether redundant user behavior data exists for the data collection end is determined, and if the same user behavior data is repeatedly collected by the data collection end, it is indicated that the redundant user behavior data exists for the user behavior data of the data collection end.
Step S902, if the redundant user behavior data exists in the user behavior data of the data collection end, the redundant user behavior data existing in the user behavior data is eliminated through a mean shift algorithm to obtain the user behavior data after the redundant user behavior data is eliminated, and if the redundant user behavior data does not exist in the user behavior data of the data collection end, the user behavior data is not processed.
In this embodiment, if the data collection end collects repeated user behavior data, it indicates that redundant user behavior data exists in the user behavior data collected by the data collection end. For K user behavior data sets D in a given N-dimensional space, the first formula may be:
Figure BDA0002231382140000121
acting on arbitrary user behavior data x in space, where ShSo as to be in a high-dimensional sphere region s with a radius h by taking x as central data, k is so as to be in a high-dimensional sphere region shThe number of user behavior data within the range; xiSo as to be in the high-dimensional spherical region shUser behavior data within a range. Moving the center point to the shifted mean position may be performed using a second formula: xt+1=Mt+xtIn operation, MtIs the mean of the shifts in the t state, xtThe center in the t state. High dimensional sphere region ShAnd shifting in a data space through a second formula to judge whether redundant user behavior data exist in the current high-dimensional sphere area, and if so, adjusting the offset mean value MtUntil there is no redundant user behavior data in the current high-dimensional sphere region. In order to clean up redundant user data by using a mean shift algorithm, user behavior data needs to be converted into a form of a feature vector before the steps. Through the steps, the non-redundant data can be mapped in the high-dimensional sphere area, and the redundant data is excluded, so that the aim of clearing the redundant user behavior data in the user behavior data is fulfilled.
Referring to fig. 7, fig. 7 is a schematic flow chart of a data acquisition optimization method according to a fourth embodiment of the present invention. In this embodiment, before step S20 in fig. 2, the method further includes the following steps:
step S110, setting different identifications for the user behavior log files of the child cloud storage space to obtain user behavior log files with the identifications set;
in this embodiment, since the user behavior log files are different from each other, in order to facilitate calculation of the amount of user behavior data in different user behavior log files, a manner of setting different identifiers for different user behavior log files is adopted in this embodiment, which is to facilitate management of log files with different identifiers, for example, currently, only the user behavior data in a log file identified as "a" needs to be counted.
Step S120, traversing the user behavior data in the user behavior log file after the setting of the identification through a binary search tree, and counting the traversed user behavior data to obtain the number of the user behavior data corresponding to different identifications.
In this embodiment, the user behavior data of the user behavior log files with different identifiers is searched through the binary search tree, and how many user behavior data can be accommodated by how many nodes of each binary search tree, according to the formula: and M is the number of user behavior data, N is the number of user behavior data which can be accommodated by a single binary search tree, and N only comprises the number X of root nodes, the number Y of left subtrees and the number Z of right subtrees, and K is the number of binary search trees. When data is queried, a mode of forward-order traversal may be adopted, a mode of middle-order traversal may also be adopted, or a mode of backward-order traversal may be adopted, and whatever mode is adopted, user behavior data in the user behavior log file may be sequentially traversed, for example, in the forward-order traversal, the traversal order is: the root node, the left sub-tree and the right sub-tree, the traversing steps are as follows:
the first step is as follows: sequentially judging whether user behavior data exist at a current root node; if yes, judging whether user behavior data exist in the current left sub-tree or not; if not, obtaining the quantity of the user behavior data: and M is X K, wherein M is the number of the user behavior data, and X is the number of the root nodes.
The second step is that: if the user behavior data exist in the current left sub-tree, the quantity of the user behavior data is obtained: m ═ X + Y × K, where Y is the number of left subtrees; if the user behavior data do not exist in the current left sub-tree, obtaining the quantity of the user behavior data: m ═ X × K.
The third step: if the user behavior data exist in the current left sub-tree, judging whether the user behavior data exist in the current right sub-tree, and if the user behavior data exist in the current right sub-tree, obtaining the quantity of the user behavior data: m ═ X + Y + Z) × K, where Z is the number of right subtrees; if the current right subtree does not have the user behavior data, obtaining the quantity of the user behavior data: m ═ X + Y × K.
The number of the user behavior data in the user behavior log file with different identification marks is obtained through the method.
Referring to fig. 8, fig. 8 is a functional module schematic diagram of the first embodiment of the data acquisition optimization device of the present invention. In this embodiment, the data acquisition optimizing apparatus includes:
the sending module 10 is configured to execute a Linux Shell script through a cron of Linux, and send a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data at regular time, where the cloud storage space includes a plurality of sub-cloud storage spaces;
the traversal module 20 is configured to traverse the user behavior data in the user behavior log file of the child cloud storage space, and count a storage space that can be occupied by the traversed user behavior data to obtain a size of the storage space occupied by the user behavior data;
a calculating module 30, configured to calculate a variance of a size of a storage space occupied by the user behavior data according to the following formula;
Figure BDA0002231382140000141
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data;
the judging module 40 is used for judging whether the variance is larger than a preset threshold value;
and the adjusting module 50 is configured to adjust the data collection frequency of the sub-cloud storage space by using a weighted polling algorithm if the variance is greater than a preset threshold value until the variance is less than or equal to the preset threshold value, execute a Linux Shell script through a cron of Linux if the variance is less than or equal to the preset threshold value, and periodically send the user behavior log file to the sub-cloud storage space of the cloud storage space for collecting data.
In this embodiment, the sending module 10 is configured to execute a Linux Shell script through a cron of Linux, and send a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data at regular time, where the cloud storage space includes a plurality of sub-cloud storage spaces; the traversal module 20 is configured to traverse the user behavior data in the user behavior log file of the child cloud storage space, and count a storage space that can be occupied by the traversed user behavior data to obtain a size of the storage space occupied by the user behavior data; the calculating module 30 is configured to calculate a variance of a size of a storage space occupied by the user behavior data according to the following formula;
Figure BDA0002231382140000151
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data; the judging module 40 is configured to judge whether the variance is greater than a preset threshold; the adjusting module 50 is configured to adjust the data collection frequency of the sub-cloud storage space by using a weighted polling algorithm if the variance is greater than a preset threshold value until the variance is less than or equal to the preset threshold value, and execute a Linux shell script through cron of Linux if the variance is less than or equal to the preset threshold value, and periodically send the user behavior log file to the sub-cloud storage space of the cloud storage space used for collecting data. The device adjusts the data collection frequency of the sub-cloud storage spaces through the weighted polling algorithm in the adjusting module, so that the load balance among the sub-cloud storage spaces is realized, the storage space resources are saved, and the storage space utilization rate is improved.
Referring to fig. 9, fig. 9 is a functional module schematic diagram of a data acquisition optimization device according to a second embodiment of the present invention. In this embodiment, the data acquisition optimizing apparatus includes:
the sending module 10 is configured to execute a Linux Shell script through a cron of Linux, and send a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data at regular time, where the cloud storage space includes a plurality of sub-cloud storage spaces;
the traversal module 20 is configured to traverse the user behavior data in the user behavior log file of the child cloud storage space, and count a storage space that can be occupied by the traversed user behavior data to obtain a size of the storage space occupied by the user behavior data;
a calculating module 30, configured to calculate a variance of a size of a storage space occupied by the user behavior data according to the following formula;
Figure BDA0002231382140000161
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data;
the judging module 40 is used for judging whether the variance is larger than a preset threshold value;
the adjusting module 50 is configured to adjust the data collection frequency of the sub-cloud storage space by using a weighted polling algorithm if the variance is greater than a preset threshold value until the variance is less than or equal to the preset threshold value, execute a Linux Shell script through a cron of Linux if the variance is less than or equal to the preset threshold value, and periodically send the user behavior log file to the sub-cloud storage space of the cloud storage space for collecting data;
the monitoring acquisition module 60 is used for monitoring the user behavior log file in the cloud storage space in real time through a flash plug-in and acquiring user behavior data in the user behavior log file;
a storage module 70, configured to store the user behavior data in real time through a distributed file system.
The invention also provides a computer readable storage medium.
In this embodiment, the computer readable storage medium has a data acquisition optimization program stored thereon, and the data acquisition optimization program, when executed by a processor, implements the steps of the data acquisition optimization method as described in any one of the above embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM), and includes instructions for causing a terminal (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The present invention is described in connection with the accompanying drawings, but the present invention is not limited to the above embodiments, which are only illustrative and not restrictive, and those skilled in the art can make various changes without departing from the spirit and scope of the invention as defined by the appended claims, and all changes that come within the meaning and range of equivalency of the specification and drawings that are obvious from the description and the attached claims are intended to be embraced therein.

Claims (10)

1. A data acquisition optimization method is characterized by comprising the following steps:
executing a Linux Shell script through a cron of Linux, and regularly sending a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data, wherein the cloud storage space comprises a plurality of sub-cloud storage spaces;
traversing user behavior data in the user behavior log file of the child cloud storage space, and counting the storage space which can be occupied by the traversed user behavior data to obtain the size of the storage space occupied by the user behavior data;
calculating the variance of the size of the storage space occupied by the user behavior data through the following formula;
Figure FDA0002231382130000011
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data;
judging whether the variance is larger than a preset threshold value or not;
if the data collection frequency of the sub-cloud storage space is greater than or equal to the preset threshold, adjusting the data collection frequency of the sub-cloud storage space by adopting a weighted polling algorithm until the variance is less than or equal to the preset threshold, otherwise, executing a Linux Shell script through a cron of Linux, and sending the user behavior log file to the sub-cloud storage space of the cloud storage space for collecting data at regular time.
2. The data collection optimization method of claim 1, wherein after the step of executing a Linux Shell script through a cron of Linux and periodically sending the user behavior log file to a child cloud storage space of a cloud storage space for collecting data, further comprising the steps of:
monitoring a user behavior log file in the cloud storage space in real time through a Flume plug-in, and collecting user behavior data in the user behavior log file;
and storing the user behavior data in real time through a distributed file system (HDFS).
3. The data collection optimization method of claim 1, wherein before the step of traversing the user behavior data in the user behavior log file of the child cloud storage space, and counting the storage space that can be occupied by the traversed user behavior data to obtain the size of the storage space that is occupied by the user behavior data, the method further comprises the following steps:
setting different identifications for the user behavior log files of the child cloud storage space to obtain user behavior log files with the identifications set;
traversing the user behavior data in the user behavior log file with the set identifier through a binary search tree, and counting the traversed user behavior data to obtain the number of the user behavior data corresponding to different identifiers.
4. The data collection optimization method of claim 1, wherein after the step of executing a Linux Shell script through a cron of Linux and periodically sending the user behavior log file to a child cloud storage space of a cloud storage space for collecting data, further comprising the steps of:
judging whether an acquisition request of user behavior data exists or not;
if the request for acquiring the user behavior data exists at present, acquiring the user behavior data from a cloud storage space, and judging whether redundant user behavior data exists in the user behavior data;
and if the redundant user behavior data exists in the user behavior data, clearing the redundant user behavior data existing in the user behavior data through a preset redundancy strategy to obtain the user behavior data with the redundant user behavior data cleared.
5. The data acquisition optimization method of claim 4, wherein if there is a request for acquiring user behavior data currently, acquiring the user behavior data from a cloud storage space, and determining whether there is redundant user behavior data in the user behavior data comprises the following steps:
if the request for acquiring the user behavior data exists at present, acquiring the user behavior data from a cloud storage space through a flux plug-in and monitoring the user behavior data to obtain a monitoring result;
and comparing a preset monitoring index with the monitoring result to judge whether redundant user behavior data exist in the user behavior data, wherein the redundant user behavior data are the user behavior data exceeding the monitoring index.
6. The data acquisition optimization method of claim 4, wherein if the user behavior data includes redundant user behavior data, clearing the redundant user behavior data included in the user behavior data by using a preset redundancy policy to obtain the user behavior data with the redundant user behavior data cleared includes the following steps:
if the redundant user behavior data exists in the user behavior data, judging whether the redundant user behavior data exists in the user behavior data of the data collection end;
and if redundant user behavior data exists in the user behavior data of the data collection end, clearing the redundant user behavior data existing in the user behavior data through a mean shift algorithm to obtain the user behavior data with the redundant user behavior data cleared.
7. A data collection optimization device, comprising:
the system comprises a sending module, a storage module and a processing module, wherein the sending module is used for executing a Linux Shell script through a cron of Linux and sending a user behavior log file to a sub-cloud storage space of a cloud storage space for collecting data at regular time, and the cloud storage space comprises a plurality of sub-cloud storage spaces;
the traversal module is used for traversing the user behavior data in the user behavior log file of the child cloud storage space, and counting the storage space which can be occupied by the traversed user behavior data to obtain the size of the storage space occupied by the user behavior data;
the calculation module is used for calculating the variance of the size of the storage space occupied by the user behavior data through the following formula;
Figure FDA0002231382130000031
mu is the mean value of the storage space of the sub-clouds for storing user behavior data X, V (X) is the variance, X (t) is the size of the storage space occupied by the user behavior data, t is the identification of different user behavior data, and n is the number of the user behavior data;
the judging module is used for judging whether the variance is larger than a preset threshold value or not;
and the adjusting module is used for adjusting the data collection frequency of the sub-cloud storage space by adopting a weighted polling algorithm if the variance is greater than a preset threshold value until the variance is less than or equal to the preset threshold value, executing a Linux Shell script through a cron of Linux if the variance is less than or equal to the preset threshold value, and sending the user behavior log file to the sub-cloud storage space of the cloud storage space for collecting data at regular time.
8. The data acquisition optimization device of claim 7, further comprising the following modules:
the monitoring acquisition module is used for monitoring the user behavior log file in the cloud storage space in real time through a flash plug-in and acquiring user behavior data in the user behavior log file;
and the storage module is used for storing the user behavior data in real time through a distributed file system.
9. A data acquisition optimization device, characterized in that the data acquisition optimization device comprises a memory, a processor and a data acquisition optimization program stored on the memory and executable on the processor, which data acquisition optimization program, when executed by the processor, implements the steps of the data acquisition optimization method according to any one of claims 1 to 6.
10. A computer-readable storage medium, having stored thereon a data acquisition optimization program which, when executed by a processor, implements the steps of the data acquisition optimization method of any one of claims 1-6.
CN201910968760.4A 2019-10-12 2019-10-12 Data acquisition optimization method, device and equipment and readable storage medium Pending CN110955642A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910968760.4A CN110955642A (en) 2019-10-12 2019-10-12 Data acquisition optimization method, device and equipment and readable storage medium
PCT/CN2020/099365 WO2021068568A1 (en) 2019-10-12 2020-06-30 Data collection optimization method, apparatus and device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910968760.4A CN110955642A (en) 2019-10-12 2019-10-12 Data acquisition optimization method, device and equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN110955642A true CN110955642A (en) 2020-04-03

Family

ID=69975574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910968760.4A Pending CN110955642A (en) 2019-10-12 2019-10-12 Data acquisition optimization method, device and equipment and readable storage medium

Country Status (2)

Country Link
CN (1) CN110955642A (en)
WO (1) WO2021068568A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021068568A1 (en) * 2019-10-12 2021-04-15 平安科技(深圳)有限公司 Data collection optimization method, apparatus and device, and readable storage medium
CN113888773A (en) * 2021-09-28 2022-01-04 南京领行科技股份有限公司 Data processing method, device, server and storage medium for network appointment vehicle

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033457B (en) * 2022-06-22 2023-08-25 浙江大学 Multi-source data real-time acquisition method and system capable of monitoring and early warning

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313759B2 (en) * 2002-10-21 2007-12-25 Sinisi John P System and method for mobile data collection
US20150278018A1 (en) * 2014-03-29 2015-10-01 Fujitsu Limited Distributed storage system and method
CN106371762A (en) * 2016-08-19 2017-02-01 浪潮(北京)电子信息产业有限公司 Optimization method and system of storage data
CN106484331A (en) * 2015-09-29 2017-03-08 华为技术有限公司 A kind of data processing method, device and flash memory device
CN106506608A (en) * 2016-10-19 2017-03-15 北京华云网际科技有限公司 The access method of distributed block data and device
CN107124472A (en) * 2017-06-26 2017-09-01 杭州迪普科技股份有限公司 Load-balancing method and device, computer-readable recording medium
CN107728948A (en) * 2017-10-18 2018-02-23 郑州云海信息技术有限公司 A kind of memory performance optimization method and device, computer equipment
CN107908748A (en) * 2017-11-17 2018-04-13 南京感度信息技术有限责任公司 Website user's behavioral data acquisition method, system and application based on big data
CN108512919A (en) * 2018-03-25 2018-09-07 东莞市华睿电子科技有限公司 A kind of cloud storage space allocation method and server
US20180275902A1 (en) * 2017-03-26 2018-09-27 Oracle International Corporation Rule-based modifications in a data storage appliance monitor
CN108710686A (en) * 2018-05-21 2018-10-26 北京五八信息技术有限公司 A kind of date storage method, device, storage medium and terminal

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140337491A1 (en) * 2013-05-08 2014-11-13 Box, Inc. Repository redundancy implementation of a system which incrementally updates clients with events that occurred via a cloud-enabled platform
CN106570835B (en) * 2016-11-02 2019-05-24 北京控制工程研究所 A kind of point cloud simplification filtering method
CN110955642A (en) * 2019-10-12 2020-04-03 平安科技(深圳)有限公司 Data acquisition optimization method, device and equipment and readable storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313759B2 (en) * 2002-10-21 2007-12-25 Sinisi John P System and method for mobile data collection
US20150278018A1 (en) * 2014-03-29 2015-10-01 Fujitsu Limited Distributed storage system and method
CN106484331A (en) * 2015-09-29 2017-03-08 华为技术有限公司 A kind of data processing method, device and flash memory device
CN106371762A (en) * 2016-08-19 2017-02-01 浪潮(北京)电子信息产业有限公司 Optimization method and system of storage data
CN106506608A (en) * 2016-10-19 2017-03-15 北京华云网际科技有限公司 The access method of distributed block data and device
US20180275902A1 (en) * 2017-03-26 2018-09-27 Oracle International Corporation Rule-based modifications in a data storage appliance monitor
CN107124472A (en) * 2017-06-26 2017-09-01 杭州迪普科技股份有限公司 Load-balancing method and device, computer-readable recording medium
CN107728948A (en) * 2017-10-18 2018-02-23 郑州云海信息技术有限公司 A kind of memory performance optimization method and device, computer equipment
CN107908748A (en) * 2017-11-17 2018-04-13 南京感度信息技术有限责任公司 Website user's behavioral data acquisition method, system and application based on big data
CN108512919A (en) * 2018-03-25 2018-09-07 东莞市华睿电子科技有限公司 A kind of cloud storage space allocation method and server
CN108710686A (en) * 2018-05-21 2018-10-26 北京五八信息技术有限公司 A kind of date storage method, device, storage medium and terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
祁超 等: "数字光纤网络通信数据采集方法优化仿真", 《数字光纤网络通信数据采集方法优化仿真 *
郝昱文 等: "基于分布式环境的存储负载均衡算法研究" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021068568A1 (en) * 2019-10-12 2021-04-15 平安科技(深圳)有限公司 Data collection optimization method, apparatus and device, and readable storage medium
CN113888773A (en) * 2021-09-28 2022-01-04 南京领行科技股份有限公司 Data processing method, device, server and storage medium for network appointment vehicle

Also Published As

Publication number Publication date
WO2021068568A1 (en) 2021-04-15

Similar Documents

Publication Publication Date Title
CN109032801B (en) Request scheduling method, system, electronic equipment and storage medium
CN110955642A (en) Data acquisition optimization method, device and equipment and readable storage medium
CN108776934B (en) Distributed data calculation method and device, computer equipment and readable storage medium
CN109117275B (en) Account checking method and device based on data slicing, computer equipment and storage medium
CN108710540B (en) Resource scheduling method, device and equipment in distributed cluster
CN110928739B (en) Process monitoring method and device and computing equipment
CN108809848A (en) Load-balancing method, device, electronic equipment and storage medium
CN107656807A (en) The automatic elastic telescopic method and device of a kind of virtual resource
US20220229809A1 (en) Method and system for flexible, high performance structured data processing
CN109800204A (en) Data distributing method and Related product
CN113032157B (en) Automatic intelligent server capacity expansion and reduction method and system
JP2009059273A (en) Stream data control system, stream data control method and steam data control program
CN109039933B (en) Cluster network optimization method, device, equipment and medium
CN114780244A (en) Container cloud resource elastic allocation method and device, computer equipment and medium
WO2017095413A1 (en) Incremental automatic update of ranked neighbor lists based on k-th nearest neighbors
CN109885384B (en) Task parallelism optimization method and device, computer equipment and storage medium
WO2020094064A1 (en) Performance optimization method, device, apparatus, and computer readable storage medium
CN112367384B (en) Kafka cluster-based dynamic speed limiting method and device and computer equipment
CN107273413B (en) Intermediate table creating method, intermediate table inquiring method and related devices
CN109272567B (en) Three-dimensional model optimization method and device
CN116955271A (en) Method and device for storing data copy, electronic equipment and storage medium
CN113364648A (en) Flow control method, system, device, service equipment and storage medium
CN116405500B (en) System resource management method based on data analysis and cloud computing data analysis
CN110851249A (en) Data exporting method and equipment
CN113542807B (en) Resource management scheduling method and system based on digital retina platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200403

RJ01 Rejection of invention patent application after publication