CN112306369A - Data processing method, device, server and storage medium - Google Patents

Data processing method, device, server and storage medium Download PDF

Info

Publication number
CN112306369A
CN112306369A CN201910689074.3A CN201910689074A CN112306369A CN 112306369 A CN112306369 A CN 112306369A CN 201910689074 A CN201910689074 A CN 201910689074A CN 112306369 A CN112306369 A CN 112306369A
Authority
CN
China
Prior art keywords
data
processed
time
space
buffer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910689074.3A
Other languages
Chinese (zh)
Inventor
董磊
陈敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910689074.3A priority Critical patent/CN112306369A/en
Publication of CN112306369A publication Critical patent/CN112306369A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Abstract

The embodiment of the application discloses a data processing method, a data processing device, a server and a storage medium, and the embodiment of the application acquires data to be processed; determining the grade of the data to be processed; distributing a buffer space and buffer time for the data to be processed; storing the data to be processed into the cache space according to the grade; and sequentially transferring the data to be processed from the cache space to a preset storage space according to the cache time and the sequence of the grades from high to low. According to the scheme, the data to be processed can be transferred and stored to the preset storage space from the cache space based on the grade and the cache time of the data to be processed, the data loss caused by the fact that the data to be processed acquired at the same moment are too much and are not processed in time is avoided, and the reliability and the flexibility of data processing are improved.

Description

Data processing method, device, server and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a data processing method, an apparatus, a server, and a storage medium.
Background
Currently, a server can process and analyze data reported in real time to determine the operation condition of the current system. Because the time of the reported data may be uniform, a large amount of data may be reported at the same time at a certain moment, which results in a high data peak value at a certain moment, and during the data peak period, the load is too high, the time for writing into the memory becomes very slow, which results in data being unable to be written in time, for example, the memory may overflow due to the fact that the processing cannot be completed at the first moment because of the memory or the network, and thus the data may be lost, and therefore the data may be frequently lost in the process of processing the data, and the reliability of data transmission is low.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, a server and a storage medium, and aims to improve the reliability and flexibility of data processing.
In order to solve the above technical problem, an embodiment of the present application provides the following technical solutions:
in a first aspect, an embodiment of the present application provides a data processing method, including:
acquiring data to be processed;
determining the grade of the data to be processed;
distributing a buffer space and buffer time for the data to be processed;
storing the data to be processed into the cache space according to the grade;
and sequentially transferring the data to be processed from the cache space to a preset storage space according to the cache time and the sequence of the grades from high to low.
In some embodiments, the determining the rank of the data to be processed comprises:
acquiring an identifier carried in the data to be processed;
determining the type of the data to be processed according to the identification;
and determining the grade of the data to be processed according to the type.
In some embodiments, the allocating buffer space and buffer time for the data to be processed includes:
acquiring a time interval of the cache space and the delay time of the data to be processed;
and distributing buffer space and buffer time for the data to be processed according to the time interval and the delay time.
In some embodiments, the allocating buffer space and buffer time for the to-be-processed data according to the time interval and the delay time includes:
constructing a plurality of buffer spaces to form a ring buffer queue;
setting a time interval for each buffer space in the annular buffer queue;
obtaining the delay time of the data to be processed and the sequence number of the buffer space of the currently-stored data in the annular buffer queue;
and distributing buffer space and buffer time for the data to be processed according to the time interval, the delay time and the sequence number.
In some embodiments, the sequentially transferring the data to be processed from the cache space to a preset storage space according to the order of the levels from high to low according to the cache time includes:
when the cache time is up, sequentially extracting the data to be processed from the cache space according to the sequence of the levels from high to low to obtain target data;
analyzing the target data to obtain analyzed target data;
and sending the analyzed target data to a preset storage space according to a preset rate.
In some embodiments, after sending the parsed target data to a preset storage space at a preset rate, the method further includes:
when the cache time is over, judging whether the cache space has data to be processed which is not extracted;
if yes, storing the unextracted data to be processed to a magnetic disk;
and analyzing the data to be processed in the disk according to the sequence of the levels from high to low, and sending the analyzed data to a preset storage space according to a preset rate.
In some embodiments, the acquiring the data to be processed includes:
and receiving sampling data, log data or performance data reported by a client or other servers.
In a second aspect, an embodiment of the present application further provides a data processing apparatus, including:
the acquisition unit is used for acquiring data to be processed;
the determining unit is used for determining the grade of the data to be processed;
the distribution unit is used for distributing cache space and cache time for the data to be processed;
the storage unit is used for storing the data to be processed into the cache space according to the grade;
and the unloading unit is used for unloading the data to be processed from the cache space to a preset storage space in sequence from high to low according to the cache time.
In some embodiments, the determining unit is specifically configured to:
acquiring an identifier carried in the data to be processed;
determining the type of the data to be processed according to the identification;
and determining the grade of the data to be processed according to the type.
In some embodiments, the dispensing unit comprises:
an obtaining module, configured to obtain a time interval of the cache space and a delay time of the to-be-processed data;
and the distribution module is used for distributing buffer space and buffer time for the data to be processed according to the time interval and the delay time.
In some embodiments, the assignment module is specifically configured to:
constructing a plurality of buffer spaces to form a ring buffer queue;
setting a time interval for each buffer space in the annular buffer queue;
obtaining the delay time of the data to be processed and the sequence number of the buffer space of the currently-stored data in the annular buffer queue;
and distributing buffer space and buffer time for the data to be processed according to the time interval, the delay time and the sequence number.
In some embodiments, the unloading unit is specifically configured to:
when the cache time is up, sequentially extracting the data to be processed from the cache space according to the sequence of the levels from high to low to obtain target data;
analyzing the target data to obtain analyzed target data;
and sending the analyzed target data to a preset storage space according to a preset rate.
In some embodiments, the unloading unit is further specifically configured to:
when the cache time is over, judging whether the cache space has data to be processed which is not extracted;
if yes, storing the unextracted data to be processed to a magnetic disk;
and analyzing the data to be processed in the disk according to the sequence of the levels from high to low, and sending the analyzed data to a preset storage space according to a preset rate.
In some embodiments, the obtaining unit is specifically configured to: and receiving sampling data, log data or performance data reported by a client or other servers.
In a third aspect, an embodiment of the present application further provides a server, including a memory and a processor, where the memory stores a computer program, and the processor executes any one of the data processing methods provided in the embodiment of the present application when calling the computer program in the memory.
In a fourth aspect, an embodiment of the present application further provides a storage medium, where the storage medium is used to store a computer program, and the computer program is loaded by a processor to execute any one of the data processing methods provided in the embodiment of the present application.
The method and the device can acquire the data to be processed, determine the grade of the data to be processed, and allocate the cache space and the cache time for the data to be processed; and then, storing the data to be processed into the cache space according to the grade, and sequentially transferring the data to be processed from the cache space to the preset storage space from high to low according to the grade according to the cache time. According to the scheme, the data to be processed can be transferred and stored to the preset storage space from the cache space based on the grade and the cache time of the data to be processed, the data loss caused by the fact that the data to be processed acquired at the same moment are too much and are not processed in time is avoided, and the reliability and the flexibility of data processing are improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic view of a data processing scenario provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of a data processing method provided in an embodiment of the present application;
FIG. 3 is another schematic flow chart diagram of a data processing method according to an embodiment of the present application;
FIG. 4 is another schematic flow chart diagram of a data processing method according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a time wheel provided in an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Embodiments of the present application provide a data processing method, an apparatus, a server, and a storage medium, which are described in detail below.
Referring to fig. 1, fig. 1 is a schematic view of a data processing scenario provided in this embodiment, where the data processing system may include a data processing device, where the data processing device may be specifically integrated in a server, and the server may obtain data to be processed, for example, data reported by a receiving terminal or another server, and determine a level of the data to be processed, for example, the data may be graded according to a data type or a data usage. Then, a buffer space and a buffer time may be allocated for the data to be processed, for example, a buffer space of a time wheel may be allocated for the data to be processed and a buffer time corresponding to the buffer space may be determined. At this time, the data to be processed may be stored in the cache space according to the level, and the data to be processed may be sequentially transferred from the cache space to the preset storage space according to the order from the high level to the low level according to the cache time, for example, when the cache time is reached, the data to be processed in the cache space may be sequentially analyzed according to the order from the high level to the low level, and the analyzed data may be sent to the preset storage space (for example, Kafka) according to a preset rate; and when the caching time is over, storing the remaining data to be processed in the caching space to the disk, analyzing the data to be processed in the disk from high to low according to the level sequence, and sending the analyzed data to the preset storage space according to a preset rate. According to the scheme, the data to be processed can be transferred and stored to the preset storage space from the cache space based on the grade and the cache time of the data to be processed, the data loss rate is reduced, and the reliability and the flexibility of data processing are improved.
It should be noted that the scenario diagram of data processing shown in fig. 1 is only an example, and the system and the scenario of data processing described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application, and as a person having ordinary skill in the art knows, along with the evolution of the system of data processing and the occurrence of a new service scenario, the technical solution provided in the embodiment of the present application is also applicable to similar technical problems.
Referring to fig. 2, fig. 2 is a schematic flow chart of a data processing method according to an embodiment of the present application. The execution main body of the data processing method may be the data processing apparatus provided in the embodiment of the present application, or a server integrated with the data processing apparatus, where the data processing apparatus may be implemented in a hardware or software manner. The data processing method may include:
and S101, acquiring data to be processed.
The data to be processed may be in the form of a data packet or a file, and the data to be processed may include sampling data, log data, or performance data, for example, the sampling data may include error information occurring during playing of audio and video by a client on a terminal, and part of data related to the error information is extracted from the audio and video, the log data may include log-related data such as video transcoding, video member purchasing, video frequency band, video evaluation, gift giving, and the like, the performance data may include platform performance-related data such as video playing flow, video playing speed, definition, and the like, and the performance data has relatively high requirements on real-time performance and reliability.
In some embodiments, obtaining the data to be processed may include: and receiving sampling data, log data or performance data reported by a client or other servers.
For example, the server may receive the to-be-processed data reported by the client or other servers through a cellular system BOSS, which is a real-time data middleware service system, supports reporting data in different manners or different languages, and may store the reported data in a memory or a disk to provide other services for subscription. Or, the server may receive related data, such as Session management Session, reported by the client. Or, the server may receive the to-be-processed data reported by the client or another server through a HyperText Transfer Protocol (HTTP). Or, the server may receive the to-be-processed data reported by the client or another server through a Transmission Control Protocol (TCP).
And S102, determining the grade of the data to be processed.
Because different data (i.e. data to be processed) have different importance degrees, in order to ensure that the more important data can be processed preferentially, the data to be processed can be graded, the higher the grade of the data is, the higher the importance degree of the data is, and conversely, the lower the grade of the data is, the lower the importance degree of the data is.
The manner of the grade division may be flexibly set according to actual needs, for example, the data to be processed may be divided into a first grade and a second grade according to the type of the data to be processed, where the first grade is higher than the second grade, the data of the first grade is important data, and the data of the second grade is unimportant data.
For another example, the data to be processed may be divided into a plurality of levels (e.g., a first level, a second level, a third level, etc.) according to the purpose, size, or source of the data to be processed. For another example, the data to be processed may be divided into a first level, a second level, a third level, a fourth level, a fifth level, a sixth level, and the like according to the characteristic information of the data to be processed. The grades are a first grade to a sixth grade from high to low in sequence.
In some embodiments, determining the rank of the data to be processed may include: acquiring an identifier carried in data to be processed; determining the type of the data to be processed according to the identifier; and determining the grade of the data to be processed according to the type.
The identifier may be an Identification (ID), a name, or a number of the data to be processed, and may be used to uniquely identify the data, and the identifier may be formed by numbers, letters, or words. In order to improve the convenience of the classification of the data to be processed, the classification may be performed based on the type of the data to be processed, for example, a mapping relationship between the type of the data and the grade of the data may be established, that is, different data types correspond to different data grades, for example, for three types of data, such as sampling data, log data, and performance data, the grade of the sampling data may be set to be the lowest (as the third grade), the grade of the log data may be the second grade, and the grade of the performance data may be the highest (as the first grade). After the data to be processed is obtained, the carried identifier can be extracted from the data to be processed, then the type of the data to be processed can be determined according to the identifier of the data to be processed, and at the moment, the grade of the data to be processed can be determined according to the type of the data to be processed. For example, if the data to be processed is data related to video playing quality, the type of the data to be processed may be determined to be performance data according to the identifier extracted from the data to be processed, and the level of the data to be processed may be determined to be a first level.
And S103, distributing buffer space and buffer time for the data to be processed.
The buffer space may be a buffer space in a preset circular buffer queue on the server, for example, a buffer space of a time wheel (which may also be referred to as a grid of the time wheel); or, the buffer space may also be a buffer space in a preset bar buffer queue on the server, for example, each buffer space in the bar buffer queue may be sequentially linked through a pointer in a linked list; alternatively, the buffer space may also be a buffer space in a discrete buffer queue preset on the server, for example, each buffer space in the discrete buffer queue may be sequentially linked through a pointer in a linked list.
The buffering time may be a time period or a time point, for example, the buffering time corresponding to the buffering space a may be a time period from a to B, the buffering time corresponding to the buffering space B may be a time period from B to C, the buffering time corresponding to the buffering space C may be a time period from C to D, the buffering time corresponding to the buffering space D may be a time period from D to e, and the like.
It should be noted that, the form of the buffer queue may be a ring, a bar, or a discrete form, and may also be flexibly set according to actual needs, the size and the position of the buffer space may be flexibly set according to actual needs, and the size of the buffer time may be flexibly set according to actual needs, and the specific content is not limited here.
In some embodiments, allocating buffer space and buffer time for the data to be processed may include: acquiring a time interval of a cache space and delay time of data to be processed; and distributing buffer space and buffer time for the data to be processed according to the time interval and the delay time.
Specifically, a time interval may be set for each buffer space in the buffer queue, and the time intervals of all buffer spaces in the buffer queue may constitute a continuous time period or a discontinuous time period, and optionally, in order to ensure the continuity of the time for processing the data to be processed, the time intervals of all buffer spaces in the buffer queue may constitute a continuous time period. For example, taking a buffer queue as a time wheel, the time wheel may be divided into 100 buffer spaces, and the total time of the time wheel is set to 5 minutes, at this time, the total time may be uniformly distributed to each buffer space, and the buffer time corresponding to each buffer space is 3 seconds, that is, the time interval corresponding to the first buffer space of the time wheel is from the current time to the third second, the time interval corresponding to the second buffer space is from the third second to the sixth second, the time interval corresponding to the third buffer space is from the sixth second to the ninth second, and so on, the time interval corresponding to the first hundred buffer spaces is from the last three seconds.
For another example, taking the buffer queue as the discrete buffer queue as an example, a plurality of discrete buffer spaces in the discrete buffer queue may be sequentially linked according to pointers in the linked list to obtain a first buffer space to a hundred-th buffer space, a total time of 5 minutes may be uniformly allocated to each buffer space (of course, the total time may also be non-uniformly allocated), and a buffer time corresponding to each buffer space is obtained to be 3 seconds, that is, a time interval corresponding to the first buffer space of the time wheel is from a current time to a third second, a time interval corresponding to the second buffer space is from the third second to a sixth second, and so on, a time interval corresponding to the hundred-th buffer space is from three seconds to the last.
And, the delay time of the data to be processed can be set, and the delay time can be flexibly set according to actual needs, for example, the delay time can be set to 6 seconds or 10 seconds, etc. At this time, a buffer space may be allocated for the data to be processed according to the time interval and the delay time, and a buffer time corresponding to the buffering of the data to be processed in the buffer space may be determined. For example, the processing time of the data to be processed may be calculated according to the delay time, the processing time may be matched with the time interval of each buffer space, the time interval in which the processing time is located may be determined, so that the buffer space in which the data to be processed may be buffered may be determined, and the buffer time may be determined according to the time interval of the buffer space.
In some embodiments, allocating buffer space and buffer time for the data to be processed according to the time interval and the delay time may include: constructing a plurality of buffer spaces to form a ring buffer queue; setting a time interval for each buffer space in the annular buffer queue; obtaining the delay time of the data to be processed and the serial number of the buffer space of the currently-stored data in the annular buffer queue; and distributing buffer space and buffer time for the data to be processed according to the time interval, the delay time and the sequence number.
Specifically, in order to improve convenience and flexibility of data buffering, a circular buffer queue formed by a plurality of buffer spaces may be constructed, for example, a time wheel may be configured, a time interval may be set for each buffer space in the circular buffer queue, for example, when the time wheel includes 100 buffer spaces, total time of the time wheel may be uniformly allocated to each buffer space for 5 minutes, and a buffer time corresponding to each buffer space is obtained to be 3 seconds, that is, a time interval corresponding to a first buffer space of the time wheel is from a current time to a third second, a time interval corresponding to a second buffer space is from the third second to the sixth second, a time interval corresponding to a third buffer space is from the sixth second to the ninth, and so on, a time interval corresponding to a first hundred buffer spaces is from three seconds to three seconds.
And the delay time of the data to be processed can be acquired, and the delay time can be flexibly set according to actual needs, for example, the delay time can be set to be 6 seconds or 10 seconds and the like. When the buffering time corresponding to the buffering space in the annular buffering queue arrives, the data to be processed in the buffering space is transferred to the preset storage space, so that the sequence number of the buffering space in which the data is currently transferred in the annular buffering queue can be obtained, for example, when the annular buffering queue includes 100 buffering spaces, the 100 buffering spaces can be numbered from 1 to 100, and if the buffering space in which the data is currently transferred is the 9 th buffering space, the sequence number of the buffering space in which the data is currently transferred in the annular buffering queue can be obtained to be 9. At this time, a buffer space may be allocated for the data to be processed according to the time interval, the delay time, and the sequence number, and a buffer time corresponding to the buffer space may be determined. For example, it may be determined that the buffering time of each buffering space is 3 seconds according to the time interval of each buffering space, for example, the time interval of the buffering space is a to b, and the time span of a to b is 3 seconds, then the buffering time of the buffering space is 3 seconds; and allocating the number of the buffer spaces to the data to be processed, which is the number of the buffer spaces included in the annular buffer queue (the sequence number of the buffer space in which the data is currently transferred, the buffer time of the buffer space + the delay time)/the buffer time%, that is, buffering the data to be processed into the buffer space corresponding to the sequence number.
And S104, storing the data to be processed into a cache space according to the grade.
After the buffer space is allocated, the data to be processed may be stored in the buffer space according to the level, for example, a plurality of queues may be set in the same buffer space, each queue stores the data to be processed of different levels, for example, the data to be processed of high level (i.e., important data) may be stored in an important queue, and the data to be processed of low level (i.e., unimportant data) may be stored in an unimportant queue, so that the data to be processed may be subsequently extracted from the corresponding queue of the buffer space according to the level of the data to be processed and then be transferred. Taking the time wheel as an example, two queues can be set in each grid of the time wheel, data with high priority can be stored in the first queue, data with low priority can be stored in the second queue, and subsequently, data to be processed with high priority can be preferentially extracted for transfer.
For another example, the data to be processed may be classified into the data groups corresponding to the levels according to the levels of the data to be processed, the data to be processed may be classified and stored in the cache space according to the data groups to which the data to be processed belongs, for example, the data group 1 corresponding to the first level, the data group 2 corresponding to the second level, the data group 3 corresponding to the third level, and the like may be sequentially stored in the cache space. And subsequently, the data to be processed in the data group 1 with higher grade can be preferentially extracted for transferring, and then the data to be processed in the data groups 2 and 3 are sequentially extracted for transferring.
And S105, sequentially transferring the data to be processed from the cache space to a preset storage space according to the sequence of the levels from high to low according to the cache time.
After the data to be processed is stored in the cache space, the server may monitor whether the cache time reaches, for example, when the cache time is 3 seconds, it may be determined whether the start time of the 3 seconds reaches, and if the start time reaches, the data to be processed may be sequentially extracted from the cache space according to the order from high to low in the order of the levels, and stored in the preset storage space; if not, the method can wait until the starting time of the cache time arrives, and then the data to be processed is transferred and stored.
The preset storage space may be flexibly set according to actual needs, for example, the preset storage space may be a message system Kafka, which is a high-throughput distributed publish-subscribe message system and may process all action stream data in a website. The preset storage space may be a single Kafka or a Kafka cluster composed of a plurality of kafkas.
In some embodiments, the sequentially transferring the data to be processed from the cache space to the preset storage space according to the order of the levels from high to low according to the cache time may include: when the caching time is up, sequentially extracting data to be processed from the caching space according to the sequence of the levels from high to low to obtain target data; analyzing the target data to obtain analyzed target data; and sending the analyzed target data to a preset storage space according to a preset rate.
In order to reduce the data loss rate, the data in the buffer space may be parsed and then transferred to a preset storage space, for example, when the buffer time is reached, the data to be processed may be sequentially extracted from the buffer space according to the order from high to low in the order of the level, so as to obtain the target data, for example, the data to be processed with the highest level may be preferentially extracted from the queue with the highest level, and then the data to be processed with the lower level may be sequentially extracted from the queue with the lower level, and so on.
Then, the extracted target data is analyzed, for example, an Internet Protocol (IP) analysis is performed between networks to obtain analyzed target data, where the analyzed target data may include information of a country, province, city, or operator corresponding to the data. At this time, the parsed target data may be transmitted to a preset storage space (e.g., Kafka) at a preset rate. The preset rate can be flexibly set according to actual needs.
In some embodiments, after sending the parsed target data to the preset storage space at the preset rate, the data processing method may further include: when the caching time is over, judging whether the data to be processed which is not extracted exists in the caching space; if yes, storing the unextracted data to be processed to a magnetic disk; analyzing the data to be processed in the disk according to the sequence of the levels from high to low, and sending the analyzed data to a preset storage space according to a preset rate.
In order to reduce the processing pressure of the memory of the server on the data and the data loss rate and improve the stability of the server on data processing, the data which is not processed in the cache space can be temporarily stored in the disk. Specifically, in the process of extracting the data to be processed from the buffer space, it may be determined whether the buffering time is over, for example, when the buffering time is 3 seconds, it may be determined whether the end time of the 3 seconds is reached, and if the end time is reached, it may be further determined whether the data to be processed which is not extracted exists in the buffer space. If the extracted data exists, the data to be processed which is not extracted can be stored to the disk sequentially according to the sequence from high level to low level. For example, the cache space caches to-be-processed data of a first level, a second level, and a third level, and when the cache time is over, if the to-be-processed data that is not extracted exists in the cache space and is the third level, the to-be-processed data of the third level may be stored in the disk; or, when the buffering time is over, if the data to be processed which is not extracted exists in the buffering space and is of the second level and the third level, the data to be processed may be sequentially stored in the disk according to the order from the second level to the third level.
Then, the data to be processed in the disk may be sequentially analyzed in order from high to low in the order of the rank, so as to obtain the analyzed data, for example, the data to be processed with the highest rank may be preferentially extracted from the disk and analyzed, and then the data to be processed with the lower rank may be sequentially extracted and analyzed, and so on. The parsing may be IP parsing, so that parsed data including information of provinces, cities, operators, or the like can be obtained, and at this time, the parsed data may be sent to a preset storage space (e.g., Kafka) according to a preset rate, where the preset rate may be flexibly set according to actual needs. The reliability of the data is ensured by storing the data in the disk, and meanwhile, the Distributed storage, such as a Distributed File System (HDFS) or a Distributed File database (HBase), can be used for uniform data access. The rate control unit may perform uniform limitation by using a memory database or the like, or may perform dynamic capacity control by using a server.
It should be noted that, in order to ensure the reliability of data transmission, in the process of transmitting data to the preset storage space, if data transmission fails, retry of data transmission is performed, after multiple retries fail, data may be listed in a gray list, and the data amount of data transmitted in the gray list may be set to be smaller, and data transmission is reduced until subsequent data can be normally transmitted, and then the data transmission amount is increased. By adding mechanisms such as data sending retry and server monitoring, the reliability of the server is improved, the data safety is improved, and the server can be ensured to be within a bearing range.
By utilizing the cache space and the disk to cache data, the conventional mode of directly reading and sending data is replaced, the reading and writing efficiency of the data can be greatly improved through smooth data peak values such as the cache space and the disk, the data can be sent at a certain speed by methods such as data classification and cache, the data loss during data peak is avoided, the stability of the server is improved, and the influence of the data peak value on the crash of the server is avoided.
The method and the device can acquire the data to be processed, determine the grade of the data to be processed, and allocate the cache space and the cache time for the data to be processed; and then, storing the data to be processed into the cache space according to the grade, and sequentially transferring the data to be processed from the cache space to the preset storage space from high to low according to the grade according to the cache time. According to the scheme, the data to be processed can be transferred and stored to the preset storage space from the cache space based on the grade and the cache time of the data to be processed, the data loss caused by the fact that the data to be processed acquired at the same moment are too much and are not processed in time is avoided, and the reliability and the flexibility of data processing are improved.
The method described in the above embodiments is further illustrated in detail by way of example.
Referring to fig. 3, fig. 3 is a schematic flow chart of a data processing method according to an embodiment of the present disclosure. The data processing method may be applied to a server, for example, the server processes data, as shown in fig. 4, the server may receive data to be processed through a cellular system BOSS, read data in the BOSS, classify the data, store the classified data in a time wheel, perform IP analysis on the data in the time wheel according to a preset policy, send the analyzed data to a messaging system Kafka according to a preset rate, or store the data in the time wheel in a Disk in a classified manner, perform IP analysis on the data in the Disk, and send the analyzed data to the messaging system Kafka according to the preset rate. As will be described in detail below, the method flow in fig. 3 may include:
s201, the server receives data reported by the client through a honeycomb system BOSS.
The data may include sampling data, log data, or performance data, for example, the sampling data may include error information occurring during the process of playing audio and video by the client, the log data may include log-related data such as video transcoding, video frequency band, or video evaluation, and the performance data may include performance-related data such as video playing flow, video playing speed, and definition.
S202, the server acquires the type of the data and grades the data according to the type of the data to obtain the grade of the data.
Because different data have different importance degrees, in order to ensure that the more important data can be processed preferentially, the server can grade the data, the higher the grade of the data is, the higher the importance degree of the data is, and conversely, the lower the grade of the data is, the lower the importance degree of the data is.
In order to improve convenience of data classification, the server may perform classification based on the type of the data, for example, a mapping relationship between the type of the data and the class of the data may be established, that is, different data types correspond to different data classes. After receiving the data, the server may determine the type of the data according to the identifier of the data, and determine the grade of the data according to the type of the data. For example, for three types of data, such as sample data, log data, and performance data, the level of the sample data may be set to a third level, the level of the log data may be set to a second level, and the level of the performance data may be set to a first level. If the type of the received data is log data, the level of the data may be determined to be a second level. For another example, if the received data is data related to video playing quality, the type of the data can be determined to be performance data according to the identifier extracted from the data, and the level of the data can be determined to be the first level.
S203, the server allocates a cache space for the data from a preset time wheel, and determines the cache time corresponding to the cache space.
In order to improve convenience and flexibility of data caching, the server may construct a time wheel formed by a plurality of cache spaces, for example, as shown in fig. 5, a time interval is set for each cache space in the time wheel, for example, when the time wheel includes 100 cache spaces (which may also be referred to as a grid), a total time of 5 minutes of the time wheel may be uniformly allocated to each cache space, and a cache time corresponding to each cache space is 3 seconds, that is, a time interval corresponding to a first cache space of the time wheel is from a current time to a third second, a time interval corresponding to a second cache space is from a third second to a sixth second, a time interval corresponding to a third cache space is from a sixth second to a ninth second, and so on, a time interval corresponding to a first hundred cache spaces is from three last seconds.
And the server can obtain the delay time of the data, and the delay time can be flexibly set according to actual needs, for example, the delay time can be set to 10 seconds. Since the data in the cache space is transferred to Kafka when the cache time corresponding to the cache space in the time wheel arrives, the sequence number (i.e., the current grid) of the cache space in which the data is currently transferred in the time wheel may be obtained, for example, when the time wheel includes 100 cache spaces, the 100 cache spaces may be numbered from 1 to 100, and if the cache space in which the data is currently transferred is the 9 th cache space, the sequence number of the cache space in which the data is currently transferred in the time wheel may be obtained as 9. At this time, a buffer space may be allocated to the data according to the time interval, the delay time, and the sequence number, and a buffer time corresponding to the buffer space may be determined. Therefore, peak data can be uniformly distributed to different time periods (namely, buffer time is distributed), so that the peak data can be uniformly transmitted to the Kafka in a mode of memory buffering, disk buffering and the like, wherein the capability of the Kafka for receiving data with average data of minute granularity is larger than the total amount of data needing to be written in each minute.
For example, taking a time wheel as an example, the time wheel is a time wheel for storing data, and the bottom layer is formed in a linked list manner. The time wheel is composed of a plurality of grids (i.e., buffer spaces) in which corresponding times are set, each grid representing a basic time span (i.e., buffer time) of the current time. The total number of the grids on the time wheel can be represented by the wheelSize, and the total time span (interval) of the whole time wheel can be calculated by the formula of the basic time span tickMs corresponding to each grid x the total number of grids wheelSize. The time wheel can also be provided with a meter dial pointer (currentTime) for indicating the current time of the time wheel, the meter dial pointer currentTime is an integral multiple of the basic time span tickMs, the meter dial pointer currentTime can divide the whole time wheel into an expired part and an unexpired part, the current grid pointed by the meter dial pointer currentTime belongs to the expired part, the grid is indicated to be just expired, and all tasks timerttaskstlist corresponding to the grid need to be processed.
Taking the total time of the time wheel as 5 minutes and every 3 seconds as the basic time span of one grid, the total time of the time wheel is 100 grids as an example, the grids of the data insertion time wheel are as follows: (delayMs + startMs)/tickMs% wheelSize, where delayMs represent delay times, startMs represent the number of boxes currently pointed by the dial pointer, wheelSize represents the total number of boxes of the time wheel, and the allowable delay time of data plus the current dial pointer modulo when data is inserted. For example, if the delay time is 10 seconds, the sequence number of the current cell is 5, the time wheel includes 100 cells, and the basic time span of each cell is 3 seconds, the cells into which the data can be inserted are: (5 x 3+ 10)/3% 100 ═ 8, i.e. the data is stored in the 8 th cell.
And S204, the server stores the data into the allocated cache space in the time wheel according to the grade of the data.
After the buffer space is allocated, the server may store the data into the allocated buffer space in the time wheel according to the grade of the data, for example, as shown in fig. 5, the data with a high grade (i.e. the very important data) may be stored into an important queue of the buffer space, and the data with a low grade (i.e. the important data) may be stored into an unimportant queue of the buffer space, so that the data may be subsequently extracted from a corresponding queue of the buffer space according to the grade of the data and be transferred. Namely, a data classification mode is adopted, two queues can be stored in each grid of the time wheel, data with high priority is placed in a queue with higher level, and data with low priority is placed in a queue with lower level.
The embodiment can divide data into important data and unimportant data according to data classification, the important data is put into an important queue, the unimportant data is put into an unimportant queue, the data is sent according to the delay time of the data, and the data which is not sent in the future is stored in a disk. The data in the disk can be sent through another process, the data with high priority is sent preferentially when being sent, and the data after being analyzed by the IP is used for controlling the sending speed by using a preset speed controller.
The time wheel is used for storing data, the existing mode of directly reading and sending data is replaced, and the reading, writing and overdue advantages of the time wheel are utilized, so that the reading and writing efficiency of the data is greatly improved. Data classification is realized by storing a plurality of queues on a single grid of a time wheel, and data is guaranteed to be sent at a certain speed by methods such as data classification and cache, so that the stability of the server is improved, and the influence of data peaks on the server is solved.
S205, the server judges whether the caching time is reached; if yes, go to step S06; if not, wait.
S206, the server sequentially extracts data from the cache space according to the sequence of the levels from high to low, and performs IP analysis on the data to obtain analyzed data.
After the data is stored in the cache space, the server may monitor whether the cache time arrives, for example, when the cache time is 3 seconds, it may be determined whether the start time of the 3 seconds arrives, if so, the data may be sequentially extracted from the cache space of the time wheel according to the order of the levels from high to low (may also be referred to as data removal), and after the data is extracted from the cache space, the data may be IP-analyzed to obtain the analyzed data; if not, the method can wait until the starting time of the cache time arrives, and then the data is not transferred.
For data removal, data can be indexed in a doubly linked list manner, when the data is removed, only a previous pointer in a current doubly linked list corresponding to a current piece of data needs to point to a next piece of data, and a pointer of the next piece of data points to the previous piece of data, which is certainly possible to remove a batch of data at one time.
Regarding the priority of the data, in order to guarantee the reliability of all the data, the data to be sent is sent to the back-end Kafka cluster, but some data have high real-time requirements and are not too high, and a certain time delay is allowed, so that the data can be extracted from the buffer space in sequence from high to low according to the grade and subjected to IP analysis, so that the analyzed data can be sent to the Kafka.
And S207, the server sends the analyzed data to the message system Kafka according to a preset rate.
In order to improve the stability of the server, reduce the data loss rate and solve the problem of server crash caused by the data peak value of the second level of data, the data is stored on the time wheel, and the advantages of reading, writing, overdue and the like of the time wheel are utilized, so that the request response of a large amount of data can be timely responded and processed, the efficiency of data processing is improved, and the data is sent at a stable speed (namely a preset speed), and the server is ensured not to crash caused by the data peak value of the second level.
Because the reported data is valuable, the operation condition of the current system can be determined through real-time data processing and analysis, and because the data storage time of the data middleware BOSS is limited, if the data processing is delayed, the data is lost, the data of the BOSS needs to be accessed into the Kafka message middleware according to the requirement so as to be used for subsequent analysis and processing.
Data is accessed from the BOSS to the Kafka, BOSS data needs to be subscribed, messages are read in real time, and the messages are written into the Kafka through a Kafka interface. The data subscription mode is a push mode, the BOSSs pushes data to the BOSSs Agent of the subscription host, the BOSSs Agent can write the data into the shared memory after receiving the data, and other services read the data according to an Application Programming Interface (API). For example, data may be read through the BOSS API, then relevant processing (e.g., IP parsing, filtering, or sampling, etc.) is performed on the data, and then the processed data is written into the Kafka cluster.
The method comprises the steps of firstly reading data in a shared memory, then carrying out IP analysis on the read data, writing the analyzed data into Kafka, wherein each piece of data has certain processing time (such as reading, analyzing, sending and the like) and belongs to a data processing joint tone, so that each piece of data can be processed within certain time to obtain the Kafka. Data processing is relatively good at low data peaks, and writing time to Kafka is much slower at high data peaks (writing is slower when the Kafka load is high), resulting in data not being written in time. Data from the BOSS system is lost if not read in time. Moreover, when the data is in a peak, the data can be completely written into the Kafka cluster, and the Kafka can cause memory overflow, process hang-up, small background cluster size, high overall load and cluster avalanche easily occur because the Kafka cannot complete processing at the first time due to a disk or a network and the like. In order to avoid data loss in the present application, data may be buffered to the time wheel and transferred to Kafka through the time wheel. In addition, the use of the time wheel not only can make the data reading and writing speed very fast and the time complexity low, but also has stable time counting so that the server can know how long each batch of data is expired.
S208, the server judges whether the caching time is finished, and if so, the step S209 is executed; if not, go to step 206.
S209, the server judges whether the data which are not extracted exist in the cache space; if yes, go to step S210; if not, step S212 is executed, and when the buffering time of the next buffering space of the time wheel is reached, the data in the next buffering space is subjected to IP analysis, and the analyzed data is sent to the message system Kafka according to the preset rate.
S210, the server stores the data which are not extracted to a Disk.
The server can temporarily store data which is not processed in the cache space into the Disk, i.e. the expired data is serialized into the Disk. Specifically, in the process of extracting data from the buffer space, it may be determined whether the buffering time is over, for example, when the buffering time is 3 seconds, it may be determined whether the end time of the 3 seconds is reached, and if the end time is reached, it may be further determined whether the data that is not extracted exists in the buffer space. If the extracted data exists, the data which is not extracted can be stored to the disk sequentially according to the sequence from high level to low level. For example, the cache space caches data of a first level, a second level, and a third level, and when the cache time is over, if the data that is not extracted exists in the cache space and is the second level and the third level, the data may be sequentially stored in the disk according to the order from the second level to the third level.
In terms of data processing performance, the disk write speed is high, for example, 200MB/s, which is higher than the read amount of peak data of the highest peak, and therefore, even if data cannot be written into the Kafka cluster, data can be stored and written into the disk sequentially in a short time.
After the data is stored in the disk, a subsequent independent process reads the data from the disk according to the sequence from high to low in grade, performs operations such as IP analysis on the read data, performs measurement according to the real-time transmitted data, and transmits the analyzed data to the Kafka cluster at a constant speed. The mode of evenly and stably sending data to the Kafka avoids the situations that the Kafka is down when the data peak period of the second granularity is temporary periodically and the data which cannot be sent in time is cached for two times in the modes of a disk and the like, and greatly improves the stability of background service.
It should be noted that, for data expiration, the expiration times of all data in the same cache space in the time wheel are consistent, and as long as the cache time corresponding to the cache space is over, all the remaining data in the cache space are marked as expired, and the expired data is processed, that is, the expired data is stored in the disk, so that the reliability of the data can be ensured by storing the data in the disk.
S211, the server carries out IP analysis on the data in the disk according to the sequence of the levels from high to low, and sends the analyzed data to the message system Kafka according to a preset rate.
The server may sequentially parse the data in the disk from high to low in the order of the levels to obtain parsed data, for example, the highest level data may be preferentially extracted from the disk, and then, the lower level data may be sequentially extracted, and so on. And IP analysis is carried out on the extracted data, and the analyzed data can be sent to Kafka according to a preset rate, wherein the preset rate can be flexibly set according to actual needs.
In the embodiment, an intermediate buffer taking a time wheel as data unloading is constructed, data read from a cellular system BOSS is buffered into the time wheel, and then the data is buffered to a disk or sent to Kafka according to the buffering time of the data, so that the data can be uniformly written into the Kafka, and a sending peak in a short period of the data is avoided. And the storage of tasks in the time wheel is changed into the storage of data, and according to the characteristics of the time wheel, the data is subjected to memory caching, disk caching, data priority classification, data operation, background service monitoring and the like in a matching manner, so that the aim of preferentially arriving the priority data is fulfilled, the possibility of downtime of the background service is reduced, the data is sent in a uniform speed manner, the peak data at the second level is ensured to be stably sent to the background service (such as Kafka), the probability of data loss is reduced, and the running stability of the server is improved.
It can be understood that, except for applying the time round mode to the data transfer process, the method can also be used for a recommendation system, data to be recommended of the recommendation system is placed in the time round, then response is cleared regularly through the priority of the data and elimination of the data (if the real-time requirement of the recommendation system is high, the data elimination directly returns to default recommendation), and the response speed of the data is guaranteed.
It should be noted that, for the monitoring and recovery mechanism of the server, when a certain node in the Kafka cluster hangs up, the current node may have a situation that data cannot be written in, and at this time, reselection may be triggered, that is, data is selected to be sent to another Kafka node. In addition, the server can regularly monitor whether the Kafka service exists, if the Kafka service does not exist, the Kafka service transmission list is removed, then the Kafka service transmission list is periodically checked, the transmission is carried out after the Kafka service transmission list can be retransmitted, wherein after a certain Kafka in the Kafka cluster is down, the Kafka cluster can be restarted, and after the Kafka cluster is restarted for a long time, the Kafka service can be recovered. And in order to guarantee the operation and maintenance performance, the total amount of Kafka in the server needs to be ensured to be more, and even if one Kafka is down, the remaining Kafka can still receive data.
According to the method and the device for caching the data, the server grades the received data according to types to obtain the grade of the data, allocates caching space for the data based on the time wheel, determines caching time corresponding to the caching space, and then stores the data into the allocated caching space in the time wheel according to the grade of the data. When the caching time reaches, sequentially carrying out IP analysis on the data in the caching space according to the sequence from high level to low level, sending the analyzed data to Kafka according to a preset rate, storing the data which is not processed in the caching time into a disk, carrying out IP analysis on the data in the disk, and sending the analyzed data to Kafka according to the preset rate. According to the scheme, data are stored in the time wheel, the writing, execution, expiration and the like of the time wheel are applied to the data, the writing and reading speed of a cache space is improved, the peak period of the data is rotated by a time dimension, the data which cannot be sent in time are periodically stored in the cache space of the time wheel of the period time dimension, and the sending speed and the quality of the data are guaranteed. And the modes of data grade, cache and the like ensure that data with high grade is sent preferentially, so that the data loss caused by the influence of data peak period is avoided, meanwhile, the safety of the data is greatly improved due to the introduction of disk cache, and the data is stored in a disk mode, so that the data loss rate is reduced.
In order to better implement the data processing method provided by the embodiment of the present application, an embodiment of the present application further provides a device based on the data processing method. The terms are the same as those in the data processing method, and details of implementation can be referred to the description in the method embodiment.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure, wherein the data processing apparatus 300 may include an obtaining unit 301, a determining unit 302, an allocating unit 303, a storing unit 304, a unloading unit 305, and the like.
The acquiring unit 301 is configured to acquire data to be processed.
A determining unit 302, configured to determine a level of the to-be-processed data.
An allocating unit 303, configured to allocate a buffer space and a buffer time for the to-be-processed data.
A storage unit 304, configured to store the data to be processed into the cache space according to the level.
A dump unit 305, configured to dump the to-be-processed data from the cache space to a preset storage space sequentially according to the order from high to low in the order of the level according to the cache time.
In some embodiments, the determining unit 302 is specifically configured to: acquiring an identifier carried in the data to be processed; determining the type of the data to be processed according to the identification; and determining the grade of the data to be processed according to the type.
In some embodiments, the allocating unit 303 may include an obtaining module, an allocating module, and the like, and specifically may be as follows:
an obtaining module, configured to obtain a time interval of the cache space and a delay time of the to-be-processed data;
and the distribution module is used for distributing buffer space and buffer time for the data to be processed according to the time interval and the delay time.
In some embodiments, the assignment module is specifically configured to: constructing a plurality of buffer spaces to form a ring buffer queue; setting a time interval for each buffer space in the annular buffer queue; obtaining the delay time of the data to be processed and the sequence number of the buffer space of the currently-stored data in the annular buffer queue; and distributing buffer space and buffer time for the data to be processed according to the time interval, the delay time and the sequence number.
In some embodiments, the unloading unit 305 is specifically configured to: when the cache time is up, sequentially extracting the data to be processed from the cache space according to the sequence of the levels from high to low to obtain target data; analyzing the target data to obtain analyzed target data; and sending the analyzed target data to a preset storage space according to a preset rate.
In some embodiments, the unloading unit 305 is further specifically configured to: when the cache time is over, judging whether the cache space has data to be processed which is not extracted; if yes, storing the unextracted data to be processed to a magnetic disk; and analyzing the data to be processed in the disk according to the sequence of the levels from high to low, and sending the analyzed data to a preset storage space according to a preset rate.
In some embodiments, the obtaining unit 301 is specifically configured to: and receiving sampling data, log data or performance data reported by a client or other servers.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
In the embodiment of the application, the obtaining unit 301 may obtain the data to be processed, the determining unit 302 determines the grade of the data to be processed, and the allocating unit 303 allocates the buffer space and the buffer time to the data to be processed; the storage unit 304 may then store the data to be processed into the buffer space according to the level, and the unloading unit 305 may unload the data to be processed into the preset storage space from the buffer space in sequence from high to low according to the buffer time. According to the scheme, the data to be processed can be transferred and stored to the preset storage space from the cache space based on the grade and the cache time of the data to be processed, and the reliability and the flexibility of data processing are improved.
The embodiment of the present application further provides a server, as shown in fig. 7, which shows a schematic structural diagram of the server according to the embodiment of the present application, specifically:
the server may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the server architecture shown in FIG. 7 is not meant to be limiting, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 401 is a control center of the server, connects various parts of the entire server using various interfaces and lines, and performs various functions of the server and processes data by running or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the server. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the server, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.
The server further includes a power supply 403 for supplying power to each component, and preferably, the power supply 403 may be logically connected to the processor 401 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The server may also include an input unit 404, the input unit 404 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the server may further include a display unit and the like, which will not be described in detail herein. Specifically, in this embodiment, the processor 401 in the server loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application program stored in the memory 402, so as to perform the following operations:
acquiring data to be processed; determining the grade of data to be processed; distributing buffer space and buffer time for the data to be processed; storing the data to be processed into a cache space according to the grade; and sequentially transferring the data to be processed from the cache space to a preset storage space according to the sequence of the levels from high to low according to the cache time.
In some embodiments, in determining the rank of the data to be processed, the processor 401 further performs: acquiring an identifier carried in data to be processed; determining the type of the data to be processed according to the identifier; and determining the grade of the data to be processed according to the type.
In some embodiments, when allocating the buffer space and the buffer time for the data to be processed, the processor 401 further performs: acquiring a time interval of a cache space and delay time of data to be processed; and distributing buffer space and buffer time for the data to be processed according to the time interval and the delay time.
In some embodiments, when allocating the buffer space and the buffer time for the data to be processed according to the time interval and the delay time, the processor 401 further performs: constructing a plurality of buffer spaces to form a ring buffer queue; setting a time interval for each buffer space in the annular buffer queue; obtaining the delay time of the data to be processed and the serial number of the buffer space of the currently-stored data in the annular buffer queue; and distributing buffer space and buffer time for the data to be processed according to the time interval, the delay time and the sequence number.
In some embodiments, when the data to be processed is sequentially transferred from the cache space to the preset storage space in the order from high to low according to the cache time, the processor 401 further performs: when the caching time is up, sequentially extracting data to be processed from the caching space according to the sequence of the levels from high to low to obtain target data; analyzing the target data to obtain analyzed target data; and sending the analyzed target data to a preset storage space according to a preset rate.
In some embodiments, after sending the parsed target data to the preset storage space at the preset rate, the processor 401 further performs: when the caching time is over, judging whether the data to be processed which is not extracted exists in the caching space; if yes, storing the unextracted data to be processed to a magnetic disk; analyzing the data to be processed in the disk according to the sequence of the levels from high to low, and sending the analyzed data to a preset storage space according to a preset rate.
In the above embodiments, the descriptions of the embodiments have respective emphasis, and parts that are not described in detail in a certain embodiment may refer to the above detailed description of the data processing method, and are not described herein again.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by a computer program, which may be stored in a computer-readable storage medium and loaded and executed by a processor, or by related hardware controlled by the computer program.
To this end, the present application provides a storage medium, in which a computer program is stored, where the computer program can be loaded by a processor to execute any one of the data processing methods provided in the present application. For example, the computer program is loaded by a processor and may perform the following steps:
acquiring data to be processed; determining the grade of data to be processed; distributing buffer space and buffer time for the data to be processed; storing the data to be processed into a cache space according to the grade; and sequentially transferring the data to be processed from the cache space to a preset storage space according to the sequence of the levels from high to low according to the cache time.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the storage medium can execute the steps in any data processing method provided in the embodiments of the present application, beneficial effects that can be achieved by any data processing method provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.
The foregoing detailed description is directed to a data processing method, an apparatus, a server, and a storage medium provided in the embodiments of the present application, and specific examples are applied in the present application to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A data processing method, comprising:
acquiring data to be processed;
determining the grade of the data to be processed;
distributing a buffer space and buffer time for the data to be processed;
storing the data to be processed into the cache space according to the grade;
and sequentially transferring the data to be processed from the cache space to a preset storage space according to the cache time and the sequence of the grades from high to low.
2. The data processing method of claim 1, wherein the determining the rank of the data to be processed comprises:
acquiring an identifier carried in the data to be processed;
determining the type of the data to be processed according to the identification;
and determining the grade of the data to be processed according to the type.
3. The data processing method according to claim 1, wherein the allocating buffer space and buffer time for the data to be processed comprises:
acquiring a time interval of the cache space and the delay time of the data to be processed;
and distributing buffer space and buffer time for the data to be processed according to the time interval and the delay time.
4. The data processing method according to claim 3, wherein the allocating buffer space and buffer time to the data to be processed according to the time interval and the delay time comprises:
constructing a plurality of buffer spaces to form a ring buffer queue;
setting a time interval for each buffer space in the annular buffer queue;
obtaining the delay time of the data to be processed and the sequence number of the buffer space of the currently-stored data in the annular buffer queue;
and distributing buffer space and buffer time for the data to be processed according to the time interval, the delay time and the sequence number.
5. The data processing method according to claim 1, wherein the sequentially transferring the data to be processed from the cache space to a preset storage space in the order of the levels from high to low according to the cache time comprises:
when the cache time is up, sequentially extracting the data to be processed from the cache space according to the sequence of the levels from high to low to obtain target data;
analyzing the target data to obtain analyzed target data;
and sending the analyzed target data to a preset storage space according to a preset rate.
6. The data processing method of claim 5, wherein after sending the parsed target data to a preset storage space at a preset rate, the method further comprises:
when the cache time is over, judging whether the cache space has data to be processed which is not extracted;
if yes, storing the unextracted data to be processed to a magnetic disk;
and analyzing the data to be processed in the disk according to the sequence of the levels from high to low, and sending the analyzed data to a preset storage space according to a preset rate.
7. The data processing method according to any one of claims 1 to 6, wherein the acquiring the data to be processed comprises:
and receiving sampling data, log data or performance data reported by a client or other servers.
8. A data processing apparatus, comprising:
the acquisition unit is used for acquiring data to be processed;
the determining unit is used for determining the grade of the data to be processed;
the distribution unit is used for distributing cache space and cache time for the data to be processed;
the storage unit is used for storing the data to be processed into the cache space according to the grade;
and the unloading unit is used for unloading the data to be processed from the cache space to a preset storage space in sequence from high to low according to the cache time.
9. A server, characterized by comprising a processor and a memory, in which a computer program is stored, the processor executing the data processing method according to any one of claims 1 to 7 when calling the computer program in the memory.
10. A storage medium for storing a computer program which is loaded by a processor to perform the data processing method of any one of claims 1 to 7.
CN201910689074.3A 2019-07-29 2019-07-29 Data processing method, device, server and storage medium Pending CN112306369A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910689074.3A CN112306369A (en) 2019-07-29 2019-07-29 Data processing method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910689074.3A CN112306369A (en) 2019-07-29 2019-07-29 Data processing method, device, server and storage medium

Publications (1)

Publication Number Publication Date
CN112306369A true CN112306369A (en) 2021-02-02

Family

ID=74330072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910689074.3A Pending CN112306369A (en) 2019-07-29 2019-07-29 Data processing method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN112306369A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204382A (en) * 2021-05-12 2021-08-03 北京百度网讯科技有限公司 Data processing method, data processing device, electronic equipment and storage medium
CN113220442A (en) * 2021-07-01 2021-08-06 北京轻松筹信息技术有限公司 Data scheduling method and device and electronic equipment
CN113791739A (en) * 2021-09-26 2021-12-14 重庆紫光华山智安科技有限公司 Data unloading method, system, electronic equipment and readable storage medium
CN117009439A (en) * 2023-10-07 2023-11-07 腾讯科技(深圳)有限公司 Data processing method, device, electronic equipment and storage medium
CN113204382B (en) * 2021-05-12 2024-05-10 北京百度网讯科技有限公司 Data processing method, device, electronic equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204382A (en) * 2021-05-12 2021-08-03 北京百度网讯科技有限公司 Data processing method, data processing device, electronic equipment and storage medium
CN113204382B (en) * 2021-05-12 2024-05-10 北京百度网讯科技有限公司 Data processing method, device, electronic equipment and storage medium
CN113220442A (en) * 2021-07-01 2021-08-06 北京轻松筹信息技术有限公司 Data scheduling method and device and electronic equipment
CN113220442B (en) * 2021-07-01 2021-11-26 北京轻松筹信息技术有限公司 Data scheduling method and device and electronic equipment
CN113791739A (en) * 2021-09-26 2021-12-14 重庆紫光华山智安科技有限公司 Data unloading method, system, electronic equipment and readable storage medium
CN117009439A (en) * 2023-10-07 2023-11-07 腾讯科技(深圳)有限公司 Data processing method, device, electronic equipment and storage medium
CN117009439B (en) * 2023-10-07 2024-01-23 腾讯科技(深圳)有限公司 Data processing method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US7631034B1 (en) Optimizing node selection when handling client requests for a distributed file system (DFS) based on a dynamically determined performance index
CN112306369A (en) Data processing method, device, server and storage medium
CN111064634B (en) Method and device for monitoring mass Internet of things terminal online state
CN111522636B (en) Application container adjusting method, application container adjusting system, computer readable medium and terminal device
EP2563062A1 (en) Long connection management apparatus and link resource management method for long connection communication
CN110474852B (en) Bandwidth scheduling method and device
CN111966289B (en) Partition optimization method and system based on Kafka cluster
CN111585824B (en) Resource distribution method, device and system and electronic equipment
CN105763595A (en) Method of improving data processing efficiency and server
CN112988679B (en) Log acquisition control method and device, storage medium and server
WO2022057001A1 (en) Device management method and system, and management cluster
WO2023109806A1 (en) Method and apparatus for processing active data for internet of things device, and storage medium
CN113422808B (en) Internet of things platform HTTP information pushing method, system, device and medium
KR20210072956A (en) Edge Cloud Computing Offloading Method in the IoT environment
CN111209159A (en) Information processing method, device, equipment and storage medium
KR101810180B1 (en) Method and apparatus for distributed processing of big data based on user equipment
CN113271228B (en) Bandwidth resource scheduling method, device, equipment and computer readable storage medium
CN114143263B (en) Method, equipment and medium for limiting current of user request
CN115878035A (en) Data reading method and device, electronic equipment and storage medium
CN115421930A (en) Task processing method, system, device, equipment and computer readable storage medium
CN113835905A (en) Message queue load balancing method and device, electronic equipment and medium
CN114422565A (en) Network connection management method based on connection pool and related device
CN112988417A (en) Message processing method and device, electronic equipment and computer readable medium
CN108683612B (en) Message acquisition method and device
CN111158899A (en) Data acquisition method, data acquisition device, task management center and task management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination