CN111309748A

CN111309748A - Data updating method, device and server

Info

Publication number: CN111309748A
Application number: CN202010171328.5A
Authority: CN
Inventors: 郭家林; 王利
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-03-12
Filing date: 2020-03-12
Publication date: 2020-06-19

Abstract

The invention provides a data updating method, a device and a server, which can sequence service types from large to small according to the occupation ratio of each service type in at least part of the service types to obtain a first sequencing sequence, sequence data groups corresponding to each service type according to the first sequencing sequence to obtain a second sequencing sequence, and allocate data recovery frequency to each data group according to the relative position of each data group in the second sequencing sequence so as to realize the recovery and updating of the service data groups based on the data recovery frequency. Therefore, the corresponding data group does not need to be searched according to the target service request and then the service data group is recovered, the stored data corresponding to the service terminal can be dynamically recovered and updated according to the historical service request of the service terminal, the time for searching and recovering the data according to the service processing request by the server is further reduced, and the service terminal is ensured to obtain the processing result returned by the big data server in time.

Description

Data updating method, device and server

Technical Field

The invention relates to the technical field of big data processing, in particular to a data updating method, a data updating device and a server.

Background

Big data plays an increasingly important role in current social production life, and people can utilize the big data to perform various business processes. A common big data based service handling mechanism is a request response mechanism: the service terminal initiates a service processing request to the big data server, the big data server responds to the service processing request and carries out corresponding data service processing, and then a processing result is returned to the service terminal. However, as the number of service terminals connected to the big data server increases, it is difficult for the service terminals to obtain the processing result returned by the big data server in time after sending the service processing request to the big data server.

Disclosure of Invention

In order to improve the above problems, the present invention provides a data updating method, apparatus and server.

In a first aspect of the embodiments of the present invention, a data updating method is provided, which is applied to a server, where the server communicates with a service terminal, and the method includes:

reading at least part of first service requests of a service terminal from an operation log according to a set time interval, and determining the service type of each first service request, wherein the first service requests are service requests sent to a server by the service terminal before the current moment, and the service types are used for representing data service processing types corresponding to the first service requests;

determining the proportion of each service type in at least part of the service types, and sequencing the service types according to the sequence of the proportion from large to small to obtain a first sequencing sequence;

determining a data group corresponding to each service type from a preset data set, and sequencing each data group according to the first sequencing sequence to obtain a second sequencing sequence;

assigning a data recovery frequency to each data group according to the relative position of each data group in the second sorted sequence;

calling at least part of the data group to a data recovery thread according to the data recovery frequency so as to recover the data of at least part of the data group through the data recovery thread to obtain at least part of the service data group;

and updating at least part of the target data set to a service data interval and returning to the step of calling at least part of the data group to a data recovery thread according to the data recovery frequency so as to realize the cyclic updating of the service data group in the service data interval.

Optionally, the invoking at least part of the data group to the data recovery thread according to the data recovery frequency to perform data recovery on at least part of the data group through the data recovery thread to obtain at least part of the service data group includes:

clustering the data groups according to the data recovery frequency corresponding to each data group to obtain a plurality of categories, wherein each category comprises at least two data groups;

assigning a class identifier to each class according to a data recovery frequency of a data group included under each class, the class identifier having a hierarchical order from high to low;

determining the load capacity of the data recovery thread, wherein the load capacity is used for representing the maximum number of data groups subjected to data recovery simultaneously;

and calling the data group under at least one category to the data recovery thread according to the load capacity and the category identification so as to recover the data group under at least one category through the data recovery thread to obtain a service data group corresponding to the data group under at least one category.

Optionally, the invoking a data group under at least one category to the data recovery thread according to the load capacity and the category identifier includes:

according to the maximum number of data groups which correspond to the load capacity and are subjected to data recovery at the same time, carrying out interval division on the category identification to obtain a plurality of category intervals, wherein the accumulated number of the data groups corresponding to each category interval is less than or equal to the maximum number;

setting an interval identifier for each category interval according to the average value of the category identifiers corresponding to the categories in each category interval;

and circularly calling the data group in each category interval to the data recovery thread according to the sequence of interval identifications from large to small.

Optionally, the performing, by the data recovery thread, data recovery on at least part of the data group to obtain at least part of the service data group includes:

for each data group in at least part of data groups, obtaining the characteristic distribution of the data group according to the data characteristics of the data strings in the data group;

determining a position weight of each feature node in the feature distribution, wherein the position weight is used for characterizing the distance between each feature node and a central node of the feature distribution;

sequentially loading the data strings in the data group into the data recovery thread according to the sequence of the position weights from big to small to obtain the feature vector of each data string;

running the data recovery thread to determine an association value for each vector value in each feature vector, wherein the matching degree between the association value and the vector value is greater than or equal to a set value, obtaining an association vector according to the association value corresponding to each feature vector, and performing dimension expansion processing on each feature vector based on the association vector to obtain a target vector; extracting each target vector from the data recovery thread and determining a service data string corresponding to each target vector value in each target vector based on a preset vector value database; and combining the service data strings into a corresponding service data group of the data group according to the arrangement sequence of the target vector values in the target vector.

In a second aspect of the embodiments of the present invention, there is provided a data updating apparatus, applied to a server, where the server communicates with a service terminal, the apparatus including:

the type determining module is used for reading at least part of first service requests of the service terminal from the running log according to a set time interval and determining the service type of each first service request, wherein the first service requests are service requests sent to the server by the service terminal before the current moment, and the service types are used for representing data service processing types corresponding to the first service requests;

the first sequencing module is used for determining the proportion of each service type in at least part of the service types and sequencing the service types according to the sequence of the proportion from large to small to obtain a first sequencing sequence;

the second sorting module is used for determining a data group corresponding to each service type from a preset data set and sorting each data group according to the first sorting sequence to obtain a second sorting sequence;

a frequency allocation module, configured to allocate a data recovery frequency to each data group according to a relative position of each data group in the second sorting sequence;

the data calling module is used for calling at least part of the data group to a data recovery thread according to the data recovery frequency so as to carry out data recovery on at least part of the data group through the data recovery thread to obtain at least part of the service data group;

and the cyclic updating module is used for updating at least part of the target data set to a service data interval and returning to the step of calling at least part of the data group to a data recovery thread according to the data recovery frequency so as to realize cyclic updating of the service data group in the service data interval.

Optionally, the data retrieving module is specifically configured to:

In a third aspect of the embodiments of the present invention, a server is provided, including: a processor and a memory and bus connected to the processor; the processor and the memory are communicated with each other through the bus; the processor is used for calling the computer program in the memory to execute the data updating method.

In a fourth aspect of the embodiments of the present invention, there is provided a computer-readable storage medium on which a program is stored, the program implementing the data updating method described above when executed by a processor.

The data updating method, the data updating device and the server provided by the embodiment of the invention can sequence the service types according to the sequence of the occupation ratio of each service type in at least part of the service types from large to small to obtain a first sequencing sequence, and sequence the data group corresponding to each service type according to the first sequencing sequence to obtain a second sequencing sequence, so that the data recovery frequency is allocated to each data group according to the relative position of each data group in the second sequencing sequence to realize the recovery and the updating of the service data group based on the data recovery frequency. Therefore, the server does not need to search the corresponding data group according to the target service request and then recover the service data group, the stored data corresponding to the service terminal can be dynamically recovered and updated according to the historical service request of the service terminal, the time for the server to search and recover the data according to the service processing request is further reduced, and the processing result returned by the big data server can be obtained in time after the service terminal sends the service processing request to the big data server.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a flowchart of a data updating method according to an embodiment of the present invention.

Fig. 2 is a functional block diagram of a data updating apparatus according to an embodiment of the present invention.

Fig. 3 is a schematic product module diagram of a server according to an embodiment of the present invention.

Icon:

200-a server;

201-data updating means; 2011-type determination module; 2012-a first ordering module; 2013-a second sorting module; 2014-frequency assignment module; 2015-a data retrieval module; 2016-cycle update module;

211-a processor; 212-a memory; 213-bus.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

In order to better understand the technical solutions of the present invention, the following detailed descriptions of the technical solutions of the present invention are provided with the accompanying drawings and the specific embodiments, and it should be understood that the specific features in the embodiments and the examples of the present invention are the detailed descriptions of the technical solutions of the present invention, and are not limitations of the technical solutions of the present invention, and the technical features in the embodiments and the examples of the present invention may be combined with each other without conflict.

With the increase of the number of service terminals connected to the big data server and the increase of the data amount stored in the big data server, in order to improve the data storage efficiency, the big data server generally compresses and stores various service data corresponding to the same service terminal, recovers and processes the corresponding data from the compressed and stored data when receiving a service processing request of the service terminal, and needs a long time to search for data from the compressed and stored data and recover the searched data.

Therefore, embodiments of the present invention provide a data updating method, an apparatus, and a server, which can dynamically recover and update stored data corresponding to a service terminal according to a historical service request of the service terminal, thereby reducing the time duration for the server to search for and recover data according to a service processing request, and ensuring that a processing result returned by a big data server can be obtained in time after the service terminal sends the service processing request to the big data server.

In order to achieve the above object, an embodiment of the present invention provides a data updating method applied to a server, where the server is in communication with a service terminal, and the method may specifically include the following contents.

Step S21, reading at least part of the first service requests of the service terminal from the running log according to the set time interval, and determining the service type of each first service request.

In this embodiment, the first service request is a service request sent by a service terminal to a server before the current time, and the service type is used to represent a data service processing type corresponding to the first service request.

In this embodiment, the set duration interval may be determined according to a communication frequency between the service terminal and the server, and the higher the communication frequency is, the shorter the set duration interval is, and the lower the communication frequency is, the longer the set duration interval is. The communication frequency may be a one-way communication behavior or a two-way communication behavior between the service terminal and the server.

Step S22, determining the proportion of each service type in at least part of service types, and sequencing the service types according to the sequence from big to small of the proportion to obtain a first sequencing sequence.

Step S23, determining a data group corresponding to each service type from a preset data set, and sorting each data group according to the first sorting sequence to obtain a second sorting sequence.

In this embodiment, each data group in the data set is a data group compressed by the server, and these data groups retain the characteristics and traceability of the service data before compression, and can be restored to the service data before compression by the data restoration thread of the server.

Step S24, assigning a data recovery frequency to each data group according to the relative position of each data group in the second sorting sequence.

In this embodiment, the relative position may be a sorting position of each data group in the second sorting sequence, the data recovery frequency is used to characterize the number of times that the data group is recovered to the service data before compression in unit time, and the data recovery frequency of the data group is greater the earlier the sorting position of the data group is.

Step S25, calling at least part of the data group to a data recovery thread according to the data recovery frequency, so as to perform data recovery on at least part of the data group by the data recovery thread to obtain at least part of the service data group.

And step S26, updating at least part of the target data set to a service data interval and returning to the step of calling at least part of the data group to the data recovery thread according to the data recovery frequency so as to realize the cyclic updating of the service data group in the service data interval.

In steps S25 and S26, the server may recover the traffic data group for at least a part of the data groups according to the data recovery frequency of each data group and update the traffic data group after recovery into the traffic data interval. The service data interval is an interval in which the server executes service data processing according to the service request.

Further, since the service data group in the service data interval is updated according to the data recovery frequency, when the user terminal sends the target service request again, the probability that the request type corresponding to the target service request corresponds to the service data group in the service data interval in a matching manner can be increased.

Therefore, the server does not need to search the corresponding data group according to the target service request and then recover the service data group, the stored data corresponding to the service terminal can be dynamically recovered and updated according to the historical service request of the service terminal, the time for the server to search and recover the data according to the service processing request is further reduced, and the processing result returned by the big data server can be obtained in time after the service terminal sends the service processing request to the big data server.

It can be understood that based on steps S21-S26, the service types can be sorted in descending order of the percentage of each service type in at least part of the service types to obtain a first sorting sequence, and the data groups corresponding to each service type can be sorted according to the first sorting sequence to obtain a second sorting sequence, so that a data recovery frequency is allocated to each data group according to the relative position of each data group in the second sorting sequence, so as to recover and update the service data groups based on the data recovery frequency. Therefore, the server does not need to search the corresponding data group according to the target service request and then recover the service data group, the stored data corresponding to the service terminal can be dynamically recovered and updated according to the historical service request of the service terminal, the time for the server to search and recover the data according to the service processing request is further reduced, and the processing result returned by the big data server can be obtained in time after the service terminal sends the service processing request to the big data server.

In a specific implementation, in order to ensure smoothness of data recovery, in step S25, the invoking at least part of the data set to the data recovery thread according to the data recovery frequency to perform data recovery on at least part of the data set by the data recovery thread to obtain at least part of the service data set may specifically include the following.

Step S251, clustering the data groups according to the data recovery frequency corresponding to each data group to obtain a plurality of categories, wherein each category comprises at least two data groups.

Step S252, assigning a category identifier to each category according to the data recovery frequency of the data group included under each category, the category identifiers having a hierarchical order from high to low.

Step S253, determining a load capacity of the data recovery thread, where the load capacity is used to represent a maximum number of data groups for data recovery at the same time.

Step S254, according to the load capacity and the category identifier, the data group in at least one category is called to the data recovery thread, so as to perform data recovery on the data group in at least one category through the data recovery thread, thereby obtaining a service data group corresponding to the data group in at least one category.

In this embodiment, the data groups under each category may be called to the data recovery thread in the order from high to low of the category identifier to implement data recovery, so that the smoothness of data recovery can be ensured.

In a specific implementation, in step S254, the invoking of the data set in at least one category to the data recovery thread according to the load capacity and the category identifier may specifically include the following.

And step S2541, according to the maximum number of the data groups corresponding to the load capacity and simultaneously performing data recovery, performing interval division on the category identification to obtain a plurality of category intervals, wherein the accumulated number of the data groups corresponding to each category interval is less than or equal to the maximum number.

Step S2542, setting a section identifier for each category section according to the mean value of the category identifiers corresponding to the categories in each category section.

And step S2543, calling the data group in each category interval to the data recovery thread circularly according to the sequence of interval identifications from large to small.

It can be understood that, through the above, the number of data groups called to the data recovery thread can be maximized, thereby improving the operating efficiency of the data recovery thread.

In a specific implementation, in step S25, the data recovery thread performs data recovery on at least part of the data set to obtain at least part of the service data set, which may specifically include the following.

(1) And for each data group in at least part of the data groups, obtaining the characteristic distribution of the data group according to the data characteristics of the data strings in the data group.

(2) And determining the position weight of each feature node in the feature distribution, wherein the position weight is used for characterizing the distance between each feature node and the central node of the feature distribution.

(3) And sequentially loading the data strings in the data group into the data recovery thread according to the sequence of the position weights from large to small so as to obtain the feature vector of each data string.

(4) Running the data recovery thread to determine an association value for each vector value in each feature vector, wherein the matching degree between the association value and the vector value is greater than or equal to a set value, obtaining an association vector according to the association value corresponding to each feature vector, and performing dimension expansion processing on each feature vector based on the association vector to obtain a target vector; extracting each target vector from the data recovery thread and determining a service data string corresponding to each target vector value in each target vector based on a preset vector value database; and combining the service data strings into a corresponding service data group of the data group according to the arrangement sequence of the target vector values in the target vector.

It can be understood that, through the above, the service data group corresponding to the data group can be accurately determined based on the data recovery thread and the preset vector value database.

In an alternative embodiment, in step S21, the reading, from the operation log according to the set time interval, at least part of the first service request of the service terminal may specifically include the following.

Step S211, listing the log content of the running log, and establishing a log content network topology, where the log content network topology includes a plurality of log content nodes, each log content node corresponds to a time weight, the time weights have a sequence from high to low, and the time weights are used to represent the number of times that each log content node is called in the running log.

Step S212, sequentially determining the activity between each log content node and the service terminal according to the sequence of the time weight from high to low, and distributing an adjustment coefficient for the next log content node corresponding to the current log content node according to the first activity of the current log content node under the condition that the first activity of the current log content node is determined so as to adjust the second activity through the adjustment coefficient when the second activity of the next log content node corresponding to the current log content node is determined.

Step S213, determining a median in the activity corresponding to each log content node, determining at least part of target log content nodes according to the median, and determining at least part of first service requests of the service terminal according to the at least part of target log content nodes.

In this embodiment, through the above, at least part of the first service request of the service terminal can be accurately read from the operation log.

On the basis of the above, please refer to fig. 2, which is a block diagram of a data updating apparatus 201 according to an embodiment of the present invention, wherein the data updating apparatus 201 may include the following modules.

A type determining module 2011, configured to read at least part of first service requests of a service terminal from an operation log according to a set time interval, and determine a service type of each first service request, where the first service request is a service request sent by the service terminal to a server before a current time, and the service type is used to represent a data service processing type corresponding to the first service request;

a first sorting module 2012, configured to determine a proportion of each service type in at least part of the service types, and sort the service types according to a descending order of the proportion to obtain a first sorting sequence;

the second sorting module 2013 is configured to determine a data group corresponding to each service type from a preset data set, and sort each data group according to the first sorting sequence to obtain a second sorting sequence;

a frequency allocation module 2014, configured to allocate a data recovery frequency for each data group according to a relative position of each data group in the second sorting sequence;

a data retrieving module 2015, configured to retrieve at least part of the data set to a data recovery thread according to the data recovery frequency, so as to perform data recovery on at least part of the data set through the data recovery thread to obtain at least part of the service data set;

a cyclic update module 2016 configured to update at least part of the target data set to a service data interval and return to the step of invoking at least part of the data group to the data recovery thread according to the data recovery frequency, so as to implement cyclic update of the service data group in the service data interval.

Optionally, the data retrieving module 2015 is specifically configured to:

clustering the data groups according to the data recovery frequency corresponding to each data group to obtain a plurality of categories, wherein each category comprises at least two data groups; assigning a class identifier to each class according to a data recovery frequency of a data group included under each class, the class identifier having a hierarchical order from high to low; determining the load capacity of the data recovery thread, wherein the load capacity is used for representing the maximum number of data groups subjected to data recovery simultaneously; and calling the data group under at least one category to the data recovery thread according to the load capacity and the category identification so as to recover the data group under at least one category through the data recovery thread to obtain a service data group corresponding to the data group under at least one category.

Optionally, the data retrieving module 2015 is specifically configured to:

according to the maximum number of data groups which correspond to the load capacity and are subjected to data recovery at the same time, carrying out interval division on the category identification to obtain a plurality of category intervals, wherein the accumulated number of the data groups corresponding to each category interval is less than or equal to the maximum number; setting an interval identifier for each category interval according to the average value of the category identifiers corresponding to the categories in each category interval; and circularly calling the data group in each category interval to the data recovery thread according to the sequence of interval identifications from large to small.

Optionally, the data retrieving module 2015 is specifically configured to:

for each data group in at least part of data groups, obtaining the characteristic distribution of the data group according to the data characteristics of the data strings in the data group; determining a position weight of each feature node in the feature distribution, wherein the position weight is used for characterizing the distance between each feature node and a central node of the feature distribution; sequentially loading the data strings in the data group into the data recovery thread according to the sequence of the position weights from big to small to obtain the feature vector of each data string; running the data recovery thread to determine an association value for each vector value in each feature vector, wherein the matching degree between the association value and the vector value is greater than or equal to a set value, obtaining an association vector according to the association value corresponding to each feature vector, and performing dimension expansion processing on each feature vector based on the association vector to obtain a target vector; extracting each target vector from the data recovery thread and determining a service data string corresponding to each target vector value in each target vector based on a preset vector value database; and combining the service data strings into a corresponding service data group of the data group according to the arrangement sequence of the target vector values in the target vector.

An embodiment of the present invention further provides a computer-readable storage medium, on which a program is stored, and the program, when executed by a processor, implements the data updating method described above.

The embodiment of the invention also provides a processor, wherein the processor is used for running the program, and the data updating method is executed when the program runs.

Referring to fig. 3, the embodiment of the invention further provides a server 200, which includes a processor 211, a memory 212 connected to the processor 211, and a bus 213. Wherein, the processor 211 and the memory 212 are communicated with each other via a bus 213. The processor 211 is used to call the program instructions in the memory 212 to execute the data update method described above.

To sum up, the data updating method, the data updating device, and the server provided by the embodiments of the present invention can sequence the service types in the descending order of the percentage of each service type in at least part of the service types to obtain a first sequence, and sequence the data groups corresponding to each service type according to the first sequence to obtain a second sequence, so as to allocate a data recovery frequency to each data group according to the relative position of each data group in the second sequence, so as to recover and update the service data groups based on the data recovery frequency. Therefore, the server does not need to search the corresponding data group according to the target service request and then recover the service data group, the stored data corresponding to the service terminal can be dynamically recovered and updated according to the historical service request of the service terminal, the time for the server to search and recover the data according to the service processing request is further reduced, and the processing result returned by the big data server can be obtained in time after the service terminal sends the service processing request to the big data server.

It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or cloud server that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or cloud server. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of additional like elements in the process, method, article, or cloud server comprising the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A data updating method applied to a server, wherein the server is in communication with a service terminal, the method comprising:

2. The method of claim 1, wherein the invoking at least a portion of the data set to a data recovery thread according to the data recovery frequency to perform data recovery on at least a portion of the data set by the data recovery thread to obtain at least a portion of the service data set comprises:

3. The method according to claim 2, wherein the invoking of the data group under at least one category to the data recovery thread according to the load capacity and the category identification comprises:

4. The method according to any one of claims 1 to 3, wherein the data recovery of at least part of the data groups by the data recovery thread to obtain at least part of the service data groups comprises:

5. A data update apparatus, applied to a server, the server communicating with a service terminal, the apparatus comprising:

6. The apparatus of claim 5, wherein the data retrieval module is specifically configured to:

7. The apparatus of claim 6, wherein the data retrieval module is specifically configured to:

8. The apparatus according to any one of claims 5 to 7, wherein the data retrieval module is specifically configured to:

9. A server, comprising: a processor and a memory and bus connected to the processor; the processor and the memory are communicated with each other through the bus; the processor is configured to call a computer program in the memory to perform the data update method of any of the preceding claims 1-4.

10. A computer-readable storage medium, characterized in that a program is stored thereon, which when executed by a processor implements the data update method of any one of claims 1 to 4.