CN110245014B

CN110245014B - Data processing method and device

Info

Publication number: CN110245014B
Application number: CN201810196279.3A
Authority: CN
Inventors: 林世洪; 高飞; 卢兰花
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2018-03-09
Filing date: 2018-03-09
Publication date: 2024-01-12
Anticipated expiration: 2038-03-09
Also published as: CN110245014A

Abstract

The application discloses a data processing method and device. One embodiment of the method comprises the following steps: in response to receiving data to be processed, obtaining a characteristic value associated with the data and determining at least one candidate slot for distributing the data to the distributed system node from a preset set of slots; determining the type of the data, and determining a shunting step corresponding to the type according to a preset shunting type mapping table, wherein the shunting type mapping table is used for representing the corresponding relation between the type of the data and the shunting step; determining a slot for distributing data to the distributed system node from at least one candidate slot according to the determined splitting step based on the characteristic value; and sending the data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table. The embodiment can improve the load balance of each node in the distributed system.

Description

Data processing method and device

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a data processing method and device.

Background

The verification of small flows is often performed on developed products and policies prior to the formal release of new functions for internet commercial products. Off-line testing is not possible to fully cover the actual scenario, on-line commissioning testing is essential, but the effect of errors needs to be controlled by shunting. The common method is to extract the two groups of flow A and B for comparison test, and the different flows are not connected branches, so as to assist in evaluating whether the new function meets the expectations or not, and whether the new function can be released online at full flow or not. In order to ensure that randomness is consistent with user experience, the traditional technical scheme is to import traffic to a pre-release service cluster at a service entrance according to users, traffic percentages and other characteristic values.

Disclosure of Invention

The embodiment of the application provides a data processing method and device.

In a first aspect, an embodiment of the present application provides a data processing method, including: in response to receiving data to be processed, acquiring a characteristic value associated with the data and determining at least one candidate slot for distributing the data to nodes of the distributed system from a preset slot set, wherein the slot is used for managing resources of the nodes in the distributed system; determining the type of the data, and determining a shunting step corresponding to the type according to a preset shunting type mapping table, wherein the shunting type mapping table is used for representing the corresponding relation between the type of the data and the shunting step; determining a slot for distributing data to the distributed system node from at least one candidate slot according to the determined splitting step based on the characteristic value; and sending the data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table, wherein the distributed system mapping table is used for representing the corresponding relation between the slot and the distributed system node.

In some embodiments, the feature value includes a user name extracted from the data; and determining a slot for distributing data to the distributed system node from the at least one candidate slot according to the determined splitting step based on the eigenvalue, comprising: converting the characteristic value into a hash code; solving the remainder of the hash code and the number of at least one candidate slot; and inquiring a slot mark corresponding to the remainder according to a preset slot mapping table, wherein the slot mapping table is used for representing the corresponding relation between the remainder and the slot mark.

In some embodiments, the characteristic value includes a time at which the data was received; and determining a slot for distributing data to the distributed system node from the at least one candidate slot according to the determined splitting step based on the eigenvalue, comprising: acquiring historical flow of at least one candidate slot, wherein the historical flow refers to data quantity distributed to the slot in a preset time; dividing a day into at least one time interval according to the historical flow, wherein each time interval corresponds to one candidate slot phase; and determining a candidate slot position corresponding to the time interval where the time is located.

In some embodiments, the method further comprises: and recording the determined shunting step, characteristic value and slot position identification corresponding to the slot position into a log.

In some embodiments, the method further comprises: and sending the log to a log server.

In a second aspect, an embodiment of the present application provides a data processing apparatus, including: an obtaining unit, configured to obtain a characteristic value associated with data in response to receiving the data to be processed, and determine at least one candidate slot for distributing the data to nodes of the distributed system from a preset slot set, where the slot is used for managing resources of the nodes in the distributed system; the first determining unit is configured to determine the type of the data and determine a splitting step corresponding to the type according to a preset splitting type mapping table, wherein the splitting type mapping table is used for representing the corresponding relation between the type of the data and the splitting step; the second determining unit is configured to determine a slot for distributing data to the distributed system node from at least one candidate slot according to the determined splitting step based on the characteristic value; and the sending unit is configured to send the data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table, wherein the distributed system mapping table is used for representing the corresponding relation between the slot and the distributed system node.

In some embodiments, the feature value includes a user name extracted from the data; and the second determining unit is further configured to: converting the characteristic value into a hash code; solving the remainder of the hash code and the number of at least one candidate slot; and inquiring a slot mark corresponding to the remainder according to a preset slot mapping table, wherein the slot mapping table is used for representing the corresponding relation between the remainder and the slot mark.

In some embodiments, the characteristic value includes a time at which the data was received; and the second determining unit is further configured to: acquiring historical flow of at least one candidate slot, wherein the historical flow refers to data quantity distributed to the slot in a preset time; dividing a day into at least one time interval according to the historical flow, wherein each time interval corresponds to one candidate slot phase; and determining a candidate slot position corresponding to the time interval where the time is located.

In some embodiments, the apparatus further comprises: the log unit is configured to record the determined shunt step, the characteristic value and the slot position identification corresponding to the slot position into a log.

In some embodiments, the log unit is further configured to: and sending the log to a log server.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method as in any of the methods described above.

In a fourth aspect, embodiments of the present application provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as any one of the methods described above.

According to the data processing method and device, the characteristic value and the shunting step are obtained through the data, and then the distributed system node corresponding to the slot position into which the data is to be imported is determined according to the shunting step based on the characteristic value, so that the load balance of each node in the distributed system is improved.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings, in which:

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow chart of one embodiment of a data processing method according to the present application;

FIG. 3 is a schematic illustration of an application scenario of a data processing method according to the present application;

FIG. 4 is a flow chart of yet another embodiment of a data processing method according to the present application;

FIG. 5 is a schematic diagram of a structure of one embodiment of a data processing apparatus according to the present application;

fig. 6 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present application.

Detailed Description

The present application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the data processing methods or data processing apparatus of the present application may be applied.

As shown in fig. 1, a system architecture 100 may include a terminal device 101, a offload server 102, a configuration service center 103, a log server 104, and a distributed system 105. The network is the medium used to provide communication links between terminal equipment 101, offload server 102, configuration service center 103, log server 104, and distributed system 105. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may interact with the offload server 102 via a network using the terminal apparatus 101 to receive or send messages, etc. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, and the like, may be installed on the terminal device 101.

The terminal device 101 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.

The streaming server 102 may be a server providing various services, such as a background streaming server that distributes data transmitted from the terminal device 101 to corresponding distributed system nodes for processing. The distribution server 102 obtains distribution configuration data, such as distribution step, number of slots, and the like, from the configuration service center 103. The received data to be processed can be analyzed and processed from the background shunt server, and the data is introduced into the corresponding distributed system nodes. The offload server 102 may also record offload results for delivery to the log server 104.

The configuration service center 103 may include a registration center and a configuration management, where the configuration management is responsible for shunting configuration data adding, updating, querying and deleting operations; the registry is responsible for offloading configuration data storage and informing the offload servers 102 subscribing to offloading configuration data in real time.

The log server 104 is used to collect logs generated by the offload server 102 in real time.

The distributed system 105 includes a plurality of distributed system nodes, which may be application servers, for processing data sent by the terminal device 101, and implementing a test function. The distributed system nodes may be located on the same server or on different servers.

It should be noted that, the data processing method provided in the embodiment of the present application is generally executed by the streaming server 102, and accordingly, the data processing apparatus is generally disposed in the streaming server 102.

The offload server 102, the configuration service center 103, the log server 104, and the distributed system 105 may be hardware or software. When the offload server 102, the configuration service center 103, the log server 104, and the distributed system 105 are hardware, they may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the above-described streaming server 102, configuration service center 103, log server 104, and distributed system 105 are software, they may be implemented as a plurality of software or software modules (for example, to provide distributed services), or may be implemented as a single software or software module. The present invention is not particularly limited herein.

It should be understood that the number of terminal devices, offload servers, configuration service centers, log servers, and distributed systems in fig. 1 are merely illustrative. There may be any number of terminal devices, offload servers, configuration service centers, log servers, and distributed systems, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a data processing method according to the present application is shown. The data processing method comprises the following steps:

in response to receiving the data to be processed, a characteristic value associated with the data is obtained and at least one candidate slot for distributing the data to the distributed system node is determined from a set of preset slots, step 201.

In this embodiment, the electronic device (for example, the server shown in fig. 1) on which the data processing method operates may receive data to be processed from the terminal with which the user performs data transmission through a wired connection manner or a wireless connection manner. Wherein slots (slots) are used to manage the resources of nodes in the distributed system. The characteristic value associated with the data may be a user name of the user transmitting the data, a time at which the data was received, or the like, which characterizes the characteristic of the data. And after receiving the data, checking the idle slots corresponding to the available nodes in the distributed system as candidate slots. Each node may correspond to one or more slots, and each slot may also correspond to one or more nodes. A slot is a logical concept of Hadoop (distributed system infrastructure), and the number of slots of a node is used to represent the capacity or capability of a resource of a certain node, so that a slot is a resource unit of Hadoop. Hadoop utilizes slots to manage the resources of the distribution nodes. Each Job (Job) applies for resources in slots, each node determines its own computing power and memory to determine the total amount of slots it contains. When a certain operation starts to be executed, firstly, the main thread is applied for the slot position, the main thread is allocated with an idle slot position, the operation occupies the slot position again, and after the operation is finished, the slot position is returned.

Step 202, determining the type of the data, and determining a shunting step corresponding to the type according to a preset shunting type mapping table.

In this embodiment, the splitting step is used to indicate how to allocate slots for the data. The splitting step is related to the data type, and different data types may employ different splitting algorithms, thus performing different splitting steps. The split type mapping table is used for representing the corresponding relation between the type of the data and the split step. For example, the data may be classified into types of instant message data, video data, order data, etc., and the type of data may be determined according to a predetermined format from fields in the data. A corresponding branching algorithm is set in advance for each type of data so as to correspond to the branching step employing the algorithm. If the user name exists in the data, the user name can be used as a characteristic value, and then mathematical operation is carried out on the user name to carry out slot mapping. If the user name does not exist in the data, the time of receiving the data can be used as a characteristic value, and slot mapping can be carried out according to the time zone.

The split process involves the usual Hash, remainder, random, and, or, percent, in, not, etc. algorithms. Different algorithms employ different splitting steps: for example, "Hash" refers to a Hash code map, which converts a characteristic value of data into a Hash code, and then uses the Hash code as a slot number of the data. The basis for hash code generation is as follows: the hash code is not completely unique, and is an algorithm, so that the objects of the same class have different hash codes as far as possible according to different characteristics of the objects, but the hash codes of different objects are not completely different. The same is true of the algorithm how the programmer writes the hash code. The term "remainder" refers to a case where the user name is a number, and the remainder of the user name and the number of the candidate slots is used as the slot number of the data, for example, if the user name 13 has 10 slots, the remainder 3 of the user name and the number of the candidate slots is used as the slot number of the data. "random" refers to randomly assigning data to any candidate slot. And refers to a slot that is assigned when the characteristic value satisfies both condition a and condition B. Or refers to a slot that is assigned when the characteristic value satisfies either condition a or condition B. "percent" refers to the proportional allocation of slots based on data flow, e.g., dividing data in turn to one of two slots if 50% flow per slot is expected, and dividing data in turn to one of ten slots if 10% flow per slot is expected. "in" refers to a slot assigned to a slot if a condition is met in the characteristic value. "not" refers to assigning a slot if a condition is not met in the characteristic value for that slot.

Alternatively, the split type mapping table may be preset in the split server, or may be downloaded from the configuration service center shown in fig. 1. And notifying the distribution server to update the distribution type mapping table when the configuration service center updates, modifies and deletes the configuration data. The splitting step may be characterized by configuration data written in a predetermined scripting language, and then performed after parsing by the splitting server. The configuration data may be as follows:

{

"distrbutionId": 1",// split step identity, related to data type

"distrbutionname": "user hash split",// split step name

"eigenValue": "uid",// eigenValue

"distrbutionPolicy": strategy of "hash",// split

SlotNum 10// number of slots of the split calculation result

"isValid":1// whether to take effect 1-take effect 0-take no effect

}

Step 203, determining a slot for distributing data to the distributed system node from at least one candidate slot according to the determined splitting step based on the eigenvalue.

In this embodiment, a slot is determined from the at least one candidate slot according to the splitting step, the slot corresponding to a distributed system node. For example, there are 10 candidate slots, and the spare split step is used to assign data with user name 13 to slot 3 and data with user name 28 to slot 8.

In some alternative implementations of the present embodiment, the characteristic value includes a user name extracted from the data; and determining a slot for distributing data to the distributed system node from the at least one candidate slot according to the determined splitting step based on the eigenvalue, comprising: converting the characteristic value into a hash code; solving the remainder of the hash code and the number of at least one candidate slot; and inquiring a slot mark corresponding to the remainder according to a preset slot mapping table, wherein the slot mapping table is used for representing the corresponding relation between the remainder and the slot mark. An arbitrary length input (also called a pre-map) can be transformed into a fixed length output by a hash algorithm. The character string can be converted into numbers through a hash algorithm, so that the slot positions can be determined by performing a remainder operation on the data with the user name of the character string. For example, when a user named "abcd" has transmitted data and can convert the data into hash codes 1234, if the total number of slots is 10, slot 4 is allocated to the data.

And step 204, transmitting the data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table.

In this embodiment, the distributed system mapping table is used to characterize the correspondence between the slot and the distributed system node. The distributed system may be a plurality of clusters under one IP address. One slot may correspond to one or more distributed system nodes. For example, if slot 1 corresponds to application server 1, data is sent to application server 1 when the data is assigned to slot 1. If slot 1 corresponds to application servers 1-10, data is sent to application servers 1-10, respectively, when data is assigned to slot 1.

In some optional implementations of this embodiment, the determined splitting step, the feature value, and the slot identifier corresponding to the slot are recorded in a log. The log recorded by the shunting server can be accessed by other servers.

In some optional implementations of the present embodiment, the method further includes: and sending the log to a log server. A distributed system of massive log collection, aggregation, and transmission may be used, such as a flime to collect logs for each offload server.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the data processing method according to the present embodiment. In the application scenario of fig. 3, user "123" sends data 301 to streaming server 302 via a terminal device. The offload server 302 determines that the candidate slots currently available are 10 slots numbered 1-10. The user name "123" is extracted from the data and a branching step using "remainder" is determined according to the data type, and the remainder of dividing 123 by 10 is 3, thus slot 3 is allocated to the data. Then, the application server 304 corresponding to the slot 3 is searched, and the data is sent to the application server 304 corresponding to the slot 3.

The method provided by the embodiment of the application improves the load balance of each node in the distributed system by associating the distributed system node for processing the data with the characteristic value of the data.

With further reference to FIG. 4, a flow 400 of yet another embodiment of a data processing method is shown. The flow 400 of the data processing method includes the steps of:

in response to receiving the data to be processed, step 401, a time at which the data was received is obtained and at least one candidate slot for distributing the data to the distributed system node is determined from a set of preset slots.

In this embodiment, the electronic device (for example, the server shown in fig. 1) on which the data processing method operates may receive data to be processed from the terminal with which the user performs data transmission through a wired connection manner or a wireless connection manner. The time at which the data is received may be expressed in 24 hours. Upon receiving the data, the view distributed system may be used to assign the data to slots of the distributed system nodes as candidate slots. Each node may correspond to one or more slots. A slot is a logical concept of Hadoop (distributed system infrastructure), and the number of slots of a node is used to represent the capacity or capability of a resource of a certain node, so that a slot is a resource unit of Hadoop. Hadoop utilizes slots to manage the resources of the distribution nodes. Each Job (Job) applies for resources in slots, each node determines its own computing power and memory to determine the total amount of slots it contains. When a certain operation starts to be executed, firstly, the main thread is applied for the slot position, the main thread is allocated with an idle slot position, the operation occupies the slot position again, and after the operation is finished, the slot position is returned.

Step 402, determining the type of the data, and determining a splitting step corresponding to the type according to a preset splitting type mapping table.

Step 402 is substantially the same as step 202 and will not be described in detail.

Step 403, obtaining the historical flow of at least one candidate slot.

In the present embodiment, the history flow refers to the amount of data allocated to the slot in a predetermined time. For example, the amount of data allocated to slots per hour. The amount of data allocated to slots for each of 24 hours a day before is counted. The amount of data allocated to the slot for each of 24 hours of consecutive days may also be obtained and then the average flow per hour determined, for example, 00:00-1:00 average flow 2m,1:00-2: the 00 average flow rate is 10M … … 23:00-00: the 00 average flow is 45M.

Step 404, dividing the day into at least one time interval according to the historical flow.

In this embodiment, the flow rate of one day may be divided into at least one time interval according to the number of slots, where each time interval corresponds to one candidate slot, and it is ensured that the historical flow rates in the respective time intervals are approximately equal. And distributing the data in the time interval of receiving the data to the corresponding slot position of the time interval. For example, there are three slots, and it is necessary to divide a day into three time intervals and ensure that the historical flow rates in the three time intervals are approximately equal. The historical flow is divided into time intervals 1[00:00-12:00], time intervals 2[12:00-16:00], and time intervals 3[16:00-24:00]. Time interval 1 may be assigned to slot 1, time interval 2 to slot 2, and time interval 3 to slot 3.

Alternatively, the time interval may be updated at predetermined time intervals without performing steps 403-404 each time data is received. For example, the streaming server may divide once a day when data is first received, and then allocate slots according to the divided time intervals on the same day. Longer intervals of time, such as one week, one month, etc., may also be updated. The time interval division may also be performed by calculating a flow average for a predetermined time each day, for example, calculating the most recent 7 day flow average each day.

Step 405, determining a candidate slot corresponding to the time interval in which the time is located.

In the present embodiment, if the time of receiving the data is 15:00, time intervals divided by the above example, 15:00 is in time interval 2[12:00-16:00], thus assigning this data to slot 2.

And step 406, transmitting the data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the data processing method in this embodiment highlights the step of splitting according to the time when the data is received. Therefore, the scheme described in the embodiment can introduce a simpler shunting scheme, so that the load balance of each node in the distributed system is improved, and the shunting time delay is reduced.

With further reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present application provides an embodiment of a data processing apparatus, where an embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 5, the data processing apparatus 500 of the present embodiment includes: an acquisition unit 501, a first determination unit 502, a second determination unit 503, and a transmission unit 504. Wherein the obtaining unit 501 is configured to obtain, in response to receiving data to be processed, a feature value associated with the data and determine at least one candidate slot for distributing the data to a node of the distributed system from a preset set of slots, where the slot is used to manage resources of the node in the distributed system; the first determining unit 502 is configured to determine a type of data, and determine a splitting step corresponding to the type according to a preset splitting type mapping table, where the splitting type mapping table is used to characterize a correspondence between the type of data and the splitting step; the second determining unit 503 is configured to determine, according to the determined splitting step, a slot for distributing data to the distributed system node from the at least one candidate slot based on the feature value; the sending unit 504 is configured to send data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table, where the distributed system mapping table is used to characterize a correspondence between the slot and the distributed system node.

In the present embodiment, specific processes of the acquisition unit 501, the first determination unit 502, the second determination unit 503, and the transmission unit 504 of the data processing apparatus 500 may refer to steps 201, 202, 203, 204 in the corresponding embodiment of fig. 2.

In some alternative implementations of the present embodiment, the characteristic value includes a user name extracted from the data; and the second determining unit 503 is further configured to: converting the characteristic value into a hash code; solving the remainder of the hash code and the number of at least one candidate slot; and inquiring a slot mark corresponding to the remainder according to a preset slot mapping table, wherein the slot mapping table is used for representing the corresponding relation between the remainder and the slot mark.

In some alternative implementations of the present embodiment, the characteristic value includes a time at which the data was received; and the second determining unit 503 is further configured to: acquiring historical flow of at least one candidate slot, wherein the historical flow refers to data quantity distributed to the slot in a preset time; dividing a day into at least one time interval according to the historical flow, wherein each time interval corresponds to one candidate slot phase; and determining a candidate slot position corresponding to the time interval where the time is located.

In some optional implementations of this embodiment, the apparatus 500 further includes: and the log unit (not shown) is configured to record the determined diversion step, the characteristic value and the slot position identification corresponding to the slot position into a log.

In some optional implementations of the present embodiment, the log unit is further configured to: and sending the log to a log server.

Referring now to FIG. 6, a schematic diagram of a computer system 600 suitable for use in implementing an electronic device of an embodiment of the present application is shown. The electronic device shown in fig. 6 is only an example and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.

As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 601. It should be noted that, the computer readable medium described in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present application may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and a transmission unit. Wherein the names of the units do not constitute a limitation of the unit itself in some cases, e.g. the acquisition unit may also be described as "in response to receiving data to be processed, acquiring a characteristic value associated with said data and determining from a set of preset slots a unit for assigning said data to at least one candidate slot of a distributed system node".

As another aspect, the present application also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: in response to receiving data to be processed, acquiring a characteristic value associated with the data and determining at least one candidate slot for distributing the data to nodes of the distributed system from a preset slot set, wherein the slot is used for managing resources of the nodes in the distributed system; determining the type of the data, and determining a shunting step corresponding to the type according to a preset shunting type mapping table, wherein the shunting type mapping table is used for representing the corresponding relation between the type of the data and the shunting step; determining a slot for distributing data to the distributed system node from at least one candidate slot according to the determined splitting step based on the characteristic value; and sending the data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table, wherein the distributed system mapping table is used for representing the corresponding relation between the slot and the distributed system node.

The foregoing description is only of the preferred embodiments of the present application and is presented as a description of the principles of the technology being utilized. It will be appreciated by persons skilled in the art that the scope of the invention referred to in this application is not limited to the specific combinations of features described above, but it is intended to cover other embodiments in which any combination of features described above or equivalents thereof is possible without departing from the spirit of the invention. Such as the above-described features and technical features having similar functions (but not limited to) disclosed in the present application are replaced with each other.

Claims

1. A data processing method, comprising:

in response to receiving data to be processed, acquiring a characteristic value associated with the data and determining at least one candidate slot for distributing the data to nodes of the distributed system from a preset slot set, wherein the slot is used for managing resources of the nodes in the distributed system;

determining the type of the data, and determining a shunting step corresponding to the type according to a preset shunting type mapping table, wherein the shunting type mapping table is used for representing the corresponding relation between the type of the data and the shunting step;

determining a slot for distributing the data to a distributed system node from the at least one candidate slot according to the determined splitting step based on the characteristic value;

and transmitting the data to the distributed system node corresponding to the determined slot position according to a preset distributed system mapping table, wherein the distributed system mapping table is used for representing the corresponding relation between the slot position and the distributed system node.

2. The method of claim 1, wherein the characteristic value comprises a user name extracted from the data; and

the determining, based on the characteristic value, a slot for distributing the data to a distributed system node from the at least one candidate slot according to the determined splitting step includes:

converting the characteristic value into a hash code;

solving a remainder of the hash code and the number of the at least one candidate slot;

and inquiring a slot position identifier corresponding to the remainder according to a preset slot position mapping table, wherein the slot position mapping table is used for representing the corresponding relation between the remainder and the slot position identifier.

3. The method of claim 1, wherein the characteristic value comprises a time at which the data was received; and

acquiring historical flow of the at least one candidate slot, wherein the historical flow refers to data quantity distributed to the slot in preset time;

dividing a day into at least one time interval according to the historical flow, wherein each time interval corresponds to one candidate slot phase;

and determining a candidate slot position corresponding to the time interval in which the time is positioned.

4. The method of claim 1, wherein the method further comprises:

and recording the determined shunting step, the characteristic value and the slot position identification corresponding to the slot position into a log.

5. The method of claim 4, wherein the method further comprises:

and sending the log to a log server.

6. A data processing apparatus comprising:

an obtaining unit, configured to obtain a characteristic value associated with data in response to receiving the data to be processed, and determine at least one candidate slot for distributing the data to a node of a distributed system from a preset slot set, wherein the slot is used for managing resources of the node in the distributed system;

the first determining unit is configured to determine the type of the data and determine a splitting step corresponding to the type according to a preset splitting type mapping table, wherein the splitting type mapping table is used for representing the corresponding relation between the type of the data and the splitting step;

a second determining unit configured to determine, from the at least one candidate slot, a slot for distributing the data to a distributed system node according to the determined splitting step, based on the feature value;

and the sending unit is configured to send the data to the distributed system node corresponding to the determined slot according to a preset distributed system mapping table, wherein the distributed system mapping table is used for representing the corresponding relation between the slot and the distributed system node.

7. The apparatus of claim 6, wherein the characteristic value comprises a user name extracted from the data; and

the second determination unit is further configured to:

converting the characteristic value into a hash code;

8. The apparatus of claim 6, wherein the characteristic value comprises a time at which the data was received; and

the second determination unit is further configured to:

9. The apparatus of claim 6, wherein the apparatus further comprises:

and the log unit is configured to record the determined splitting step, the characteristic value and the slot position identification corresponding to the slot position into a log.

10. The apparatus of claim 9, wherein the log unit is further configured to:

and sending the log to a log server.

11. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-5.

12. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-5.