CN112988429B - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents

Data processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN112988429B
CN112988429B CN202110497751.9A CN202110497751A CN112988429B CN 112988429 B CN112988429 B CN 112988429B CN 202110497751 A CN202110497751 A CN 202110497751A CN 112988429 B CN112988429 B CN 112988429B
Authority
CN
China
Prior art keywords
data
identity
processed
message
partition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110497751.9A
Other languages
Chinese (zh)
Other versions
CN112988429A (en
Inventor
陈佛林
高斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu New Hope Finance Information Co Ltd
Original Assignee
Chengdu New Hope Finance Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu New Hope Finance Information Co Ltd filed Critical Chengdu New Hope Finance Information Co Ltd
Priority to CN202110497751.9A priority Critical patent/CN112988429B/en
Publication of CN112988429A publication Critical patent/CN112988429A/en
Application granted granted Critical
Publication of CN112988429B publication Critical patent/CN112988429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data processing method, a data processing device, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: writing the current message into a corresponding data storage table; processing the message in the data storage table by using one processing thread or a plurality of processing threads; acquiring processed data from the data storage table through the consumption thread from high to low according to the priority of the identity; after the consumption thread reads the processed data successfully, marking the processed data which is read successfully as consumed data. The efficiency of data processing can be improved while maintaining the sequential consumption of data.

Description

Data processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method, an apparatus, an electronic device, and a computer-readable storage medium.
Background
Currently, data processing modes of producers and consumers are widely applied by business systems, and various types of message queue protocols are generated based on data processing requirements of the producer and consumer modes, such as RabbitMQ, rockmq, ActiveMQ, Kafka, ZeroMQ, MetaMq, and some message queues implemented based on databases, REDIS, zorkeeper, and the like. However, the message queues are based on the first-in first-out rule, and implement the sequential processing of messages and the sequential consumption of messages. However, the efficiency of message processing in this mode is relatively low, and the requirements of some more complex scenarios cannot be met.
Disclosure of Invention
The application aims to provide a data processing method, a data processing device, an electronic device and a computer readable storage medium, which can solve the problem of low data processing efficiency.
In a first aspect, the present invention provides a data processing method, including:
writing the current message into a corresponding data storage table;
processing the messages in the data storage table by using one processing thread or a plurality of processing threads, and storing the processed data obtained by processing into the data storage table;
acquiring processed data from the data storage table through a consumption thread according to the priority of the identity from high to low;
after the consumption thread reads the processed data successfully, marking the processed data which is read successfully as consumed data.
In an optional embodiment, the writing the current message into the corresponding data storage table includes:
determining a first message partition to which the current message belongs according to the current message identity of the current message;
writing the current message identity into a position corresponding to the first message partition in the first identity table;
and writing the current message into a data storage table corresponding to the first message partition.
In the above embodiment, by storing each message in a partition manner according to the identity of each message, data can be conveniently processed concurrently based on the partition in the data processing stage, so as to improve the efficiency of data processing.
In an alternative embodiment, the processing, by using one processing thread or multiple processing threads, a message in the data storage table, and storing processed data obtained by the processing into the data storage table includes:
acquiring an identity to be processed from the first identity list;
determining a second message partition corresponding to the identity identifier to be processed according to the identity identifier to be processed;
acquiring a to-be-processed message corresponding to the to-be-processed identity from a position corresponding to the second message partition in the data storage table;
processing the message to be processed according to a set processing logic;
and if the to-be-processed message is successfully processed, updating a second identity identification table based on the to-be-processed identity identification, and writing the processed data corresponding to the to-be-processed identity identification into a data storage table corresponding to the second message partition.
In the above embodiment, by establishing the first identity identification table and the second identity identification table of the two identity identification tables, the identity identifications of data in different states can be stored in separate tables, ordered data processing and data consumption can be realized, and the situations that data is repeatedly processed or the data enters a consumption stage without being processed are reduced.
In an alternative embodiment, the obtaining, by the consuming thread, the processed data from the data storage table according to the priority of the identity from high to low includes:
using a target consumption thread corresponding to a target message partition to acquire the current identity of the target message partition from the second identity table;
determining all processed data of the target message partition from the position corresponding to the target message partition in the data storage table according to the current identity;
determining all consumed data of the target message partition;
determining an identity identifier with the highest priority in all processed data of the target message partition according to all processed data of the target message partition;
determining consumable data in the target message partition according to all consumed data of the target message partition, the current identity of the target message partition and the identity with the highest priority in all processed data of the target message partition;
and acquiring the consumable data from a data storage table corresponding to the target message partition from high to low according to the priority of the identity.
In the above embodiment, through the determination of the consumed data and the processed data, the situation of repeated consumption can be reduced while the ordered consumption is realized, and the accuracy of data processing consumption is improved.
In an alternative embodiment, the method further comprises:
and if the to-be-processed message is failed to be processed, writing the to-be-processed identity into the first identity list.
In the above embodiment, by identifying whether the data processing is successful or not, and adopting different processing flows for the identity of the data which is successfully processed and the identity of the data which is failed to be processed, the overall success rate of the data processing can be improved.
In an alternative embodiment, the method further comprises:
if a processing thread does not acquire an identity from a sub-table corresponding to a third message partition in the first identity table, judging whether the identity recorded in the identity corresponding to the third message partition in the second identity table is the identity with the lowest priority of consumable data in the processed data;
and if the identity identifier recorded in the identity identifier corresponding to the third message partition in the second identity identifier table is not the maximum identity identifier of the consumable data in the processed data, updating the identity identifier corresponding to the third message partition in the second identity identifier table to the maximum identity identifier of the identity identifier with the lowest identity identifier priority in the consumable data.
In the above embodiment, the identifier stored in the third message partition is dynamically modified, so that more processed data can be consumed conveniently in the subsequent data consumption stage, and the efficiency of data consumption is improved.
In an optional embodiment, after the consuming thread successfully reads the processed data, marking the processed data successfully read as consumed data includes:
after the consumption thread reads the processed data successfully, updating a third identity identification table according to the identity identification of the currently acquired processed data, wherein the data corresponding to the identity identification in the third identity identification table is the consumed data;
and deleting the consumed data corresponding to the identification recorded in the third identification table from the data storage table.
In the embodiment, the consumed data is dynamically cleared, so that the occupied space of the data can be reduced.
In a second aspect, the present invention provides a data processing apparatus comprising:
the first writing module is used for writing the current message into a corresponding data storage table;
the processing module is used for processing the messages in the data storage table by using one processing thread or a plurality of processing threads and storing the processed data obtained by processing into the data storage table;
the acquisition module is used for acquiring processed data from the data storage table from high to low through a consumption thread according to the priority of the identity;
and the marking module is used for marking the processed data which is successfully read as the consumed data after the consumption thread successfully reads the processed data.
In a third aspect, the present invention provides an electronic device comprising: a processor, a memory storing machine readable instructions executable by the processor, the machine readable instructions when executed by the processor perform the steps of the method of any of the preceding embodiments when the electronic device is run.
In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method according to any of the preceding embodiments.
The beneficial effects of the embodiment of the application are that: the data processing and the data consumption are divided into two stages, and a high-concurrency processing mode can be adopted in the data processing stage so as to improve the data processing information; in the data consumption stage, the data in the data storage table can be sequentially acquired for consumption according to the priority of the identity from high to low, so that the sequential consumption of the data can be maintained.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Fig. 2 is a flowchart of a data processing method according to an embodiment of the present application.
Fig. 3 is a detailed flowchart of step 202 of the data processing method according to the embodiment of the present application.
Fig. 4 is a detailed flowchart of step 204 of the data processing method according to the embodiment of the present application.
Fig. 5 is a detailed flowchart of step 206 of the data processing method according to the embodiment of the present application.
Fig. 6 is a schematic functional block diagram of a data processing apparatus according to an embodiment of the present application.
Detailed Description
The technical solution in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
The existing message queues are RabbitMQ, RocktMQ, ActiveMQ, Kafka, ZeroMQ, MetaMq and the like, and the queues all take a first-in first-out FIFO message processing mode. The message queue described above cannot handle the following scenarios: 1) the producer writes the messages into the message queue in sequence; 2) concurrently processing messages in the queue; 3) the consumer sequentially consumes the messages in the queue and uses them in order. The existing method is that after the message is completely processed, the message is consumed by a single thread at a consumer end.
Based on this, the data processing method, the data processing device, the electronic device and the computer-readable storage medium provided by the application can implement the above-mentioned scenario that cannot be implemented by the existing message queue. This is described below by means of several embodiments.
Example one
To facilitate understanding of the present embodiment, first, an electronic device executing the data processing method disclosed in the embodiments of the present application will be described in detail.
As shown in fig. 1, is a block schematic diagram of an electronic device. The electronic device 100 may include a memory 111, a processor 113, an input-output unit 115. It will be understood by those of ordinary skill in the art that the structure shown in fig. 1 is merely exemplary and is not intended to limit the structure of the electronic device 100. For example, electronic device 100 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The above-mentioned components of the memory 111, the processor 113 and the input/output unit 115 are directly or indirectly electrically connected to each other to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The processor 113 is used to execute the executable modules stored in the memory.
The Memory 111 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 111 is configured to store a program, and the processor 113 executes the program after receiving an execution instruction, and the method executed by the electronic device 100 defined by the process disclosed in any embodiment of the present application may be applied to the processor 113, or implemented by the processor 113.
The processor 113 may be an integrated circuit chip having signal processing capability. The Processor 113 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input/output unit 115 is used to provide input data to the user. The input/output unit 115 may be, but is not limited to, a mouse, a keyboard, and the like.
The electronic device 100 in this embodiment may be configured to perform each step in each method provided in this embodiment. The implementation of the data processing method is described in detail below by means of several embodiments.
Example two
Please refer to fig. 2, which is a flowchart illustrating a data processing method according to an embodiment of the present disclosure. The specific process shown in fig. 2 will be described in detail below.
Step 202, writing the current message into the corresponding data storage table.
Optionally, before step 202, the method may further include: step 201, a data storage table required to be used in the data processing and data consumption process is created.
Illustratively, the data storage table may be a queue.
Illustratively, the configuration information of the queue may be written to a queue partition configuration table in a remote dictionary service (REDIS). In this embodiment, the created queue may be used to store messages involved in the data processing process. For example, the messages involved in the data processing procedure may be stored using a MAP (key, value) store.
In one example, the method "hset queue [ name ] [ np ]" is used to create the required queue. Where name represents the name of the queue created and np represents the number of partitions contained in the queue.
Optionally, a global counter may be initialized before data processing and data consumption, so as to facilitate generation of an identifier for a message to be written into the queue using the global counter. In one example, the global counter may be initialized using the method "set name 0".
In this implementation, as shown in fig. 3, step 202 may include the following steps 2021 to 2023.
Step 2021, determining the first message partition to which the current message belongs according to the current message identity of the current message.
Alternatively, the identity of the message may be in the form of a long type data record.
Optionally, the current message identity may be one generated using a global counter. Illustratively, the method "incr [ queue name ]" may be used to increment a global counter by one to sequentially increment the identity of the message.
Optionally, the current message identifier may also be a current identifier determined by the producer when the current message is generated.
If the identity marks are sequentially increased by long type data, the identity marks can be subjected to modulo determination to determine a first message partition to which the current message belongs. For example, if the current partition number is m, the identity is modulo-operated with m, which can be understood as dividing the identity by m to obtain a remainder, which is the corresponding partition bit order. For example, if the current id can be divided by m, the partition corresponding to the current message is the mth partition, and if the current id and m are subjected to modulo operation, and the remainder is three, the partition corresponding to the current message is the third partition.
Step 2022, writing the current message id into the position corresponding to the first message partition in the first id table.
In this embodiment, the first identity table is used to store the identity of the message that needs to be processed.
In this embodiment, the first identity identifier table includes storage locations corresponding to the message partitions.
In an example, the current message id may be written into a location corresponding to the first message partition in the first id table by using a method lpush.
Optionally, the first identity table may include a plurality of sub-tables, each sub-table being configured to store an identity of a message corresponding to one message partition.
Step 2023, writing the current message into the data storage table corresponding to the first message partition.
In this embodiment, the data storage table may correspond to a plurality of partitions, and each partition is used to store a message produced by a producer corresponding to the partition and processed data obtained based on the message produced by the producer.
Using REDIS as an example, the message may be written to the data storage table using the method "hset [ name ] [ Pn ] [ ID ] [ value ]. Where Pn denotes a partition number, ID denotes an identity of the current message, and value denotes a value of the current message.
In this embodiment, the data storage table may be the data storage table initially created in step 201.
And step 204, processing the message in the data storage table by using one processing thread or a plurality of processing threads, and storing the processed data obtained by processing into the data storage table.
In this embodiment, as shown in fig. 4, step 204 may include steps 2041 to 2046.
Step 2041, obtain the identity to be processed from the first identity table.
And aiming at any processing thread, acquiring the identity to be processed from the first identity list.
Optionally, each thread acquires the to-be-processed identity corresponding to the message to be processed from the sub-table corresponding to each message partition in the first identity table in a polling manner.
In one example, the identity to be processed may be obtained from a sub-table corresponding to each message partition in the first identity table by using the method rpop.
Step 2042, according to the identity identifier to be processed, determining a second message partition corresponding to the identity identifier to be processed.
In this embodiment, the manner of determining the second message partition is the same as the manner of determining the first message partition, and a modulo manner of the identity to be processed is adopted to determine the second message partition corresponding to the identity to be processed.
Step 2043, obtaining the to-be-processed message corresponding to the to-be-processed identity from the location corresponding to the second message partition in the data storage table.
Step 2044, the message to be processed is processed according to the set processing logic.
The processing logic for the message to be processed may be different for different application scenarios. For example, for the detection of the function of the computer, the message to be processed may be a message to obtain the operating parameters of the computer, and then the processing logic that may be based on the message to be processed may be to obtain the current operating parameters of the computer. For another example, if the written messages need to be sequentially stored in the database, the processing logic may convert the messages to be processed into SQL statements, and concatenate the SQL statements. The processing logic of other messages is not described in detail herein.
Step 2045, if the message to be processed is successfully processed, updating a second identity identification table based on the identity identification to be processed, and writing the processed data corresponding to the identity identification to be processed into a data storage table corresponding to the second message partition.
For example, the processed data may be data obtained by processing the message to be processed by the corresponding processing logic.
The second id table may be configured to store the id with the lowest id priority in the processed data in each message partition.
In this embodiment, the second identity table may also include a storage location corresponding to each message partition.
In this embodiment, the higher the priority of the identity identifier is, the processed data corresponding to the identity identifier may be consumed preferentially in the consumption stage.
For example, the priority of the identity may be determined by the size of the identity, and the smaller the value of the identity, the higher the priority of the identity. For example, the id is of long type, and the smaller the value of the id, the higher the priority.
In this embodiment, for any specified message partition, the identity stored in the sub-table corresponding to the specified message partition in the second identity table is a specified identity. All processed data corresponding to an identifier with a higher priority than the specified identifier in the specified message partition may be consumed. Taking the id as long type data as an example, if the id corresponding to the designated message partition in the second id table is n1, it indicates that all messages with ids no greater than n1 in the designated message partition have been processed successfully, and the data consumption stage may designate that all processed data in the data storage table with ids no greater than n1 in the message partition may be consumed in sequence.
In order to avoid that the data consumed in the subsequent consumption stage is incorrect data due to the data processing failure, and the message with the processing failure can be processed again, as shown in fig. 4 again, the method in this embodiment may further include: step 2046, if the processing of the message to be processed fails, writing the identity to be processed into the first identity list.
For example, the data corresponding to the id in the first id table may be obtained again by the processing thread, so as to be processed by the processing thread.
The above-mentioned steps 2041 to 2046 describe the flow of message processing by one thread. In this embodiment, the process from step 2041 to step 2046 may be concurrently processed by multiple threads, so as to implement high-level concurrent processing of data and improve the efficiency of data processing.
In this embodiment, before step 206, there may be a failure of the processing thread to obtain the id from the first id table.
The data processing method in this embodiment may further include the following step 2051 and step 2052.
Step 2051, if a processing thread does not obtain an identity from the sub-table corresponding to the third message partition in the first identity table, determining whether the identity recorded in the identity corresponding to the third message partition in the second identity table is the identity with the lowest priority of consumable data in the processed data.
Optionally, before the identity corresponding to the third message partition in the second identity table is determined, traversing the third message partition in the data storage table to obtain all the identities corresponding to the third message partition, and determining whether the identity recorded in the identity corresponding to the third message partition in the second identity table is the identity with the lowest priority of consumable data in the processed data according to all the identities of the third message partition.
In this embodiment, the consumable data may be processed data, and the identifier of the consumable data is an identifier with consecutive priority in the third message partition.
Step 2052, if the identity recorded in the identity corresponding to the third message partition in the second identity table is not the maximum identity of the consumable data in the processed data, updating the identity corresponding to the third message partition in the second identity table to the maximum identity of the identity with the lowest identity priority in the consumable data.
And step 206, acquiring the processed data from the data storage table through the consumption thread according to the priority of the identity identification from high to low.
In one embodiment, as shown in fig. 5, step 206 may include the following steps 2061 to 2066.
Step 2061, using the target consumption thread corresponding to the target message partition, and obtaining the current identity of the target message partition from the second identity table.
In this embodiment, each message partition may be bound to a consuming thread, and the consuming thread may be configured to obtain processed data in the consuming partition bound to the consuming thread.
Step 2062, determining all the processed data of the target message partition from the position corresponding to the target message partition in the data storage table according to the current identity.
In one embodiment, the processed data in the target message partition may be determined by obtaining all data in the location corresponding to the target message partition in the data storage table.
Taking the id as long type data as an example, the data corresponding to the id which is not less than the minimum id and not more than the previous id in all the processed data in the target message partition is consumable data.
At step 2063, all consumed data of the target message partition is determined.
Optionally, all consumed data of the target message partition may be determined by the identity corresponding to the target message partition in the third identity table.
In this embodiment, the third id table also includes storage locations corresponding to the message partitions.
Illustratively, the third id table is used for storing the ids with the lowest priority among the ids in the consumed data in the respective message partitions. Taking the id as long type data as an example, the third id table is used for storing the id with the largest id in the consumed data of each message partition.
Step 2064, according to all the processed data of the target message partition, determining the identity with the highest priority in all the processed data of the target message partition.
Taking the id as long type data as an example, the id with the highest priority in all the processed data of the target message partition is the id with the smallest id in all the processed data of the target message partition.
Step 2065, determining consumable data in the target message partition according to all the consumed data of the target message partition, the current identity of the target message partition and the identity with the highest priority in all the processed data of the target message partition.
Taking the identity as long type data as an example, determining that the processed data which is larger than the identity with the largest identity in the consumed data and is not larger than the identity corresponding to the identity between the current identities of the target message partition is the consumable data in the target message partition.
Step 2066, the consumable data is obtained from the data storage table corresponding to the target message partition according to the priority of the identity from high to low.
In this embodiment, the data consumption may be realized by acquiring the processed data from the data storage table.
And step 208, marking the processed data which is successfully read as consumed data after the consumption thread successfully reads the processed data.
In one embodiment, the identity of the consumed data may be stored in a table to indicate that the processed data has been consumed.
Illustratively, step 208 may include step 2081 and step 2082.
Step 2081, after the consumption thread reads the processed data successfully, updating the third identification table according to the currently acquired identification of the processed data.
And the data corresponding to the identification in the third identification table is consumed data.
Alternatively, only the identity of the data newly consumed by each message partition may be stored in the third identity table. It is to be understood that only the identifier with the lowest priority in the consumed data in each message partition may be stored in the third identifier table. Taking the id as long type data as an example, the third id table may only store the largest id among the ids corresponding to the consumed data of each message partition.
Optionally, the third id may also store the ids of all consumed data of each message partition.
Step 2082, deleting the consumed data corresponding to the identification recorded in the third identification table from the data storage table.
In one embodiment, the consumed data in the data storage table may be deleted after each consumption of one item of data by the consuming thread.
In another embodiment, the consumed data in each message partition in the data storage table may be deleted according to the identity recorded in the third identity table after every specified time period.
For example, when the identities of all consumed data are recorded in the third identity table, all consumed data in the data storage table may be deleted according to the identities recorded in the third identity table.
For example, when the identity with the lowest priority in the consumed data is recorded in the third identity table, all the identities with higher priorities than the identity recorded in the third identity table may be determined according to the identity recorded in the third identity table, and all the consumed data in the data storage table may be deleted according to the determined identities.
In the data processing method provided by the embodiment of the application, the data processing and the data consumption are divided into two stages, and a high-concurrency processing mode can be adopted in the data processing stage to improve the data processing information; in the data consumption stage, the data in the data storage table can be sequentially acquired for consumption according to the priority of the identity from high to low, so that the sequential consumption of the data can be maintained.
According to the data processing method provided by the embodiment of the application, high concurrent processing is adopted in the middle data processing process, and the data processing efficiency in the scene that the data consumption sequence is strictly required can be improved. For example, data related to adding and deleting data must be added and deleted in a strict order, otherwise, data confusion may result; the method comprises the steps that a producer needs to write messages transmitted in sequence into a final database, the messages need to be converted into SQL statements in the middle data processing process, the process of splicing the SQL statements consumes a large amount of time, and the middle process of splicing the SQL statements can use high-concurrency processing.
EXAMPLE III
Based on the same application concept, a data processing apparatus corresponding to the data processing method is also provided in the embodiments of the present application, and since the principle of the apparatus in the embodiments of the present application for solving the problem is similar to that in the embodiments of the data processing method, the implementation of the apparatus in the embodiments of the present application may refer to the description in the embodiments of the method, and repeated details are not repeated.
Please refer to fig. 6, which is a schematic diagram of functional modules of a data processing apparatus according to an embodiment of the present disclosure. Each module in the data processing apparatus in this embodiment is configured to execute each step in the above-described method embodiment. The data processing apparatus includes: a first writing module 301, a processing module 302, an obtaining module 303 and a marking module 304; wherein,
a first writing module 301, configured to write the current message into a corresponding data storage table;
a processing module 302, configured to process a message in the data storage table by using one processing thread or multiple processing threads, and store processed data obtained by the processing into the data storage table;
an obtaining module 303, configured to obtain, by a consuming thread, processed data from the data storage table according to a priority of the identity identifier from high to low;
a marking module 304, configured to mark the processed data that is successfully read as consumed data after the consumption thread successfully reads the processed data.
In one possible implementation, the first writing module 301 is configured to:
determining a first message partition to which the current message belongs according to the current message identity of the current message;
writing the current message identity into a position corresponding to the first message partition in the first identity table;
and writing the current message into a data storage table corresponding to the first message partition.
In a possible implementation, the processing module 302 is configured to:
acquiring an identity to be processed from the first identity list;
determining a second message partition corresponding to the identity identifier to be processed according to the identity identifier to be processed;
acquiring a to-be-processed message corresponding to the to-be-processed identity from a position corresponding to the second message partition in the data storage table;
processing the message to be processed according to a set processing logic;
and if the to-be-processed message is successfully processed, updating a second identity identification table based on the to-be-processed identity identification, and writing the processed data corresponding to the to-be-processed identity identification into a data storage table corresponding to the second message partition.
In a possible implementation, the obtaining module 303 is configured to:
using a target consumption thread corresponding to a target message partition to acquire the current identity of the target message partition from the second identity table;
determining all processed data of the target message partition from the position corresponding to the target message partition in the data storage table according to the current identity;
determining all consumed data of the target message partition;
determining an identity identifier with the highest priority in all processed data of the target message partition according to all processed data of the target message partition;
determining consumable data in the target message partition according to all consumed data of the target message partition, the current identity of the target message partition and the identity with the highest priority in all processed data of the target message partition;
and acquiring the consumable data from a data storage table corresponding to the target message partition from high to low according to the priority of the identity.
In a possible implementation manner, the data processing apparatus provided in an embodiment of the present application may further include: and the second writing module is used for writing the identity to be processed into the first identity list if the processing of the message to be processed fails.
In a possible implementation manner, the data processing apparatus provided in an embodiment of the present application may further include:
the judging module is used for judging whether the identity identifier recorded in the identity identifier corresponding to the third message partition in the second identity identifier table is the identity identifier with the lowest priority of consumable data in the processed data or not if a processing thread does not acquire the identity identifier from the sub-table corresponding to the third message partition in the first identity identifier table;
and the updating module is used for updating the identity corresponding to the third message partition in the second identity table to the identity with the highest identity priority in the consumable data if the identity recorded in the identity corresponding to the third message partition in the second identity table is not the maximum identity of the consumable data in the processed data.
In one possible implementation, the marking module 304 is configured to:
after the consumption thread reads the processed data successfully, updating a third identity identification table according to the identity identification of the currently acquired processed data, wherein the data corresponding to the identity identification in the third identity identification table is the consumed data;
and deleting the consumed data corresponding to the identification recorded in the third identification table from the data storage table.
In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the data processing method in the foregoing method embodiment.
The computer program product of the data processing method provided in the embodiment of the present application includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the steps of the data processing method in the foregoing method embodiment, which may be specifically referred to in the foregoing method embodiment, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. A data processing method, comprising:
writing the current message into a corresponding data storage table;
processing the messages in the data storage table by using one processing thread or a plurality of processing threads, and storing the processed data obtained by processing into the data storage table to obtain a second identity identification table, wherein the second identity identification table is used for storing the identity identification with the lowest identity identification priority in the processed data in each message partition;
using a target consumption thread corresponding to a target message partition to acquire the current identity of the target message partition from the second identity table;
determining all processed data of the target message partition from the position corresponding to the target message partition in the data storage table according to the current identity;
determining all consumed data of the target message partition;
determining an identity identifier with the highest priority in all processed data of the target message partition according to all processed data of the target message partition;
determining consumable data in the target message partition according to all consumed data of the target message partition, the current identity of the target message partition and the identity with the highest priority in all processed data of the target message partition;
acquiring the consumable data from a data storage table corresponding to the target message partition according to the priority of the identity from high to low;
after the consumption thread reads the processed data successfully, marking the processed data which is read successfully as consumed data.
2. The data processing method of claim 1, wherein writing the current message into the corresponding data storage table comprises:
determining a first message partition to which the current message belongs according to the current message identity of the current message;
writing the current message identity into a position corresponding to the first message partition in a first identity table;
and writing the current message into a data storage table corresponding to the first message partition.
3. The data processing method according to claim 2, wherein the processing the message in the data storage table by using one processing thread or a plurality of processing threads, and storing the processed data obtained by the processing into the data storage table comprises:
acquiring an identity to be processed from the first identity list;
determining a second message partition corresponding to the identity identifier to be processed according to the identity identifier to be processed;
acquiring a to-be-processed message corresponding to the to-be-processed identity from a position corresponding to the second message partition in the data storage table;
processing the message to be processed according to a set processing logic;
and if the to-be-processed message is successfully processed, updating a second identity identification table based on the to-be-processed identity identification, and writing the processed data corresponding to the to-be-processed identity identification into a data storage table corresponding to the second message partition.
4. The data processing method of claim 3, wherein the method further comprises:
and if the to-be-processed message is failed to be processed, writing the to-be-processed identity into the first identity list.
5. The data processing method of claim 3, wherein the method further comprises:
if a processing thread does not acquire an identity from a sub-table corresponding to a third message partition in the first identity table, judging whether the identity recorded in the identity corresponding to the third message partition in the second identity table is the identity with the lowest priority of consumable data in the processed data;
and if the identity identifier recorded in the identity identifier corresponding to the third message partition in the second identity identifier table is not the maximum identity identifier of the consumable data in the processed data, updating the identity identifier corresponding to the third message partition in the second identity identifier table to the maximum identity identifier of the identity identifier with the lowest identity identifier priority in the consumable data.
6. The data processing method of claim 1, wherein the marking the processed data that is successfully read as consumed data after the consuming thread successfully reads the processed data comprises:
after the consumption thread reads the processed data successfully, updating a third identity identification table according to the identity identification of the currently acquired processed data, wherein the data corresponding to the identity identification in the third identity identification table is the consumed data;
and deleting the consumed data corresponding to the identification recorded in the third identification table from the data storage table.
7. A data processing apparatus, comprising:
the first writing module is used for writing the current message into a corresponding data storage table;
the processing module is used for processing the messages in the data storage table by using one processing thread or a plurality of processing threads and storing the processed data obtained by processing into the data storage table to obtain a second identity identification table, wherein the second identity identification table is used for storing the identity identification with the lowest identity identification priority in the processed data in each message partition;
the acquisition module is used for acquiring processed data from the data storage table from high to low through a consumption thread according to the priority of the identity;
the marking module is used for marking the processed data which is successfully read into the consumed data after the consumption thread successfully reads the processed data;
wherein the obtaining module is configured to:
using a target consumption thread corresponding to a target message partition to acquire the current identity of the target message partition from the second identity table;
determining all processed data of the target message partition from the position corresponding to the target message partition in the data storage table according to the current identity;
determining all consumed data of the target message partition;
determining an identity identifier with the highest priority in all processed data of the target message partition according to all processed data of the target message partition;
determining consumable data in the target message partition according to all consumed data of the target message partition, the current identity of the target message partition and the identity with the highest priority in all processed data of the target message partition;
and acquiring the consumable data from a data storage table corresponding to the target message partition from high to low according to the priority of the identity.
8. An electronic device, comprising: a processor, a memory storing machine-readable instructions executable by the processor, the machine-readable instructions when executed by the processor performing the steps of the method of any of claims 1 to 6 when the electronic device is run.
9. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 6.
CN202110497751.9A 2021-05-08 2021-05-08 Data processing method and device, electronic equipment and computer readable storage medium Active CN112988429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110497751.9A CN112988429B (en) 2021-05-08 2021-05-08 Data processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110497751.9A CN112988429B (en) 2021-05-08 2021-05-08 Data processing method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN112988429A CN112988429A (en) 2021-06-18
CN112988429B true CN112988429B (en) 2021-08-06

Family

ID=76337265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110497751.9A Active CN112988429B (en) 2021-05-08 2021-05-08 Data processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN112988429B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874320A (en) * 2016-06-20 2017-06-20 阿里巴巴集团控股有限公司 The method and apparatus of distributive type data processing
CN107193539A (en) * 2016-03-14 2017-09-22 北京京东尚科信息技术有限公司 Multi-thread concurrent processing method and multi-thread concurrent processing system
US10726009B2 (en) * 2016-09-26 2020-07-28 Splunk Inc. Query processing using query-resource usage and node utilization data
US10831857B2 (en) * 2017-09-06 2020-11-10 Plex Systems, Inc. Secure and scalable data ingestion pipeline
CN112363853A (en) * 2020-11-10 2021-02-12 平安普惠企业管理有限公司 Kafka system-based message publishing method, device, equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193539A (en) * 2016-03-14 2017-09-22 北京京东尚科信息技术有限公司 Multi-thread concurrent processing method and multi-thread concurrent processing system
CN106874320A (en) * 2016-06-20 2017-06-20 阿里巴巴集团控股有限公司 The method and apparatus of distributive type data processing
US10726009B2 (en) * 2016-09-26 2020-07-28 Splunk Inc. Query processing using query-resource usage and node utilization data
US10831857B2 (en) * 2017-09-06 2020-11-10 Plex Systems, Inc. Secure and scalable data ingestion pipeline
CN112363853A (en) * 2020-11-10 2021-02-12 平安普惠企业管理有限公司 Kafka system-based message publishing method, device, equipment and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Unifying Fixed Code Mapping,Communication,Synchronization and Scheduling Algorithms for Efficient and Scalable Looping Pipelining;Aristeidis Mastoras等;《IEEE Transactions on Parallel and Distributed Systems》;20180319;第2136-2149页 *
高并发多线程竞争共享资源架构;林平荣等;《计算机工程与设计》;20201116;第41卷(第11期);第3282-3288页 *

Also Published As

Publication number Publication date
CN112988429A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN108089893B (en) Method and device for determining redundant resources, terminal equipment and storage medium
CN110162526B (en) Method, device and equipment for inquiring data records in block chain type account book
CN107608860B (en) Method, device and equipment for classified storage of error logs
KR101857510B1 (en) Sorting
CN106354817B (en) Log processing method and device
CN113238843B (en) Task execution method, device, equipment and storage medium
CN110968431A (en) Message processing method, device and equipment
CN111639101A (en) Method, device and system for correlating rule engine system of internet of things and storage medium
CN107704604A (en) A kind of information persistence method, server and computer-readable recording medium
CN110019444B (en) Operation request processing method, device, equipment and system
CN108399175B (en) Data storage and query method and device
CN111400170B (en) Data authority testing method and device
CN104462420A (en) Method and device for executing query tasks on database
CN115495212A (en) Task queue processing method, device, equipment, storage medium and program product
US10048991B2 (en) System and method for parallel processing data blocks containing sequential label ranges of series data
CN112988429B (en) Data processing method and device, electronic equipment and computer readable storage medium
US8700542B2 (en) Rule set management
CN116880908B (en) Instruction processing method and device, electronic equipment and readable storage medium
CN110442439B (en) Task process processing method and device and computer equipment
CN111176992B (en) Flow engine testing method and device, computer equipment and storage medium
CN112860412A (en) Service data processing method and device, electronic equipment and storage medium
CN111324645B (en) Block chain data processing method and device
CN111784246A (en) Logistics path estimation method
CN107943923B (en) Telegram code database construction method, telegram code identification method and device
CN107832349B (en) Business object management method and information management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant