CN115658745A - Data processing method, data processing device, computer equipment and computer readable storage medium - Google Patents

Data processing method, data processing device, computer equipment and computer readable storage medium Download PDF

Info

Publication number
CN115658745A
CN115658745A CN202211236677.6A CN202211236677A CN115658745A CN 115658745 A CN115658745 A CN 115658745A CN 202211236677 A CN202211236677 A CN 202211236677A CN 115658745 A CN115658745 A CN 115658745A
Authority
CN
China
Prior art keywords
data
processed
state
target
external file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211236677.6A
Other languages
Chinese (zh)
Inventor
桑文锋
曹犟
刘耀洲
付力力
郭雪东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sensors Data Network Technology Beijing Co Ltd
Original Assignee
Sensors Data Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sensors Data Network Technology Beijing Co Ltd filed Critical Sensors Data Network Technology Beijing Co Ltd
Priority to CN202211236677.6A priority Critical patent/CN115658745A/en
Publication of CN115658745A publication Critical patent/CN115658745A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Retry When Errors Occur (AREA)

Abstract

The application discloses a data processing method, a data processing device, computer equipment and a computer readable storage medium, wherein data to be processed is obtained through a computing engine and processed to obtain processed data; updating the data state of the state rear end of the computing engine according to the processed data to obtain an updated data state; storing the processed data and the updated data state into an external file; sending the processed data to a data receiving end requesting the data, and identifying the processed data sent to the data receiving end by an external file; when the data state is restored to the disk storage data state indicated by the checkpoint event, obtaining unidentified target processed data and a target data state corresponding to the target processed data from an external file; and updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to a data receiving end, so that data retransmission and missed transmission can be avoided.

Description

Data processing method, data processing device, computer equipment and computer readable storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a data processing method, an apparatus, a computer device, and a computer-readable storage medium.
Background
In the big data field, computing engines such as flink, mapReduce, and spark computing engines may be used for data volume and data interaction with external systems. Some computing engines are provided with a mechanism for ensuring data consistency, the data consistency in the computing engines can be automatically processed, the consistency of data interacting with an external system cannot be ensured, when the computing engines are abnormal in service and are restarted, the data state is restored to the inventory state indicated by the checkpoint event, the computing engines can resend processed data obtained after the inventory data state, and the problems that the data sent to the external system is repeated or missed and the like due to the obtained processed data before the inventory data state cannot be sent are solved.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, computer equipment and a computer readable storage medium, when a computing engine is abnormally restarted due to service, the data state of the computing engine can be restored to the inventory state indicated by a checkpoint event, the inventory data state in the data back end can be updated according to the data state stored in an external file, processed data obtained before the updated data state is determined according to data identification stored in the external file, and target processed data which is not sent to a data receiving end is sent to the data receiving end, and data retransmission and missed sending after the computing engine is restarted are avoided.
The data processing method provided by the embodiment of the application comprises the following steps:
acquiring data to be processed through a computing engine, and processing the data to be processed according to a data state in a state rear end of the computing engine to obtain processed data;
updating the data state of the rear end of the state of the computing engine according to the processed data to obtain an updated data state;
storing the processed data and the updated data state into an external file;
sending the processed data to a data receiving end requesting data, and identifying the processed data sent to the data receiving end in the external file;
when the computing engine is in service abnormity and the data state is restored to the disk storage data state indicated by the checkpoint event, acquiring stored target processed data which are not identified and a target data state corresponding to the target processed data from the external file;
and updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to the data receiving end.
Correspondingly, an embodiment of the present application further provides a data processing apparatus, including:
the processing unit is used for acquiring data to be processed through a computing engine and processing the data to be processed according to the data state in the state rear end of the computing engine to obtain processed data;
the updating unit is used for updating the data state of the rear end of the state of the computing engine according to the processed data to obtain an updated data state;
the storage unit is used for storing the processed data and the updated data state into an external file;
the identification unit is used for sending the processed data to a data receiving terminal requesting data and identifying the processed data sent to the data receiving terminal in the external file;
an obtaining unit, configured to, when a service exception occurs in the compute engine and the data state is restored to a disk storage data state indicated by a checkpoint event, obtain, from the external file, stored target processed data that is not identified and a target data state corresponding to the target processed data;
and the sending unit is used for updating the disk storage data state at the rear end of the state according to the target data state and sending the target processed data to the data receiving end.
In one embodiment, the identification unit includes:
the first identifier acquiring subunit is used for acquiring the data identifier corresponding to the processed data which is sent to the data receiving terminal;
and the storage subunit is used for storing the data identifier in an index file preset in the external file.
In one embodiment, the identification unit includes:
a submitting subunit, configured to submit the processed data to a message queue, so as to send the processed data to the data receiving end through the message queue;
and the data identification subunit is used for identifying the processed data submitted to the message queue in the external file.
In one embodiment, the sending unit includes:
the second identifier acquiring subunit is configured to acquire a data identifier corresponding to the processed data in the message queue;
and the data sending subunit is configured to send the target processed data to the message queue if the data identifier in the message queue is not matched with the data identifier corresponding to the target processed data.
In one embodiment, the processing unit includes:
the receiving subunit is used for receiving data to be processed and storing the data to be processed into a data receiving queue;
and the data acquisition subunit is used for acquiring the data to be processed from the data receiving queue through the calculation engine.
In one embodiment, the data processing apparatus further comprises:
the position storage unit is used for storing the data reading position of the data to be processed in the data receiving queue into the external file;
the information acquisition unit is used for acquiring current data to be processed from the data receiving queue and a current data reading position corresponding to the current data to be processed;
a position acquisition unit configured to acquire the target data reading position from the external file;
and the data processing unit is used for processing the current data to be processed to obtain processed data if the position relation between the data reading position and the target data reading position meets a preset condition.
In one embodiment, the processing unit includes:
the state subunit is used for acquiring a data state corresponding to the data to be processed from the state rear end of the computing engine, wherein the data state comprises processed data obtained by performing data processing on historical data to be processed;
and the data to be processed processing subunit is used for performing data processing on the data to be processed based on the data state to obtain processed data.
Correspondingly, the embodiment of the application also provides computer equipment, which comprises a memory and a processor; the memory stores a computer program, and the processor is used for operating the computer program in the memory to execute any data processing method provided by the embodiment of the application.
Accordingly, embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium is used to store a computer program, and the computer program is loaded by a processor to execute any one of the data processing methods provided in the embodiments of the present application.
In the embodiment of the application, the data to be processed is acquired through the computing engine, and is processed according to the data state in the state rear end of the computing engine to obtain the processed data; updating the data state of the state rear end of the computing engine according to the processed data to obtain an updated data state; storing the processed data and the updated data state into an external file; sending the processed data to a data receiving end requesting the data, and identifying the processed data sent to the data receiving end by an external file; when the computing engine is out of service and the data state is restored to the disk storage data state indicated by the checkpoint event, acquiring stored target processed data which are not identified and a target data state corresponding to the target processed data from an external file; and updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to a data receiving end.
When the computing engine is abnormally restarted due to service, the data state of the computing engine can be restored to the inventory state indicated by the checkpoint event, the inventory data state in the data back end can be updated according to the data state stored in the external file, the processed data obtained before the updated data state is determined according to the data identification stored in the external file, the target processed data which is not sent to the data receiving end is sent to the data receiving end, and the data retransmission and the data missing after the computing engine is restarted are avoided.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a data processing method provided in an embodiment of the present application;
FIG. 2 is an interaction diagram of a compute engine provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a computer device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a data processing method, a data processing device, computer equipment and a computer readable storage medium. The data processing apparatus may be integrated into a computer device, and the computer device may be a server or a terminal.
The terminal may include a mobile phone, a wearable smart device, a tablet Computer, a notebook Computer, a Personal Computer (PC), a vehicle-mounted Computer, and the like.
The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, big data and artificial intelligence platform.
The following are detailed descriptions. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.
The present embodiment will be described from the perspective of a data processing apparatus, which may be specifically integrated in a computer device, and the computer device may be a server, or may be a terminal or other devices.
In the data processing method provided in the embodiment of the present application, as shown in fig. 1, a specific flow of the data processing method may be as follows
101. And acquiring data to be processed through the computing engine, and processing the data to be processed according to the data state in the state rear end of the computing engine to obtain the processed data.
The computation engine may be a data processing engine for data processing, for example, flink is a framework for performing stateful computation processing on unbounded and bounded data streams, and is a relatively efficient data stream processing engine that considers that all data exists in the form of a stream.
The state back end is a component provided by the Flink and used for managing the state, the Flink provides three state back ends of MemoryStateBackend, rocksDBStateBackend and FsStateBackend, the data state (state) is used for storing intermediate calculation results or cache data in the Flink, some data processing needs to depend on the intermediate data stored by the state, and the calculation which needs to depend on previous or subsequent events is called stateful calculation.
For example, the data to be processed is input into a calculation engine, the calculation engine acquires a data state in the data back end according to a preset program, and processes the data to be processed according to the data state to obtain processed data corresponding to the data to be processed.
It can be understood that the state stores processed data obtained by processing historical to-be-processed data of the to-be-processed data, the data state is an intermediate calculation result stored in the state, and the processing of the to-be-processed data according to the data state is processing of the to-be-processed data according to the intermediate calculation result stored in the state.
The data to be processed may be data that needs to be calculated in a stateful manner by the calculation engine, and the processed data is data obtained by calculating the data to be processed, and may be, for example, a processing result or intermediate data.
That is, the step of processing the data to be processed according to the data state in the state back end of the computing engine to obtain the processed data may specifically include:
acquiring a data state corresponding to the data to be processed from the state rear end of the computing engine, wherein the data state comprises processed data obtained by processing data of historical data to be processed;
and performing data processing on the data state and the data to be processed to obtain processed data.
For example, the method may specifically be that, according to a preset computer program for processing data to be processed, a data state dependent on processing the data to be processed is acquired from a state back end of the computing engine, and the data to be processed is processed according to the data state to obtain processed data.
In some application scenarios, a large amount of data may be generated in a burst, the large amount of data may enter the compute engine at a moment, which may cause the compute engine to crash, and in order to prevent the compute engine from crashing due to a traffic burst, the to-be-processed data may be buffered through the message queue, that is, in an embodiment, the step "obtaining the to-be-processed data through the compute engine" may specifically include:
receiving data to be processed and storing the data to be processed into a data receiving queue;
and acquiring the data to be processed from the data receiving queue through the calculation engine.
For example, as shown in fig. 2, a message queue (for differentiation, it may be referred to as a data receiving queue) may be set between the computing engine and an upstream service, where the upstream service is a system that generates data to be processed, such as a terminal and a server that generate the data to be processed, the data to be processed generated by the upstream service is stored in the message queue, and the computing engine acquires the data to be processed from the message queue.
102. And updating the data state of the state rear end of the computing engine according to the processed data to obtain the updated data state.
For example, the data state at the back end of the state in the calculation engine may be updated according to the processed data to obtain an updated data state,
assuming that the task corresponding to the processing of the data to be processed is to count the total number of times of access of each address in real time, the data to be processed may be a request carrying an access address, the data state in the data back end is the number of times of access of each address, the data to be processed is analyzed to determine the access address, the processed data, that is, the number of times of access of each address is obtained according to the data state and the current access address, and the data state is updated.
Updating the data state may be adding a record of the updated data state in the data back end.
103. And storing the processed data and the updated data state into an external file.
The external File may be a storage File independent of the computing engine, for example, may be a Hadoop Distributed File System (HDFS).
For example, the updated data state is saved to an external file.
104. And sending the processed data to the data receiving end which requests the data, and identifying the processed data which is sent to the data receiving end by the external file.
For example, taking the above statistics of the total number of times of access of each address as an example, the terminal may obtain the processed data from the computing engine, render the processed data, and display the processed data, so as to realize visualization of the real-time access condition of each address.
Optionally, as shown in fig. 2, a message queue may be added between the compute engine and the downstream service, configured to cache processed data sent by the compute engine to the data receiving end, and facilitate performing asynchronous processing on the processed data, that is, the step "send the processed data to the data receiving end that requests data, and identify, in an external file, the processed data that has been sent to the data receiving end" includes:
submitting the processed data to a message queue so as to send the processed data to a data receiving end through the message queue;
the external file identifies the processed data that has been submitted to the message queue.
And submitting the processed data to a message queue, wherein a data receiving terminal can acquire the processed data through a subscription message queue, and when the processed data is submitted to the message queue, the processed data can be sent to a data receiving terminal requesting the data through the message queue.
When the processed data is sent to a data receiving end or submitted to a message queue, it indicates that the processed data is sent to a device that needs the data, and when an exception occurs, the processed data does not need to be sent again, so the processed data that has been sent to the data receiving end or submitted to the message queue needs to be identified, for example, an identifier may be set for the processed data stored in an external file, and if the processed data is stored in a list, a field may exist in the list to indicate whether the processed data has been sent, and the processed data is identified by writing a value or a symbol, etc. that indicates that the processed data has been sent, in the field of the processed data.
For another example, the processed data may be stored in the form of key-value1-value2, where key is the data identifier of the processed data, value1 is the processed data, and value2 may indicate whether the processed data has been transmitted, for example, 0 is not transmitted and 1 is transmitted, and when the processed data is stored in the external file, value is 0.
When the processed data is sent to a data receiving terminal or submitted to a message queue, a data identifier of the processed data is output through a calculation engine, and the corresponding value2 is updated to 1 according to the data identifier.
Optionally, the data identifier may also be stored, and the processed data that is not sent in the external file may be determined according to the stored data identifier, that is, the step "the external file identifies the processed data that has been sent to the data receiving end" may specifically include:
acquiring a data identifier corresponding to processed data which is sent to a data receiving terminal;
and storing the data identification in an index file preset in an external file.
The data identifier may be an identifier corresponding to the data to be processed, and the corresponding data to be processed may be uniquely identified according to the data identifier, or the processed data corresponding to the data to be processed may be uniquely identified by the data identifier.
The index file may be a preset file for storing a data identifier corresponding to the sent processed data, and the processed data may be stored in a file different from the index file.
For example, if the processed data is sent to the data terminal through the message queue, the processed data is submitted to the message queue, that is, the processed data is not sent to the data receiving terminal, the data identifier corresponding to the processed data that has been sent to the data receiving terminal may be obtained, and the data identifier is stored in the preset index file of the external file.
105. When the computing engine is out of service and the data state is restored to the disk storage data state indicated by the checkpoint event, the stored target processed data which are not identified and the target data state corresponding to the target processed data are obtained from the external file.
The data state of the storage can be a data state saved by a CheckPoint event, the CheckPoint event (CheckPoint) is a fault-tolerant mechanism provided by a computing engine, the computing engine can periodically trigger the CheckPoint event, when the CheckPoint event is triggered, a snapshot is generated according to the current state of each task in the computing engine and is saved, and besides the current state of the task is processed and saved by the CheckPoint event, the offset of data to be processed, which is obtained from a message queue, in the current state can be saved.
In the Checkpoint mechanism of Flink, after a computing engine sends a service failure and restarts, a data state in a data back end can be recovered according to a saved data state stored in a Checkpoint event, and an offset of a data receiving queue is recovered, the computing engine acquires data to be processed according to the recovered offset to process, after the Checkpoint event and before the failure occurs, the data to be processed by the computing engine needs to be reprocessed, and the data to be processed is sent again, so that the data is retransmitted.
The processed data is sent to the data receiving terminal through the message queue, the message queue can be used for sending the processed data in batch, when the CheckPoint is in a storage state, the processed data corresponding to the historical state is not sent to the data receiving terminal, and if a fault occurs at the moment, the processed data can be missed when the CheckPoint is recovered.
In this embodiment of the application, the external file identifies the sent processed data, and target processed data that is not sent and a data state corresponding to the target processed data can be obtained from the external file. The unsent processed data is submitted to a message queue to be sent to a data receiving end or directly sent to the data receiving end, so that data retransmission and missed transmission can be avoided.
It will be appreciated that in some scenarios the target process data and the data state are the same.
106. And updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to a data receiving end.
For example, the data state of the state back end, which is restored to the disk storage data state, is updated to the target data state, which is understood to be updated to the latest target data state (the most recently acquired target data state), and the unsent processed data is submitted to the message queue to be sent to the data receiving end, or is directly sent to the data receiving end, so that the data is consistent, and the data retransmission and missed transmission are avoided.
Optionally, when sending the target processed data, the data identifier of the processed data in the message queue may be checked to see whether the data identifier is the same as the target processed data, and if the data identifier is the same as the target processed data, the target processed data is not sent to avoid retransmission, that is, in an embodiment, the step "sending the target processed data to the data receiving end" may specifically include:
acquiring a data identifier corresponding to the processed data in the message queue;
and if the data identification in the message queue is not matched with the data identification corresponding to the target processed data, sending the target processed data to the message queue so as to send the processed data to a data receiving end through the message queue.
For example, a data identifier of the processed data cached in the message queue is obtained, the data identifier is compared with a data identifier of the target processed data, and if the data identifier is different from the data identifier of the target processed data, the processed data is submitted to the message queue to be sent to the downstream service through the message queue. And if the data are the same, the target processed data are not sent.
Optionally, the data to be processed may be one piece of data or batch data, that is, the computing engine receives multiple pieces of data at a time, and processes the batch data as one task, where the data identifier may be a transaction identifier (transaction ID) for identifying a batch of data received by the computing engine, and each piece of data in the batch data corresponds to a sub-identifier for identifying one piece of data in the batch data.
The method comprises the steps that a calculation engine processes each piece of data in batch data, processed data which can be correspondingly generated are batch data, a transaction identifier is stored in an external file after a plurality of pieces of data contained in the data processed by the calculation engine are submitted to a message queue, if a fault occurs when the batch data are not completely submitted to the message queue, processed data of each piece of data containing a transaction ID in a target processed result in the external file can be obtained, and the processed data are sent if the sub identifier is different from the sub identifier in the target processed result.
In an embodiment, after acquiring the to-be-processed data in step 101, the to-be-processed data may be stored in a target data reading position in the data receiving queue, and when the computing engine sends a failure restart, the data acquired from the data receiving queue may be filtered according to the stored data reading position, that is, after "acquiring the to-be-processed data from the data receiving queue through the computing engine", the data processing method provided in the embodiment of the present application may further include:
storing a data reading position of data to be processed in a data receiving queue into an external file;
after the step "updating the disk storage data state at the rear end of the state according to the target data state", the data processing method provided in the embodiment of the present application may further include:
acquiring current data to be processed and a current data reading position corresponding to the current data to be processed from a data receiving queue;
acquiring a target data reading position from an external file;
and if the position relation between the data reading position and the target data reading position meets a preset condition, processing the current data to be processed to obtain processed data.
The data reading position may represent a position of the data to be processed in the data receiving queue, for example, the data reading position may be an offset.
And when the data to be processed is acquired from the data receiving queue through the computing engine, storing the data reading position of the data to be processed into an external file.
And when the computing engine is abnormally restarted due to service, starting to acquire the data to be processed from the data receiving queue according to the data reading position obtained by recovering the checkpoint event. And acquiring the current data to be processed and the current data reading position corresponding to the current data to be processed from the data receiving queue.
And acquiring a stored target data reading position from an external file, taking the target data reading position as a filtering condition, wherein if the current data reading position is in front of the target data reading position, the current data to be processed is indicated to be processed, and if the current data reading position is behind the target data reading position, the current data to be processed is indicated to be unprocessed.
And if the position relation between the data reading position and the target data reading position meets a preset condition, namely the current data reading position is behind the target data reading position, processing the current data to be processed to obtain processed data.
As can be seen from the above, in the embodiment of the present application, the data to be processed is obtained through the calculation engine, and the data to be processed is processed according to the data state in the state rear end of the calculation engine, so as to obtain the processed data; updating the data state of the state rear end of the computing engine according to the processed data to obtain an updated data state; storing the processed data and the updated data state into an external file; sending the processed data to a data receiving end requesting the data, and identifying the processed data sent to the data receiving end by an external file; when the computing engine is out of service and the data state is restored to the disk storage data state indicated by the checkpoint event, acquiring stored target processed data which are not identified and a target data state corresponding to the target processed data from an external file; and updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to a data receiving end.
When the computing engine is abnormally restarted due to service, the data state of the computing engine can be restored to the inventory state indicated by the checkpoint event, the inventory data state in the data back end can be updated according to the data state stored in the external file, the processed data obtained before the updated data state is determined according to the data identification stored in the external file, the target processed data which is not sent to the data receiving end is sent to the data receiving end, and the data retransmission and the data missing after the computing engine is restarted are avoided.
In order to better implement the data processing method provided by the embodiment of the present application, in an embodiment, a data processing apparatus is further provided. The terms are the same as those in the data processing method, and details of implementation can be referred to the description in the method embodiment.
The data processing apparatus may be specifically integrated in a computer device, as shown in fig. 3, and the data processing apparatus may include: the processing unit 301, the updating unit 302, the saving unit 303, the identifying unit 304, the obtaining unit 305, and the sending unit 306 are as follows:
(1) The processing unit 301: the data processing method is used for acquiring data to be processed through the computing engine, processing the data to be processed according to the data state in the state rear end of the computing engine, and obtaining the processed data.
In an embodiment, the processing unit 301 may include a receiving subunit and a data acquiring subunit, specifically:
a receiving subunit: the data receiving queue is used for receiving data to be processed and storing the data to be processed into the data receiving queue;
a data acquisition subunit: the data processing method is used for acquiring data to be processed from the data receiving queue through the computing engine.
In an embodiment, the processing unit 301 may include a state subunit and a data to be processed subunit, specifically:
a state subunit: the data state acquisition module is used for acquiring a data state corresponding to the data to be processed from the state rear end of the computing engine, wherein the data state comprises processed data obtained by processing data of historical data to be processed;
a to-be-processed data processing subunit: and the data processing module is used for processing data based on the data state and the data to be processed to obtain processed data.
(2) The updating unit 302: and the data state updating module is used for updating the data state of the state rear end of the computing engine according to the processed data to obtain the updated data state.
(3) The saving unit 303: and the data processing device is used for saving the processed data and the updated data state into an external file.
(4) The identification unit 304: and the data processing device is used for sending the processed data to the data receiving end which requests the data, and identifying the processed data which is sent to the data receiving end by the external file.
In an embodiment, the identification unit 304 may include a first identification obtaining subunit and a storage subunit, specifically:
a first identifier acquisition subunit: the data processing device is used for acquiring a data identifier corresponding to processed data which are sent to the data receiving terminal;
a storage subunit: and the data identification is stored in an index file preset in the external file.
In an embodiment, the identification unit 304 may include a submission subunit and a data identification subunit, specifically:
a submission subunit: the data processing device is used for submitting the processed data to a message queue so as to send the processed data to a data receiving end through the message queue;
a data identification subunit: for identifying, by the external file, the processed data that has been submitted to the message queue.
(5) The acquisition unit 305: and when the computing engine is in service exception and the data state is restored to the disk storage data state indicated by the checkpoint event, acquiring the stored target processed data which is not identified and the target data state corresponding to the target processed data from the external file.
(6) The transmission unit 306: and the data processing module is used for updating the disk storage data state at the rear end of the state according to the target data state and sending the target processed data to the data receiving end.
In an embodiment, the sending unit 306 may include a second identifier obtaining subunit and a data sending subunit, specifically:
a second identifier acquisition subunit: the data processing device is used for acquiring a data identifier corresponding to the processed data in the message queue;
a data transmission subunit: and the data processing module is used for sending the target processed data to the message queue if the data identification in the message queue is not matched with the data identification corresponding to the target processed data.
In an embodiment, the data processing apparatus may further include a location storage unit, an information acquisition unit, a location acquisition unit, and a data processing unit, specifically:
a position storage unit: the data reading position of the data to be processed in the data receiving queue is stored in an external file;
an information acquisition unit: the data processing device is used for acquiring current data to be processed from the data receiving queue and a current data reading position corresponding to the current data to be processed;
a position acquisition unit: the reading position of the target data is obtained from the external file;
a data processing unit: and the processing unit is used for processing the current data to be processed to obtain processed data if the position relation between the data reading position and the target data reading position meets a preset condition.
As can be seen from the above, in the data processing apparatus in the embodiment of the present application, the processing unit 301 obtains the data to be processed through the computing engine, and processes the data to be processed according to the data state in the state rear end of the computing engine, so as to obtain processed data; the updating unit 302 updates the data state of the rear end of the state of the computing engine according to the processed data to obtain an updated data state; the saving unit 303 saves the processed data and the updated data state into an external file; the identification unit 304 sends the processed data to the data receiving end which requests the data, and identifies the processed data which is sent to the data receiving end by an external file; when the computing engine is out of service and the data state is restored to the disk-storing data state indicated by the checkpoint event, the obtaining unit 305 obtains the stored target processed data which is not identified and the target data state corresponding to the target processed data from the external file; the sending unit 306 updates the disk storage data state at the rear end of the state according to the target data state, and sends the target processed data to the data receiving end.
When the computing engine is abnormally restarted due to service, the data state of the computing engine can be restored to the inventory state indicated by the checkpoint event, the inventory data state in the data back end can be updated according to the data state stored in the external file, the processed data obtained before the updated data state is determined according to the data identification stored in the external file, the target processed data which is not sent to the data receiving end is sent to the data receiving end, and the data retransmission and the data missing after the computing engine is restarted are avoided.
An embodiment of the present application further provides a computer device, where the computer device may be a terminal or a server, as shown in fig. 4, which shows a schematic structural diagram of the computer device according to the embodiment of the present application, and specifically:
the computer device may include components such as a processor 1001 of one or more processing cores, memory 1002 of one or more computer-readable storage media, a power supply 1003, and an input unit 1004. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 4 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. Wherein:
the processor 1001 is a control center of the computer device, connects various parts of the entire computer device using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 1002 and calling data stored in the memory 1002, thereby monitoring the computer device as a whole. Optionally, processor 1001 may include one or more processing cores; preferably, the processor 1001 may integrate an application processor, which mainly handles operating systems, user interfaces, computer programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 1001.
The memory 1002 may be used to store software programs and modules, and the processor 1001 executes various functional applications and data processing by operating the software programs and modules stored in the memory 1002. The memory 1002 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 1002 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 1002 may also include a memory controller to provide the processor 1001 access to the memory 1002.
The computer device further includes a power source 1003 for supplying power to each component, and preferably, the power source 1003 may be logically connected to the processor 1001 through a power management system, so that functions of managing charging, discharging, power consumption, and the like are implemented through the power management system. The power source 1003 may also include any component including one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The computer device may further include an input unit 1004, and the input unit 1004 may be used to receive input numeric or character information, and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the computer device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 1001 in the computer device loads the executable file corresponding to the process of one or more computer programs into the memory 1002 according to the following instructions, and the processor 1001 runs the computer programs stored in the memory 1002, so as to implement various functions as follows:
acquiring data to be processed through a computing engine, and processing the data to be processed according to the data state in the state rear end of the computing engine to obtain processed data;
updating the data state of the state rear end of the computing engine according to the processed data to obtain an updated data state;
storing the processed data and the updated data state into an external file;
sending the processed data to a data receiving end requesting the data, and identifying the processed data sent to the data receiving end by an external file;
when the computing engine is out of service and the data state is restored to the disk storage data state indicated by the checkpoint event, acquiring stored target processed data which are not identified and a target data state corresponding to the target processed data from an external file;
and updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to a data receiving end.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
As can be seen from the above, the computer device in the embodiment of the present application may obtain the data to be processed through the computing engine, and process the data to be processed according to the data state in the state rear end of the computing engine to obtain the processed data; updating the data state of the state rear end of the computing engine according to the processed data to obtain an updated data state; storing the processed data and the updated data state into an external file; sending the processed data to a data receiving end requesting the data, and identifying the processed data sent to the data receiving end by an external file; when the computing engine is out of service and the data state is restored to the disk storage data state indicated by the checkpoint event, acquiring stored target processed data which are not identified and a target data state corresponding to the target processed data from an external file; and updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to a data receiving end.
When the computing engine is abnormally restarted due to service, the data state of the computing engine can be restored to the inventory state indicated by the checkpoint event, the inventory data state in the data back end can be updated according to the data state stored in the external file, the processed data obtained before the updated data state is determined according to the data identification stored in the external file, the target processed data which is not sent to the data receiving end is sent to the data receiving end, and the data retransmission and the data missing after the computing engine is restarted are avoided.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the method provided in the various alternative implementations of the above embodiments.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by a computer program, which may be stored in a computer-readable storage medium and loaded and executed by a processor, or by related hardware controlled by the computer program.
To this end, the present application provides a computer-readable storage medium, in which a computer program is stored, where the computer program can be loaded by a processor to execute any one of the data processing methods provided in the present application.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the computer-readable storage medium may include: read Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, and the like.
Since the computer program stored in the computer-readable storage medium can execute any data processing method provided in the embodiments of the present application, beneficial effects that can be achieved by any data processing method provided in the embodiments of the present application can be achieved, and detailed descriptions are omitted here for the foregoing embodiments.
The foregoing detailed description has provided a data processing method, an apparatus, a computer device, and a computer-readable storage medium according to embodiments of the present application, and specific examples are applied herein to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A data processing method, comprising:
acquiring data to be processed through a computing engine, and processing the data to be processed according to a data state in a state rear end of the computing engine to obtain processed data;
updating the data state of the rear end of the state of the computing engine according to the processed data to obtain an updated data state;
storing the processed data and the updated data state into an external file;
sending the processed data to a data receiving end requesting data, and identifying the processed data sent to the data receiving end in the external file;
when the computing engine is in service abnormity and the data state is restored to the disk storage data state indicated by the checkpoint event, acquiring stored target processed data which are not identified and a target data state corresponding to the target processed data from the external file;
and updating the disk storage data state at the rear end of the state according to the target data state, and sending the target processed data to the data receiving end.
2. The method according to claim 1, wherein the identifying, at the external file, the processed data that has been sent to the data receiving end comprises:
acquiring a data identifier corresponding to the processed data sent to the data receiving terminal;
and storing the data identification in an index file preset in the external file.
3. The method according to claim 1, wherein the sending the processed data to a data receiving end requesting data and identifying the processed data sent to the data receiving end in the external file comprises:
submitting the processed data to a message queue so as to send the processed data to the data receiving end through the message queue;
identifying, at the external file, the processed data that has been submitted to the message queue.
4. The method of claim 3, wherein the sending the target processed data to the data receiving end comprises:
acquiring a data identifier corresponding to the processed data in the message queue;
and if the data identification in the message queue is not matched with the data identification corresponding to the target processed data, sending the target processed data to the message queue so as to send the processed data to the data receiving end through the message queue.
5. The method of claim 1, wherein the obtaining the data to be processed by the computing engine comprises:
receiving data to be processed and storing the data to be processed into a data receiving queue;
and acquiring data to be processed from the data receiving queue through a calculation engine.
6. The method of claim 5, wherein after the obtaining, by the compute engine, the data to be processed from the data receive queue, the method further comprises:
storing the data reading position of the data to be processed in the data receiving queue into the external file;
after the updating the disk-storing data state at the state back end according to the target data state, the method further includes:
acquiring current data to be processed and a current data reading position corresponding to the current data to be processed from the data receiving queue;
acquiring the target data reading position from the external file;
and if the position relation between the data reading position and the target data reading position meets a preset condition, processing the current data to be processed to obtain processed data.
7. The method according to any one of claims 1 to 6, wherein the processing the data to be processed according to the data state in the state backend of the computing engine to obtain processed data comprises:
acquiring a data state corresponding to the data to be processed from a state rear end of the computing engine, wherein the data state comprises processed data obtained by processing data of historical data to be processed;
and performing data processing on the basis of the data state and the data to be processed to obtain processed data.
8. A data processing apparatus, comprising:
the processing unit is used for acquiring data to be processed through a computing engine and processing the data to be processed according to the data state in the state rear end of the computing engine to obtain processed data;
the updating unit is used for updating the data state of the rear end of the state of the computing engine according to the processed data to obtain an updated data state;
the storage unit is used for storing the processed data and the updated data state into an external file;
the identification unit is used for sending the processed data to a data receiving terminal requesting data and identifying the processed data sent to the data receiving terminal in the external file;
an obtaining unit, configured to, when a service exception occurs in the compute engine and the data state is restored to a disk storage data state indicated by a checkpoint event, obtain, from the external file, stored target processed data that is not identified and a target data state corresponding to the target processed data;
and the sending unit is used for updating the disk storage data state at the rear end of the state according to the target data state and sending the target processed data to the data receiving end.
9. A computer device comprising a memory and a processor; the memory stores a computer program, and the processor is configured to execute the computer program in the memory to perform the data processing method according to any one of claims 1 to 7.
10. A computer-readable storage medium for storing a computer program which is loaded by a processor to perform the data processing method of any one of claims 1 to 7.
CN202211236677.6A 2022-10-10 2022-10-10 Data processing method, data processing device, computer equipment and computer readable storage medium Pending CN115658745A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211236677.6A CN115658745A (en) 2022-10-10 2022-10-10 Data processing method, data processing device, computer equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211236677.6A CN115658745A (en) 2022-10-10 2022-10-10 Data processing method, data processing device, computer equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115658745A true CN115658745A (en) 2023-01-31

Family

ID=84987362

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211236677.6A Pending CN115658745A (en) 2022-10-10 2022-10-10 Data processing method, data processing device, computer equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115658745A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117112302A (en) * 2023-08-30 2023-11-24 广州经传多赢投资咨询有限公司 Abnormal disaster recovery method, system, equipment and medium for financial data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117112302A (en) * 2023-08-30 2023-11-24 广州经传多赢投资咨询有限公司 Abnormal disaster recovery method, system, equipment and medium for financial data
CN117112302B (en) * 2023-08-30 2024-03-12 广州经传多赢投资咨询有限公司 Abnormal disaster recovery method, system, equipment and medium for financial data

Similar Documents

Publication Publication Date Title
CN108388479B (en) Delayed message pushing method and device, computer equipment and storage medium
JP6865219B2 (en) Event batch processing, output sequencing, and log-based state storage in continuous query processing
CN108566290B (en) Service configuration management method, system, storage medium and server
CN108776934B (en) Distributed data calculation method and device, computer equipment and readable storage medium
US9384114B2 (en) Group server performance correction via actions to server subset
CN111913818B (en) Method for determining dependency relationship between services and related device
WO2017166713A1 (en) Service request processing method and device
CN110581887B (en) Data processing method, device, block chain node and storage medium
CN108984333B (en) Method and device for big data real-time calculation
CN111586126A (en) Method, device and equipment for pre-downloading small program and storage medium
CN108664520B (en) Method and device for maintaining data consistency, electronic equipment and readable storage medium
CN115658745A (en) Data processing method, data processing device, computer equipment and computer readable storage medium
CN109766198B (en) Stream processing method, device, equipment and computer readable storage medium
CN115757611A (en) Big data cluster switching method and device, electronic equipment and storage medium
CN110795322A (en) Service monitoring method and device, computer equipment and storage medium
CN112711515B (en) Real-time monitoring method and device and electronic equipment
US11243979B1 (en) Asynchronous propagation of database events
CN108241616B (en) Message pushing method and device
CN113900855B (en) Active hot start method, system and device for abnormal state of switch
CN108121730B (en) Device and method for quickly synchronizing data update to service system
CN113079152B (en) Data transmission method, device and medium
CN115563160A (en) Data processing method, data processing device, computer equipment and computer readable storage medium
CN111694645B (en) Task processing method and related device in distributed task scheduling system
CN111901366B (en) Data pushing method, device, equipment and storage medium
CN114201449A (en) Log monitoring method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination