CN116166673A - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents

Data processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN116166673A
CN116166673A CN202210843047.9A CN202210843047A CN116166673A CN 116166673 A CN116166673 A CN 116166673A CN 202210843047 A CN202210843047 A CN 202210843047A CN 116166673 A CN116166673 A CN 116166673A
Authority
CN
China
Prior art keywords
data processing
data
rule
source data
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210843047.9A
Other languages
Chinese (zh)
Inventor
杨祥
郭剑霓
吴海英
郭江
刘磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Priority to CN202210843047.9A priority Critical patent/CN116166673A/en
Publication of CN116166673A publication Critical patent/CN116166673A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a data processing method and apparatus, an electronic device, and a computer readable medium, where the method includes: acquiring source data to be processed; acquiring a target data processing rule corresponding to source data, wherein the target data processing rule is a real-time rule for checking whether the source data meets data processing conditions, the target data processing rule is obtained by synchronizing corresponding data processing rules in a rule base in advance, the rule base is used for storing the data processing rules corresponding to different services, and updating the stored data processing rules corresponding to the rule changing request according to the received rule changing request; and verifying the source data according to the target data processing rule, and performing data processing on the source data passing the verification to obtain target data. According to the embodiment of the disclosure, the source data can be subjected to data processing without changing the data processing code, so that high-quality target data can be obtained in real time and efficiently.

Description

Data processing method and device, electronic equipment and computer readable storage medium
Technical Field
The disclosure relates to the field of computer technology, and in particular, to a data processing method and device, an electronic device, and a computer readable storage medium.
Background
With the continuous development of internet technology, more and more service Data are generated in real time in various service applications, and in the face of changeable and complex real-time service scenarios, how to perform Data processing on service Data generated by service applications, for example, data Governance (Data Governance), so as to provide real-time and high-quality Data to users is a problem to be solved urgently.
Currently, for service data generated by a service application, a batch processing manner is generally used to screen and process service data in a period of time obtained in batch based on a data processing rule pre-written in the data processing application, which may have problems of processing delay, insufficient real-time and incapability of timely adapting to frequently-changed service scenes.
Disclosure of Invention
The disclosure provides a data processing method and device, electronic equipment and a computer readable storage medium.
In a first aspect, the present disclosure provides a data processing method, the method comprising:
acquiring source data to be processed; the method comprises the steps of,
acquiring a target data processing rule corresponding to the source data, wherein the target data processing rule is a rule for checking whether the source data meets data processing conditions, the target data processing rule is obtained by synchronizing corresponding data processing rules in a rule base in advance, the rule base is used for storing the data processing rules corresponding to different services, and updating the stored data processing rule corresponding to the rule changing request according to a received rule changing request;
And verifying the source data according to the target data processing rule, and performing data processing on the source data passing the verification to obtain target data.
In a second aspect, the present disclosure provides a data processing apparatus comprising:
the source data acquisition unit is used for acquiring source data to be processed in real time;
a processing rule obtaining unit, configured to obtain a target data processing rule corresponding to the source data, where the target data processing rule is a rule for checking whether the source data meets a data processing condition, the target data processing rule is obtained by synchronizing corresponding data processing rules in a rule base in advance, and the rule base is configured to store data processing rules corresponding to different services, and update the stored data processing rule corresponding to the rule change request according to a received rule change request;
and the processing unit is used for verifying the source data according to the target data processing rule and performing data processing on the source data which passes the verification so as to obtain target data.
In a third aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, one or more of the computer programs being executable by the at least one processor to enable the at least one processor to perform the data processing method described above.
In a fourth aspect, the present disclosure provides a computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the data processing method described above.
According to the embodiment provided by the disclosure, after the source data to be processed is obtained, the target data processing rule corresponding to the source data is obtained through the corresponding data processing rule in the pre-synchronization rule base, the source data is checked based on the target data processing rule, and the source data passing the check is processed, and because the target data processing rule is obtained through the corresponding data processing rule in the pre-synchronization rule base, and the rule base supports updating the data processing rule corresponding to the rule changing request under the condition that the rule changing request is received, the method enables the electronic equipment to perform check sum data processing on the source data in real time and high efficiency under the condition that the data processing code does not need to be changed, so that high-quality target data is obtained.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of obtaining source data provided by an embodiment of the present disclosure;
FIG. 3 is a flowchart of a process rule for obtaining target data provided by an embodiment of the present disclosure;
FIG. 4 is a flow chart of a cache data processing rule provided by an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a processing framework of a data processing method according to an embodiment of the disclosure;
FIG. 6 is a block diagram of a data processing apparatus provided by an embodiment of the present disclosure;
fig. 7 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
In the embodiment of the disclosure, the data processing includes data management, adding, deleting, modifying and the like; for example, when data governance is performed, typically, service data generated in real time is stored in a service database by a service application, then service data generated by a previous day application is extracted from the service database by a data extraction tool at a fixed time of each day into a Hadoop distributed file system (HDFS, hadoop Distributed File System) and appearance mapping data is created by using a data warehouse tool hive, so that the data governance application can verify and govern the extracted service data according to a data governance rule implemented by pre-encoding, where the data extraction tool may be, for example, an open source tool Sqoop, dataX, and other tools. The data processing method is actually used for carrying out data treatment on service data generated by service application in a batch processing mode, so that the problem of serious data delay usually exists, and the problem that service iteration and instant response service requirements cannot be realized quickly due to the fact that the data treatment rule for data treatment is realized by fixed coding is also solved.
To solve the above-mentioned problems, an embodiment of the present disclosure provides a data processing method, please refer to fig. 1, which is a flowchart of a data processing method provided by an embodiment of the present disclosure. The method can be applied to the electronic equipment, the electronic equipment can be a server, and the server can be a physical server or a virtual server; of course, with the continuous progress of the technology, the electronic device may also be a terminal device, that is, the method may also be applied to the terminal device alone, for example, may be applied to an edge terminal device in an edge computing scenario, which is not limited in particular herein.
As shown in fig. 1, the data processing method provided in the embodiment of the present disclosure includes the following steps S101 to S103, which are described in detail below.
Step S101, acquiring source data to be processed.
The source data refers to service data to be subjected to data processing, which is generated by a service application.
For example, the source data may be generated and stored in real-time for a business application, such as data in a mysql database.
In the embodiment of the present disclosure, the service application for generating the service data may be at least one application in any application scenario, that is, the source data may be data in a plurality of data sources.
For example, the service application for generating the source data may be at least one of an image processing application, a voice processing application, a text processing application, a video processing application, and the like; correspondingly, the source data can be at least one of images, voice, text and video.
In the embodiments of the present disclosure, the source data is text data, for example, service data generated for a financial consumer application, for example, a financial lending application for providing services such as "borrowing", "repayment", etc. to a user, unless otherwise specified.
Step S102, a target data processing rule corresponding to the source data is obtained, wherein the target data processing rule is a rule for checking whether the source data meets data processing conditions, the target data processing rule is obtained by synchronizing corresponding data processing rules in a rule base in advance, the rule base is used for storing the data processing rules corresponding to different services, and the stored data processing rule corresponding to the rule change request is updated according to the received rule change request.
The target data processing rule is a data processing rule corresponding to the source data, which is obtained in advance from a rule base supporting data updating and is synchronized before the current moment.
In the embodiment of the present disclosure, the rule base may be a mysql database, or may be another database, which is not limited herein.
Since the data processing rules are stored in the rule base, and the rule base can update the stored data processing rules corresponding to the rule change request under the request of receiving the rule change request, in the embodiment of the disclosure, the data processing rules for verifying whether the source data meets the data processing conditions can be dynamically changed, and since the target data processing rules are obtained by synchronizing the corresponding data processing rules in the rule base in advance, when the data processing method provided by the embodiment of the disclosure processes the source data, the service scene requirement can be responded in time without changing the data processing codes.
For example, a front-end application interface for configuring the data processing rules can be provided for the user, so that the user can dynamically change the corresponding data processing rules by configuring the data processing rules at the latest moment in the front-end application interface according to the real-time service scene requirements and sending a rule change request to the rule base.
Under the condition that the data processing rules in the rule base are changed, the electronic equipment implementing the data processing method of the embodiment of the disclosure can acquire the changed rules in advance and synchronize the changed rules into the cache under the condition that the data processing rules in the rule base are detected to be changed, so that the real-time target data processing rules corresponding to the source data are stored in the cache at any time.
In general, during the operation of a service application, some dirty data, such as abnormal service data or invalid service data, may be generated due to an application operation reason or a network transmission reason, or some service data that does not meet the viewing requirement of a user may also be generated, so in order to reduce the data processing amount to increase the data processing speed, a data processing rule is generally set during the data processing, and the source data is first checked based on the data processing rule.
It should be noted that, the data processing rule in the embodiment of the present disclosure may be a rule implemented based on any one of the sql function or the regular expression.
For example, a data processing rule whose content is "timestamp < = currTime ()" may be set to check whether the generation time stamp of the source data is not greater than the current time.
For another example, a data processing rule whose content is "name. Contact (' name1, ' name2 ')" may be set to perform data processing only on source data containing "name1" or "name2" in the name.
For another example, a data processing rule having a content of "(timestamp < = currTime ())) & name.contact ('name 1', 'name 2')" may be directly set to perform data processing only on source data whose name contains "name1" or "name2" and whose generation time stamp is less than or equal to the current time.
According to the above description, in the embodiment of the present disclosure, by setting the data processing rule that can be implemented based on the sql function or the regular expression, not only can user configuration be facilitated, but also the back end can be facilitated to directly verify the source data to be processed based on the data processing rule, so that the data processing speed can be greatly improved.
Step S103, verifying the source data according to the target data processing rule, and performing data processing on the source data passing the verification to obtain target data.
After the target data processing rule corresponding to the source data is obtained in step S102, the source data may be verified based on the target data processing rule, and data processing, for example, corresponding conversion processing, is performed on the source data under the condition that the verification is passed, so as to obtain target data that is convenient for the user to view.
It should be noted that, in the embodiment of the present disclosure, the data processing on the source data that passes the verification may be performing conversion processing on the source data according to a preset rule, so as to convert the source data into target data that is convenient for a user to view.
For example, when the service application generates the source data, for reasons of reducing storage space occupation and facilitating service logic processing, the data type and/or data format of the source data are convenient for the electronic device to calculate, but the visibility is poor, so when the source data are verified based on the target data processing rule and the verification is passed, at least one of format conversion, data type conversion and other conversion processes can be performed on the source data which is passed by the verification, so as to obtain the target data with better visibility. Of course, the above is only one embodiment of performing data processing on the source data that passes the verification, and is not particularly limited herein.
It can be seen that, in the data processing method provided in the embodiment of the present disclosure, after obtaining source data to be processed, a target data processing rule corresponding to the source data and obtained by synchronizing a corresponding data processing rule in a rule base in advance is obtained, and verification is performed on the source data based on the target data processing rule and data processing is performed on the source data that passes the verification.
Referring to fig. 2, a flowchart of acquiring source data is provided in an embodiment of the disclosure. That is, in the embodiment of the present disclosure, in the process of executing the above step S101, the acquiring the source data to be processed may specifically include the following steps S201 to S204.
Step S201, detecting data change information of a target database through a second detector, wherein the target database is used for storing data corresponding to different services, and the second detector obtains data change information through detecting change information of a second preset log file corresponding to the target database.
The second detector may be a tool implemented based on a flink-cdc component for detecting data change information in the target database.
The target database may be, for example, a mysql database or other relational database, and is not particularly limited herein.
The link-cdc component detects and captures change information of a database, for example, information such as insertion, update, and deletion of data or a data table, and records the change information in order of occurrence.
In this embodiment, the detecting, by the second detector, the data change information of the target database may specifically be: and detecting the change information of a second preset log file corresponding to the target database through a second detector so as to obtain the data change information.
The second preset log file corresponding to the target database is a log file for recording the data change information of the target database. For example, where the target database is a mysql database, the second preset log file may be a binlog file of the mysql database.
Step S202, according to the data change information, obtaining changed data records in the target database.
Step S203, the changed data record is written into a second message queue;
and a second message queue for caching the changed data record obtained based on the data change information, so that the electronic device can consume the changed data record in a streaming manner and process the changed data record piece by piece, wherein the second message queue can be a kafka message queue, for example.
Step S204, the source data is obtained according to the changed data record in the second message queue.
That is, in the embodiment of the present disclosure, the second detector detects the data change information of the target database, and by synchronously buffering the changed data record obtained based on the data change information in advance to the second message queue and obtaining and processing the changed data record piece by piece in real time, the effect of processing the source data generated by the service application program in real time in a streaming manner can be achieved; meanwhile, the acquired changed data records are firstly stored in the second message queue and then are acquired and processed one by one, so that the embodiment can also reduce the data processing concurrency in the whole data processing system, and achieve the effects of increasing the system robustness and guaranteeing the high availability of the system.
Referring to fig. 3, a flowchart of a process for obtaining target data processing rules is provided in an embodiment of the present disclosure. That is, in the embodiment of the present disclosure, in the process of executing the above step S102, the acquiring the target data processing rule corresponding to the source data may specifically include: step S301, obtaining a service identifier corresponding to source data; and step S302, acquiring the data processing rule corresponding to the service identifier from the cached multiple data processing rules as a target data processing rule.
The service identifier is a data identifier for identifying a service and/or a service scene corresponding to the source data. For example, the form of "large classification-small classification" may be used to represent the traffic and traffic scenario to which the source data corresponds.
That is, as can be seen from the above description in step S101, in the embodiment of the present disclosure, the source data may be data corresponding to different service applications; meanwhile, considering that different service scenarios may exist in the same service application, for example, in a financial consumption application program, a plurality of service scenarios such as "borrowing", "repayment", "withdrawal", and "delay" may exist at the same time, so in order to promote high availability of data processing, so that the data processing system may support data processing on service data of different services and service scenarios at the same time, in the embodiment of the present disclosure, the acquired source data may correspondingly include a service identifier corresponding to the source data; correspondingly, the data processing rule can also comprise a service identifier corresponding to the service and the service scene; of course, in the specific implementation, a general type of data processing rule may also be set to verify the service data of different services and service scenarios at the same time, which is not limited in particular herein.
Referring to fig. 4, a flowchart of a cache data processing rule according to an embodiment of the present disclosure is shown. As shown in fig. 4, in the embodiment of the present disclosure, any data processing rule in the cache may be obtained through the following steps S401 to S404.
In step S401, rule change information of the rule base is detected by the first detector.
The first detector is implemented in a similar manner to the second detector, and may be, for example, a tool implemented based on a flink-cdc component for detecting rule change information in a rule base.
The rule change information is used for recording change information of rules or data tables in a rule base.
The detecting, by the second detector, rule change information of the rule base may be: and detecting the change information of a first preset log file corresponding to the rule base through a first detector to obtain the rule change information. In this embodiment, the first preset log file may be, for example, a binlog log file.
Step S402, according to the rule change information, obtaining changed data processing rules in the rule base;
step S403, writing the changed data processing rule into a first message queue;
and the first message queue is used for caching the changed data records so that the electronic equipment can consume the changed data processing rules in a streaming mode and process the changed data processing rules piece by piece, wherein the first message queue can be a kafka message queue, for example.
Step S404, updating the data in the buffer memory according to the changed data processing rule in the first message queue to obtain the data processing rule.
As can be seen from the foregoing description, in the embodiment of the present disclosure, during the process of performing data processing, the first detector detects the changed data processing rule in the acquisition rule base, and in a manner of acquiring and processing the changed data processing rule from the first message queue to synchronously update the rule cached in the cache, the data processing application can support the dynamic change and synchronous update of the data processing rule to the cache for use during the running process, so that the electronic device implementing the method can implement the effects of fast service iteration and immediate response to the service requirement.
In the above description, the effect of ensuring the latest data processing rule of the time synchronous cache in the cache and timely obtaining the source data by the second detector by respectively describing that the rule base can be monitored in real time by the first detector; it should be noted that, in the implementation, the above embodiments may also be combined to further enhance the high usability of the method, which is not limited herein.
In addition, it should be noted that, in the embodiment of the present disclosure, the source data to be processed may include at least one data item, and the target data processing rule corresponding to the source data may also include at least one data processing rule; each data processing rule corresponds to at least one data item; in this embodiment, the verifying the source data according to the target data processing rule may be: verifying the data in each data item and the corresponding data processing rule; in the event that the data in each data item is verified, the source data is determined to be verified.
That is, one piece of source data may correspond to a plurality of data processing rules, and each data processing rule may correspond to at least one data item in the source data. For example, the source data may include 5 data items, the data processing rule 1 may be used to verify the content of data item 1 of one source data, the data processing rule 2 may be used to verify the content of data item 2 of the source data, and if the content of data item 1 satisfies data processing rule 1 and the content of data item 2 satisfies data processing rule 2, then the source data is determined to pass the verification; otherwise, determining that the source data is not verified.
In some embodiments, in the case that the verification of the source data is not passed, the source data may also be stored in the HDFS, and may be confirmed by a user and processed correspondingly by a message pushing manner, which is not limited herein.
In some embodiments, in the case that the source data passes through the target data processing rule, in order to reduce the concurrency of data processing, the source data passing through the verification may be stored in the third message queue first, so that the subsequent processing module may process the source data according to the processing capability.
That is, in this embodiment, the data processing is performed on the source data that passes the verification, so as to obtain the target data, which may specifically be: acquiring source data passing verification from the third message queue; and performing corresponding data conversion processing on the source data passing the verification to obtain target data.
In addition, after obtaining the target data through the above method, for facilitating the user's viewing, the method further includes: the target data is displayed.
In this embodiment, in the case where the method is applied to a server, the server may provide the target data to a terminal device used by a user, and then the terminal device displays the target data; alternatively, in the case where the method is applied to a terminal device, such as an edge computing device, the target data may be presented by the terminal device directly for viewing by the user after the target data is obtained.
Of course, whether the execution body of the method is a server or a terminal device, after the target data is obtained, the execution body may not perform presentation, but the execution body may buffer the target data to the fourth message queue for consumption by a downstream application, which is not limited in particular herein.
It should be noted that, the first message queue, the second message queue, and the third message queue may be different message queues or the same message queue, where in the case that the three message queues are the same message queue, when writing the changed data processing rule, the changed data record, and the source data passing the verification into the message queue, the corresponding message types may be set, so that the data processing application consumes different types of data according to the message types.
Please refer to fig. 5, which is a schematic diagram of a processing framework of a data processing method according to an embodiment of the disclosure. The following describes a data processing method provided by an embodiment of the present disclosure with reference to fig. 5.
As shown in fig. 5, in the embodiment of the present disclosure, a target database for storing business data may be set as a data source of source data, and a rule base may be set as a data source of data processing rules at the same time.
In the running process of the data processing application, the rule base can receive the data processing rule configured by the user through the front-end application interface, and update and store the corresponding data processing rule stored by the user; the first detector implemented based on the flink-cdc component acquires the changed data processing rule by detecting rule change information in the rule base, and writes the changed data processing rule into the first message queue, so that the data processing application can synchronously update the data in the cache by acquiring the changed data processing rule in the first message queue, so that the data processing application can keep the latest data processing rule in the cache at any time.
Meanwhile, in the running process of the system, a second detector realized based on the flink-cdc component can obtain changed data records by detecting data change information in a target database, and the changed data records are written into a second message queue, so that the system can acquire source processing to be processed for data processing by acquiring the changed data records in the second message queue.
For the acquired source data to be processed, acquiring a target data processing rule from a plurality of cached data processing rules according to the service identifier of the source data, and checking the source data based on the target data processing rule; if the verification is passed, writing the source data passing the verification into a third message queue for data processing to obtain target data; and for the source data which fails to pass the verification, the source data can be stored in the HDFS for the user to confirm.
It should be noted that, in the embodiment of the present disclosure, the data processing method may be implemented by developing a stream computing Flink job application based on an Apache Flink distributed processing framework, or may also be implemented based on other stream job frameworks, which is not limited in particular herein.
It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.
In addition, the disclosure further provides a data processing apparatus, an electronic device, and a computer readable storage medium, where the foregoing may be used to implement any one of the data processing methods provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.
Fig. 6 is a block diagram of a data processing apparatus according to an embodiment of the present disclosure.
Referring to fig. 6, an embodiment of the present disclosure provides a data processing apparatus including: a source data acquisition unit 601, a processing rule acquisition unit 602, and a processing unit 603.
The source data acquisition unit 601 is configured to acquire source data to be processed.
The processing rule obtaining unit 602 is configured to obtain a target data processing rule corresponding to source data, where the target data processing rule is a rule for verifying whether the source data meets a data processing condition, the target data processing rule is obtained by synchronizing corresponding data processing rules in a rule base in advance, the rule base is configured to store data processing rules corresponding to different services, and update the stored data processing rule corresponding to the rule change request according to a received rule change request.
The processing unit 603 is configured to verify the source data according to the target data processing rule, and perform data processing on the source data that passes the verification, so as to obtain target data.
In some embodiments, the source data obtaining unit 601 may be configured to, when obtaining source data to be processed: detecting data change information of a target database through a second detector, wherein the target database is used for storing data corresponding to different services, and the second detector obtains the data change information through detecting the change information of a second preset log file corresponding to the target database; obtaining changed data records in the target database according to the data change information; writing the changed data record into a second message queue; and obtaining the source data by obtaining the changed data record in the second message queue.
In some embodiments, the processing rule acquiring unit 602, when acquiring the target data processing rule corresponding to the source data, may be configured to: acquiring a service identifier corresponding to source data; and acquiring the data processing rule corresponding to the service identifier from the cached multiple data processing rules as a target data processing rule.
In some embodiments, the apparatus 600 further comprises a processing rule synchronization unit for: detecting rule change information of a rule base by a first detector; obtaining changed data processing rules in the rule base according to the rule change information; writing the changed data processing rule into a first message queue; and acquiring the changed data processing rule in the first message queue so as to acquire the data processing rule by updating the data in the cache.
In some embodiments, the synchronization unit, when detecting rule change information of the rule base by the first detector, may be configured to: and detecting the change information of a first preset log file corresponding to the rule base through a first detector so as to obtain the rule change information.
In some embodiments, the source data includes at least one data item and the target data processing rule includes at least one data processing rule; each data processing rule corresponds to at least one data item; the processing unit 603, when verifying the source data according to the target data processing rule, may be configured to: verifying the data in each data item and the corresponding data processing rule; in the case that the data in each data item is checked for pass, the source data is determined to be checked for pass.
Fig. 7 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Referring to fig. 7, an embodiment of the present disclosure provides an electronic device including: at least one processor 701; at least one memory 702, and one or more I/O interfaces 703 connected between the processor 701 and the memory 702; wherein the memory 702 stores one or more computer programs executable by the at least one processor 701, the one or more computer programs being executable by the at least one processor 701 to enable the at least one processor 701 to perform the data processing method described above.
The disclosed embodiments also provide a computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the data processing method described above. The computer readable storage medium may be a volatile or nonvolatile computer readable storage medium.
Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, performs the above-described data processing method.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims (10)

1. A method of data processing, comprising:
acquiring source data to be processed; the method comprises the steps of,
acquiring a target data processing rule corresponding to the source data, wherein the target data processing rule is a rule for checking whether the source data meets data processing conditions, the target data processing rule is obtained by synchronizing corresponding data processing rules in a rule base in advance, the rule base is used for storing the data processing rules corresponding to different services, and updating the stored data processing rule corresponding to the rule changing request according to a received rule changing request;
And verifying the source data according to the target data processing rule, and performing data processing on the source data passing the verification to obtain target data.
2. The method of claim 1, wherein the obtaining the target data processing rule corresponding to the source data comprises:
acquiring a service identifier corresponding to the source data;
and acquiring the data processing rule corresponding to the service identifier from the cached multiple data processing rules as the target data processing rule.
3. The method according to claim 2, wherein any data processing rule in the cache is obtained by:
detecting rule change information of the rule base through a first detector;
obtaining changed data processing rules in the rule base according to the rule change information;
writing the changed data processing rule into a first message queue;
and updating the data in the cache according to the changed data processing rule in the first message queue to obtain the data processing rule.
4. A method according to claim 3, wherein detecting rule change information of the rule base by a first detector comprises:
And detecting the change information of a first preset log file corresponding to the rule base through the first detector to obtain the rule change information.
5. The method of claim 1, wherein the obtaining source data to be processed comprises:
detecting data change information of a target database through a second detector, wherein the target database is used for storing data corresponding to different services, and the second detector obtains the data change information by detecting change information of a second preset log file corresponding to the target database;
obtaining a changed data record in the target database according to the data change information;
writing the changed data record into a second message queue;
and obtaining the source data according to the changed data record in the second message queue.
6. The method of any of claims 1-5, wherein the source data includes at least one data item and the target data processing rule includes at least one data processing rule; each data processing rule corresponds to at least one data item;
the verifying the source data according to the target data processing rule includes:
Verifying the data in each data item and the corresponding data processing rule;
in the event that the data in each data item is verified, the source data is determined to be verified.
7. The method of claim 6, wherein after the source data verification passes, the method further comprises:
storing the source data which passes the verification into a third message queue;
the data processing of the source data passing the verification is carried out to obtain target data, which comprises the following steps:
acquiring the source data passing the verification from the third message queue;
and carrying out corresponding data conversion processing on the source data passing the verification so as to obtain the target data.
8. A data processing apparatus, comprising:
the source data acquisition unit is used for acquiring source data to be processed;
a processing rule obtaining unit, configured to obtain a target data processing rule corresponding to the source data, where the target data processing rule is a rule for checking whether the source data meets a data processing condition, the target data processing rule is obtained by synchronizing corresponding data processing rules in a rule base in advance, and the rule base is configured to store data processing rules corresponding to different services, and update the stored data processing rule corresponding to the rule change request according to a received rule change request;
And the processing unit is used for verifying the source data according to the target data processing rule and performing data processing on the source data which passes the verification so as to obtain target data.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the data processing method of any one of claims 1-7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the data processing method according to any of claims 1-7.
CN202210843047.9A 2022-07-18 2022-07-18 Data processing method and device, electronic equipment and computer readable storage medium Pending CN116166673A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210843047.9A CN116166673A (en) 2022-07-18 2022-07-18 Data processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210843047.9A CN116166673A (en) 2022-07-18 2022-07-18 Data processing method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN116166673A true CN116166673A (en) 2023-05-26

Family

ID=86420682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210843047.9A Pending CN116166673A (en) 2022-07-18 2022-07-18 Data processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN116166673A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117746999A (en) * 2024-02-20 2024-03-22 之江实验室 Data processing method and device, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117746999A (en) * 2024-02-20 2024-03-22 之江实验室 Data processing method and device, storage medium and electronic equipment
CN117746999B (en) * 2024-02-20 2024-05-03 之江实验室 Data processing method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US10897500B2 (en) Synchronizing a device using push notifications
US11188560B2 (en) Synchronizing object in local object storage node
AU2020327970B2 (en) Distributed queueing over a redis cluster
US10216586B2 (en) Unified data layer backup system
US20140214956A1 (en) Method and apparatus for managing sessions of different websites
CN107040576A (en) Information-pushing method and device, communication system
CN116166673A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN113742518A (en) Method, apparatus and computer program product for storing and providing video
CN111338834A (en) Data storage method and device
CN108055594B (en) Edge slicing implementation method and device, computer equipment and storage medium
US20230055968A1 (en) Filtering group messages
US9798626B2 (en) Implementing change data capture by interpreting published events as a database recovery log
US11277300B2 (en) Method and apparatus for outputting information
US20120023074A1 (en) Scalable rule-based data synchronization systems and methods
US20150248499A1 (en) Optimized read/write access to a document object model
CN110633332A (en) Data warehouse, data updating and calling method, device and equipment
US9430477B2 (en) Predicting knowledge gaps of media consumers
US11202130B1 (en) Offline video presentation
US11487631B2 (en) Data refresh in a replication environment
US11487931B1 (en) Replaying a webpage based on virtual document object model
CN117193670B (en) Method and device for clearing cache, storage medium and electronic equipment
US10831573B2 (en) Message processing
US10901944B2 (en) Statelessly populating data stream into successive files
CN117609381A (en) Data synchronization method and device, electronic equipment and computer readable storage medium
CN116932347A (en) Link tracking method, device and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination