CN114860823A - Batch data processing method and device - Google Patents

Batch data processing method and device Download PDF

Info

Publication number
CN114860823A
CN114860823A CN202210345850.XA CN202210345850A CN114860823A CN 114860823 A CN114860823 A CN 114860823A CN 202210345850 A CN202210345850 A CN 202210345850A CN 114860823 A CN114860823 A CN 114860823A
Authority
CN
China
Prior art keywords
data
data set
condition
layer logic
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210345850.XA
Other languages
Chinese (zh)
Inventor
陈戈
谢炜琪
帅红波
黄显超
柯星宇
梁展瑞
郑凯帆
李俊华
周赞
彭建业
陈志鹏
邓亚丽
刘铁成
吴华东
陈芬
黄国军
何波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202210345850.XA priority Critical patent/CN114860823A/en
Publication of CN114860823A publication Critical patent/CN114860823A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Evolutionary Computation (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a batch data processing method and device, which can be applied to the technical field of computers. The device for processing the batch data obtains a fourth data set according to the first data set, the system layer logic condition, the application layer logic condition and the data layer logic condition, wherein data in the fourth data set meets the system layer logic condition, the application layer logic condition and the data layer logic condition; and performing state conversion on the data in the fourth data set, and marking the data in the fourth data set according to the conversion result. Therefore, the device for processing the batch data can automatically process the data to be processed in batch according to the three-layer logic conditions by presetting the reasonable three-layer logic conditions aiming at various service scenes, so that the state switching of the data to be processed is realized, the data is marked for data conversion according to the switching result, and the efficiency and the accuracy of data processing are improved.

Description

Batch data processing method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for processing batch data.
Background
In many scenarios, the data processing workload is often staggering because the data of the service and the user is huge and complex, and the data amount to be processed is unprecedented huge. For example, for a banking system, the business is huge, the number of users is large, the data to be processed is huge, and each data has many possible states (such as open card, normal, silent, clear, close, etc.). At present, a data processing mode usually operates on single data, but for a scene with numerous data and a complex data state, the data processing mode of single data operation obviously cannot meet the requirement.
Based on this, it is desirable to provide a method capable of automatically processing data in batch, so as to solve the needs for data processing in the scenes that the data to be processed and the data state are complicated and various.
Disclosure of Invention
The embodiment of the application provides a batch data processing method and device, which can automatically process data to be processed in batches, improve the efficiency and accuracy of data processing, and improve user experience.
In a first aspect, an embodiment of the present application provides a batch data processing method, including:
obtaining a second data set according to a first data set and a system layer logic condition, wherein data in the second data set meets the system layer logic condition, and the second data set is a subset of the first data set;
obtaining a third data set according to the second data set and an application layer logic condition, wherein data in the third data set meets the application layer logic condition, and the third data set is a subset of the second data set;
obtaining a fourth data set according to the third data set and a data layer logic condition, wherein data in the fourth data set meets the data layer logic condition, and the fourth data set is a subset of the third data set;
and performing state conversion on the data in the fourth data set, and marking the data in the fourth data set according to a conversion result.
Optionally, the method further comprises:
acquiring the first data set meeting a state transition condition from the total data set, wherein the state transition condition is used for indicating data needing state transition, and the first data set is a subset of the total data set.
Optionally, the method further comprises:
and acquiring a fifth data set meeting the quasi-state transition condition from the total data set, wherein the quasi-state transition condition is used for indicating data needing state transition as long as a target index is met, and the fifth data set is a subset of the total data set.
Optionally, the method further comprises:
acquiring a sixth data set meeting a quasi-preprocessing condition from the fifth data set, wherein the quasi-preprocessing condition is used for indicating data needing to be subjected to quasi-preprocessing, the quasi-preprocessing condition comprises the target index, and the sixth data set is a subset of the fifth data set;
adding an identifier to the data in the sixth data set, wherein the identifier is used for indicating that the data in the sixth data set is to-be-preprocessed data;
and generating a quasi-preprocessing table, wherein the quasi-preprocessing table comprises the data in the sixth data set and the identification of each data.
Optionally, the method further comprises:
obtaining a seventh data set according to the first data set and the system-level logic condition, wherein data in the seventh data set does not satisfy the system-level logic condition, and the seventh data set is a subset of the first data set;
obtaining an eighth data set according to the second data set and the application layer logic condition, wherein data in the eighth data set does not meet the application layer logic condition, and the eighth data set is a subset of the second data set;
obtaining a ninth data set according to the third data set and the data layer logic condition, wherein data in the ninth data set does not satisfy the data layer logic condition, and the ninth data set is a subset of the third data set;
and generating a prompt information table, wherein the prompt information table comprises the data in the seventh data set, the data in the eighth data set, the data in the ninth data set and relevant information of each data, and the relevant information indicates the reason why the data do not meet the corresponding conditions.
Optionally, after performing state transition on the data in the fourth data set and marking the data in the fourth data set according to the transition result, the method further includes:
determining that data in a tenth data set does not meet any one of the system layer logic condition, the application layer logic condition or the data layer logic condition, performing state conversion on the data in the tenth data set, and marking the data in the tenth data set according to a conversion result, where the tenth data set is a subset of the fourth data set.
In a second aspect, an embodiment of the present application further provides an apparatus for batch data processing, including:
a first obtaining unit, configured to obtain a second data set according to a first data set and a system-level logic condition, where data in the second data set satisfies the system-level logic condition, and the second data set is a subset of the first data set;
a second obtaining unit, configured to obtain a third data set according to the second data set and an application layer logic condition, where data in the third data set satisfies the application layer logic condition, and the third data set is a subset of the second data set;
a third obtaining unit, configured to obtain a fourth data set according to the third data set and a data layer logic condition, where data in the fourth data set satisfies the data layer logic condition, and the fourth data set is a subset of the third data set;
and the first conversion unit is used for carrying out state conversion on the data in the fourth data set and marking the data in the fourth data set according to the conversion result.
Optionally, the apparatus further comprises:
a fourth obtaining unit, configured to obtain, from a total data set, the first data set that satisfies a state transition condition, where the state transition condition is used to indicate data that needs to be state-transitioned, and the first data set is a subset of the total data set.
Optionally, the apparatus further comprises:
a fifth obtaining unit, configured to obtain, from the total data set, a fifth data set that satisfies the quasi-state transition condition, where the quasi-state transition condition is used to indicate data that needs to be state-transitioned as long as a target index is satisfied, and the fifth data set is a subset of the total data set.
Optionally, the apparatus further comprises:
a sixth obtaining unit, configured to obtain a sixth data set that meets a quasi-preprocessing condition from the fifth data set, where the quasi-preprocessing condition is used to indicate data that needs to be subjected to quasi-preprocessing, the quasi-preprocessing condition includes the target index, and the sixth data set is a subset of the fifth data set;
an adding unit, configured to add an identifier to the data in the sixth data set, where the identifier is used to indicate that the data in the sixth data set is to-be-preprocessed data;
and the first generating unit is used for generating a quasi-preprocessing table, and the quasi-preprocessing table comprises the data in the sixth data set and the identification of each data.
Optionally, the apparatus further comprises:
a seventh obtaining unit, configured to obtain a seventh data set according to the first data set and the system-level logic condition, where data in the seventh data set does not satisfy the system-level logic condition, and the seventh data set is a subset of the first data set;
an eighth obtaining unit, configured to obtain an eighth data set according to the second data set and the application layer logic condition, where data in the eighth data set does not satisfy the application layer logic condition, and the eighth data set is a subset of the second data set;
a ninth obtaining unit, configured to obtain a ninth data set according to the third data set and the data layer logical condition, where data in the ninth data set does not satisfy the data layer logical condition, and the ninth data set is a subset of the third data set;
a second generating unit, configured to generate a cue information table, where the cue information table includes the data in the seventh data set, the data in the eighth data set, the data in the ninth data set, and related information of each data, and the related information indicates a reason why the data does not satisfy a corresponding condition.
Optionally, the apparatus further comprises:
and a second conversion unit, configured to, after performing state conversion on data in the fourth data set and marking the data in the fourth data set according to a conversion result, determine that data in a tenth data set does not satisfy any one of the system layer logic condition, the application layer logic condition, or the data layer logic condition, perform state conversion on the data in the tenth data set and mark the data in the tenth data set according to the conversion result, where the tenth data set is a subset of the fourth data set.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes a processor and a memory:
the memory is used for storing a computer program;
the processor is configured to perform the method provided by the first aspect above according to the computer program.
In a fourth aspect, this application embodiment further provides a computer-readable storage medium, where the computer-readable storage medium is used to store a computer program, where the computer program is used to execute the method provided in the first aspect.
Therefore, the embodiment of the application has the following beneficial effects:
the embodiment of the application provides a batch data processing method, which for example may include: the batch data processing device obtains a second data set according to a first data set and a system level logic condition, wherein data in the second data set meets the system level logic condition, and the second data set is a subset of the first data set; obtaining a third data set according to the second data set and an application layer logic condition, wherein data in the third data set meets the application layer logic condition, and the third data set is a subset of the second data set; obtaining a fourth data set according to the third data set and a data layer logic condition, wherein data in the fourth data set meets the data layer logic condition, and the fourth data set is a subset of the third data set; and performing state conversion on the data in the fourth data set, and marking the data in the fourth data set according to a conversion result. Therefore, the device for processing the batch data can automatically process the data to be processed according to the three-layer logic conditions by presetting the reasonable three-layer logic conditions aiming at various service scenes, the state switching of the data to be processed is realized, and the data is marked by the data conversion according to the switching result, so that the requirements of the data to be processed and the scenes with complicated and various data states on the data to be processed on the high-efficiency processing are met, the efficiency and the accuracy of the data processing are improved, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and it is obvious for those skilled in the art to obtain other drawings according to these drawings.
FIG. 1 is a schematic flowchart of a batch data processing method according to an embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating an example of a batch data processing method according to an embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of an apparatus 300 for batch data processing according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device 400 according to an embodiment of the present disclosure.
Detailed Description
It should be noted that the method and apparatus for processing batch data provided by the present invention can be used in the field of big data, the field of computer technology, or the financial field. The foregoing is merely an example, and does not limit the application field of the batch data processing method and apparatus provided by the present invention.
The method and the device for processing batch data can be used in the field of big data, the field of computer technology, the field of finance or other fields, for example, can be used in the batch processing scene of data in a bank system. The other fields are any fields other than the big data field, the computer technology field, and the financial field. The foregoing is merely an example, and does not limit the application field of the batch data processing method and apparatus provided by the present invention.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanying the drawings are described in detail below. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application. It should be noted that, for the convenience of description, only a part related to the present application is shown in the drawings, and not all structures are shown.
For a large-scale system, such as a bank system, the state of internal data is extremely complex, and the system needs to dynamically pay attention to the state of the data according to actual conditions, service types and the like and switch the state of the data when necessary in order to meet the requirements of service and regional supervision.
At present, data is generally processed by adopting a single data processing mode, but under the scene that the data volume in the system is suddenly increased, especially for a large-scale system, the single data processing mode cannot meet the requirement.
Based on this, the present application provides a method capable of automatically processing batch data, and the method may include: the batch data processing device obtains a second data set according to a first data set and a system level logic condition, wherein data in the second data set meets the system level logic condition, and the second data set is a subset of the first data set; obtaining a third data set according to the second data set and an application layer logic condition, wherein data in the third data set meets the application layer logic condition, and the third data set is a subset of the second data set; obtaining a fourth data set according to the third data set and a data layer logic condition, wherein data in the fourth data set meets the data layer logic condition, and the fourth data set is a subset of the third data set; and performing state conversion on the data in the fourth data set, and marking the data in the fourth data set according to a conversion result.
Therefore, the device for processing the batch data can automatically process the data to be processed according to the three-layer logic conditions by presetting the reasonable three-layer logic conditions aiming at various service scenes, the state switching of the data to be processed is realized, and the data is marked by the data conversion according to the switching result, so that the requirements of the data to be processed and the scenes with complicated and various data states on the data to be processed on the high-efficiency processing are met, the efficiency and the accuracy of the data processing are improved, and the user experience is improved.
It can be understood that, in the embodiment of the present application, for data processing of a large-scale system, accuracy needs to be ensured first, the more complex the system is, the more rigorous the condition of data judgment is, the embodiment of the present application partitions data of different dimensions in a multi-layer parameter control manner, refines common processing conditions (such as system layer logic conditions), and automatically matches customized screening conditions (such as data layer logic conditions), so as to achieve the purpose of efficient and accurate batch data processing.
First, a general description will be given of concepts or terms referred to in the embodiments of the present application.
Bulk data (which may also be referred to as bulk traffic): data generated by service scenarios requiring large-volume processing of data in the system, such as: batch payroll data, batch fee collection data, batch payment data, batch interest counting data, interest bearing and tax deduction data and the like;
the state of the data: the data is used as a main body of system processing, the state of the data represents a real-time state of the data, the different states of the data allow different processing operations to be performed on the data, the corresponding states of the data are different according to different systems and different data, and taking a certain account of a bank as an example, the states of the data of the account may include but are not limited to: opening the card, normal, silent, clear and closing the house.
Pre-treating: the method refers to a data processing mode for predicting that operations such as state conversion and the like are needed in advance, and when the data does not reach the condition that the data needs to be processed, the data is subjected to operations such as calculation, judgment and the like in advance.
Parameter hierarchy: representing the range size controlled by the parameter, taking the banking system as an example, the parameter hierarchy can be divided into: system layer parameters (also called bank layer parameters, which control all the authorities of the bank), application layer parameters (also called product layer parameters, which control the authorities of partial data), data layer parameters (which control the authorities of single data), and the like, and the functional logic is realized through the parameter control of different levels.
Marking: the system marks the data meeting the conditions in the data processing process, so that the system can conveniently and quickly identify the data meeting the conditions, for example, a 'closing' mark is marked on the related data of the account meeting the closing conditions.
And (3) label removal: after the data is marked, the system removes the existing mark from the data meeting the condition in the process of processing the data, and restores the data to the state before marking, for example, after the "user off" mark is marked on the data relevant to the account meeting the user off condition, the data relevant to the account is found to meet the condition corresponding to the "normal" state in the process of subsequent data processing, and then the "user off" mark of the data relevant to the account is removed and restored to the "normal" state (it can also be understood that the "normal" mark is marked on the data relevant to the account).
A prompt information table: when the system judges the multilayer logic conditions of the data, the data which do not meet the logic conditions and the information such as the reason why each data does not meet the logic conditions are recorded, so that more information of the data can be displayed to workers or users for daily analysis of services, or the user can learn the data to improve the user experience.
For the purpose of facilitating an understanding of a specific implementation of the system for batch data processing provided by the embodiments of the present application, reference will be made to the following description taken in conjunction with the accompanying drawings.
It should be noted that the main body implementing the batch data processing method may be the batch data processing apparatus provided in the embodiments of the present application, and the apparatus may be carried in an electronic device or a functional module of the electronic device. The electronic device in the embodiment of the present application may be any device capable of implementing the method for batch data processing in the embodiment of the present application, and may be an IoT device, for example.
Fig. 1 is a schematic flowchart of a method for processing batch data according to an embodiment of the present application. The method may be applied to the apparatus 300 for batch data processing shown in fig. 3, or may be applied to the electronic device 400 shown in fig. 4.
As shown in fig. 1, the method may include, for example:
s101, obtaining a second data set according to a first data set and a system layer logic condition, wherein data in the second data set meets the system layer logic condition, and the second data set is a subset of the first data set.
The first data set may be a set of data that needs to be state-converted and that is screened from all data by the batch data processing apparatus. A data set composed of all data may be referred to as a total data set.
As an example, before S101, the method provided in the embodiment of the present application may further include: and judging whether the data in the total data set meet a preset state conversion condition, and if so, forming the data meeting the state conversion condition into a first data set. In other words, before S101, the method provided in the embodiment of the present application may further include: acquiring the first data set meeting a state transition condition from the total data set, wherein the state transition condition is used for indicating data needing state transition, and the first data set is a subset of the total data set.
In some implementation manners, for data that does not satisfy a preset state transition condition in a total data set, in order to sense batch data that is about to undergo state transition in advance, and perform pre-processing operations such as advance notification and advance processing, the method provided in the embodiment of the present application may further include: and judging whether the data which do not meet the state conversion conditions meet preset quasi-state conversion conditions or not, and if so, forming a fifth data set by the data which meet the quasi-state conversion conditions. In other words, before S101, the method provided in the embodiment of the present application may further include: and acquiring a fifth data set meeting the quasi-state transition condition from the total data set, wherein the quasi-state transition condition is used for indicating data needing state transition as long as a target index is met, and the fifth data set is a subset of the total data set. The fifth data set and the first data set do not intersect.
It should be noted that, for data in the total data set that neither satisfies the preset state transition condition nor the preset quasi-state transition condition, the batch data processing is not performed on the data.
In specific implementation, the method shown in fig. 1 is executed for the first data set, and corresponding batch data processing is performed after three-layer logic conditions are judged; for a fifth data set, continuously judging whether data in the fifth data set meets a quasi-preprocessing condition, if so, forming a sixth data set by the data meeting the quasi-preprocessing condition, wherein the quasi-preprocessing condition is used for indicating the data needing to be subjected to quasi-preprocessing, the quasi-preprocessing condition comprises the target index, and the sixth data set is a subset of the fifth data set; adding an identifier to the data in the sixth data set, wherein the identifier is used for indicating that the data in the sixth data set is to-be-preprocessed data; and generating a quasi-preprocessing table, wherein the quasi-preprocessing table comprises the data in the sixth data set and the identification of each data.
The state transition condition may refer to dividing data in the total data set into data that needs to be state-transitioned (i.e., data in the first data set) and data that will be needed to be state-transitioned (i.e., data in the fifth data set). Taking financial systems such as banks as an example, the state transition conditions may include: at least one of the indexes such as the date, the amount, or the credit amount, taking the state transition condition as the due date as an example, data of a total data set whose due date is before the current date (including the current date) may be determined as data satisfying the state transition condition, and data of a total data set whose due date is within a preset time (for example, 30 days) after the current date may be determined as data about to satisfy the state transition condition.
The quasi-state transition condition may include the same criteria as the state transition condition, the same criteria including a target criterion (e.g., date) that satisfies the quasi-state transition condition, the data in the data set may be summarized, and other criteria are satisfied except that the target criterion in the state transition condition is not satisfied. For example, the pseudo-state transition condition may include an expiration date and an amount of money, and data 1 indicates that the expiration date is 3 months and 30 days, and the amount of money is 100 yuan; the due date of the data 2 is 4 months and 8 days, the amount of money is 300 yuan, the condition from the normal state to the closed state is that the deposit is less than 500 yuan, and the time to be preprocessed is 30 days, so that when batch data processing is carried out at 31 days of 3 months, the data 1 meets the state conversion condition, and the data 2 meets the state conversion condition except that the due date is not reached, and other indexes (namely, the amount of money) meet the state conversion condition, so the data 2 meets the state conversion condition.
The quasi-preprocessing condition refers to that data in a fifth data set meeting the quasi-state conversion condition is screened again to be implemented to a condition which is required to be met by actually performing state conversion, and since state conversion rules which correspond to different services are different, the quasi-preprocessing condition may not be identical to indexes included in the quasi-state conversion condition, so that data which does not meet the quasi-preprocessing condition in the fifth data set is not processed in the batch data processing process, only data which meets the quasi-preprocessing condition in the fifth data set (namely data in a sixth data set) is marked with a quasi-conversion identifier and recorded in a quasi-preprocessing table, for example, a 'about to be closed' identifier is marked on data 2, and a 'about to be closed in 4-8 days' record is added in the quasi-preprocessing table, so that data which needs to be processed can be sensed quickly during subsequent batch data processing, the processing efficiency of the data to be preprocessed is improved, and therefore the interactive experience with the user is improved.
For S101, the system level logic condition, which belongs to a common general data judgment condition, may be specifically understood as a rule specified between different systems through parameter setting at a system level, for example, for some regions of a banking system, the system level logic condition is present: if the deposit is more than 100 ten thousand, the withdrawal operation is allowed to be carried out, and then when the deposit of the account is from 30 ten thousand to 200 ten thousand, the logical condition of the system layer is met, and the data of the account meets the possibility of being converted from the withdrawal limit state to the normal state.
For data in the first data set which does not meet the logic condition of the system layer, a seventh data set can be formed and recorded in the prompt information table. Optionally, in order to make the prompt information table richer and enable a user or a worker to obtain more information from the prompt information table, the prompt information table may further include information such as data in the seventh data set and reasons why each data does not satisfy the system layer logic condition. In other words, the method provided by the embodiment of the present application may further include: obtaining a seventh data set according to the first data set and the system-level logic condition, wherein data in the seventh data set does not satisfy the system-level logic condition, and the seventh data set is a subset of the first data set; and generating a prompt information table, wherein the prompt information table comprises the data in the seventh data set and relevant information of each data, and the relevant information comprises a reason why the data does not meet the logic condition of a system layer.
In a specific implementation, S101 may include, for example: and judging whether the data in the first data set meets the logic condition of the system layer, and if so, forming the data meeting the logic condition of the system layer in the first data set into a second data set. The method may further comprise: and forming a seventh data set by the data which do not meet the logic condition of the system layer in the first data set, and recording the data in the seventh data set and the related information of the data into a prompt information table.
S102, obtaining a third data set according to the second data set and the application layer logic condition, wherein data in the third data set meets the application layer logic condition, and the third data set is a subset of the second data set.
For S102, the application layer logic condition belongs to a judgment condition in the application program, and may be specifically understood as a rule specified between different applications through parameter setting of the application program. The application layer logic conditions are usually screened based on the main service functions of the service, most of the data which do not meet the service rules are eliminated, and the data which meet the service rules are reserved.
And for the data which does not meet the application layer logic condition in the second data set, forming an eighth data set, and recording the eighth data set into the prompt information table. Optionally, in order to make the prompt information table richer and enable a user or a worker to obtain more information from the prompt information table, the prompt information table may further include information such as data in the eighth data set and reasons why each data does not satisfy the application layer logic condition. In other words, the method provided by the embodiment of the present application may further include: obtaining an eighth data set according to the second data set and the application layer logic condition, wherein data in the eighth data set does not meet the application layer logic condition, and the eighth data set is a subset of the second data set; and generating a prompt information table, wherein the prompt information table comprises the data in the eighth data set and relevant information of each data, and the relevant information comprises a reason why the data does not meet the logic condition of an application layer.
In a specific implementation, S102 may include, for example: and judging whether the data in the second data set meets the application layer logic condition, and if so, forming the data meeting the application layer logic condition in the second data set into a third data set. The method may further comprise: and forming an eighth data set by the data which do not meet the logic condition of the application layer in the second data set, and recording the data in the eighth data set and the related information of the data into a prompt information table.
S103, obtaining a fourth data set according to the third data set and a data layer logic condition, wherein data in the fourth data set meets the data layer logic condition, and the fourth data set is a subset of the third data set.
For S103, the data layer logic condition belongs to judgment and verification of a data type, and may be specifically understood as defining different parameters corresponding to different business rules through different data types. The data layer logic condition is mostly a special check index, such as a customized business rule or a non-universal business rule.
For data in the third data set that does not satisfy the data layer logic condition, a ninth data set may be formed and recorded in the hint table. Optionally, in order to make the prompt information table richer and enable a user or a worker to obtain more information from the prompt information table, the prompt information table may further include information such as data in the ninth data set and reasons why each data does not satisfy the data layer logic condition. In other words, the method provided by the embodiment of the present application may further include: obtaining a ninth data set according to the third data set and the data layer logic condition, wherein data in the ninth data set does not satisfy the data layer logic condition, and the ninth data set is a subset of the third data set; and generating a prompt information table, wherein the prompt information table comprises the data in the ninth data set and relevant information of each data, and the relevant information comprises a reason why the data do not meet the logic condition of the data layer.
In a specific implementation, S103 may include, for example: and judging whether the data in the third data set meets the data layer logic condition, and if so, forming the data meeting the data layer logic condition in the third data set into a fourth data set. The method may further comprise: and forming a ninth data set by the data which do not meet the logic condition of the data layer in the third data set, and recording the data in the ninth data set and the related information of the data into a prompt information table.
Thus, through the above S101 to S103, three-layer logical screening and verification are automatically performed on the data satisfying the state transition condition, so as to obtain a fourth data set, which is a set of data that can really implement state transition, and prepare for batch processing of data.
S104, performing state conversion on the data in the fourth data set, and marking the data in the fourth data set according to the conversion result.
In specific implementation, the state of all the data in the fourth data set may be converted to obtain a conversion result, so that the data in the fourth data set is correspondingly marked according to the conversion result. For example, if the fourth data set includes data 1, data 2, and data 3, and it is determined through the screening and checking in S101 to S103 that data 1 needs to be switched from the "normal" state to the "clear" state, data 2 needs to be switched from the "normal" state to the "off" state, and data 3 needs to be switched from the "off" state to the "normal", then S104 may include: the above state conversion is performed on the data 1, the data 2, and the data 3, respectively, with the conversion results that the state of the data 1 is "clear", the state of the data 2 is "closed", and the state of the data 3 is "normal", and the data is subjected to the data state conversion, thereby marking the data 1, the data 2, and the data 3, respectively, and marking the data 1, the data 2, and the data 3 after the marking operation is "clear", "closed", and "normal", respectively.
Therefore, batch processing of data in the total data set is completed, automatic three-layer logic processing is achieved, different data rules can be classified clearly and definitely for numerous data of a complex system, public service logic can be processed in a unified mode, customized rules can be set dynamically according to parameters, and batch data processing accuracy and processing efficiency are improved.
In some implementations, the batch data processing process may be performed periodically (e.g., daily) or triggered based on an event (e.g., a user operation such as a deposit), and then after S104, the embodiment of the present application may further include: determining a subset of the fourth data set, that is, data in a tenth data set does not satisfy any one of the system layer logic condition, the application layer logic condition, or the data layer logic condition, performing state conversion on the data in the tenth data set, and marking the data in the tenth data set according to a conversion result. The process may be referred to as a demark operation for the marking process of S104, or may be used as a marking process in a brand-new batch data processing process.
In order to more clearly introduce the method provided by the embodiment of the present application, the embodiment of the present application is exemplarily described below with reference to the flowchart shown in fig. 2.
As shown in fig. 2, the method may include, for example:
s201, judging whether the data in the total data set meets the state conversion condition, if so, executing S202, otherwise, executing S206.
S202, obtaining a first data set which meets the state conversion condition in the total data set, judging whether the data in the first data set meets the system layer logic condition, if so, executing S203, otherwise, executing S208.
S203, obtaining a second data set meeting the logic conditions of the system layer in the first data set, judging whether the data in the second data set meets the logic conditions of the application layer, if so, executing S204, otherwise, executing S208.
S204, obtaining a third data set meeting the application layer logic condition in the second data set, judging whether the data in the third data set meets the data layer logic condition, if so, executing S205, otherwise, executing S208.
S205, obtaining a fourth data set satisfying the logic condition of the data layer in the third data set, performing state conversion on the data in the fourth data set, marking the data in the fourth data set based on the conversion result, and recording the marking result in the state change table, thereby performing S209.
S206, acquiring a fifth data set which does not meet the state conversion condition but meets the quasi-state conversion condition in the total data set, judging whether the data in the fifth data set meets the quasi-preprocessing condition, if so, executing S207, otherwise, ending the batch processing flow.
S207, obtaining a sixth data set satisfying a quasi-preprocessing condition in a fifth data set, adding a quasi-preprocessing identifier to data in the sixth data set, and generating a quasi-preprocessing table, where the quasi-preprocessing table includes data in the sixth data set and the quasi-preprocessing identifier of each data, thereby performing S209.
S208, a seventh data set which does not meet the logic condition of the system layer in the first data set is obtained, an eighth data set which does not meet the logic condition of the application layer in the second data set is obtained, a ninth data set which does not meet the logic condition of the data layer in the third data set is obtained, and the data in the seventh data set, the eighth data set and the ninth data set and the related information of each data are recorded in a prompt information table, so that S209 is executed.
S209, displaying various reports, wherein the various reports include but are not limited to: a state change table, a pre-processing table and a prompt information table.
Therefore, the method provided by the embodiment of the application applies more levels of parameter control, provides a special mechanism to be preprocessed, is applicable to various fields, and has better user interaction experience. The three-layer logic processing is used as the core of the technical scheme provided by the embodiment of the application, each layer is controlled by adopting different parameters of each layer, data is recorded in the prompt information table when the abnormality is encountered in each step, and no matter how the data is processed, an end user can find the source of the data and the reason of failure in the state torsion process. In addition, the report display is used as visual output in the embodiment of the application, and different types of reports ensure that no gap exists in interaction of each stage, so that richer contents are displayed for users.
In addition, although the pre-processing flow is not a process that needs to be executed each time, if some specific system does not need to provide a function of predicting a data state in advance, the function may not be enabled, but for batch processing of the whole data state, if the pre-processing function is enabled, the user experience can be greatly improved, system risks caused by various data states can be predicted in advance, and the batch processing efficiency of data is improved.
It can be understood that, the method for processing batch data provided by the embodiment of the application complements the three-layer logic processing function, the pre-processing function and the report display function together as a set of complete mechanism for automatically processing batch data, so that the data state can be sensed in advance, real-time user interaction can be realized for abnormal data in each stage, the processes of marking and de-marking the state of the batch data are recorded, and each process of data state conversion is ensured to be safe and effective.
The quasi-preprocessing function plays a crucial role in the data state modification process, the quasi-preprocessing can reduce a large part of risks of the system in advance, and meanwhile the user interaction experience of the system can be better improved by generating the quasi-preprocessing table.
The three-layer logic processing function considers that the system has various data processing modes and different business processing rules, adopts a three-layer logic processing mode aiming at complex business, can clearly and definitely classify different data rules, can uniformly process public business logic, and can dynamically set parameters by customized rules.
The report display function can record conditions encountered in the data processing process for the whole situation, can collect and count abnormal data and reasons of the abnormal data under various conditions by taking abnormal conditions as an example, and provides functions of querying and generating reports for users, thereby further improving user experience.
Referring to fig. 3, an embodiment of the present application further provides an apparatus 300 for batch data processing, where the apparatus 300 may include: a first obtaining unit 301, a second obtaining unit 302, a third obtaining unit 303, and a first converting unit 304, wherein:
a first obtaining unit 301, configured to obtain a second data set according to a first data set and a system-level logic condition, where data in the second data set satisfies the system-level logic condition, and the second data set is a subset of the first data set;
a second obtaining unit 302, configured to obtain a third data set according to the second data set and an application layer logic condition, where data in the third data set satisfies the application layer logic condition, and the third data set is a subset of the second data set;
a third obtaining unit 303, configured to obtain a fourth data set according to the third data set and a data layer logic condition, where data in the fourth data set satisfies the data layer logic condition, and the fourth data set is a subset of the third data set;
a first conversion unit 304, configured to perform state conversion on the data in the fourth data set, and mark the data in the fourth data set according to a conversion result.
Optionally, the apparatus 300 further comprises:
a fourth obtaining unit, configured to obtain, from a total data set, the first data set that satisfies a state transition condition, where the state transition condition is used to indicate data that needs to be state-transitioned, and the first data set is a subset of the total data set.
Optionally, the apparatus 300 further comprises:
a fifth obtaining unit, configured to obtain, from the total data set, a fifth data set that satisfies the quasi-state transition condition, where the quasi-state transition condition is used to indicate data that needs to be state-transitioned as long as a target index is satisfied, and the fifth data set is a subset of the total data set.
Optionally, the apparatus 300 further comprises:
a sixth obtaining unit, configured to obtain a sixth data set that meets a quasi-preprocessing condition from the fifth data set, where the quasi-preprocessing condition is used to indicate data that needs to be subjected to quasi-preprocessing, the quasi-preprocessing condition includes the target index, and the sixth data set is a subset of the fifth data set;
an adding unit, configured to add an identifier to the data in the sixth data set, where the identifier is used to indicate that the data in the sixth data set is to-be-preprocessed data;
and the first generating unit is used for generating a quasi-preprocessing table, and the quasi-preprocessing table comprises the data in the sixth data set and the identification of each data.
Optionally, the apparatus 300 further comprises:
a seventh obtaining unit, configured to obtain a seventh data set according to the first data set and the system-level logic condition, where data in the seventh data set does not satisfy the system-level logic condition, and the seventh data set is a subset of the first data set;
an eighth obtaining unit, configured to obtain an eighth data set according to the second data set and the application layer logic condition, where data in the eighth data set does not satisfy the application layer logic condition, and the eighth data set is a subset of the second data set;
a ninth obtaining unit, configured to obtain a ninth data set according to the third data set and the data layer logical condition, where data in the ninth data set does not satisfy the data layer logical condition, and the ninth data set is a subset of the third data set;
a second generating unit, configured to generate a cue information table, where the cue information table includes the data in the seventh data set, the data in the eighth data set, the data in the ninth data set, and related information of each data, and the related information indicates a reason why the data does not satisfy a corresponding condition.
Optionally, the apparatus 300 further comprises:
and a second conversion unit, configured to, after performing state conversion on data in the fourth data set and marking the data in the fourth data set according to a conversion result, determine that data in a tenth data set does not satisfy any one of the system layer logic condition, the application layer logic condition, or the data layer logic condition, perform state conversion on the data in the tenth data set and mark the data in the tenth data set according to the conversion result, where the tenth data set is a subset of the fourth data set.
It should be noted that, the specific implementation manner and the achieved technical effect of the apparatus 300 can be referred to the related description in the method shown in fig. 1 or fig. 2.
In addition, an embodiment of the present application further provides an electronic device 400, as shown in fig. 4, the electronic device 400 includes a processor 401 and a memory 402:
the memory 402 is used for storing computer programs;
the processor 401 is configured to execute the method provided in fig. 1 or fig. 2 according to the computer program.
In addition, the embodiment of the present application also provides a computer-readable storage medium, which is used for storing a computer program, and the computer program is used for executing the method provided by the embodiment of the present application.
As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that all or part of the steps in the above embodiment methods can be implemented by software plus a general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a read-only memory (ROM)/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network communication device such as a router) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, system embodiments and device embodiments are substantially similar to method embodiments and are therefore described in a relatively simple manner, where relevant reference may be made to some descriptions of method embodiments. The above-described embodiments of the apparatus and system are merely illustrative, wherein modules described as separate parts may or may not be physically separate, and parts shown as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only a preferred embodiment of the present application and is not intended to limit the scope of the present application. It should be noted that, for a person skilled in the art, several improvements and modifications can be made without departing from the scope of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (14)

1. A method of batch data processing, comprising:
obtaining a second data set according to a first data set and a system layer logic condition, wherein data in the second data set meets the system layer logic condition, and the second data set is a subset of the first data set;
obtaining a third data set according to the second data set and an application layer logic condition, wherein data in the third data set meets the application layer logic condition, and the third data set is a subset of the second data set;
obtaining a fourth data set according to the third data set and a data layer logic condition, wherein data in the fourth data set meets the data layer logic condition, and the fourth data set is a subset of the third data set;
and performing state conversion on the data in the fourth data set, and marking the data in the fourth data set according to a conversion result.
2. The method of claim 1, further comprising:
acquiring the first data set meeting a state transition condition from the total data set, wherein the state transition condition is used for indicating data needing state transition, and the first data set is a subset of the total data set.
3. The method of claim 1, further comprising:
and acquiring a fifth data set meeting the quasi-state transition condition from the total data set, wherein the quasi-state transition condition is used for indicating data needing state transition as long as a target index is met, and the fifth data set is a subset of the total data set.
4. The method of claim 3, further comprising:
acquiring a sixth data set meeting a quasi-preprocessing condition from the fifth data set, wherein the quasi-preprocessing condition is used for indicating data needing to be subjected to quasi-preprocessing, the quasi-preprocessing condition comprises the target index, and the sixth data set is a subset of the fifth data set;
adding an identifier to the data in the sixth data set, wherein the identifier is used for indicating that the data in the sixth data set is to-be-preprocessed data;
and generating a quasi-preprocessing table, wherein the quasi-preprocessing table comprises the data in the sixth data set and the identification of each data.
5. The method according to any one of claims 1-4, further comprising:
obtaining a seventh data set according to the first data set and the system-level logic condition, wherein data in the seventh data set does not satisfy the system-level logic condition, and the seventh data set is a subset of the first data set;
obtaining an eighth data set according to the second data set and the application layer logic condition, wherein data in the eighth data set does not meet the application layer logic condition, and the eighth data set is a subset of the second data set;
obtaining a ninth data set according to the third data set and the data layer logic condition, wherein data in the ninth data set does not satisfy the data layer logic condition, and the ninth data set is a subset of the third data set;
and generating a prompt information table, wherein the prompt information table comprises the data in the seventh data set, the data in the eighth data set, the data in the ninth data set and relevant information of each data, and the relevant information indicates the reason why the data do not meet the corresponding conditions.
6. The method according to any one of claims 1-4, wherein after performing the state transition on the data in the fourth data set and marking the data in the fourth data set according to the transition result, the method further comprises:
determining that data in a tenth data set does not meet any one of the system layer logic condition, the application layer logic condition or the data layer logic condition, performing state conversion on the data in the tenth data set, and marking the data in the tenth data set according to a conversion result, wherein the tenth data set is a subset of the fourth data set.
7. An apparatus for batch data processing, comprising:
a first obtaining unit, configured to obtain a second data set according to a first data set and a system-level logic condition, where data in the second data set satisfies the system-level logic condition, and the second data set is a subset of the first data set;
a second obtaining unit, configured to obtain a third data set according to the second data set and an application layer logic condition, where data in the third data set satisfies the application layer logic condition, and the third data set is a subset of the second data set;
a third obtaining unit, configured to obtain a fourth data set according to the third data set and a data layer logic condition, where data in the fourth data set satisfies the data layer logic condition, and the fourth data set is a subset of the third data set;
and the first conversion unit is used for carrying out state conversion on the data in the fourth data set and marking the data in the fourth data set according to the conversion result.
8. The apparatus of claim 7, further comprising:
a fourth obtaining unit, configured to obtain, from a total data set, the first data set that satisfies a state transition condition, where the state transition condition is used to indicate data that needs to be state-transitioned, and the first data set is a subset of the total data set.
9. The apparatus of claim 7, further comprising:
a fifth obtaining unit, configured to obtain, from the total data set, a fifth data set that satisfies the quasi-state transition condition, where the quasi-state transition condition is used to indicate data that needs to be state-transitioned as long as a target index is satisfied, and the fifth data set is a subset of the total data set.
10. The apparatus of claim 9, further comprising:
a sixth obtaining unit, configured to obtain a sixth data set that meets a quasi-preprocessing condition from the fifth data set, where the quasi-preprocessing condition is used to indicate data that needs to be subjected to quasi-preprocessing, the quasi-preprocessing condition includes the target index, and the sixth data set is a subset of the fifth data set;
an adding unit, configured to add an identifier to the data in the sixth data set, where the identifier is used to indicate that the data in the sixth data set is to-be-preprocessed data;
and the first generating unit is used for generating a quasi-preprocessing table, and the quasi-preprocessing table comprises the data in the sixth data set and the identification of each data.
11. The apparatus according to any one of claims 7-10, further comprising:
a seventh obtaining unit, configured to obtain a seventh data set according to the first data set and the system-level logic condition, where data in the seventh data set does not satisfy the system-level logic condition, and the seventh data set is a subset of the first data set;
an eighth obtaining unit, configured to obtain an eighth data set according to the second data set and the application layer logic condition, where data in the eighth data set does not satisfy the application layer logic condition, and the eighth data set is a subset of the second data set;
a ninth obtaining unit, configured to obtain a ninth data set according to the third data set and the data layer logical condition, where data in the ninth data set does not satisfy the data layer logical condition, and the ninth data set is a subset of the third data set;
a second generating unit, configured to generate a cue information table, where the cue information table includes the data in the seventh data set, the data in the eighth data set, the data in the ninth data set, and related information of each data, and the related information indicates a reason why the data does not satisfy a corresponding condition.
12. The apparatus according to any one of claims 7-10, further comprising:
and a second conversion unit, configured to, after performing state conversion on data in the fourth data set and marking the data in the fourth data set according to a conversion result, determine that data in a tenth data set does not satisfy any one of the system layer logic condition, the application layer logic condition, or the data layer logic condition, perform state conversion on the data in the tenth data set and mark the data in the tenth data set according to the conversion result, where the tenth data set is a subset of the fourth data set.
13. An electronic device, comprising a processor and a memory:
the memory is used for storing a computer program;
the processor is adapted to perform the method of any of claims 1-6 in accordance with the computer program.
14. A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program for performing the method of any of claims 1-6.
CN202210345850.XA 2022-04-02 2022-04-02 Batch data processing method and device Pending CN114860823A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210345850.XA CN114860823A (en) 2022-04-02 2022-04-02 Batch data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210345850.XA CN114860823A (en) 2022-04-02 2022-04-02 Batch data processing method and device

Publications (1)

Publication Number Publication Date
CN114860823A true CN114860823A (en) 2022-08-05

Family

ID=82630394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210345850.XA Pending CN114860823A (en) 2022-04-02 2022-04-02 Batch data processing method and device

Country Status (1)

Country Link
CN (1) CN114860823A (en)

Similar Documents

Publication Publication Date Title
CN106952158A (en) Solve the problems, such as the bookkeeping methods and equipment of focus account
CN108229806A (en) A kind of method and system for analyzing business risk
WO2019041774A1 (en) Customer information screening method and apparatus, electronic device, and medium
CN111260465A (en) Business processing method, device, server and storage medium
CN111091358B (en) Unified processing method and system for multiple payment channels
CN110084561A (en) Breakpoint follow-up method, electronic device and readable storage medium storing program for executing
CN111046184A (en) Text risk identification method, device, server and storage medium
CN110972086A (en) Short message processing method and device, electronic equipment and computer readable storage medium
CN109039695B (en) Service fault processing method, device and equipment
CN114860823A (en) Batch data processing method and device
CN111242779A (en) Financial data characteristic selection and prediction method, device, equipment and storage medium
CN114971637A (en) Risk early warning method, device, equipment and medium
CN111160011A (en) Organization unit standardization method, device, equipment and storage medium
CN101206734A (en) System and method for extracting time to automatic updating input data based on case
CN113269547B (en) Data processing method, device, electronic equipment and storage medium
CN112907009B (en) Standardized model construction method and device, storage medium and equipment
CN110351116B (en) Abnormal object monitoring method, device, medium and electronic equipment
CN109493208A (en) Processing method, apparatus and system, storage medium, the terminal of collage-credit data
Goel Fraud detection and corporate filings
US20230222579A1 (en) Method and Apparatus for Iterating Credit Scorecard Model, Electronic Device and Storage Medium
CN117271628A (en) Data processing model construction method, platform and storage medium
CN114580889A (en) Operation risk management and control method, device, equipment, medium and program product
CN117493996A (en) Construction method of police situation cascade classification model
CN116226111A (en) Data management method, device, equipment, storage medium and program product
CN116703505A (en) Order information judging method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination