CN112559611A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112559611A
CN112559611A CN202011480092.XA CN202011480092A CN112559611A CN 112559611 A CN112559611 A CN 112559611A CN 202011480092 A CN202011480092 A CN 202011480092A CN 112559611 A CN112559611 A CN 112559611A
Authority
CN
China
Prior art keywords
data
target
wide
order
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011480092.XA
Other languages
Chinese (zh)
Inventor
张�浩
潘栋恒
张仪林
黄艳
曹睿
高翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Life Insurance Co Ltd China
Original Assignee
China Life Insurance Co Ltd China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Life Insurance Co Ltd China filed Critical China Life Insurance Co Ltd China
Priority to CN202011480092.XA priority Critical patent/CN112559611A/en
Publication of CN112559611A publication Critical patent/CN112559611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Abstract

According to the data processing method, the data processing device, the data processing equipment and the data processing storage medium provided by one or more embodiments of the specification, starting from a complex index, all data in a plurality of target data tables corresponding to the complex index are partitioned according to order IDs, each order ID is not crossed, and the sequence of the data and the integrity of the data are kept; and then, the target column data partitioned according to the order ID is put into a wide table, and the target data in the wide table is analyzed, so that the analysis process does not relate to multi-table association any more, the Spark Streaming can quickly process the target in the wide table, and the obtained data has high accuracy and good real-time performance.

Description

Data processing method, device, equipment and storage medium
Technical Field
One or more embodiments of the present disclosure relate to the field of technology, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
In conventional BI-like data analysis systems, it is generally composed of indices and dimensions. The system processes and lands required indexes and dimensions from a source database to a target database regularly through an ETL tool, and then the required indexes and dimensions are displayed by a front-end platform.
With the rapid development of domestic insurance business, the competition is increasingly intense, business departments have more and more requirements on data, and higher requirements are provided for data processing frequency, data processing dimensionality and data processing accuracy. For the data processing frequency, the data can be reported from the year before 10 years, and can not meet the business requirements at the current timing from the time of the quarter report, the month report and the day report to the time of the next two years (1 hour, half hour, 10 minutes and 5 minutes), and the key data can be updated completely in real time. For data processing dimension and data processing accuracy, a simple timing or real-time updating index from the same data table or from a small number of data tables (such as more than 4 data tables) can be realized by Spark Streaming + Kafka, but as the correlation of a small number of data tables is involved, the error report of the analysis process is more, namely the data processing accuracy of the data processing is required to be improved, the accuracy of result data realized at the present stage of the slightly complex indexes related to multiple tables (more than 4 data tables) is required to be improved, and the multi-table correlation analysis process is long in time consumption, so that the real-time performance of the result data is poor, and the problem is urgently needed to be solved.
Disclosure of Invention
In view of this, one or more embodiments of the present disclosure are directed to a data processing method, an apparatus, a device and a storage medium, so as to solve the technical problem in the prior art that the accuracy and the real-time performance of the complex indicators associated with multiple tables are to be improved.
In view of the above object, one or more embodiments of the present specification provide a data processing method including:
pushing a data table in an original system database to a primary Kafka message queue, and sending a target data table in the primary Kafka message queue to a data mart according to preset table name information;
sorting the data in the target data table in the data mart according to the order ID partition according to the preset order ID information;
reading the data sorted according to the order ID partitions, extracting target column data according to preset column name information to form a wide table, and pushing the wide table to a secondary Kafka message queue;
reading a plurality of wide tables from the secondary Kafka message queue, inquiring a target wide table from the plurality of wide tables according to preset wide table name information, analyzing target condition data in the target wide table according to preset condition information to form result data, and pushing the result data to a front end for displaying in real time and/or at regular time.
Based on the same inventive concept, one or more embodiments of the present specification further provide a data processing apparatus, including:
the data primary extraction module is configured to push a data table in an original system database to a primary Kafka message queue and send a target data table in the primary Kafka message queue to a data mart according to preset table name information;
the data partitioning module is configured to sort the data in the target data table in the data mart according to the order ID partition according to preset order ID information;
the data secondary extraction module is configured to read the data sorted according to the order ID partitions, extract target column data according to preset column name information to form a wide table, and push the wide table to a secondary Kafka message queue;
and the data analysis pushing module is configured to read a plurality of wide tables from the secondary Kafka message queue, analyze target condition data in a target wide table according to preset wide table name information and preset column name information to form result data, and push the result data to a front end for displaying in real time or at regular time.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method when executing the program.
Based on the same inventive concept, one or more embodiments of the present specification also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the above-described method.
As can be seen from the above, in the data processing method, the apparatus, the device, and the storage medium provided in one or more embodiments of the present disclosure, starting from a complex index, all data in a plurality of target data tables corresponding to the complex index are partitioned according to order IDs, and each order ID is not crossed and maintains the order of the data and the integrity of the data; and then, the target column data partitioned according to the order ID is put into a wide table, and the target data in the wide table is analyzed, so that the analysis process does not relate to multi-table association any more, the Spark Streaming can quickly process the target in the wide table, and the obtained data has high accuracy and good real-time performance.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
FIG. 1 is a schematic flow diagram of a data processing method according to one or more embodiments of the present disclosure;
FIG. 2 is a flow diagram illustrating a method for extracting data according to one or more embodiments of the present disclosure;
FIG. 3 is a flow diagram of a data partitioning method in accordance with one or more embodiments of the present disclosure;
FIG. 4 is a flow diagram illustrating a method for secondary data extraction according to one or more embodiments of the present disclosure;
fig. 5 is a flow diagram illustrating a data analysis pushing method according to one or more embodiments of the present disclosure;
FIG. 6 is a schematic diagram of a data processing apparatus according to one or more embodiments of the present description;
fig. 7 is a schematic structural diagram of an electronic device according to one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in one or more embodiments of the specification is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
As described in the background art, in the prior art, a simple real-time index from the same data table or from a small number of data tables (e.g., 4 or more data tables) can be realized by Spark Streaming + Kafka, that is, Spark Streaming can process the data table in Kafka in real time and analyze target data to form result data, and the result data is pushed to a front-end browser by websocket in real time for display; however, Spark Streaming has a limitation on the number of data tables processed simultaneously, which is generally 1-4 data tables, and for complex indexes of multi-table association of more than 4 data tables, the accuracy of result data needs to be improved, and the specific reason is that, because multiple data tables are involved, in the prior art, partition concurrent processing is generally performed according to table names, and ETL can be adopted to extract target data in the multiple data tables.
To solve the problems in the prior art, referring to fig. 1, one or more embodiments of the present specification provide a data processing method, including:
s101, pushing a data table in an original system database to a primary Kafka message queue, and sending a target data table in the primary Kafka message queue to a data mart according to preset table name information;
s102, sorting the data in the target data table in the data mart according to the preset order ID information in a partition mode;
s103, reading the data sorted according to the order ID partitions, extracting target column data according to preset column name information to form a wide table, and pushing the wide table to a secondary Kafka message queue;
s104, reading a plurality of wide tables from the secondary Kafka message queue, inquiring a target wide table from the plurality of wide tables according to preset wide table name information, analyzing target condition data in the target wide table according to preset condition information to form result data, and pushing the result data to a front end for displaying in real time and/or at regular time.
In the embodiment, a plurality of target data tables in the primary Kafka message queue are sent to the data mart through Spark Streaming, and all data in all the target data tables are partitioned and sequenced according to the order ID; extracting the target column from the data sorted by the partitions to form a wide table; and sending the wide table to a secondary Kafka message queue through Spark Streaming, analyzing target data in the wide table according to preset condition information, and pushing the result data to a front-end browser to be displayed in a timed or real-time manner, thereby completing the formation and analysis of the wide table.
That is, in this embodiment, starting from the complex index, all data in the multiple target data tables corresponding to the complex index are partitioned according to the order IDs, and each order ID is not crossed and maintains the order of the data and the integrity of the data; and then, the target column data partitioned according to the order ID is put into a wide table, and the target data in the wide table is analyzed, so that the analysis process does not relate to multi-table association any more, the Spark Streaming can quickly process the target in the wide table, and the obtained data has high accuracy and good real-time performance.
As an optional embodiment, referring to fig. 2, the data table in the original system database is pushed to the primary Kafka message queue, and the target data table in the primary Kafka message queue is sent to the data mart according to preset table name information; the method comprises the following steps:
s201, pushing a data table in a source system database Oracle to a primary Kafka message queue by using software SparkStreaming;
s202, reading data table information in the primary Kafka message queue;
s203, extracting the target data table with the corresponding table name information in the primary Kafka message queue to a data mart according to preset table name information.
The table name information is the table name of the original data table related to the complex index, and the table name information is used as the configuration information of spark streaming so as to realize real-time automatic extraction of the target data table.
As an optional embodiment, referring to fig. 3, the sorting, according to preset order ID information, data in the target data table located in the data mart according to order ID partitions includes:
s301, setting a plurality of areas in the data mart according to the order ID in the data table;
s302, sequencing the areas;
s303, storing the data with the same order ID in the data table of the data mart simultaneously or successively into an area with corresponding order ID information.
Further optionally, one of the regions corresponds to an order ID, the order of the regions and the order of the order IDs do not intersect, and the order of the data and the integrity of the data are maintained.
And because the order ID of each piece of data is unique, scattering all the data in each target data table according to the sequence of the target data tables entering the data mart, and entering a plurality of areas set according to the order ID.
If the target data table entered into the data mart is:
TABLE 1 market part-2019-month policy-making volume summary
Input name Order ID Amount of order Term of payment
King XX 124468 5000 2019.09.18
Liu XX 124469 6000 2019.09.19
Xi Xi 124470 5000 2019.09.19
TABLE 2 market part 1 list of types of 9-month policy deals in 2019
Input name Order ID Policy type
King XX 124468 Accident danger
Xi Xi 124470 Medical risk
The data presentation form after the target data table is subjected to partition sequencing refers to table 3, namely data in each target data table are scattered and divided into a plurality of independent data units, the core of each data unit is a data body, each data body can carry two labels of an order ID and a column name in the subsequent transmission process, each data unit can be transmitted independently without mutual interference, and the timeliness and the accuracy of data transmission are improved.
In addition, if a new target data table flows into the data mart, the order ID related to the new target data table is inquired in advance to be corresponding to the corresponding existing area, if yes, the related data is scattered and stored in the area with the corresponding order ID information, if not, a new storage area is set corresponding to the new ID, and the scattered data is stored.
TABLE 3
Figure BDA0002837241450000061
Figure BDA0002837241450000071
As an optional embodiment, referring to fig. 4, the reading the data sorted according to the order ID partitions, extracting the target column data according to the preset column name information to form a wide table, and pushing the wide table to the second-level Kafka message queue includes:
s401, reading data in the plurality of areas by using software spark streaming;
s402, extracting target data according to preset column name information;
s403, automatically flowing into a corresponding position in a broad table through mapping according to the self order ID information and the column name information of the target data;
s404, pushing the wide table to a second-level Kafka message queue.
The column name information is the column name of a target column in an original data table related to a complex index, and the column name information is used as the configuration information of spark streaming so as to realize real-time automatic extraction of target data.
When the target data flows into the wide table, whether the order ID carried by the target data is already present in the wide table or not is inquired in advance, and if the order ID is already present in the wide table, the relevant information is only filled or updated in the cell of the column corresponding to the order ID according to the column name information; if not, the order ID is inserted and the corresponding column of cells is filled or updated with the relevant information.
For example, the table structure of the wide table is shown in table 4.
TABLE 4XX Width Table
Order ID Enter employee's number Input name Auditor Amount of order
Wherein, the input start time, the input end time, the input employee number and the input name are preset column name information; the SparkStreaming reads data in a plurality of areas in the table 3, extracts data related to column names of 'recorded name' and 'order amount', inquires whether order ID124468, order ID124469 and order ID124470 exist in the extracted data before entering the table 4, and if yes, automatically flows into a corresponding position in the wide table through mapping to update the data; and if not, using merge intos statement to insert the corresponding order ID, and then automatically flowing into the corresponding position in the wide table through mapping to insert data.
It should be further noted that the wide table pushed into the secondary Kafka message queue may be pushed in real time or pushed in a timed manner (e.g., timed 30s, 1min, 2min, etc.).
As an optional embodiment, referring to fig. 5, the reading of multiple wide tables from the secondary Kafka message queue, analyzing target condition data in a target wide table according to preset wide table name information and condition information to form result data, and pushing the result data to a front end for display in real time and/or at regular time includes:
s501, reading a plurality of wide tables from the secondary Kafka message queue by utilizing spark streaming software, inquiring a target wide table from the wide tables according to preset wide table name information, and analyzing target condition data in the target wide table according to preset condition information to form result data;
s502, in response to the fact that the result data are determined to be real-time updating result data, directly sending the real-time updating result data to a java-Rest interface; the java-Rest interface receives the real-time updating result data and pushes the real-time updating result data to a front-end browser for display in real time through websocket long connection;
s503, sending all result data to a target database for storage; the front-end browser sets a timing task through a JavaScript function, sends a timing updating data request to the target database at a time node limited by the timing task, and the target database sends timing updating result data corresponding to the data request to the front-end browser for displaying.
The wide form name information is a wide form name corresponding to a complex index, the condition information is a condition set according to the complex index, for example, the complex index is "the highest underwriter in 9 months in 2019 and the total sum of the underwriters, the complex index corresponds to" recorded name "and" order amount "in the XX wide form shown in table 4, that is, the preset wide form name information is the XX wide form, and the preset condition information is" sum the order amounts corresponding to the same recorded name and select the maximum sum value ". The broad table name information and the condition information can be used as configuration information of spark streaming software or directly written into a program so as to realize real-time automatic analysis of target condition data in a target broad table.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
It should be noted that the above description describes certain embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, one or more embodiments of the present specification further provide a data processing apparatus corresponding to any of the above-described embodiment methods.
Referring to fig. 6, the data processing apparatus includes:
the primary data extraction module 601 is configured to push a data table in an original system database to a primary Kafka message queue, and send a target data table in the primary Kafka message queue to a data mart according to preset table name information;
a data partitioning module 602 configured to sort, according to preset order ID information, data in the target data table located in the data mart according to order ID partitions;
a data secondary extraction module 603, configured to read the data sorted according to the order ID partitions, extract target column data according to preset column name information to form a wide table, and push the wide table to a secondary Kafka message queue;
and the data analysis pushing module 604 is configured to read a plurality of wide tables from the secondary Kafka message queue, analyze target condition data in a target wide table according to preset wide table name information and preset column name information to form result data, and push the result data to a front end for displaying in real time or at regular time.
As an optional embodiment, the pushing a data table in an original system database to a primary Kafka message queue, and sending a target data table in the primary Kafka message queue to a data mart according to preset table name information includes:
pushing a data table in a source system database Oracle to a primary Kafka message queue by using software SparkStreaming;
reading data table information in the primary Kafka message queue;
and extracting a target data table with corresponding table name information in the primary Kafka message queue into a data mart according to preset table name information.
As an optional embodiment, the sorting, according to preset order ID information, data in the target data table located in the data mart according to order ID partitions includes:
setting a plurality of areas in the data mart according to the order ID in the data table;
sorting a plurality of said regions;
and storing the data with the same order ID in the data table of the data mart simultaneously or successively into an area with corresponding order ID information.
As an alternative embodiment, one of the areas corresponds to an order ID.
As an optional embodiment, the reading the data sorted according to the order ID partitions, extracting target column data according to preset column name information to form a wide table, and pushing the wide table to a secondary Kafka message queue includes:
reading data in a plurality of the areas by using software spark streaming;
extracting target data according to preset column name information;
according to the order ID information and the column name information carried by the target data, automatically flowing into the corresponding position in the wide table through mapping;
and pushing the wide table to a secondary Kafka message queue.
As an optional embodiment, the reading multiple wide tables from the secondary Kafka message queue, analyzing the target condition data in the target wide table according to the wide table name information and the condition information to form result data, and pushing the result data to the front end for display in real time and/or at regular time includes:
reading a plurality of wide tables from the secondary Kafka message queue by utilizing spark streaming software, inquiring a target wide table from the plurality of wide tables according to preset wide table name information, and analyzing target condition data in the target wide table according to preset condition information to form result data;
in response to determining that the result data is real-time update result data, directly sending the real-time update result data to a java-Rest interface; the java-Rest interface receives the real-time updating result data and pushes the real-time updating result data to a front-end browser for display in real time through websocket long connection;
sending all result data to a target database for storage; the front-end browser sets a timing task through a JavaScript function, sends a timing updating data request to the target database at a time node limited by the timing task, and the target database sends timing updating result data corresponding to the data request to the front-end browser for displaying.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus in the foregoing embodiment is used to implement the corresponding data processing method in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, corresponding to any of the above-mentioned embodiments, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the data processing method according to any of the above-mentioned embodiments is implemented.
Fig. 7 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The electronic device of the foregoing embodiment is used to implement the corresponding data processing method in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, corresponding to any of the above-described embodiment methods, one or more embodiments of the present specification further provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the data processing method according to any of the above-described embodiments.
Computer-readable media of the present embodiments, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
The computer instructions stored in the storage medium of the foregoing embodiment are used to enable the computer to execute the data processing method according to any one of the foregoing embodiments, and have the beneficial effects of the corresponding method embodiment, which are not described herein again.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
In addition, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown in the provided figures, for simplicity of illustration and discussion, and so as not to obscure one or more embodiments of the disclosure. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the understanding of one or more embodiments of the present description, and this also takes into account the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (9)

1. A data processing method, comprising:
pushing a data table in an original system database to a primary Kafka message queue, and sending a target data table in the primary Kafka message queue to a data mart according to preset table name information;
sorting the data in the target data table in the data mart according to the order ID partition according to the preset order ID information;
reading the data sorted according to the order ID partitions, extracting target column data according to preset column name information to form a wide table, and pushing the wide table to a secondary Kafka message queue;
reading a plurality of wide tables from the secondary Kafka message queue, inquiring a target wide table from the plurality of wide tables according to preset wide table name information, analyzing target condition data in the target wide table according to preset condition information to form result data, and pushing the result data to a front end for displaying in real time and/or at regular time.
2. The method according to claim 1, wherein the pushing the data table in the original system database to the primary Kafka message queue, and sending the target data table in the primary Kafka message queue to the data mart according to preset table name information comprises:
pushing a data table in a source system database Oracle to a primary Kafka message queue by using software SparkStreaming;
reading data table information in the primary Kafka message queue;
and extracting a target data table with corresponding table name information in the primary Kafka message queue into a data mart according to preset table name information.
3. The method according to claim 1, wherein the sorting the data in the target data table located in the data mart according to the order ID partition according to the preset order ID information comprises:
setting a plurality of areas in the data mart according to the order ID in the data table;
sorting a plurality of said regions;
and storing the data with the same order ID in the data table of the data mart simultaneously or successively into an area with corresponding order ID information.
4. The method of claim 3, wherein one of said regions corresponds to an order ID.
5. The method as claimed in claim 3, wherein the reading the data sorted according to the order ID partitions, extracting target column data according to preset column name information to form a wide table, and pushing the wide table to a secondary Kafka message queue comprises:
reading data in a plurality of the areas by using software spark streaming;
extracting target data according to preset column name information;
according to the order ID information and the column name information carried by the target data, automatically flowing into the corresponding position in the wide table through mapping;
and pushing the wide table to a secondary Kafka message queue.
6. The method according to claim 1, wherein the reading a plurality of wide tables from the secondary Kafka message queue, analyzing target condition data in a target wide table according to wide table name information and condition information to form result data, and pushing the result data to a front-end display in real time and/or at regular time comprises:
reading a plurality of wide tables from the secondary Kafka message queue by utilizing spark streaming software, inquiring a target wide table from the plurality of wide tables according to preset wide table name information, and analyzing target condition data in the target wide table according to preset condition information to form result data;
in response to determining that the result data is real-time update result data, directly sending the real-time update result data to a java-Rest interface; the java-Rest interface receives the real-time updating result data and pushes the real-time updating result data to a front-end browser for display in real time through websocket long connection;
sending all result data to a target database for storage; the front-end browser sets a timing task through a JavaScript function, sends a timing updating data request to the target database at a time node limited by the timing task, and the target database sends timing updating result data corresponding to the data request to the front-end browser for displaying.
7. A data processing apparatus, comprising:
the data primary extraction module is configured to push a data table in an original system database to a primary Kafka message queue and send a target data table in the primary Kafka message queue to a data mart according to preset table name information;
the data partitioning module is configured to sort the data in the target data table in the data mart according to the order ID partition according to preset order ID information;
the data secondary extraction module is configured to read the data sorted according to the order ID partitions, extract target column data according to preset column name information to form a wide table, and push the wide table to a secondary Kafka message queue;
and the data analysis pushing module is configured to read a plurality of wide tables from the secondary Kafka message queue, analyze target condition data in a target wide table according to preset wide table name information and preset column name information to form result data, and push the result data to a front end for displaying in real time or at regular time.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 6 when executing the program.
9. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 6.
CN202011480092.XA 2020-12-15 2020-12-15 Data processing method, device, equipment and storage medium Pending CN112559611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011480092.XA CN112559611A (en) 2020-12-15 2020-12-15 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011480092.XA CN112559611A (en) 2020-12-15 2020-12-15 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112559611A true CN112559611A (en) 2021-03-26

Family

ID=75063892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011480092.XA Pending CN112559611A (en) 2020-12-15 2020-12-15 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112559611A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002484A (en) * 2018-06-25 2018-12-14 北京明朝万达科技股份有限公司 A kind of method and system for sequence consumption data
CN109241159A (en) * 2018-08-07 2019-01-18 威富通科技有限公司 A kind of subregion querying method, system and the terminal device of data cube
US20190079964A1 (en) * 2017-09-13 2019-03-14 Coursera Inc. Dynamic state tracking with query serving in an online content platform
CN109960708A (en) * 2019-03-22 2019-07-02 蔷薇智慧科技有限公司 Data processing method, device, electronic equipment and storage medium
CN110019087A (en) * 2017-11-09 2019-07-16 北京京东尚科信息技术有限公司 Data processing method and its system
CN110297866A (en) * 2019-05-20 2019-10-01 平安普惠企业管理有限公司 Method of data synchronization and data synchronization unit based on log analysis
CN110784419A (en) * 2019-10-22 2020-02-11 中国铁道科学研究院集团有限公司电子计算技术研究所 Method and system for visualizing professional data of railway electric affairs

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190079964A1 (en) * 2017-09-13 2019-03-14 Coursera Inc. Dynamic state tracking with query serving in an online content platform
CN110019087A (en) * 2017-11-09 2019-07-16 北京京东尚科信息技术有限公司 Data processing method and its system
CN109002484A (en) * 2018-06-25 2018-12-14 北京明朝万达科技股份有限公司 A kind of method and system for sequence consumption data
CN109241159A (en) * 2018-08-07 2019-01-18 威富通科技有限公司 A kind of subregion querying method, system and the terminal device of data cube
CN109960708A (en) * 2019-03-22 2019-07-02 蔷薇智慧科技有限公司 Data processing method, device, electronic equipment and storage medium
CN110297866A (en) * 2019-05-20 2019-10-01 平安普惠企业管理有限公司 Method of data synchronization and data synchronization unit based on log analysis
CN110784419A (en) * 2019-10-22 2020-02-11 中国铁道科学研究院集团有限公司电子计算技术研究所 Method and system for visualizing professional data of railway electric affairs

Similar Documents

Publication Publication Date Title
CN109766497B (en) Ranking list generation method and device, storage medium and electronic equipment
US9672272B2 (en) Method, apparatus, and computer-readable medium for efficiently performing operations on distinct data values
US10223388B2 (en) Avoid double counting of mapped database data
US20110087954A1 (en) Data analysis expressions
CN111459982A (en) Data query method and device, terminal device and storage medium
CN112084269A (en) Data quality calculation method and device, storage medium and server
US10339035B2 (en) Test DB data generation apparatus
CN109783788A (en) Tables of data complementing method, device, computer equipment and storage medium
CN112434087A (en) Cross-system data comparison method and device, electronic equipment and storage medium
CN109359027A (en) Monkey test method, device, electronic equipment and computer readable storage medium
CN114741392A (en) Data query method and device, electronic equipment and storage medium
CN112488845A (en) Method and device for screening insurance clients, electronic equipment and storage medium
CN112559611A (en) Data processing method, device, equipment and storage medium
CN109324963B (en) Method for automatically testing profit result and terminal equipment
CN108021464B (en) Bottom-pocketing processing method and device for application response data
CN113609271B (en) Knowledge graph-based service processing method, device, equipment and storage medium
CN114860759A (en) Data processing method, device and equipment and readable storage medium
CN115658680A (en) Data storage method, data query method and related device
CN110515946B (en) Data extraction method, device, equipment and computer readable storage medium
CN109840213B (en) Test data creating method, device, terminal and storage medium for GUI test
JP2014229297A (en) Pivot analysis method using condition group
CN114169451A (en) Behavior data classification processing method, device, equipment and storage medium
CN109086309B (en) Index dimension relation definition method, server and storage medium
JP6204923B2 (en) Assessment device, assessment system, assessment method, and program
CN112380117A (en) Production method and device of software test basic case and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination