CN110515967B - Spark calculation framework-based data analysis method and electronic equipment - Google Patents

Spark calculation framework-based data analysis method and electronic equipment Download PDF

Info

Publication number
CN110515967B
CN110515967B CN201910817122.2A CN201910817122A CN110515967B CN 110515967 B CN110515967 B CN 110515967B CN 201910817122 A CN201910817122 A CN 201910817122A CN 110515967 B CN110515967 B CN 110515967B
Authority
CN
China
Prior art keywords
service
data
service data
report
configuration information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910817122.2A
Other languages
Chinese (zh)
Other versions
CN110515967A (en
Inventor
张兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wanghai Kangxin Beijing Technology Co ltd
Original Assignee
Wanghai Kangxin Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wanghai Kangxin Beijing Technology Co ltd filed Critical Wanghai Kangxin Beijing Technology Co ltd
Priority to CN201910817122.2A priority Critical patent/CN110515967B/en
Publication of CN110515967A publication Critical patent/CN110515967A/en
Application granted granted Critical
Publication of CN110515967B publication Critical patent/CN110515967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application relates to the technical field of data processing, and discloses a spark calculation framework-based data analysis method and electronic equipment, wherein the spark calculation framework-based data analysis method comprises the following steps: an analysis server based on a spark calculation framework acquires configuration information of at least one service data required by a service report system by inquiring a report database; determining service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases; and performing corresponding data processing on each acquired service data, and storing each processing result into a report database according to the configuration information, so that the service report system acquires each processing result by inquiring the report database. The method of the embodiment of the application greatly improves the query efficiency and reduces the coupling between the analysis server and the query terminal.

Description

Spark calculation framework-based data analysis method and electronic equipment
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a spark computing framework-based data analysis method and electronic equipment.
Background
With the advent of the information age, the accumulation of data has grown exponentially, for example, hospital accounting data (such as fixed asset purchase and consumption, wages and benefits of medical staff, medicine and equipment purchase and consumption, outpatient income, budgeting and reimbursement, etc.), internet data, etc., often come from different business systems, and the data of these business systems may come from multiple databases. The data amount of a single data table can reach tens of millions and hundreds of millions, and the data structure of each data table is different.
Currently, in the process of querying data, a method of executing a plurality of SQL (Structured Query Language) statements by developing JDBC (Java database connectivity) is generally used to Query different databases respectively, and a plurality of front-end formulas (similar to EXCEL formulas) are developed to load and calculate Query data, and an obtained data result is displayed in a WEB page. Although the business requirements can be met to a certain extent by the mode, the query efficiency is low, and the page loading speed is extremely low when massive data and complex SQL are faced; especially, when multi-user and multi-data query occurs simultaneously, the situation that corresponding data cannot be queried after long-time waiting or the situation that a database and an application server are crashed occurs, which causes extremely poor user experience.
Disclosure of Invention
The purpose of the embodiments of the present application is to solve at least one of the above technical drawbacks, and to provide the following technical solutions:
in one aspect, a data analysis method based on spark calculation framework is provided, which includes:
an analysis server based on a spark calculation framework acquires configuration information of at least one service data required by a service report system by inquiring a report database;
determining service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases;
and performing corresponding data processing on the acquired business data respectively, and storing each processing result into a report database respectively according to the configuration information, so that the business report system acquires each processing result by inquiring the report database.
In one implementation, the configuration information of at least one service data required by the service reporting system includes at least one of the following items:
at least one service database corresponding to the service data respectively;
at least one query statement corresponding to the service data respectively;
and at least one business report corresponding to the business data respectively.
In one implementation, determining, according to the configuration information, service databases corresponding to the respective service data, and obtaining the respective service data from the respective service databases, includes:
and acquiring corresponding service data from the service database corresponding to each service data according to the query statement corresponding to each service data.
In one implementation manner, the corresponding data processing is performed on each obtained service data, and the method includes at least one of the following steps:
performing data cleaning on each acquired service data to filter redundant service data;
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
performing mathematical operation of at least one of addition, subtraction, multiplication and division on each acquired service data;
and summarizing the acquired business data according to the data types.
In one implementation, storing each processing result in a report database according to the configuration information includes:
determining a business report corresponding to each business data according to the configuration information;
and respectively storing each processing result corresponding to each business data into a corresponding business report, wherein the business report is positioned in a report database.
In one aspect, a spark calculation framework-based data analysis apparatus is provided, including:
the acquisition module is used for acquiring configuration information of at least one service data required by a service report system by inquiring a report database by an analysis server based on a spark calculation frame;
the first processing module is used for determining service databases corresponding to the service data according to the configuration information and respectively acquiring the corresponding service data from the service databases;
and the second processing module is used for respectively carrying out corresponding data processing on each acquired service data and respectively storing each processing result into the report database according to the configuration information, so that the service report system acquires each processing result by inquiring the report database.
In one implementation, the configuration information of at least one service data required by the service reporting system includes at least one of the following items:
at least one service database corresponding to the service data respectively;
at least one query statement corresponding to the service data respectively;
and at least one business report corresponding to the business data respectively.
In an implementation manner, the first processing module is specifically configured to obtain, according to the query statement corresponding to each service data, corresponding service data from a service database corresponding to each service data.
In one implementation, the second processing module is specifically configured to perform at least one of the following:
performing data cleaning on each acquired service data to filter redundant service data;
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
performing mathematical operation of at least one of addition, subtraction, multiplication and division on each acquired service data;
and summarizing the acquired business data according to the data types.
In one implementation, the second processing module includes a determining submodule and a storing submodule;
the determining submodule is used for determining a business report corresponding to each business data according to the configuration information;
and the storage submodule is used for respectively storing each processing result corresponding to each service data into a corresponding service report, and the service report is positioned in the report database.
In one aspect, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the data analysis method based on the spark calculation framework is implemented.
In one aspect, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the above-mentioned spark computing framework-based data analysis method.
In the spark computing framework-based data analysis method provided by the embodiment of the application, the spark computing framework-based analysis server obtains the configuration information of at least one service data required by the service report system by inquiring the report database, and determining the service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases, so that the analysis server based on spark calculation framework can obtain the configuration information in the report database, the service databases corresponding to the service data are inquired in parallel to acquire the service data simultaneously, so that the inquiry efficiency is greatly improved, and the coupling between the analysis server and the query terminal is greatly reduced by introducing the service database, when the analysis server is abnormal, the influence on the normal use of other functions of the query terminal can be effectively avoided; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
Additional aspects and advantages of embodiments of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of embodiments of the present application will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a spark calculation framework-based data analysis method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a data query process based on spark computing framework according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a basic structure of a spark calculation framework-based data analysis apparatus according to an embodiment of the present application;
FIG. 4 is a detailed structural diagram of a data analysis apparatus based on spark computing framework according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be further described in detail with reference to the accompanying drawings.
The spark calculation framework-based data analysis method and the electronic device provided by the embodiment of the application aim to solve the technical problems in the prior art.
The following describes in detail the technical solutions of the embodiments of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
One embodiment of the application provides a data analysis method based on spark computing framework, and the method is executed by a server device. The servers may be individual physical servers, clusters of physical servers, or virtual servers. As shown in fig. 1, the method includes:
step S110, the spark calculation framework-based analysis server obtains configuration information of at least one service data required by the service reporting system by querying the reporting database.
Specifically, the user may start the business reporting system in a terminal device (e.g., a desktop computer, a notebook computer, etc.) in which the business reporting system is pre-stored or pre-installed, and input a corresponding query condition in the business reporting system to query the needed one or more business data. After inquiring the needed one or more business data, the business reporting system displays the inquired one or more business data.
Specifically, the query condition input by the user in the business reporting system is configuration information configured for one or more business data required by the user. The configuration information of the one or more service data configured by the user in the service reporting system is equivalent to the configuration information of the one or more service data configured by the user received by the service reporting system. And after receiving the configuration information, the business report system stores the configuration information into a report database.
Specifically, the business data is stored in the analysis server based on the spark calculation framework, and the user queries one or more needed business data through the business reporting system, and actually completes the query of the data through the interaction between the business reporting system and the analysis server based on the spark calculation framework. In practical application, the analysis server based on the spark calculation framework may query the report database to obtain the configuration information, and query the required service data according to the configuration information, that is, the analysis server based on the spark calculation framework obtains the configuration information of at least one service data required by the service report system by querying the report database.
Step S120, determining the service databases corresponding to the service data according to the configuration information, and respectively obtaining the corresponding service data from the service databases.
Specifically, after the analysis server based on the spark calculation frame acquires the configuration information, it may determine, according to the configuration information, service databases corresponding to the one or more service data, respectively, and if the service data is data a, data B, and data C, respectively, and the service database corresponding to the data a is service database 1, the service database corresponding to the data B is service database 2, and the service database corresponding to the data C is service database 3, after the analysis server based on the spark calculation frame acquires the configuration information, it may determine, according to the configuration information, that the service database corresponding to the data a is service database 1, the service database corresponding to the data B is service database 2, and the service database corresponding to the data C is service database 3.
Specifically, after determining the service databases corresponding to the respective service data according to the configuration information, data a may be obtained from the service database 1, data B may be obtained from the service database 2, and data C may be obtained from the service database 3, that is, the respective service data may be obtained from the respective service databases. The process of acquiring the data A, the data B and the data C can be carried out simultaneously, namely, all the service data are inquired in parallel from the service database corresponding to all the service data respectively, so that the inquiry efficiency is greatly improved. In addition, by introducing the service database, the coupling between the analysis server and the terminal equipment is greatly reduced, and when the analysis server is abnormal, the influence on the normal use of other functions of the terminal equipment is effectively avoided.
Step S130, performing corresponding data processing on each obtained service data, and storing each processing result into the report database according to the configuration information, so that the service report system obtains each processing result by querying the report database.
Specifically, after the corresponding service data is respectively obtained from each service database, the obtained service data may be respectively subjected to corresponding data processing to obtain corresponding processing results, and then the processing results are respectively stored in the report database according to the configuration information, so that the service report system may obtain the processing results by querying the report database. After the business report system obtains each processing result, each processing result can be displayed to the user.
In the spark computing framework-based data analysis method provided by the embodiment of the application, the spark computing framework-based analysis server obtains the configuration information of at least one service data required by the service report system by inquiring the report database, and determining the service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases, so that the analysis server based on spark calculation framework can obtain the configuration information in the report database, the service databases corresponding to the service data are inquired in parallel to acquire the service data simultaneously, so that the inquiry efficiency is greatly improved, and the coupling between the analysis server and the query terminal is greatly reduced by introducing the service database, when the analysis server is abnormal, the influence on the normal use of other functions of the query terminal is effectively avoided; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
In a possible implementation manner, the configuration information of the at least one business data required by the business reporting system includes, but is not limited to: at least one service database corresponding to the service data respectively; at least one query statement corresponding to the service data respectively; and at least one business report corresponding to the business data respectively.
Specifically, when the configuration information includes at least one service database corresponding to service data, for example, the service database corresponding to the data a is the service database 1, the service database corresponding to the data B is the service database 2, and the service database corresponding to the data C is the service database 3, then: the analysis server based on spark calculation framework can directly obtain data a from the business database 1, data B from the business database 2, and data C from the business database 3 according to the configuration information.
Specifically, when the configuration information includes at least one query statement corresponding to each piece of business data, for example, the query statement corresponding to data a is SQL1, the query statement corresponding to data B is SQL2, and the query statement corresponding to data C is SQL3, then: the analysis server based on spark computing framework can directly obtain data a from the business database 1 according to SQL1, data B from the business database 2 according to SQL2, and data C from the business database 3 according to SQL3 according to the configuration information. In other words, in the process of determining the service databases corresponding to the respective service data according to the configuration information and acquiring the corresponding service data from the respective service databases, the corresponding service data may be acquired from the service databases corresponding to the respective service data according to the query statements corresponding to the respective service data.
Specifically, when the configuration information includes at least one service report corresponding to each service data, the data a corresponds to the service report 1, the data B corresponds to the service report 2, and the data C corresponds to the service report 3, then: after the analysis server based on the spark calculation framework obtains the data a, the data B and the data C, the data a can be stored in the service report 1, the data B can be stored in the service report 2, and the data C can be stored in the service report 3.
Specifically, after determining the service databases corresponding to the service data according to the configuration information, and respectively obtaining the corresponding service data from the service databases, corresponding data processing may be performed on the obtained service data, for example, data cleaning may be performed on the obtained service data to filter out redundant service data, for example, at least one of set operation and union operation may be performed on the obtained service data, for example, at least one of addition, subtraction, multiplication, and division may be performed on the obtained service data, and for example, the obtained service data may be summarized according to data categories.
Specifically, after performing corresponding data processing on each obtained service data, a service report corresponding to each service data may be determined according to the configuration information, and then each processing result corresponding to each service data is stored in a corresponding service report, where the service report is located in a report database, so that the service report system obtains each processing result by querying the report database.
Specifically, fig. 2 shows an application process of the spark calculation framework-based data analysis method implemented in the present application, which specifically includes the following processes:
step a1, the user configures configuration information such as a business database, an inquiry statement, and a business report corresponding to one or more needed business data in the business report system, and stores the configuration information into the report database through the business report coefficient, that is, configures information such as the database, the report information, and the inquiry statement in the business report system into the report database.
Step A2, the spark computing framework-based analysis server determines the service databases corresponding to the respective service data according to the configuration information by calling the internal preset analysis program, and acquires the corresponding service data from the respective service databases. The analysis server based on the spark calculation framework may be a single server or a clustered server.
Step A3, the analysis server based on spark calculation frame calls the configuration information by inquiring the report database, the called content includes: (1) a business report needing to be analyzed; (2) a database corresponding to the business report and a corresponding query statement SQL.
Step A4, the analytic program of the analytic server based on spark computing framework, by querying the report database, executes the corresponding query statement SQL to obtain the corresponding business data in the business database corresponding to each business data of each business report in the report database, where there are one or more business databases, and then executes multiple data queries to obtain the corresponding query results.
And step A5, respectively performing data analysis on each acquired service data according to service requirements based on an analysis program of the analysis server of the spark calculation framework to obtain corresponding analysis results, and storing each obtained analysis result in a report database.
Step A6, the business reporting system displays each analysis result to the report page corresponding to the business reporting system by querying each analysis result in the report database.
The main processing process of the analysis program comprises the following specific steps:
step B1: calling configuration information; and acquiring configuration information of the business data by querying a report database, wherein the configuration information comprises a report to be analyzed, and the field of the report data comprises the business database corresponding to the business data, a query statement SQL corresponding to the business data, an analysis result report and the like.
Step B2: and accessing the corresponding service database according to the configuration information, and executing the configured query statement SQL in the service database to acquire the corresponding service data.
Step B3: and performing logics of data intersection, combination, calculation, group-by-group summarization and the like on the acquired service data according to service requirements to obtain an analyzed data result.
Step B4: and storing the analyzed data result into an analysis result report of a report database, and providing a data query basis for a business report system.
Fig. 3 is a schematic structural diagram of a data analysis apparatus based on a spark calculation framework according to another embodiment of the present application, as shown in fig. 3, the apparatus 30 may include an obtaining module 31, a first processing module 32, and a second processing module 33, where:
an obtaining module 31, configured to obtain configuration information of at least one service data required by the service reporting system by querying a reporting database through an analysis server based on a spark calculation framework;
the first processing module 32 is configured to determine, according to the configuration information, service databases corresponding to the service data, and obtain the corresponding service data from the service databases;
the second processing module 33 is configured to perform corresponding data processing on each obtained service data, and store each processing result into the report database according to the configuration information, so that the service report system obtains each processing result by querying the report database.
According to the device provided by the embodiment of the application, the analysis server based on the spark calculation frame acquires the configuration information of at least one service data required by the service report system by inquiring the report database, determines the service databases corresponding to the service data respectively according to the configuration information, and acquires the corresponding service data from the service databases respectively, so that the analysis server based on the spark calculation frame can parallelly inquire the service databases corresponding to the service data respectively according to the configuration information in the report database to acquire the service data simultaneously, the inquiry efficiency is greatly improved, the coupling between the analysis server and the inquiry terminal is greatly reduced by introducing the service databases, and when the analysis server is abnormal, the influence on the normal use of other functions of the inquiry terminal is effectively avoided; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
Specifically, the configuration information of at least one service data required by the service reporting system includes at least one of the following items:
at least one service database corresponding to the service data respectively;
at least one query statement corresponding to the service data respectively;
and at least one business report corresponding to the business data respectively.
Specifically, the first processing module 32 is specifically configured to obtain, according to the query statement corresponding to each service data, corresponding service data from the service database corresponding to each service data.
Specifically, the second processing module 33 is specifically configured to perform at least one of the following:
performing data cleaning on each acquired service data to filter redundant service data;
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
performing mathematical operation of at least one of addition, subtraction, multiplication and division on each acquired service data;
and summarizing the acquired business data according to the data types.
Specifically, the second processing module 33 includes a determination submodule 331 and a storage submodule 332, where:
the determining submodule 331 is configured to determine, according to the configuration information, a service report corresponding to each service data;
the storage sub-module 332 is configured to store each processing result corresponding to each service data into a corresponding service report, where the service report is located in the report database.
It should be noted that the present embodiment is an apparatus embodiment corresponding to the method embodiment described above, and the present embodiment can be implemented in cooperation with the method embodiment described above. The related technical details mentioned in the above method embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described method item embodiments.
Another embodiment of the present application provides an electronic device, as shown in fig. 5, an electronic device 500 shown in fig. 5 includes: a processor 501 and a memory 503. Wherein the processor 501 is coupled to the memory 503, such as via the bus 502. Further, the electronic device 500 may also include a transceiver 504. It should be noted that the transceiver 504 is not limited to one in practical applications, and the structure of the electronic device 500 is not limited to the embodiment of the present application.
The processor 501 is applied in the embodiment of the present application, and is used to implement the functions of the obtaining module, the first processing module, and the second processing module shown in fig. 3 and fig. 4.
The processor 501 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 501 may also be a combination of implementing computing functionality, e.g., comprising one or more microprocessors, a combination of DSPs and microprocessors, and the like.
Bus 502 may include a path that transfers information between the above components. The bus 502 may be a PCI bus or an EISA bus, etc. The bus 502 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus.
The memory 503 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The memory 503 is used for storing application program codes for executing the scheme of the application, and the processor 501 controls the execution. The processor 501 is configured to execute the application program code stored in the memory 503 to implement the actions of the data analysis apparatus based on spark calculation framework provided by the embodiment shown in fig. 3 or fig. 4.
The electronic device provided by the embodiment of the application comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, the electronic device can realize that: the analysis server based on the spark calculation framework acquires the configuration information of at least one service data required by a service report system by inquiring the report database, determines the service databases corresponding to the service data respectively according to the configuration information, and acquires the corresponding service data from the service databases respectively, so that the analysis server based on the spark calculation framework can simultaneously acquire the service data by parallelly inquiring the service databases corresponding to the service data respectively according to the configuration information in the report database, thereby greatly improving the inquiry efficiency, greatly reducing the coupling between the analysis server and the inquiry terminal by introducing the service databases, and effectively avoiding influencing the normal use of other functions of the inquiry terminal when the analysis server is abnormal; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
The embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the method shown in the above embodiment. Specifically, the analysis server based on the spark calculation framework acquires configuration information of at least one service data required by the service report system by inquiring the report database, determines the service databases corresponding to the service data respectively according to the configuration information, and acquires the corresponding service data from the service databases respectively, so that the analysis server based on the spark calculation framework can simultaneously acquire the service data by inquiring the service databases corresponding to the service data respectively according to the configuration information in the report database, thereby greatly improving the inquiry efficiency, greatly reducing the coupling between the analysis server and the inquiry terminal by introducing the service databases, and effectively avoiding influencing the normal use of other functions of the inquiry terminal when the analysis server is abnormal; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
The computer-readable storage medium provided by the embodiment of the application is suitable for any embodiment of the method.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims (10)

1. A data analysis method based on spark calculation framework is characterized by comprising the following steps:
an analysis server based on spark calculation framework acquires configuration information of at least one service data required by a service report system by inquiring a report database, wherein the configuration information is stored in the report database after the service report system receives the configuration information of at least one service data configured by a user;
determining service databases corresponding to the service data according to the configuration information, and respectively acquiring corresponding service data from the service databases, wherein the service data are prestored in the service database of the analysis server based on the spark calculation framework;
performing corresponding data processing on each acquired service data, and storing each processing result into the report database according to the configuration information, so that the service report system acquires each processing result by querying the report database;
the performing corresponding data processing on each obtained service data includes at least one of the following:
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
and performing at least one of addition, subtraction, multiplication and division on each acquired service data.
2. The method according to claim 1, wherein the configuration information of the at least one business data required by the business reporting system comprises at least one of:
the service database corresponding to the at least one service data respectively;
the query statements respectively correspond to the at least one service data;
and the at least one service data respectively corresponds to a service report.
3. The method according to claim 2, wherein the determining, according to the configuration information, service databases corresponding to the respective service data, and obtaining the respective service data from the respective service databases respectively comprises:
and acquiring corresponding service data from the service database corresponding to each service data according to the query statement corresponding to each service data.
4. The method according to claim 1, wherein the performing the corresponding data processing on each obtained service data further includes at least one of:
performing data cleaning on each acquired service data to filter redundant service data;
and summarizing the acquired business data according to the data types.
5. The method according to claim 2, wherein the storing each processing result into the report database according to the configuration information comprises:
determining a business report corresponding to each business data according to the configuration information;
and respectively storing each processing result corresponding to each business data into a corresponding business report, wherein the business report is positioned in the report database.
6. A spark calculation framework-based data analysis apparatus, comprising:
the acquisition module is used for acquiring configuration information of at least one service data required by a service report system by inquiring a report database based on an analysis server of a spark calculation frame, wherein the configuration information is stored in the report database after the service report system receives the configuration information of at least one service data configured by a user;
the first processing module is used for determining service databases corresponding to the service data according to the configuration information, and acquiring the corresponding service data from the service databases, wherein the service data is prestored in the service database of the analysis server based on the spark calculation framework;
the second processing module is used for respectively carrying out corresponding data processing on each acquired service data and respectively storing each processing result into the report database according to the configuration information so that the service report system acquires each processing result by inquiring the report database;
the second processing module is specifically configured to perform at least one of:
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
and performing at least one of addition, subtraction, multiplication and division on each acquired service data.
7. The apparatus according to claim 6, wherein the configuration information of the at least one business data required by the business reporting system comprises at least one of:
the service database corresponding to the at least one service data respectively;
the query statements respectively correspond to the at least one service data;
and the at least one service data respectively corresponds to a service report.
8. The apparatus according to claim 7, wherein the first processing module is specifically configured to obtain, according to the query statement corresponding to each service data, corresponding service data from a service database corresponding to each service data.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the spark computation framework-based data analysis method of any of claims 1-5 when executing the program.
10. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the program is executed by a processor, the method for data analysis based on spark computing framework according to any one of claims 1 to 5 is implemented.
CN201910817122.2A 2019-08-30 2019-08-30 Spark calculation framework-based data analysis method and electronic equipment Active CN110515967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910817122.2A CN110515967B (en) 2019-08-30 2019-08-30 Spark calculation framework-based data analysis method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910817122.2A CN110515967B (en) 2019-08-30 2019-08-30 Spark calculation framework-based data analysis method and electronic equipment

Publications (2)

Publication Number Publication Date
CN110515967A CN110515967A (en) 2019-11-29
CN110515967B true CN110515967B (en) 2020-09-08

Family

ID=68629003

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910817122.2A Active CN110515967B (en) 2019-08-30 2019-08-30 Spark calculation framework-based data analysis method and electronic equipment

Country Status (1)

Country Link
CN (1) CN110515967B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111273966B (en) * 2020-02-20 2023-08-15 浪潮通用软件有限公司 Welfare data processing method, device and computer readable medium
CN113869018A (en) * 2021-10-15 2021-12-31 创优数字科技(广东)有限公司 Business report generation method, electronic equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536778A (en) * 2018-03-29 2018-09-14 客如云科技(成都)有限责任公司 A kind of data application shared platform and method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075963B (en) * 2009-11-25 2013-11-06 中国移动通信集团贵州有限公司 A mobile business data acquisition analysis method and a system for the same
CN102118446A (en) * 2011-03-09 2011-07-06 成都四方信息技术有限公司 WEB-based high-performance intelligent reporting system
US11328307B2 (en) * 2015-02-24 2022-05-10 OpSec Online, Ltd. Brand abuse monitoring system with infringement detection engine and graphical user interface
CN105574643A (en) * 2015-11-23 2016-05-11 江苏瑞中数据股份有限公司 Real-time data center and big data platform fusion method for power grid
CN107798037A (en) * 2017-04-26 2018-03-13 平安科技(深圳)有限公司 The acquisition methods and server of user characteristic data
CN110413610B (en) * 2019-06-19 2023-10-27 中国平安财产保险股份有限公司 Method and system for improving export efficiency of business data report forms and database server

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536778A (en) * 2018-03-29 2018-09-14 客如云科技(成都)有限责任公司 A kind of data application shared platform and method

Also Published As

Publication number Publication date
CN110515967A (en) 2019-11-29

Similar Documents

Publication Publication Date Title
CN109086409B (en) Microservice data processing method and device, electronic equipment and computer readable medium
JP2023062126A (en) Data quality analysis
JP5298117B2 (en) Data merging in distributed computing
CN110795455A (en) Dependency relationship analysis method, electronic device, computer device and readable storage medium
US20080222634A1 (en) Parallel processing for etl processes
CN111339073A (en) Real-time data processing method and device, electronic equipment and readable storage medium
CN107273369B (en) Table data modification method and device
US8839198B2 (en) Automated analysis of composite applications
CN105550270B (en) Data base query method and device
CN110515967B (en) Spark calculation framework-based data analysis method and electronic equipment
CN113268500B (en) Service processing method and device and electronic equipment
CN109783498B (en) Data processing method and device, electronic equipment and storage medium
CN111666279B (en) Query data processing method and device, electronic equipment and computer storage medium
CN112214505A (en) Data synchronization method and device, computer readable storage medium and electronic equipment
CN114496140B (en) Data matching method, device, equipment and medium for query conditions
CN109344169B (en) Data processing method and device
CN112199930B (en) Method and system for automatically generating report according to report configuration
CN107515916B (en) Performance optimization method and device for data query
US7848909B2 (en) Computing prediction results during an unbroken online interactive session
CN111045983A (en) Nuclear power station electronic file management method and device, terminal equipment and medium
CN115718754A (en) Electronic accounting archive data query method and device and electronic equipment
CN107688581B (en) Data model processing method and device
CN112307050B (en) Identification method and device for repeated correlation calculation and computer system
US8170973B2 (en) Satisfying rules through a configuration of list processing methods
US20170161359A1 (en) Pattern-driven data generator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant