CN110515967B - Spark calculation framework-based data analysis method and electronic equipment - Google Patents
Spark calculation framework-based data analysis method and electronic equipment Download PDFInfo
- Publication number
- CN110515967B CN110515967B CN201910817122.2A CN201910817122A CN110515967B CN 110515967 B CN110515967 B CN 110515967B CN 201910817122 A CN201910817122 A CN 201910817122A CN 110515967 B CN110515967 B CN 110515967B
- Authority
- CN
- China
- Prior art keywords
- service
- data
- service data
- report
- configuration information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application relates to the technical field of data processing, and discloses a spark calculation framework-based data analysis method and electronic equipment, wherein the spark calculation framework-based data analysis method comprises the following steps: an analysis server based on a spark calculation framework acquires configuration information of at least one service data required by a service report system by inquiring a report database; determining service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases; and performing corresponding data processing on each acquired service data, and storing each processing result into a report database according to the configuration information, so that the service report system acquires each processing result by inquiring the report database. The method of the embodiment of the application greatly improves the query efficiency and reduces the coupling between the analysis server and the query terminal.
Description
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a spark computing framework-based data analysis method and electronic equipment.
Background
With the advent of the information age, the accumulation of data has grown exponentially, for example, hospital accounting data (such as fixed asset purchase and consumption, wages and benefits of medical staff, medicine and equipment purchase and consumption, outpatient income, budgeting and reimbursement, etc.), internet data, etc., often come from different business systems, and the data of these business systems may come from multiple databases. The data amount of a single data table can reach tens of millions and hundreds of millions, and the data structure of each data table is different.
Currently, in the process of querying data, a method of executing a plurality of SQL (Structured Query Language) statements by developing JDBC (Java database connectivity) is generally used to Query different databases respectively, and a plurality of front-end formulas (similar to EXCEL formulas) are developed to load and calculate Query data, and an obtained data result is displayed in a WEB page. Although the business requirements can be met to a certain extent by the mode, the query efficiency is low, and the page loading speed is extremely low when massive data and complex SQL are faced; especially, when multi-user and multi-data query occurs simultaneously, the situation that corresponding data cannot be queried after long-time waiting or the situation that a database and an application server are crashed occurs, which causes extremely poor user experience.
Disclosure of Invention
The purpose of the embodiments of the present application is to solve at least one of the above technical drawbacks, and to provide the following technical solutions:
in one aspect, a data analysis method based on spark calculation framework is provided, which includes:
an analysis server based on a spark calculation framework acquires configuration information of at least one service data required by a service report system by inquiring a report database;
determining service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases;
and performing corresponding data processing on the acquired business data respectively, and storing each processing result into a report database respectively according to the configuration information, so that the business report system acquires each processing result by inquiring the report database.
In one implementation, the configuration information of at least one service data required by the service reporting system includes at least one of the following items:
at least one service database corresponding to the service data respectively;
at least one query statement corresponding to the service data respectively;
and at least one business report corresponding to the business data respectively.
In one implementation, determining, according to the configuration information, service databases corresponding to the respective service data, and obtaining the respective service data from the respective service databases, includes:
and acquiring corresponding service data from the service database corresponding to each service data according to the query statement corresponding to each service data.
In one implementation manner, the corresponding data processing is performed on each obtained service data, and the method includes at least one of the following steps:
performing data cleaning on each acquired service data to filter redundant service data;
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
performing mathematical operation of at least one of addition, subtraction, multiplication and division on each acquired service data;
and summarizing the acquired business data according to the data types.
In one implementation, storing each processing result in a report database according to the configuration information includes:
determining a business report corresponding to each business data according to the configuration information;
and respectively storing each processing result corresponding to each business data into a corresponding business report, wherein the business report is positioned in a report database.
In one aspect, a spark calculation framework-based data analysis apparatus is provided, including:
the acquisition module is used for acquiring configuration information of at least one service data required by a service report system by inquiring a report database by an analysis server based on a spark calculation frame;
the first processing module is used for determining service databases corresponding to the service data according to the configuration information and respectively acquiring the corresponding service data from the service databases;
and the second processing module is used for respectively carrying out corresponding data processing on each acquired service data and respectively storing each processing result into the report database according to the configuration information, so that the service report system acquires each processing result by inquiring the report database.
In one implementation, the configuration information of at least one service data required by the service reporting system includes at least one of the following items:
at least one service database corresponding to the service data respectively;
at least one query statement corresponding to the service data respectively;
and at least one business report corresponding to the business data respectively.
In an implementation manner, the first processing module is specifically configured to obtain, according to the query statement corresponding to each service data, corresponding service data from a service database corresponding to each service data.
In one implementation, the second processing module is specifically configured to perform at least one of the following:
performing data cleaning on each acquired service data to filter redundant service data;
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
performing mathematical operation of at least one of addition, subtraction, multiplication and division on each acquired service data;
and summarizing the acquired business data according to the data types.
In one implementation, the second processing module includes a determining submodule and a storing submodule;
the determining submodule is used for determining a business report corresponding to each business data according to the configuration information;
and the storage submodule is used for respectively storing each processing result corresponding to each service data into a corresponding service report, and the service report is positioned in the report database.
In one aspect, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the data analysis method based on the spark calculation framework is implemented.
In one aspect, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the above-mentioned spark computing framework-based data analysis method.
In the spark computing framework-based data analysis method provided by the embodiment of the application, the spark computing framework-based analysis server obtains the configuration information of at least one service data required by the service report system by inquiring the report database, and determining the service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases, so that the analysis server based on spark calculation framework can obtain the configuration information in the report database, the service databases corresponding to the service data are inquired in parallel to acquire the service data simultaneously, so that the inquiry efficiency is greatly improved, and the coupling between the analysis server and the query terminal is greatly reduced by introducing the service database, when the analysis server is abnormal, the influence on the normal use of other functions of the query terminal can be effectively avoided; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
Additional aspects and advantages of embodiments of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of embodiments of the present application will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a spark calculation framework-based data analysis method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a data query process based on spark computing framework according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a basic structure of a spark calculation framework-based data analysis apparatus according to an embodiment of the present application;
FIG. 4 is a detailed structural diagram of a data analysis apparatus based on spark computing framework according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be further described in detail with reference to the accompanying drawings.
The spark calculation framework-based data analysis method and the electronic device provided by the embodiment of the application aim to solve the technical problems in the prior art.
The following describes in detail the technical solutions of the embodiments of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
One embodiment of the application provides a data analysis method based on spark computing framework, and the method is executed by a server device. The servers may be individual physical servers, clusters of physical servers, or virtual servers. As shown in fig. 1, the method includes:
step S110, the spark calculation framework-based analysis server obtains configuration information of at least one service data required by the service reporting system by querying the reporting database.
Specifically, the user may start the business reporting system in a terminal device (e.g., a desktop computer, a notebook computer, etc.) in which the business reporting system is pre-stored or pre-installed, and input a corresponding query condition in the business reporting system to query the needed one or more business data. After inquiring the needed one or more business data, the business reporting system displays the inquired one or more business data.
Specifically, the query condition input by the user in the business reporting system is configuration information configured for one or more business data required by the user. The configuration information of the one or more service data configured by the user in the service reporting system is equivalent to the configuration information of the one or more service data configured by the user received by the service reporting system. And after receiving the configuration information, the business report system stores the configuration information into a report database.
Specifically, the business data is stored in the analysis server based on the spark calculation framework, and the user queries one or more needed business data through the business reporting system, and actually completes the query of the data through the interaction between the business reporting system and the analysis server based on the spark calculation framework. In practical application, the analysis server based on the spark calculation framework may query the report database to obtain the configuration information, and query the required service data according to the configuration information, that is, the analysis server based on the spark calculation framework obtains the configuration information of at least one service data required by the service report system by querying the report database.
Step S120, determining the service databases corresponding to the service data according to the configuration information, and respectively obtaining the corresponding service data from the service databases.
Specifically, after the analysis server based on the spark calculation frame acquires the configuration information, it may determine, according to the configuration information, service databases corresponding to the one or more service data, respectively, and if the service data is data a, data B, and data C, respectively, and the service database corresponding to the data a is service database 1, the service database corresponding to the data B is service database 2, and the service database corresponding to the data C is service database 3, after the analysis server based on the spark calculation frame acquires the configuration information, it may determine, according to the configuration information, that the service database corresponding to the data a is service database 1, the service database corresponding to the data B is service database 2, and the service database corresponding to the data C is service database 3.
Specifically, after determining the service databases corresponding to the respective service data according to the configuration information, data a may be obtained from the service database 1, data B may be obtained from the service database 2, and data C may be obtained from the service database 3, that is, the respective service data may be obtained from the respective service databases. The process of acquiring the data A, the data B and the data C can be carried out simultaneously, namely, all the service data are inquired in parallel from the service database corresponding to all the service data respectively, so that the inquiry efficiency is greatly improved. In addition, by introducing the service database, the coupling between the analysis server and the terminal equipment is greatly reduced, and when the analysis server is abnormal, the influence on the normal use of other functions of the terminal equipment is effectively avoided.
Step S130, performing corresponding data processing on each obtained service data, and storing each processing result into the report database according to the configuration information, so that the service report system obtains each processing result by querying the report database.
Specifically, after the corresponding service data is respectively obtained from each service database, the obtained service data may be respectively subjected to corresponding data processing to obtain corresponding processing results, and then the processing results are respectively stored in the report database according to the configuration information, so that the service report system may obtain the processing results by querying the report database. After the business report system obtains each processing result, each processing result can be displayed to the user.
In the spark computing framework-based data analysis method provided by the embodiment of the application, the spark computing framework-based analysis server obtains the configuration information of at least one service data required by the service report system by inquiring the report database, and determining the service databases corresponding to the service data according to the configuration information, and respectively acquiring the corresponding service data from the service databases, so that the analysis server based on spark calculation framework can obtain the configuration information in the report database, the service databases corresponding to the service data are inquired in parallel to acquire the service data simultaneously, so that the inquiry efficiency is greatly improved, and the coupling between the analysis server and the query terminal is greatly reduced by introducing the service database, when the analysis server is abnormal, the influence on the normal use of other functions of the query terminal is effectively avoided; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
In a possible implementation manner, the configuration information of the at least one business data required by the business reporting system includes, but is not limited to: at least one service database corresponding to the service data respectively; at least one query statement corresponding to the service data respectively; and at least one business report corresponding to the business data respectively.
Specifically, when the configuration information includes at least one service database corresponding to service data, for example, the service database corresponding to the data a is the service database 1, the service database corresponding to the data B is the service database 2, and the service database corresponding to the data C is the service database 3, then: the analysis server based on spark calculation framework can directly obtain data a from the business database 1, data B from the business database 2, and data C from the business database 3 according to the configuration information.
Specifically, when the configuration information includes at least one query statement corresponding to each piece of business data, for example, the query statement corresponding to data a is SQL1, the query statement corresponding to data B is SQL2, and the query statement corresponding to data C is SQL3, then: the analysis server based on spark computing framework can directly obtain data a from the business database 1 according to SQL1, data B from the business database 2 according to SQL2, and data C from the business database 3 according to SQL3 according to the configuration information. In other words, in the process of determining the service databases corresponding to the respective service data according to the configuration information and acquiring the corresponding service data from the respective service databases, the corresponding service data may be acquired from the service databases corresponding to the respective service data according to the query statements corresponding to the respective service data.
Specifically, when the configuration information includes at least one service report corresponding to each service data, the data a corresponds to the service report 1, the data B corresponds to the service report 2, and the data C corresponds to the service report 3, then: after the analysis server based on the spark calculation framework obtains the data a, the data B and the data C, the data a can be stored in the service report 1, the data B can be stored in the service report 2, and the data C can be stored in the service report 3.
Specifically, after determining the service databases corresponding to the service data according to the configuration information, and respectively obtaining the corresponding service data from the service databases, corresponding data processing may be performed on the obtained service data, for example, data cleaning may be performed on the obtained service data to filter out redundant service data, for example, at least one of set operation and union operation may be performed on the obtained service data, for example, at least one of addition, subtraction, multiplication, and division may be performed on the obtained service data, and for example, the obtained service data may be summarized according to data categories.
Specifically, after performing corresponding data processing on each obtained service data, a service report corresponding to each service data may be determined according to the configuration information, and then each processing result corresponding to each service data is stored in a corresponding service report, where the service report is located in a report database, so that the service report system obtains each processing result by querying the report database.
Specifically, fig. 2 shows an application process of the spark calculation framework-based data analysis method implemented in the present application, which specifically includes the following processes:
step a1, the user configures configuration information such as a business database, an inquiry statement, and a business report corresponding to one or more needed business data in the business report system, and stores the configuration information into the report database through the business report coefficient, that is, configures information such as the database, the report information, and the inquiry statement in the business report system into the report database.
Step A2, the spark computing framework-based analysis server determines the service databases corresponding to the respective service data according to the configuration information by calling the internal preset analysis program, and acquires the corresponding service data from the respective service databases. The analysis server based on the spark calculation framework may be a single server or a clustered server.
Step A3, the analysis server based on spark calculation frame calls the configuration information by inquiring the report database, the called content includes: (1) a business report needing to be analyzed; (2) a database corresponding to the business report and a corresponding query statement SQL.
Step A4, the analytic program of the analytic server based on spark computing framework, by querying the report database, executes the corresponding query statement SQL to obtain the corresponding business data in the business database corresponding to each business data of each business report in the report database, where there are one or more business databases, and then executes multiple data queries to obtain the corresponding query results.
And step A5, respectively performing data analysis on each acquired service data according to service requirements based on an analysis program of the analysis server of the spark calculation framework to obtain corresponding analysis results, and storing each obtained analysis result in a report database.
Step A6, the business reporting system displays each analysis result to the report page corresponding to the business reporting system by querying each analysis result in the report database.
The main processing process of the analysis program comprises the following specific steps:
step B1: calling configuration information; and acquiring configuration information of the business data by querying a report database, wherein the configuration information comprises a report to be analyzed, and the field of the report data comprises the business database corresponding to the business data, a query statement SQL corresponding to the business data, an analysis result report and the like.
Step B2: and accessing the corresponding service database according to the configuration information, and executing the configured query statement SQL in the service database to acquire the corresponding service data.
Step B3: and performing logics of data intersection, combination, calculation, group-by-group summarization and the like on the acquired service data according to service requirements to obtain an analyzed data result.
Step B4: and storing the analyzed data result into an analysis result report of a report database, and providing a data query basis for a business report system.
Fig. 3 is a schematic structural diagram of a data analysis apparatus based on a spark calculation framework according to another embodiment of the present application, as shown in fig. 3, the apparatus 30 may include an obtaining module 31, a first processing module 32, and a second processing module 33, where:
an obtaining module 31, configured to obtain configuration information of at least one service data required by the service reporting system by querying a reporting database through an analysis server based on a spark calculation framework;
the first processing module 32 is configured to determine, according to the configuration information, service databases corresponding to the service data, and obtain the corresponding service data from the service databases;
the second processing module 33 is configured to perform corresponding data processing on each obtained service data, and store each processing result into the report database according to the configuration information, so that the service report system obtains each processing result by querying the report database.
According to the device provided by the embodiment of the application, the analysis server based on the spark calculation frame acquires the configuration information of at least one service data required by the service report system by inquiring the report database, determines the service databases corresponding to the service data respectively according to the configuration information, and acquires the corresponding service data from the service databases respectively, so that the analysis server based on the spark calculation frame can parallelly inquire the service databases corresponding to the service data respectively according to the configuration information in the report database to acquire the service data simultaneously, the inquiry efficiency is greatly improved, the coupling between the analysis server and the inquiry terminal is greatly reduced by introducing the service databases, and when the analysis server is abnormal, the influence on the normal use of other functions of the inquiry terminal is effectively avoided; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
Specifically, the configuration information of at least one service data required by the service reporting system includes at least one of the following items:
at least one service database corresponding to the service data respectively;
at least one query statement corresponding to the service data respectively;
and at least one business report corresponding to the business data respectively.
Specifically, the first processing module 32 is specifically configured to obtain, according to the query statement corresponding to each service data, corresponding service data from the service database corresponding to each service data.
Specifically, the second processing module 33 is specifically configured to perform at least one of the following:
performing data cleaning on each acquired service data to filter redundant service data;
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
performing mathematical operation of at least one of addition, subtraction, multiplication and division on each acquired service data;
and summarizing the acquired business data according to the data types.
Specifically, the second processing module 33 includes a determination submodule 331 and a storage submodule 332, where:
the determining submodule 331 is configured to determine, according to the configuration information, a service report corresponding to each service data;
the storage sub-module 332 is configured to store each processing result corresponding to each service data into a corresponding service report, where the service report is located in the report database.
It should be noted that the present embodiment is an apparatus embodiment corresponding to the method embodiment described above, and the present embodiment can be implemented in cooperation with the method embodiment described above. The related technical details mentioned in the above method embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described method item embodiments.
Another embodiment of the present application provides an electronic device, as shown in fig. 5, an electronic device 500 shown in fig. 5 includes: a processor 501 and a memory 503. Wherein the processor 501 is coupled to the memory 503, such as via the bus 502. Further, the electronic device 500 may also include a transceiver 504. It should be noted that the transceiver 504 is not limited to one in practical applications, and the structure of the electronic device 500 is not limited to the embodiment of the present application.
The processor 501 is applied in the embodiment of the present application, and is used to implement the functions of the obtaining module, the first processing module, and the second processing module shown in fig. 3 and fig. 4.
The processor 501 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 501 may also be a combination of implementing computing functionality, e.g., comprising one or more microprocessors, a combination of DSPs and microprocessors, and the like.
The memory 503 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The memory 503 is used for storing application program codes for executing the scheme of the application, and the processor 501 controls the execution. The processor 501 is configured to execute the application program code stored in the memory 503 to implement the actions of the data analysis apparatus based on spark calculation framework provided by the embodiment shown in fig. 3 or fig. 4.
The electronic device provided by the embodiment of the application comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, the electronic device can realize that: the analysis server based on the spark calculation framework acquires the configuration information of at least one service data required by a service report system by inquiring the report database, determines the service databases corresponding to the service data respectively according to the configuration information, and acquires the corresponding service data from the service databases respectively, so that the analysis server based on the spark calculation framework can simultaneously acquire the service data by parallelly inquiring the service databases corresponding to the service data respectively according to the configuration information in the report database, thereby greatly improving the inquiry efficiency, greatly reducing the coupling between the analysis server and the inquiry terminal by introducing the service databases, and effectively avoiding influencing the normal use of other functions of the inquiry terminal when the analysis server is abnormal; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
The embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the method shown in the above embodiment. Specifically, the analysis server based on the spark calculation framework acquires configuration information of at least one service data required by the service report system by inquiring the report database, determines the service databases corresponding to the service data respectively according to the configuration information, and acquires the corresponding service data from the service databases respectively, so that the analysis server based on the spark calculation framework can simultaneously acquire the service data by inquiring the service databases corresponding to the service data respectively according to the configuration information in the report database, thereby greatly improving the inquiry efficiency, greatly reducing the coupling between the analysis server and the inquiry terminal by introducing the service databases, and effectively avoiding influencing the normal use of other functions of the inquiry terminal when the analysis server is abnormal; the acquired business data are respectively subjected to corresponding data processing, and the processing results are respectively stored in the report database according to the configuration information, so that the query terminal can acquire corresponding analysis data by querying the report database, the corresponding analysis data can be efficiently provided for the query terminal, and the analysis server based on the spark calculation framework can support distributed deployment, clustered deployment and the like, and has good expandability.
The computer-readable storage medium provided by the embodiment of the application is suitable for any embodiment of the method.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.
Claims (10)
1. A data analysis method based on spark calculation framework is characterized by comprising the following steps:
an analysis server based on spark calculation framework acquires configuration information of at least one service data required by a service report system by inquiring a report database, wherein the configuration information is stored in the report database after the service report system receives the configuration information of at least one service data configured by a user;
determining service databases corresponding to the service data according to the configuration information, and respectively acquiring corresponding service data from the service databases, wherein the service data are prestored in the service database of the analysis server based on the spark calculation framework;
performing corresponding data processing on each acquired service data, and storing each processing result into the report database according to the configuration information, so that the service report system acquires each processing result by querying the report database;
the performing corresponding data processing on each obtained service data includes at least one of the following:
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
and performing at least one of addition, subtraction, multiplication and division on each acquired service data.
2. The method according to claim 1, wherein the configuration information of the at least one business data required by the business reporting system comprises at least one of:
the service database corresponding to the at least one service data respectively;
the query statements respectively correspond to the at least one service data;
and the at least one service data respectively corresponds to a service report.
3. The method according to claim 2, wherein the determining, according to the configuration information, service databases corresponding to the respective service data, and obtaining the respective service data from the respective service databases respectively comprises:
and acquiring corresponding service data from the service database corresponding to each service data according to the query statement corresponding to each service data.
4. The method according to claim 1, wherein the performing the corresponding data processing on each obtained service data further includes at least one of:
performing data cleaning on each acquired service data to filter redundant service data;
and summarizing the acquired business data according to the data types.
5. The method according to claim 2, wherein the storing each processing result into the report database according to the configuration information comprises:
determining a business report corresponding to each business data according to the configuration information;
and respectively storing each processing result corresponding to each business data into a corresponding business report, wherein the business report is positioned in the report database.
6. A spark calculation framework-based data analysis apparatus, comprising:
the acquisition module is used for acquiring configuration information of at least one service data required by a service report system by inquiring a report database based on an analysis server of a spark calculation frame, wherein the configuration information is stored in the report database after the service report system receives the configuration information of at least one service data configured by a user;
the first processing module is used for determining service databases corresponding to the service data according to the configuration information, and acquiring the corresponding service data from the service databases, wherein the service data is prestored in the service database of the analysis server based on the spark calculation framework;
the second processing module is used for respectively carrying out corresponding data processing on each acquired service data and respectively storing each processing result into the report database according to the configuration information so that the service report system acquires each processing result by inquiring the report database;
the second processing module is specifically configured to perform at least one of:
performing set operation of at least one item of intersection operation and union operation on each acquired service data;
and performing at least one of addition, subtraction, multiplication and division on each acquired service data.
7. The apparatus according to claim 6, wherein the configuration information of the at least one business data required by the business reporting system comprises at least one of:
the service database corresponding to the at least one service data respectively;
the query statements respectively correspond to the at least one service data;
and the at least one service data respectively corresponds to a service report.
8. The apparatus according to claim 7, wherein the first processing module is specifically configured to obtain, according to the query statement corresponding to each service data, corresponding service data from a service database corresponding to each service data.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the spark computation framework-based data analysis method of any of claims 1-5 when executing the program.
10. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the program is executed by a processor, the method for data analysis based on spark computing framework according to any one of claims 1 to 5 is implemented.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910817122.2A CN110515967B (en) | 2019-08-30 | 2019-08-30 | Spark calculation framework-based data analysis method and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910817122.2A CN110515967B (en) | 2019-08-30 | 2019-08-30 | Spark calculation framework-based data analysis method and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110515967A CN110515967A (en) | 2019-11-29 |
CN110515967B true CN110515967B (en) | 2020-09-08 |
Family
ID=68629003
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910817122.2A Active CN110515967B (en) | 2019-08-30 | 2019-08-30 | Spark calculation framework-based data analysis method and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110515967B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111273966B (en) * | 2020-02-20 | 2023-08-15 | 浪潮通用软件有限公司 | Welfare data processing method, device and computer readable medium |
CN113869018A (en) * | 2021-10-15 | 2021-12-31 | 创优数字科技(广东)有限公司 | Business report generation method, electronic equipment and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108536778A (en) * | 2018-03-29 | 2018-09-14 | 客如云科技(成都)有限责任公司 | A kind of data application shared platform and method |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102075963B (en) * | 2009-11-25 | 2013-11-06 | 中国移动通信集团贵州有限公司 | A mobile business data acquisition analysis method and a system for the same |
CN102118446A (en) * | 2011-03-09 | 2011-07-06 | 成都四方信息技术有限公司 | WEB-based high-performance intelligent reporting system |
US11328307B2 (en) * | 2015-02-24 | 2022-05-10 | OpSec Online, Ltd. | Brand abuse monitoring system with infringement detection engine and graphical user interface |
CN105574643A (en) * | 2015-11-23 | 2016-05-11 | 江苏瑞中数据股份有限公司 | Real-time data center and big data platform fusion method for power grid |
CN107798037A (en) * | 2017-04-26 | 2018-03-13 | 平安科技(深圳)有限公司 | The acquisition methods and server of user characteristic data |
CN110413610B (en) * | 2019-06-19 | 2023-10-27 | 中国平安财产保险股份有限公司 | Method and system for improving export efficiency of business data report forms and database server |
-
2019
- 2019-08-30 CN CN201910817122.2A patent/CN110515967B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108536778A (en) * | 2018-03-29 | 2018-09-14 | 客如云科技(成都)有限责任公司 | A kind of data application shared platform and method |
Also Published As
Publication number | Publication date |
---|---|
CN110515967A (en) | 2019-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086409B (en) | Microservice data processing method and device, electronic equipment and computer readable medium | |
JP2023062126A (en) | Data quality analysis | |
JP5298117B2 (en) | Data merging in distributed computing | |
CN110795455A (en) | Dependency relationship analysis method, electronic device, computer device and readable storage medium | |
US20080222634A1 (en) | Parallel processing for etl processes | |
CN111339073A (en) | Real-time data processing method and device, electronic equipment and readable storage medium | |
CN107273369B (en) | Table data modification method and device | |
US8839198B2 (en) | Automated analysis of composite applications | |
CN105550270B (en) | Data base query method and device | |
CN110515967B (en) | Spark calculation framework-based data analysis method and electronic equipment | |
CN113268500B (en) | Service processing method and device and electronic equipment | |
CN109783498B (en) | Data processing method and device, electronic equipment and storage medium | |
CN111666279B (en) | Query data processing method and device, electronic equipment and computer storage medium | |
CN112214505A (en) | Data synchronization method and device, computer readable storage medium and electronic equipment | |
CN114496140B (en) | Data matching method, device, equipment and medium for query conditions | |
CN109344169B (en) | Data processing method and device | |
CN112199930B (en) | Method and system for automatically generating report according to report configuration | |
CN107515916B (en) | Performance optimization method and device for data query | |
US7848909B2 (en) | Computing prediction results during an unbroken online interactive session | |
CN111045983A (en) | Nuclear power station electronic file management method and device, terminal equipment and medium | |
CN115718754A (en) | Electronic accounting archive data query method and device and electronic equipment | |
CN107688581B (en) | Data model processing method and device | |
CN112307050B (en) | Identification method and device for repeated correlation calculation and computer system | |
US8170973B2 (en) | Satisfying rules through a configuration of list processing methods | |
US20170161359A1 (en) | Pattern-driven data generator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |