CN110781252A

CN110781252A - Intelligent data analysis visualization method based on big data

Info

Publication number: CN110781252A
Application number: CN201911071160.4A
Authority: CN
Inventors: 吴鑫坤; 张子斌
Original assignee: Anhui Data Hall Technology Co Ltd
Current assignee: Anhui Data Hall Technology Co Ltd
Priority date: 2019-11-05
Filing date: 2019-11-05
Publication date: 2020-02-11
Anticipated expiration: 2039-11-05
Also published as: CN110781252B

Abstract

The invention discloses a big data-based intelligent data analysis visualization method, which comprises the following steps: loading and importing various structured and unstructured data embedded with an access connector and all data in various relational databases and various big data storage files to a built-in SPARK platform through a data connection module; the invention flexibly selects modules and components required by different products according to a plurality of independent functional modules which are divided to analyze the past, monitor the present and predict the future, connects various different databases or distributed databases, supports the connection and access of various big data storage platforms, combines, searches, visualizes and analyzes data, provides one-stop storage and management service for customers, helps the customers to face the rapid growth of data and the uncertain storage demand risk of a business system, reduces the risk of the customers and can meet the needs of business growth and change at the same time.

Description

Intelligent data analysis visualization method based on big data

Technical Field

The invention relates to the technical field of visualization platforms, in particular to a big data-based intelligent data analysis visualization method.

Background

The business intelligence is to use a data warehouse and a data mining technology to store and manage data systematically, analyze the data through various data statistical analysis tools, provide various analysis reports such as customer value evaluation, customer satisfaction evaluation, service quality evaluation, marketing effect evaluation and future market demand, and further provide decision information for the business activities of enterprises.

With the continuous deepening of the global economy integration, the competition among enterprises is gradually upgraded from the initial single product competition to the comprehensive competition in the aspects of products, technologies, management, services and the like. And the data analysis can help enterprises to optimize business processes, reduce operation cost, improve customer value and mine potential business opportunities, so that more competitive personalized products and services are provided, the core competitiveness of the enterprises is finally improved, and the dominant position of the enterprises on the market is ensured. However, due to the continuous development of data acquisition and database technology and the large-scale development of enterprise information construction, business data accumulated by enterprises are explosively increased, which brings a series of difficulties to data analysis work of the enterprises.

The business development of enterprises is very rapid nowadays, and along with the rapid change of a management mode, if report design cannot keep up with the needs of business development, serious disjunction can be caused, so that the report design is too slow, which is a common problem at present, the installation and deployment environments of current business intelligent platforms, reports, traditional OLAPs, cured products and design tools are complex, the data amount inside the enterprises is huge, and a large amount of redundant data is accumulated because of the information construction for years, and for the enterprises, the most important is to distinguish the data and reasonably utilize the data;

and the analysis of the current performance of the enterprise is difficult to understand the current condition of the enterprise and predict the future development road. Early-stage enterprises have massive business reports, but no method provides effective support for decision, high-level decision makers cannot guide decisions through enterprise key performance, report requirements can change continuously, report development period is long, statistical apertures are different, report display forms are limited, and reports which accord with humanization cannot be made;

in order to solve the above-mentioned drawbacks, a technical solution is now provided.

Disclosure of Invention

The invention aims to provide a big data-based intelligent data analysis visualization method, which realizes data application in the aspects of decision support, financial analysis, early warning analysis, performance analysis, operation analysis and the like according to a visualization design platform, can quickly mine data, exerts data value, achieves the purpose of intelligent decision and forms the competitive advantage of enterprises;

the system provides a flexible application and integration mode, flexibly selects modules and components required by different products according to needs through a plurality of divided independent function modules to analyze the past, monitor the present and predict the future, connects various different databases or distributed databases, supports the connection and access of various big data storage platforms, combines, searches, visualizes and analyzes data, provides one-stop storage and management service for clients, helps the clients to face rapid growth of data and uncertain storage demand risks of business systems, and can meet the needs of business growth and change while reducing the risks of the clients.

The technical problems to be solved by the invention are as follows:

and the analysis of the current performance of the enterprise is difficult to understand the current condition of the enterprise and predict the future development road. In the early period, enterprises have massive business reports, but no method provides effective support for decision, high-level decision makers cannot guide decisions through enterprise key performance, report requirements can change continuously, report development period is long, statistical apertures are different, report development forms are limited, and reports which accord with humanization cannot be made.

The purpose of the invention can be realized by the following technical scheme:

a data intelligent analysis visualization method based on big data comprises the following steps:

loading and importing various structured and unstructured data embedded with an access connector and all data in various relational databases and various big data storage files to a built-in SPARK platform through a data connection module, and transmitting the data to a big data visualization module;

it supports a variety of data sources such as traditional relational data, columnar databases, cloud services, HADOOP, APIs, unstructured databases, EXCEL, and third party platforms, etc.; and supports multiple access modes, such as JDBC, Web Services, HTTP requests, manual import and the like;

the SPARK platform comprises a data center mother unit, a data analysis mother unit and a data display mother unit;

the data modeling subunit is used for performing data extraction operation, data governance operation, data cleaning operation and data integration operation on the data source, and transmitting the data extraction operation, the data governance operation, the data cleaning operation and the data integration operation to the data analysis parent unit together with the obtained table, SQL and EXCEL to the data analysis parent unit;

the data analysis mother unit consists of a self-service query subunit, an image-text report subunit, a data mining subunit and a system management subunit, wherein the self-service query subunit is used for carrying out dimension switching operation, index switching operation, interaction operation, filtering operation, ranking calculation operation, summarizing calculation operation, header design operation, data early warning operation, data drilling operation, user-defined index operation, data linkage operation and parameter transmission operation on received data and sending the obtained data to the data center mother unit; the image-text report subunit is used for carrying out user-defined report operation, Chinese report operation, chart linkage operation, graph-picture linkage operation, function formula operation, data drilling operation, map display operation, HTML (hypertext markup language) embedding operation, data filling operation, intelligent early warning operation, secondary development operation and personalized portal operation on the received data and sending the obtained data to the data center mother unit; the data mining subunit is used for carrying out data cleaning operation, automatic model operation, classification model operation, regression model operation, clustering model operation, recommendation model operation, association model operation, statistical model operation, R integration extension operation, text mining operation, deep learning operation and analysis report operation on the received data and sending the obtained data to the data center mother unit; the system management subunit is used for carrying out user management operation, authority management operation, log management operation, task management operation, personalized theme operation, system integration operation, message pushing operation and API opening operation on the received data and sending the obtained data to the data center parent unit;

the data display mother unit consists of a PC end, a mobile phone end, a mobile end and a large screen, and the PC end, the mobile phone end, the mobile end and the large screen are used for displaying and displaying the received data;

the SPARK platform supports a data visualization platform, a data interaction platform, a multidimensional data analysis and visualization display platform, a visualization data mining platform, an artificial intelligence platform, a data center data storage platform and a big data storage and relational database storage platform, namely from an analysis layer to a display decision layer, wherein common users, data analysis experts, IT integration businessmen, data mining personnel and the like can realize big data analysis;

secondly, the big data visualization module creates and arranges the received data through table, graph, chart, map and instrument panel components, analyzes and applies operations according to the data to obtain high-frequency using signals, and places the components corresponding to the high-frequency using signals in a display area of the big data visualization module so as to professionally, simply and flexibly provide the client with the optimal selection requirements, wherein the big data visualization module comprises a display area, an operation area, an audio area and the like and transmits the created and arranged data to a report design module;

step three, the report design module generates a single form from the received data, obtains a plurality of complex forms according to a result splicing mode, transmits the complex forms to the exploration association analysis module, and the exploration association analysis module performs association analysis operation on the received complex forms to obtain various association reports and transmits the association reports to the self-service interaction analysis module;

step four, the self-service interactive analysis module consists of a calculation formula engine unit and a memory data cube unit, the calculation formula engine unit is internally provided with a trigonometric function, a text function, a date and time function, a logic function, an array function and a self-defined function, the internal analysis mode comprises a homonymy, a ring ratio, an occupation ratio, an accumulation, an average, a variance, a median and a standard deviation, and the internal sorting mode comprises sorting based on the query result, sorting based on the dimension of the self-service interactive analysis module, sorting based on the size of a summary index and sorting based on a formula value so as to facilitate the display and conversion of the data type and the result; dynamic memory data cube technology, a parallel computing data processing mode, a column database storage mode, various bitmap index modes and a duplication avoidance computing cache mode are stored in the memory data cube unit, so that the accuracy and the efficiency of data query and display are guaranteed;

the self-service interaction analysis module carries out data aggregation and calculation on the received various associated reports through the calculation formula engine unit and the memory data cube unit to obtain various data sets, and transmits the various data sets to the abnormity early warning module;

step five: the anomaly early warning module consists of a multi-dimensional analysis unit and a drilling analysis unit, wherein the multi-dimensional analysis unit provides an SQL (structured query language) query interface and a multi-dimensional analysis mode above Hadoop and Spark on the basis of a bottom layer big data architecture and performs pre-calculation by combining a data cube technology, namely, compared with the prior query speed, the query speed is increased by hundreds to thousands of times, and the functions of query, filtering, transposition, drilling, sorting, summarizing, early warning, graph and the like of TB-level data can be easily realized; the drilling analysis unit drills and calculates and analyzes data layer by layer on the basis of the whole big data architecture to obtain corresponding bottom data so as to find out the fundamental factors influencing the development result of the data;

the abnormity early warning module corresponds the received various data sets to each region, the bottom layer data of the region is compared with a preset data interval through the multi-dimensional analysis unit and the drilling analysis unit, when the bottom layer data of the region is positioned outside the preset data interval, different numbers and colors are marked on the region, the region is sent to the data visualization large-screen platform, the bottom layer data of each region is compared with the preset data interval, the current data characteristics are displayed in a visual mode according to the marking change of the numbers and the colors, and prompt and early warning are carried out on decision-making personnel in time, so that a decision-making scheme is adjusted as early as possible;

and step six, the data visualization large-screen platform displays the data according to the data, and creates a corresponding large-data visualization large screen according to various large-screen templates stored in the data visualization large-screen platform, so that business personnel can use and understand the data conveniently.

Further, the specific steps of the data analysis application operation are as follows:

the method comprises the following steps: acquiring the total selection times of the tables, graphs, charts, maps and instrument panel components in the first time level for creating and sorting data, and demarcating them as Qi, Wi, Ei, Ri and Ti, and when Qi, Wi, Ei, Ri and Ti are respectively larger than the maximum value of the respective preset range, within the respective preset range and smaller than the minimum value of the respective preset range, then Qi is assigned to the nominal values Q1, Q2, and Q3, Q1 is greater than Q2 is greater than Q3, then Wi is assigned the nominal values W1, W2, and W3, W1 is greater than W2 is greater than W3, then assigning Ei to the calibrated values E1, E2, and E3, E1 greater than E2 greater than E3, then Ri is assigned the calibrated values R1, R2, and R3, respectively, R1 is greater than R2 is greater than R3, then Ti is assigned the nominal values T1, T2, and T3, respectively, T1 is greater than T2 is greater than T3, and T1 greater than R1 greater than W1 greater than Q1 greater than E1, and T2 greater than R2 greater than W2 greater than Q2 greater than E2, and T3 greater than R3 greater than W3 greater than Q3 greater than E3;

step two: acquiring the total working time of the table, graph, chart, map and instrument panel components in the first time level for creating and sorting data, and demarcating the values as Qj, Wj, Ej, Rj and Tj, and when Qj, Wj, Ej, Rj and Tj are respectively larger than the maximum value of the respective preset range, are within the respective preset range and are smaller than the minimum value of the respective preset range, then assigning Qj to the nominal values Q4, Q5, and Q6, Q4 greater than Q5 greater than Q6, then Wj is assigned the nominal values W4, W5, and W6, W4 is greater than W5 is greater than W6, then assigning Ej to the calibrated values E4, E5, and E6, E4 greater than E5 greater than E6, then Rj are assigned the nominal values R4, R5, and R6, respectively, R4 is greater than R5 is greater than R6, then Tj are assigned the nominal values T4, T5, and T6, respectively, T4 greater than T5 greater than T6, and T4 equal to R4 equal to W4 equal to Q4 equal to E4, and T5 equal to R5 equal to W5 equal to Q5 equal to E5, and T6 equal to R6 equal to W6 equal to Q6 equal to E6;

step three: acquiring total switching times of creating and sorting data by table, graph, chart, map and instrument panel components in a first time level, wherein the total switching times represent the total change times of an original component after any component is switched to other components, and calibrating the total change times into Qk, Wk, Ek, Rk and Tk, and when the Qk, Wk, Ek, Rk and Tk are respectively greater than the maximum value of a respective preset range and are located in the respective preset range and less than the minimum value of the respective preset range, assigning Qk to calibration values Q7, Q9 and Q9, Q9 is greater than Q8 and is greater than Q7, assigning Wk to calibration values W7, W8 and W9, if W9 is greater than W9 and is greater than W9, assigning Ek to calibration values E9, E9 and E9, if E9 is greater than E9, assigning R36k to calibration values E9 and T9, assigning T9 and T9 to T9 and T9, and T7 greater than R7 greater than W7 greater than Q7 greater than E7, and T8 greater than R8 greater than W8 greater than Q8 greater than E8, and T9 greater than R9 greater than W9 greater than Q9 greater than E9, the first time level representing the duration of one week;

step four: weighting coefficients q, w and e are respectively given to Qi, Wi, Ei, Ri and Ti, Qj, Wj, Ej, Rj and Tj, Qk, Wk, Ek, Rk and Tk, w is larger than e and is larger than q, and q + w + e is 3.5412, then using coefficients Q, W, E, R and T of tables, graphs, charts and instrument panel assemblies in a first time level are obtained according to a formula U, wherein U is Ui, q + Uj, w + Uk, and U is Q, W, E, R, T, and a high-frequency using signal is generated by a module corresponding to the maximum value of the using coefficients.

Furthermore, the result splicing mode is to select the results of various single tables to be arranged in the same area, and perform formula operation and display according to the results to obtain a plurality of complex tables.

Furthermore, the correlation analysis operation is to use the cargo owner area, cargo owner city, cargo carrier, sales volume, quantity, cargo cost and unit price in the multiple complex forms as correlation items, and establish report links to combine and correlate to obtain various correlation reports, i.e. to correlate different multiple complex forms to obtain various correlation reports, so as to realize the skip from one report to another report, and according to the analysis conversion among reports, not only can conveniently realize the perspective analysis from summary data to detail data, but also can transmit parameters among the correlated reports to realize the analysis flow.

The invention has the beneficial effects that:

according to the invention, data application in the aspects of decision support, financial analysis, early warning analysis, performance analysis, operation analysis and the like is realized according to a visual design platform, data mining can be rapidly carried out, the data value is exerted, the purpose of intelligent decision is achieved, and the competitive advantage of an enterprise is formed;

the system provides a flexible application and integration mode, flexibly selects modules and components required by different products according to needs through a plurality of divided independent function modules to analyze the past, monitor the present and predict the future, connects various different databases or distributed databases, supports the connection and access of various big data storage platforms, combines, searches, visualizes and analyzes data, provides one-stop storage and management service for clients, helps the clients to easily face rapid growth of data and uncertain storage demand risks of business systems, and can meet the needs of business growth and change while reducing the risks of the clients;

the invention analyzes, creates and arranges each data source to obtain corresponding processing data, and carries out data analysis application operation on the processing data, namely, the total selection times, the total working time and the total switching times of the table, graph, chart, map and instrument panel component for data creation and arrangement are calibrated, assigned weight analysis and comparison processing together to obtain high-frequency using signals, the component corresponding to the high-frequency using signals is arranged in a corresponding display area so as to professionally, simply and flexibly provide the optimal selection requirements for customers, simultaneously, the obtained processing data is generated into a single table, and a plurality of complex tables are obtained according to a result splicing mode, and various items in the plurality of complex tables are combined and associated according to the actual requirements to obtain various associated reports, namely, different complex tables are associated to obtain various associated reports, the method has the advantages that the jump from one report to another is realized, and the perspective analysis from summary data to detailed data can be conveniently realized according to the analysis conversion among reports, and the analysis flow can be realized by transmitting parameters among related reports;

and then carrying out data aggregation and calculation on the data according to formula analysis and storage distribution to obtain various data sets, carrying out multi-dimensional analysis and drilling analysis on the data sets, searching out bottom fundamental factors influencing development results of the data sets on the basis of an integral big data architecture, displaying current data characteristics in an intuitive mode according to the mark change of numbers and colors, and carrying out prompt and early warning on decision-making personnel in time so as to adjust a decision-making scheme as early as possible, carrying out data display on the decision-making scheme, and creating a corresponding big data visual big screen according to various big screen templates stored inside so as to facilitate the use and understanding of business personnel on the data.

Drawings

In order to facilitate understanding for those skilled in the art, the present invention will be further described with reference to the accompanying drawings;

FIG. 1 is a block flow diagram of the present invention;

FIG. 2 is a schematic diagram of the internal elements of the SPARK platform of the present invention;

FIG. 3 is a schematic diagram of the SPARK platform support platform of the present invention.

Detailed Description

As shown in fig. 1 to 3, a method for intelligently analyzing and visualizing data based on big data includes the following steps:

step two, the big data visualization module creates and arranges the received data through table, graph, chart, map and instrument panel components, and analyzes application operation according to the data, and the specific steps are as follows:

step four: firstly, weighting coefficients q, w and e are respectively given to Qi, Wi, Ei, Ri and Ti, Qj, Wj, Ej, Rj and Tj, Qk, Wk, Ek, Rk and Tk, w is greater than e and is greater than q, and q + w + e is 3.5412, then using coefficients Q, W, E, R and T of tables, graphs, charts and instrument panel assemblies in a first time level are obtained according to a formula U which is Ui + q + Uj + w + Uk and U is Q, W, E, R, T, and a high-frequency using signal is generated by a component corresponding to the maximum value of the using coefficients;

the large data visualization module comprises a display area, an operation area, an audio area and the like, and transmits the created and sorted data to a report design module;

step three, the report design module generates a single table from the received data, and according to the result splicing mode, the results of various single tables are selected to be arranged in the same area, formula operation and display are carried out according to the results to obtain a plurality of complex tables, the complex tables are transmitted to the exploration association analysis module, the exploration association analysis module carries out association analysis operation on the received complex tables, namely, the owner area, the owner city, the shipper, the sales volume, the quantity, the shipment cost and the unit price in the complex tables are taken as association items, report links are established for combination and association to obtain various associated reports, namely, different complex tables are associated to obtain various associated reports, the skip from one report to another report is realized, and according to the analysis conversion among the reports, the perspective analysis from summary data to detailed data can be conveniently realized, parameters can be transmitted among the associated reports to realize analysis flow, and the analysis flow is transmitted to the self-service interactive analysis module;

A visualization method of data intelligent analysis based on big data, in the working process, analyzing, creating and sorting each data source to obtain the corresponding processing data, and performing data analysis application operation on the processing data, namely, calibrating, assigning weight analysis and comparison processing are carried out on the total selection times, the total working time and the total switching times of the table, graph, chart, map and instrument panel components for data creation and sorting together to obtain high-frequency using signals, and the components corresponding to the high-frequency using signals are arranged in the corresponding display areas to provide the optimal selection requirements for customers professionally, simply and flexibly, and simultaneously, the obtained processing data are generated into a single table to obtain a plurality of complex tables according to the result splicing mode, and various items in the plurality of complex tables are combined and associated according to the actual requirements to obtain various associated reports, different complex forms are correlated to obtain various correlated reports, so that the jump from one report to another is realized, and the perspective analysis from summary data to detailed data can be conveniently realized according to the analysis conversion among the reports, and parameters can be transmitted among the correlated reports to realize the analysis flow;

The foregoing is merely exemplary and illustrative of the present invention and various modifications, additions and substitutions may be made by those skilled in the art to the specific embodiments described without departing from the scope of the invention as defined in the following claims.

Claims

1. A data intelligent analysis visualization method based on big data is characterized by comprising the following steps:

secondly, the big data visualization module creates and arranges the received data through table, graph, chart, map and instrument panel components, obtains high-frequency using signals according to data analysis application operation, places the components corresponding to the high-frequency using signals in a display area of the big data visualization module, and transmits the created and arranged data to the report design module;

step four, the self-service interactive analysis module consists of a calculation formula engine unit and a memory data cube unit, the calculation formula engine unit is internally provided with a trigonometric function, a text function, a date and time function, a logic function, an array function and a self-defined function, the internal analysis mode comprises a homonymy ratio, a ring ratio, an occupation ratio, an accumulation, an average, a variance, a median and a standard deviation, and the internal sorting mode comprises sorting based on a query result, sorting based on self-dimension, sorting based on the size of a summary index and sorting based on a formula value; the internal of the memory data cube unit stores dynamic memory data cube technology, a parallel computing data processing mode, a column database storage mode, various bitmap index modes and a duplicate avoidance computing cache mode;

step five: the anomaly early warning module consists of a multi-dimensional analysis unit and a drilling analysis unit, wherein the multi-dimensional analysis unit provides an SQL (structured query language) query interface and a multi-dimensional analysis mode above Hadoop and Spark on the basis of a bottom big data architecture and performs pre-calculation by combining a data cube technology; the drilling analysis unit is used for drilling data layer by layer and calculating and analyzing the data on the basis of the whole big data architecture to obtain corresponding bottom data;

the anomaly early warning module corresponds the received various data sets to each region, compares the bottom data of the region with a preset data interval through the multi-dimensional analysis unit and the drilling analysis unit, marks different numbers and colors for the region when the region is positioned outside the preset data interval, and sends the region to the data visualization large-screen platform;

and step six, the data visualization large-screen platform displays the data according to the data, and creates a corresponding large-data visualization large screen according to various large-screen templates stored in the data visualization large-screen platform.

2. The intelligent analysis and visualization method for big data according to claim 1, wherein the specific steps of the data analysis application operation are as follows:

step three: acquiring total switching times of creation and arrangement of tables, graphs, charts, maps and instrument panel components in a first time level on data, calibrating the total switching times as Qk, Wk, Ek, Rk and Tk, and when Qk, Wk, Ek, Rk and Tk are respectively larger than the maximum value of each preset range, are positioned in each preset range and are smaller than the minimum value of each preset range, assigning Qk to calibration values Q7, Q8 and Q9, Q9 is larger than Q8 and is larger than Q7, assigning Wk to calibration values W7, W8 and W9, assigning W9 to W9 and is larger than W9, assigning Ek to calibration values E9, E9 and E9, assigning E9 to E9 and is larger than E9, assigning Rk to calibration values R9, R9 and R9 to T9 and T9, assigning T9 to T9 and is larger than T9 and is larger than T9, the first time level represents the duration of a week;

3. The intelligent data analysis visualization method based on big data as claimed in claim 1, wherein the result splicing mode is to select results of various single tables to be placed in the same area, and to perform formula operation and display based on the results to obtain multiple complex tables.

4. The intelligent data analysis visualization method based on big data as claimed in claim 1, wherein the correlation analysis operation is to use the owner area, owner city, shipper, sales amount, quantity, shipping cost and unit price in multiple complex tables as correlation items, and establish report links to perform merging and correlation to obtain various correlation reports.