CN112765254B - Solution method and system for realizing data visualization and data integration ETL - Google Patents

Solution method and system for realizing data visualization and data integration ETL Download PDF

Info

Publication number
CN112765254B
CN112765254B CN202110374939.4A CN202110374939A CN112765254B CN 112765254 B CN112765254 B CN 112765254B CN 202110374939 A CN202110374939 A CN 202110374939A CN 112765254 B CN112765254 B CN 112765254B
Authority
CN
China
Prior art keywords
data
analyzed
target
service data
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110374939.4A
Other languages
Chinese (zh)
Other versions
CN112765254A (en
Inventor
金震
张京日
徐伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SunwayWorld Science and Technology Co Ltd
Original Assignee
Beijing SunwayWorld Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SunwayWorld Science and Technology Co Ltd filed Critical Beijing SunwayWorld Science and Technology Co Ltd
Priority to CN202110374939.4A priority Critical patent/CN112765254B/en
Publication of CN112765254A publication Critical patent/CN112765254A/en
Application granted granted Critical
Publication of CN112765254B publication Critical patent/CN112765254B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data

Abstract

The invention discloses a solution method and a system for realizing data visualization and data integration ETL, wherein the method comprises the following steps: the method comprises the steps of obtaining business data to be analyzed uploaded by a target user, creating data connection according to the business data to be analyzed, confirming whether integration processing needs to be carried out on the business data to be analyzed, carrying out ETL processing on the business data to be analyzed if the integration processing needs to be carried out on the business data to be analyzed, obtaining the processed business data to be analyzed, replacing the business data to be analyzed in the data connection with the processed business data to be analyzed, creating a data analysis model according to the data connection, drawing a data report based on the processed business data to be analyzed by using the created data analysis model, and sharing and issuing the data report to a cockpit. The final data report is more diversified and accurate, and enterprise leaders and employees can quickly and intuitively make decision according to the final data report, so that the use experience of the enterprise leaders and the employees is improved.

Description

Solution method and system for realizing data visualization and data integration ETL
Technical Field
The invention relates to the technical field of data analysis, in particular to a solution method and a system for realizing data visualization and data integration ETL.
Background
At present, many enterprises still use past personal experiences and do not speak with data during decision making, which causes many problems during actual decision making operation. In countries with a well developed data analysis industry, 90% of market and business decisions are determined by data analysis research. Speaking by data, the importance of quantitative analysis is also gradually becoming a problem of important consideration in the processes of scientific research, enterprise operation, decision making and the like. More and more people are aware of the importance of data analysis to economic development.
The data analysis work is beneficial to deep development and utilization of data materials, and the purpose of data analysis is to bring more business values to enterprises, help the enterprises to avoid or reduce loss caused by risks, improve data quality and solve problems for the enterprises. The prior art data analysis method is to perform correlation modeling by using the existing data and then draw a data report for the staff to refer to, but the method has the following disadvantages: the data to be analyzed which is huge in quantity and complex can only be subjected to simple correlation processing, and deeper processing cannot be performed on the data to be analyzed, so that a finally generated data report is too simple and has no reference, and the experience of workers is seriously influenced.
Disclosure of Invention
Aiming at the problems shown above, the invention provides a solution method and a system for implementing data visualization and data integration ETL, which are used for solving the problems that in the background art, only simple association processing can be performed on huge and complex data to be analyzed, and deeper processing cannot be performed on the data to be analyzed, so that a finally generated data report is too simple and has no reference, and the experience of workers is seriously influenced.
A solution for realizing data visualization and data integration ETL comprises the following steps:
acquiring to-be-analyzed business data uploaded by a target user, and establishing data connection according to the to-be-analyzed business data;
determining whether integration processing needs to be carried out on the service data to be analyzed, if so, carrying out ETL processing on the service data to be analyzed to obtain processed service data to be analyzed;
replacing the service data to be analyzed in the data connection with the processed service data to be analyzed;
creating a data analysis model according to the data connection, and drawing a data report based on the processed business data to be analyzed by using the created data analysis model;
the data report is shared and published into the cockpit.
Preferably, the acquiring the service data to be analyzed uploaded by the target user, and creating a data connection according to the service data to be analyzed includes:
receiving service data to be analyzed uploaded by the target user, and preprocessing the service data to be analyzed to obtain preprocessed service data to be analyzed;
determining the data type of the preprocessed service data to be analyzed;
and matching database tables according to the data types to obtain matched database tables corresponding to the data types, and constructing data connection between the matched database tables.
Preferably, the determining whether integration processing needs to be performed on the service data to be analyzed, and if so, performing ETL processing on the service data to be analyzed to obtain processed service data to be analyzed includes:
sending a request for judging whether integration processing is needed to be carried out to a terminal where the target user is located, and receiving a feedback instruction sent by the terminal;
when the feedback instruction confirms that integration processing is required, displaying a plurality of components processed by ETL to the terminal, and acquiring a target component selected by a target user;
selecting a proper target data connection based on the target component, inputting the service data to be analyzed into the target data connection in a specific format for ETL processing, and obtaining the processed service data to be analyzed;
and outputting the processed service data to be analyzed in the specific format.
Preferably, the plurality of components includes: a set component, an aggregation component, an expression component, a connection component, a filtering component, a ranking component, and a desensitization component;
the specific format includes: table format and Excel format.
Preferably, the creating a data analysis model according to the data connection, and drawing a data report based on the processed service data to be analyzed by using the created data analysis model includes:
importing the processed business data to be analyzed in the data connection, and detecting the processed business data to be analyzed to obtain detection information;
acquiring attribute factors in the detection information, and mapping the attribute factors to a preset initial data model to obtain a target data analysis model corresponding to the processed to-be-analyzed service data;
performing data analysis on the processed business data to be analyzed by using the target data analysis model to obtain an analysis result;
and creating a report center, and drawing a data report based on the processed business data to be analyzed in the report center according to the analysis result.
Preferably, the sharing and publishing the data report to the cockpit includes:
generating an exclusive sharing link of the data report, and setting the access right of the exclusive sharing link as anyone;
transmitting the exclusive sharing link to a terminal where a target user is located;
the data report is communicated with the cockpit and is led into the cockpit after the connection is finished;
and after the import is finished, detecting the integrity of the imported data report, when the integrity is 100%, no subsequent operation is needed, and when the integrity is complemented by 100%, importing the data report to the cockpit again to replace the previously imported data report.
Preferably, the method further comprises:
storing the data report as a spare template and storing the spare template in a template library;
and encrypting the template library, acquiring an encryption key, and forwarding the encryption key to the mobile phone terminal of the user with the template calling authority.
Preferably, the importing the processed service data to be analyzed in the data connection, detecting the processed service data to be analyzed, and obtaining detection information includes:
analyzing the processed service data to be analyzed to obtain target attribute information contained in the processed service data;
calculating the current matching degree of the target attribute information and preset attribute information; screening out first attribute sub-information with the current matching degree being greater than or equal to a preset matching degree;
judging abnormal information of second attribute sub-information with the current matching degree smaller than the preset matching degree to obtain an abnormal information sequence corresponding to the second attribute sub-information;
generating a specific detection signal, and analyzing the specific detection signal to obtain detection characteristic information of the specific detection signal;
generating a first characteristic information linked list according to the detection characteristic information, and combining the detection characteristic information with the same attribute in the characteristic information linked list to obtain a second characteristic information linked list;
generating a test data packet according to the second characteristic information linked list;
sending the test data packet to the abnormal attribute sub-information corresponding to the abnormal information sequence to obtain response information fed back by the abnormal attribute sub-information;
determining a target detection condition for the processed business data to be analyzed according to the response information;
acquiring target detection parameters of the target detection conditions, and replacing the current detection parameters with the target detection parameters;
and detecting the processed service data to be analyzed by using the target detection condition to obtain the detection information.
Preferably, before performing ETL processing on the service data to be analyzed, the method further includes:
detecting the CPU utilization rate and the memory utilization rate of a server which is to perform ERL processing on the service data to be analyzed;
acquiring a plurality of working nodes of the server, and determining a target working node required to be used in ERL processing in the plurality of working nodes;
determining the current work task of each target work node;
calculating a load balancing index of the server before ETL processing is carried out on the service data to be analyzed according to the current work task of each target work node and the CPU utilization rate and the memory utilization rate of the server:
the method comprises the steps that k is a load balancing index of a server before ETL processing is carried out on service data to be analyzed, A is a CPU utilization rate of the server, B is a memory utilization rate of the server, is a current service rate of the server, is a target working node number, is a current use data volume corresponding to a current working task of an ith target working node, is a maximum distribution data volume of the ith target working node, is an expected use data volume of the ith target working node when the working stability is highest, and is a performance index of the ith target working node;
calculating a target time length for ETL processing on the service data to be analyzed by using the current load balancing index according to the load balancing index before ETL processing is carried out on the service data to be analyzed by the server:
the method comprises the steps of obtaining a target time length for ETL processing of service data to be analyzed by using a current load balancing index, obtaining a preset expected time length, obtaining a target data volume distributed when ETL processing is performed on the service data to be analyzed by an ith target working node, obtaining a current residual data volume of the ith target working node, and obtaining the working efficiency of the ith target working node of a server under the load balancing index;
whether the target time length is greater than or equal to a preset limited time length is confirmed, if yes, work node task process optimization is carried out on the server to liberate the use data volume of the target work node, and otherwise, follow-up operation is not needed;
and after the server is optimized, carrying out ETL processing on the service data to be analyzed by utilizing the server.
A solution system for implementing data visualization and data integration ETL, the system comprising:
the system comprises a creating module, a data processing module and a data transmission module, wherein the creating module is used for acquiring to-be-analyzed business data uploaded by a target user and creating data connection according to the to-be-analyzed business data;
the processing module is used for determining whether integration processing needs to be carried out on the service data to be analyzed, and if so, ETL processing is carried out on the service data to be analyzed to obtain the processed service data to be analyzed;
the replacing module is used for replacing the service data to be analyzed in the data connection with the processed service data to be analyzed;
the drawing module is used for creating a data analysis model according to the data connection and drawing a data report based on the processed business data to be analyzed by using the created data analysis model;
and the issuing module is used for sharing the data report and issuing the data report to the cockpit.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flowchart illustrating a solution for implementing data visualization and data integration ETL according to the present invention;
FIG. 2 is another flowchart of a solution for implementing data visualization and data integration ETL according to the present invention;
FIG. 3 is a flowchart illustrating a solution for implementing data visualization and data integration ETL according to the present invention;
FIG. 4 is a screenshot of an embodiment of a solution for implementing data visualization and data integration ETL according to the present invention;
fig. 5 is a schematic structural diagram of a solution system for implementing data visualization and data integration ETL according to the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
At present, many enterprises still use past personal experiences and do not speak with data during decision making, which causes many problems during actual decision making operation. In countries with a well developed data analysis industry, 90% of market and business decisions are determined by data analysis research. Speaking by data, the importance of quantitative analysis is also gradually becoming a problem of important consideration in the processes of scientific research, enterprise operation, decision making and the like. More and more people are aware of the importance of data analysis to economic development.
The data analysis work is beneficial to deep development and utilization of data materials, and the purpose of data analysis is to bring more business values to enterprises, help the enterprises to avoid or reduce loss caused by risks, improve data quality and solve problems for the enterprises. The prior art data analysis method is to perform correlation modeling by using the existing data and then draw a data report for the staff to refer to, but the method has the following disadvantages: the data to be analyzed which is huge in quantity and complex can only be subjected to simple correlation processing, and deeper processing cannot be performed on the data to be analyzed, so that a finally generated data report is too simple and has no reference, and the experience of workers is seriously influenced. In order to solve the above problem, the present embodiment discloses a solution for implementing data visualization and data integration ETL.
A solution for implementing data visualization and data integration ETL, as shown in fig. 1, includes the following steps:
step S101, acquiring to-be-analyzed service data uploaded by a target user, and creating data connection according to the to-be-analyzed service data;
step S102, confirming whether integration processing needs to be carried out on the service data to be analyzed, if so, carrying out ETL processing on the service data to be analyzed to obtain processed service data to be analyzed;
step S103, replacing the service data to be analyzed in the data connection with the processed service data to be analyzed;
step S104, creating a data analysis model according to the data connection, and drawing a data report based on the processed business data to be analyzed by using the created data analysis model;
and step S105, sharing the data report and releasing the data report to the cockpit.
The working principle of the technical scheme is as follows: the method comprises the steps of obtaining business data to be analyzed uploaded by a target user, creating data connection according to the business data to be analyzed, confirming whether integration processing needs to be carried out on the business data to be analyzed, carrying out ETL processing on the business data to be analyzed if the integration processing needs to be carried out on the business data to be analyzed, obtaining the processed business data to be analyzed, replacing the business data to be analyzed in the data connection with the processed business data to be analyzed, creating a data analysis model according to the data connection, drawing a data report based on the processed business data to be analyzed by using the created data analysis model, and sharing and issuing the data report to a cockpit.
The beneficial effects of the above technical scheme are: the ETL processing is carried out on the service data to be analyzed, so that a data integration ETL process can be created, very complex service data can be processed, a more complex data form is provided for data modeling, the output result of the ETL can still be taken as the input of the ETL process again, the processing can be repeatedly carried out according to the service requirement, the final data report is more diversified and accurate, a worker can quickly and intuitively carry out decision-making work according to the final data report, and the use experience of the ETL is improved. The problem of in the prior art can only do simple correlation processing to huge and complicated data to be analyzed, can not carry out deeper processing to the data to be analyzed and consequently the data report that finally generates is too simple not have the referential, has seriously influenced staff's experience sense is solved.
In one embodiment, the acquiring the service data to be analyzed uploaded by the target user and creating a data connection according to the service data to be analyzed includes:
receiving service data to be analyzed uploaded by the target user, and preprocessing the service data to be analyzed to obtain preprocessed service data to be analyzed;
determining the data type of the preprocessed service data to be analyzed;
and matching database tables according to the data types to obtain matched database tables corresponding to the data types, and constructing data connection between the matched database tables.
The beneficial effects of the above technical scheme are: the database table matching work related to the data type can be quickly carried out by determining the data type of the business data to be analyzed, the working efficiency is improved, further, the data in the matching database table can be called for data analysis by constructing data connection between the data type and the matching database table, the analysis sample and the analysis accuracy are provided, and the accuracy of the final data result is ensured.
In an embodiment, as shown in fig. 2, the determining whether integration processing needs to be performed on the service data to be analyzed, and if so, performing ETL processing on the service data to be analyzed to obtain processed service data to be analyzed includes:
step S201, sending a request for judging whether integration processing is needed to be carried out to a terminal where the target user is located, and receiving a feedback instruction sent by the terminal;
step S202, when the feedback instruction confirms that integration processing is required, displaying a plurality of components processed by ETL to the terminal, and acquiring a target component selected by a target user;
step S203, selecting a proper target data connection based on the target component, inputting the service data to be analyzed into the target data connection in a specific format for ETL processing, and obtaining processed service data to be analyzed;
and step S204, outputting the processed service data to be analyzed in the specific format.
The beneficial effects of the above technical scheme are: the target assembly selected by the target user is utilized to carry out ETL processing, so that the user can select different processing assemblies according to the self requirement, the practicability is improved, further, the subsequent data analysis modeling and data integration ETL are seamlessly converted, and the problem of the very complicated data processing can be easily solved by adopting a data integration ETL process.
In one embodiment, the plurality of components includes: a set component, an aggregation component, an expression component, a connection component, a filtering component, a ranking component, and a desensitization component;
the specific format includes: table format and Excel format.
In an embodiment, as shown in fig. 3, the creating a data analysis model according to the data connection, and drawing a data report based on the processed business data to be analyzed by using the created data analysis model includes:
step S301, importing the processed service data to be analyzed in the data connection, and detecting the processed service data to be analyzed to obtain detection information;
step S302, obtaining an attribute factor in the detection information, and mapping the attribute factor to a preset initial data model to obtain a target data analysis model corresponding to the processed to-be-analyzed service data;
step S303, carrying out data analysis on the processed business data to be analyzed by using the target data analysis model to obtain an analysis result;
and step S304, creating a report center, and drawing a data report based on the processed business data to be analyzed in the report center according to the analysis result.
The beneficial effects of the above technical scheme are: the attribute factors are obtained to be mapped to the initial data model so as to obtain a target data analysis model corresponding to the service data to be analyzed, and then data analysis is carried out, so that a unique accurate analysis result of the service data to be analyzed can be obtained, the accuracy of the analysis result is ensured, the influence of interference data is removed, and the accuracy of a data report is further improved.
In one embodiment, the sharing and publishing the data report into the cockpit includes:
generating an exclusive sharing link of the data report, and setting the access right of the exclusive sharing link as anyone;
transmitting the exclusive sharing link to a terminal where a target user is located;
the data report is communicated with the cockpit and is led into the cockpit after the connection is finished;
and after the import is finished, detecting the integrity of the imported data report, when the integrity is 100%, no subsequent operation is needed, and when the integrity is complemented by 100%, importing the data report to the cockpit again to replace the previously imported data report.
The beneficial effects of the above technical scheme are: both can make all people all can look over the data schoolbag, guaranteed the integrality of leading-in data report again when having improved user's experience sense for staff's reference sample is more accurate and then makes reasonable decision.
In one embodiment, the method further comprises:
storing the data report as a spare template and storing the spare template in a template library;
and encrypting the template library, acquiring an encryption key, and forwarding the encryption key to the mobile phone terminal of the user with the template calling authority.
The beneficial effects of the above technical scheme are: the template can be selected when a subsequent user creates the report center, so that a delicate report can be drawn very quickly, the working efficiency is improved, and meanwhile, the confidentiality of the template can be ensured by setting the encryption key.
In an embodiment, the importing the processed service data to be analyzed in the data connection, detecting the processed service data to be analyzed, and obtaining detection information includes:
analyzing the processed service data to be analyzed to obtain target attribute information contained in the processed service data;
calculating the current matching degree of the target attribute information and preset attribute information; screening out first attribute sub-information with the current matching degree being greater than or equal to a preset matching degree;
judging abnormal information of second attribute sub-information with the current matching degree smaller than the preset matching degree to obtain an abnormal information sequence corresponding to the second attribute sub-information;
generating a specific detection signal, and analyzing the specific detection signal to obtain detection characteristic information of the specific detection signal;
generating a first characteristic information linked list according to the detection characteristic information, and combining the detection characteristic information with the same attribute in the characteristic information linked list to obtain a second characteristic information linked list;
generating a test data packet according to the second characteristic information linked list;
sending the test data packet to the abnormal attribute sub-information corresponding to the abnormal information sequence to obtain response information fed back by the abnormal attribute sub-information;
determining a target detection condition for the processed business data to be analyzed according to the response information;
acquiring target detection parameters of the target detection conditions, and replacing the current detection parameters with the target detection parameters;
and detecting the processed service data to be analyzed by using the target detection condition to obtain the detection information.
The beneficial effects of the above technical scheme are: the processed business data to be analyzed can be detected in a more appropriate and reasonable detection mode by determining the target detection condition of the processed business data to be analyzed, the accuracy of detection information is ensured, a reasonable data base is provided for subsequent modeling and data report generation, furthermore, the detection condition determined by utilizing the abnormal information sequence in the processed business data information to be analyzed can accurately screen out the detection condition which has no influence on the abnormal information sequence so as to keep the integrity detection of the processed business data to be analyzed, and the practicability of the data is further improved.
In one embodiment, before subjecting the service data to be analyzed to ETL processing, the method further includes:
detecting the CPU utilization rate and the memory utilization rate of a server which is to perform ERL processing on the service data to be analyzed;
acquiring a plurality of working nodes of the server, and determining a target working node required to be used in ERL processing in the plurality of working nodes;
determining the current work task of each target work node;
calculating a load balancing index of the server before ETL processing is carried out on the service data to be analyzed according to the current work task of each target work node and the CPU utilization rate and the memory utilization rate of the server:
the method comprises the steps that k is a load balancing index of a server before ETL processing is carried out on service data to be analyzed, A is a CPU utilization rate of the server, B is a memory utilization rate of the server, is a current service rate of the server, is a target working node number, is a current use data volume corresponding to a current working task of an ith target working node, is a maximum distribution data volume of the ith target working node, is an expected use data volume of the ith target working node when the working stability is highest, and is a performance index of the ith target working node;
calculating a target time length for ETL processing on the service data to be analyzed by using the current load balancing index according to the load balancing index before ETL processing is carried out on the service data to be analyzed by the server:
the target duration for performing ETL processing on the service data to be analyzed by using the current load balancing index is represented as a preset expected duration, the target data volume distributed when the ith target working node performs ETL processing on the service data to be analyzed is represented as the current residual data volume of the ith target working node, and the working efficiency of the ith target working node of the server under the load balancing index is represented;
whether the target time length is greater than or equal to a preset limited time length is confirmed, if yes, work node task process optimization is carried out on the server to liberate the use data volume of the target work node, and otherwise, follow-up operation is not needed;
and after the server is optimized, carrying out ETL processing on the service data to be analyzed by utilizing the server.
The beneficial effects of the above technical scheme are: the busy condition of the server before ETL processing is carried out on the service data to be analyzed can be effectively judged by calculating the load balancing index of the server, whether the ETL processing can be carried out on the service data to be analyzed by using the server on the current basis can be judged, when the ETL processing can be carried out, whether the required time length of the service data to be analyzed under the load balancing index is qualified or not can be evaluated in advance by calculating the processing target time length, and then the server is optimized when the required time length of the service data to be analyzed is unqualified so as to ensure the ETL processing efficiency of the service data to be analyzed, the overlong waiting of a target user is avoided, the working efficiency is improved, and the experience of the user is also improved.
In one embodiment, the method further comprises: and converting the user expression into a final query statement segment of the target database by using a preset expression parser, and setting a conversion system from a standard function to a data source dialect in the created data connection.
The beneficial effects of the above technical scheme are: the preset expression parser can be used for realizing rapid conversion into the same query statement segment aiming at expressions of different modes input by the same person with different expression intentions, so that more persons are suitable for the method, the practicability and the experience feeling of users are improved, further, the conversion system from the standard function to the data source dialect can be effectively used for expanding the range of the actual content of the service data to be analyzed corresponding to the data connection authority to the dialect field, and a data report can be obtained.
In one embodiment, as shown in fig. 4, includes: different types of data connections (including Excel, MySQL, Oracle, SQLServer, RestFul API, etc.) are created, and the data connection types can be combined to be associated with the creation model at will. The report center can refer to different model drawing reports, and the report chart component supports more than 36 types. The drawn data report can support sharing and publishing to the cockpit, wherein sharing refers to sharing report links, people who know report links are free from logging in to view the report, and the report is published to a place where the cockpit intensively views data analysis results, so that data support is provided for decision making. The drawn data report can be stored as a template, and the template can be selected when the report center is created, so that the exquisite report can be drawn very quickly. The method can create a data integration ETL process, process very complex business data, output the processing result to the database table or Excel type data connection with connection, provide more complex data form for data modeling, and output result of ETL can be taken as input of the ETL process again, and can be processed repeatedly according to business needs.
The embodiment also discloses a solution system for implementing data visualization and data integration ETL, as shown in fig. 5, the system includes:
a creating module 501, configured to acquire service data to be analyzed uploaded by a target user, and create a data connection according to the service data to be analyzed;
a processing module 502, configured to determine whether integration processing needs to be performed on the to-be-analyzed service data, and if yes, perform ETL processing on the to-be-analyzed service data to obtain processed to-be-analyzed service data;
a replacing module 503, configured to replace the to-be-analyzed service data in the data connection with the processed to-be-analyzed service data;
a drawing module 504, configured to create a data analysis model according to the data connection, and draw a data report based on the processed to-be-analyzed service data by using the created data analysis model;
and the publishing module 505 is used for sharing the data report and publishing the data report to the cockpit.
The working principle and the advantageous effects of the above technical solution have been explained in the method claims, and are not described herein again.
It will be understood by those skilled in the art that the first and second terms of the present invention refer to different stages of application.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (9)

1. A solution for realizing data visualization and data integration ETL is characterized by comprising the following steps:
acquiring to-be-analyzed business data uploaded by a target user, and establishing data connection according to the to-be-analyzed business data;
determining whether integration processing needs to be carried out on the service data to be analyzed, if so, carrying out ETL processing on the service data to be analyzed to obtain processed service data to be analyzed;
replacing the service data to be analyzed in the data connection with the processed service data to be analyzed;
creating a data analysis model according to the data connection, and drawing a data report based on the processed business data to be analyzed by using the created data analysis model;
sharing and publishing the data report to a cockpit;
before performing ETL processing on the service data to be analyzed, the method further includes:
detecting the CPU utilization rate and the memory utilization rate of a server which is to carry out ETL processing on the service data to be analyzed;
acquiring a plurality of working nodes of the server, and determining a target working node required to be used in ETL processing in the plurality of working nodes;
determining the current work task of each target work node;
calculating a load balancing index of the server before ETL processing is carried out on the service data to be analyzed according to the current work task of each target work node and the CPU utilization rate and the memory utilization rate of the server:
Figure 293275DEST_PATH_IMAGE002
wherein k is a load balance index of the server before ETL processing is performed on the service data to be analyzed, A is a CPU utilization rate of the server, B is a memory utilization rate of the server,
Figure DEST_PATH_IMAGE003
expressed as a current service rate of the server,
Figure 613398DEST_PATH_IMAGE004
expressed as a target number of working nodes,
Figure DEST_PATH_IMAGE005
representing the current usage data amount corresponding to the current work task of the ith target work node,
Figure 893111DEST_PATH_IMAGE006
denoted as the ith targetThe maximum amount of data allocated for the working node,
Figure DEST_PATH_IMAGE007
expressed as the expected usage data volume of the ith target worker node when the job stability is highest,
Figure 931474DEST_PATH_IMAGE008
expressed as the performance index of the ith target working node;
calculating a target time length for ETL processing on the service data to be analyzed by using the current load balancing index according to the load balancing index before ETL processing is carried out on the service data to be analyzed by the server:
Figure 737756DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE011
expressed as a target duration for performing ETL processing on the service data to be analyzed by using the current load balancing index,
Figure 177964DEST_PATH_IMAGE012
expressed as a preset desired length of time,
Figure DEST_PATH_IMAGE013
the target data quantity allocated when the ith target working node performs ETL processing on the service data to be analyzed is represented,
Figure 309868DEST_PATH_IMAGE014
expressed as the current remaining data volume of the ith target worker node,
Figure DEST_PATH_IMAGE015
the working efficiency of the ith target working node of the server under the load balancing index is represented;
whether the target time length is greater than or equal to a preset limited time length is confirmed, if yes, work node task process optimization is carried out on the server to liberate the use data volume of the target work node, and otherwise, follow-up operation is not needed;
and after the server is optimized, carrying out ETL processing on the service data to be analyzed by utilizing the server.
2. The solution for implementing data visualization and data integration ETL according to claim 1, wherein the acquiring the to-be-analyzed service data uploaded by the target user and creating a data connection according to the to-be-analyzed service data comprises:
receiving service data to be analyzed uploaded by the target user, and preprocessing the service data to be analyzed to obtain preprocessed service data to be analyzed;
determining the data type of the preprocessed service data to be analyzed;
and matching database tables according to the data types to obtain matched database tables corresponding to the data types, and constructing data connection between the matched database tables.
3. The method according to claim 1, wherein the determining whether the business data to be analyzed needs to be integrated, and if so, performing ETL processing on the business data to be analyzed to obtain processed business data to be analyzed includes:
sending a request for judging whether integration processing is needed to be carried out to a terminal where the target user is located, and receiving a feedback instruction sent by the terminal;
when the feedback instruction confirms that integration processing is required, displaying a plurality of components processed by ETL to the terminal, and acquiring a target component selected by a target user;
selecting a proper target data connection based on the target component, inputting the service data to be analyzed into the target data connection in a specific format for ETL processing, and obtaining the processed service data to be analyzed;
and outputting the processed service data to be analyzed in the specific format.
4. The solution for implementing data visualization and data integration ETL as claimed in claim 3, wherein said plurality of components comprises: a set component, an aggregation component, an expression component, a connection component, a filtering component, a ranking component, and a desensitization component;
the specific format includes: table format and Excel format.
5. The solution for implementing data visualization and data integration ETL according to claim 1, wherein the creating a data analysis model according to the data connection and using the created data analysis model to draw a data report based on the processed business data to be analyzed comprises:
importing the processed business data to be analyzed in the data connection, and detecting the processed business data to be analyzed to obtain detection information;
acquiring attribute factors in the detection information, and mapping the attribute factors to a preset initial data model to obtain a target data analysis model corresponding to the processed to-be-analyzed service data;
performing data analysis on the processed business data to be analyzed by using the target data analysis model to obtain an analysis result;
and creating a report center, and drawing a data report based on the processed business data to be analyzed in the report center according to the analysis result.
6. The ETL solution for data visualization and data integration according to claim 1, wherein the sharing and publishing the data report to the cockpit comprises:
generating an exclusive sharing link of the data report, and setting the access right of the exclusive sharing link as anyone;
transmitting the exclusive sharing link to a terminal where a target user is located;
the data report is communicated with the cockpit and is led into the cockpit after the connection is finished;
and after the import is finished, detecting the integrity of the imported data report, when the integrity is 100%, no subsequent operation is needed, and when the integrity is complemented by 100%, importing the data report to the cockpit again to replace the previously imported data report.
7. The solution for implementing data visualization and data integration ETL according to claim 1, wherein the method further comprises:
storing the data report as a spare template and storing the spare template in a template library;
and encrypting the template library, acquiring an encryption key, and forwarding the encryption key to the mobile phone terminal of the user with the template calling authority.
8. The method according to claim 5, wherein the step of importing the processed service data to be analyzed in the data connection, detecting the processed service data to be analyzed to obtain detection information includes:
analyzing the processed service data to be analyzed to obtain target attribute information contained in the processed service data;
calculating the current matching degree of the target attribute information and preset attribute information; screening out first attribute sub-information with the current matching degree being greater than or equal to a preset matching degree;
judging abnormal information of second attribute sub-information with the current matching degree smaller than the preset matching degree to obtain an abnormal information sequence corresponding to the second attribute sub-information;
generating a specific detection signal, and analyzing the specific detection signal to obtain detection characteristic information of the specific detection signal;
generating a first characteristic information linked list according to the detection characteristic information, and combining the detection characteristic information with the same attribute in the characteristic information linked list to obtain a second characteristic information linked list;
generating a test data packet according to the second characteristic information linked list;
sending the test data packet to the abnormal attribute sub-information corresponding to the abnormal information sequence to obtain response information fed back by the abnormal attribute sub-information;
determining a target detection condition for the processed business data to be analyzed according to the response information;
acquiring target detection parameters of the target detection conditions, and replacing the current detection parameters with the target detection parameters;
and detecting the processed service data to be analyzed by using the target detection condition to obtain the detection information.
9. A solution system for realizing data visualization and data integration ETL is characterized by comprising:
the system comprises a creating module, a data processing module and a data transmission module, wherein the creating module is used for acquiring to-be-analyzed business data uploaded by a target user and creating data connection according to the to-be-analyzed business data;
the processing module is used for determining whether integration processing needs to be carried out on the service data to be analyzed, and if so, ETL processing is carried out on the service data to be analyzed to obtain the processed service data to be analyzed;
the replacing module is used for replacing the service data to be analyzed in the data connection with the processed service data to be analyzed;
the drawing module is used for creating a data analysis model according to the data connection and drawing a data report based on the processed business data to be analyzed by using the created data analysis model;
the publishing module is used for sharing the data report and publishing the data report to a cockpit;
before the processing module performs ETL processing on the service data to be analyzed, the processing module is further configured to:
detecting the CPU utilization rate and the memory utilization rate of a server which is to carry out ETL processing on the service data to be analyzed;
acquiring a plurality of working nodes of the server, and determining a target working node required to be used in ETL processing in the plurality of working nodes;
determining the current work task of each target work node;
calculating a load balancing index of the server before ETL processing is carried out on the service data to be analyzed according to the current work task of each target work node and the CPU utilization rate and the memory utilization rate of the server:
Figure 461626DEST_PATH_IMAGE002
wherein k is a load balance index of the server before ETL processing is performed on the service data to be analyzed, A is a CPU utilization rate of the server, B is a memory utilization rate of the server,
Figure 806020DEST_PATH_IMAGE003
expressed as a current service rate of the server,
Figure 569577DEST_PATH_IMAGE004
expressed as a target number of working nodes,
Figure 872382DEST_PATH_IMAGE005
representing the current usage data amount corresponding to the current work task of the ith target work node,
Figure 88600DEST_PATH_IMAGE006
expressed as the maximum allocated data volume for the ith target worker node,
Figure 502264DEST_PATH_IMAGE007
expressed as the expected usage data volume of the ith target worker node when the job stability is highest,
Figure 884441DEST_PATH_IMAGE008
expressed as the performance index of the ith target working node;
calculating a target time length for ETL processing on the service data to be analyzed by using the current load balancing index according to the load balancing index before ETL processing is carried out on the service data to be analyzed by the server:
Figure 92569DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 796082DEST_PATH_IMAGE011
expressed as a target duration for performing ETL processing on the service data to be analyzed by using the current load balancing index,
Figure 747858DEST_PATH_IMAGE012
expressed as a preset desired length of time,
Figure 486007DEST_PATH_IMAGE013
the target data quantity allocated when the ith target working node performs ETL processing on the service data to be analyzed is represented,
Figure 130615DEST_PATH_IMAGE014
expressed as the current remaining data volume of the ith target worker node,
Figure 321425DEST_PATH_IMAGE015
the working efficiency of the ith target working node of the server under the load balancing index is represented;
whether the target time length is greater than or equal to a preset limited time length is confirmed, if yes, work node task process optimization is carried out on the server to liberate the use data volume of the target work node, and otherwise, follow-up operation is not needed;
and after the server is optimized, carrying out ETL processing on the service data to be analyzed by utilizing the server.
CN202110374939.4A 2021-04-08 2021-04-08 Solution method and system for realizing data visualization and data integration ETL Active CN112765254B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110374939.4A CN112765254B (en) 2021-04-08 2021-04-08 Solution method and system for realizing data visualization and data integration ETL

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110374939.4A CN112765254B (en) 2021-04-08 2021-04-08 Solution method and system for realizing data visualization and data integration ETL

Publications (2)

Publication Number Publication Date
CN112765254A CN112765254A (en) 2021-05-07
CN112765254B true CN112765254B (en) 2021-07-20

Family

ID=75691242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110374939.4A Active CN112765254B (en) 2021-04-08 2021-04-08 Solution method and system for realizing data visualization and data integration ETL

Country Status (1)

Country Link
CN (1) CN112765254B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297190B (en) * 2021-05-11 2022-07-12 浪潮卓数大数据产业发展有限公司 Visualization method, device and medium based on data comprehensive analysis

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101446897A (en) * 2008-11-26 2009-06-03 重庆邮电大学 Resource management system based on net system business structure platform
CN101826100A (en) * 2010-03-16 2010-09-08 中国测绘科学研究院 Automatic integrated system and method of wide area network (WAN)-oriented multisource emergency information
CN104850623A (en) * 2015-05-19 2015-08-19 杭州迅涵科技有限公司 Dynamic extension method and system for multidimensional data analysis model
CN106156350A (en) * 2016-07-25 2016-11-23 恒安嘉新(北京)科技有限公司 The big data analysing method of a kind of visualization and system
CN109634767A (en) * 2018-12-06 2019-04-16 北京字节跳动网络技术有限公司 Method and apparatus for detection information
CN112085241A (en) * 2019-06-12 2020-12-15 江苏汇环环保科技有限公司 Environment big data analysis and decision platform based on machine learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100573457C (en) * 2007-12-29 2009-12-23 中国建设银行股份有限公司 A kind of finance data is realized ETL method for processing and system
US10860602B2 (en) * 2018-06-29 2020-12-08 Lucid Software, Inc. Autolayout of visualizations based on contract maps

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101446897A (en) * 2008-11-26 2009-06-03 重庆邮电大学 Resource management system based on net system business structure platform
CN101826100A (en) * 2010-03-16 2010-09-08 中国测绘科学研究院 Automatic integrated system and method of wide area network (WAN)-oriented multisource emergency information
CN104850623A (en) * 2015-05-19 2015-08-19 杭州迅涵科技有限公司 Dynamic extension method and system for multidimensional data analysis model
CN106156350A (en) * 2016-07-25 2016-11-23 恒安嘉新(北京)科技有限公司 The big data analysing method of a kind of visualization and system
CN109634767A (en) * 2018-12-06 2019-04-16 北京字节跳动网络技术有限公司 Method and apparatus for detection information
CN112085241A (en) * 2019-06-12 2020-12-15 江苏汇环环保科技有限公司 Environment big data analysis and decision platform based on machine learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"某商业智能管理系统的设计与实现";金震;《中国优秀硕士学位论文全文数据库》;20170315;第21页第2-3段,图4-3 *
金震."某商业智能管理系统的设计与实现".《中国优秀硕士学位论文全文数据库》.2017,第21页. *

Also Published As

Publication number Publication date
CN112765254A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN109814856B (en) Data entry method, device, terminal and computer readable storage medium
CN111177176A (en) Data detection method, device and storage medium
CN110334021B (en) Interface test case generation method, device, equipment and storage medium
Labunets et al. An experimental comparison of two risk-based security methods
CN106897206A (en) A kind of service test method and device
CN108038052A (en) Automatic test management method, device, terminal device and storage medium
CN106909604B (en) Automatic checking method and system for EXCEL form and ACCESS database based on rules
CN111782719B (en) Data processing method and device
CN112765254B (en) Solution method and system for realizing data visualization and data integration ETL
CN112612813A (en) Test data generation method and device
CN112434982A (en) Task processing method, device and system, storage medium and electronic equipment
CN106549853A (en) A kind of email processing method and device
CN110322217A (en) Manufacture cloud service Requirement Decomposition system and method based on template
CN116089490A (en) Data analysis method, device, terminal and storage medium
CN113778446B (en) Low-code application development platform
KR20100127624A (en) Method and apparatus for management information
Yang et al. Research on intelligent security protection of privacy data in government cyberspace
CN114428913A (en) Data management method, device, equipment and storage medium
CN114500691A (en) Information configuration method and device
CN114610803A (en) Data processing method and device, electronic equipment and storage medium
CN112966071B (en) User feedback information analysis method, device, equipment and readable storage medium
CN117196069B (en) Federal learning method
CN111708580B (en) Processing method and system for static data of configuration software
WO2022173044A1 (en) Information processing device
Alashkevich et al. Information support for the financial activity of public institutions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant