CN112347092A - Method and device for generating data analysis billboard and computer equipment - Google Patents

Method and device for generating data analysis billboard and computer equipment Download PDF

Info

Publication number
CN112347092A
CN112347092A CN202011224649.3A CN202011224649A CN112347092A CN 112347092 A CN112347092 A CN 112347092A CN 202011224649 A CN202011224649 A CN 202011224649A CN 112347092 A CN112347092 A CN 112347092A
Authority
CN
China
Prior art keywords
data
index
dimension
database
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011224649.3A
Other languages
Chinese (zh)
Other versions
CN112347092B (en
Inventor
张健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202011224649.3A priority Critical patent/CN112347092B/en
Publication of CN112347092A publication Critical patent/CN112347092A/en
Application granted granted Critical
Publication of CN112347092B publication Critical patent/CN112347092B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to big data technology, and discloses a method for generating a data analysis billboard, which comprises the following steps: determining a database to which data respectively corresponding to each service index belongs according to the service index selected by the current user; calling a data synchronization script, and acquiring data corresponding to each service index from each database to form a collection source database; according to the correlation coefficient of each service index, cleaning the data in the convergent source database to obtain a target database; acquiring an index dimension corresponding to a service index selected by a current user in a configuration library; and forming a panel view corresponding to the index dimension according to the data of the target database. Related data are acquired and collected from different source databases by developing a data synchronization script, then data cleaning is carried out through the correlation with service indexes to obtain target data, target data are formed according to signboard images corresponding to dimension items through index dimensions selected by a user, and the dimension data are visually displayed.

Description

Method and device for generating data analysis billboard and computer equipment
Technical Field
The present application relates to the field of big data, and in particular, to a method, an apparatus, and a computer device for generating a data analysis billboard.
Background
The human resources department has not yet obtained HR-concerned index data in a complete system such as: the human power distribution map, the cadre configuration distribution map, the organization performance distribution map, the training budget distribution map, the salary level distribution map, the staff training distribution map, the staff job leaving rate and the like can only depend on experience and can not play a role in guiding and making decisions for high-rise leaders; for the indexes needing to know the historical information, the data statistics cannot be realized basically when the related data amount is large; the counting period is long, the response is slow, more manpower is required to be occupied, and the counting result is inaccurate.
Disclosure of Invention
The main purpose of this application is for solving the scattered technical problem that can't realize data statistics and show of data in the manpower resources field.
The application provides a method for generating a data analysis billboard, which comprises the following steps:
determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user;
calling a data synchronization script, and acquiring data corresponding to each service index from each database to form a collection source database;
according to the correlation coefficient of each service index, cleaning the data in the sink source database to obtain a target database;
acquiring an index dimension corresponding to the service index selected by the current user in a configuration library;
and forming a panel graph corresponding to the index dimension according to the data of the target database according to the index dimension.
Preferably, the step of cleaning the data in the sink source database according to the correlation coefficient with each service index to obtain the target database includes:
acquiring data corresponding to a specified index, wherein the specified index belongs to any one of all service indexes;
carrying out sorting calculation on the data corresponding to the specified indexes through a sorting calculation component to obtain correlation coefficients of the data and the specified indexes;
forming a data queue corresponding to the specified index according to the correlation coefficient from high to low;
storing the data of which the correlation coefficient is larger than a preset threshold value in the data queue into a target database to form target data corresponding to the specified index;
and generating target data corresponding to each service index according to the generation process of the target data corresponding to the specified index to obtain the target database after cleaning the sink source database.
Preferably, the step of performing ranking calculation on the data related to the specified index through a ranking calculation component to obtain a correlation coefficient between the data and the specified index includes:
extracting an index feature vector of the specified index and data feature vectors respectively corresponding to data related to the specified index;
calculating cosine distances between the index characteristic vectors and the data characteristic vectors respectively according to a specified calculation formula, wherein the specified calculation formula is
Figure BDA0002763248360000021
m represents index feature vectors, n represents each data feature vector, P represents the total number of the data feature vectors, P is a positive integer, theta represents a vector included angle between the index feature vectors and the data feature vectors, i represents the number of the data feature vectors, and n represents the total number of the data feature vectorsiRepresenting the ith data feature vector;
and taking the cosine distance as a correlation coefficient of data and the specified index.
Preferably, after the step of generating target data corresponding to each service index according to the generation process of the target data corresponding to the specified index to obtain the target database after cleaning the aggregated source database, the method includes:
pushing the target data corresponding to each service index to the corresponding memory grid node;
calling pre-configured computing components in each memory grid node in parallel, starting multithreading to respectively correspond target data to each service index, and performing data analysis according to dimension items to obtain dimension data respectively corresponding to each dimension item;
and summarizing the dimensional data corresponding to each service index returned by each memory grid node to obtain a dimensional database.
Preferably, the step of forming a dashboard map corresponding to the index dimension from the data of the target database according to the index dimension includes:
enabling multiple threads to process data of each index dimension in the dimension database to obtain attribute values corresponding to each index dimension;
acquiring configuration attributes of a corresponding timing task on a timing scheduling platform at the current moment, wherein the configuration attributes comprise a designated chart mode for displaying attribute values corresponding to the index dimensions respectively;
and forming a panel chart in the manner of the designated chart according to the configuration attributes of the timing task and the attribute values corresponding to the index dimensions respectively.
Preferably, after the step of enabling multithreading to process the data of each index dimension in the dimension database to obtain the attribute value corresponding to each index dimension, the method includes:
judging whether a comprehensive index corresponding to each index dimension exists or not, wherein the comprehensive index is obtained by combining a specified number of index dimensions;
inputting the attribute values corresponding to the index dimensions into a calculation component for comprehensive calculation to obtain comprehensive attribute values corresponding to the comprehensive indexes;
and forming a panel graph by using the comprehensive attribute values in the specified graph mode.
Preferably, before the step of obtaining the index dimension corresponding to the service index selected by the current user in the configuration library, the method includes:
acquiring a service index currently selected by a user and a setting weight corresponding to each service index;
inputting the currently selected service index into a configuration library;
and correspondingly associating each service index with the corresponding setting weight of each service index in the configuration library one by one to form configured index dimensionality, and storing the configured index dimensionality in the configuration library.
The present application further provides a device for generating a data analysis billboard, comprising:
the determining module is used for determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user;
the calling module is used for calling a data synchronization script, acquiring data respectively corresponding to each service index from each database and forming a collection source database;
the cleaning module is used for cleaning the data in the sink source database according to the correlation coefficient of each service index to obtain a target database;
the first acquisition module is used for acquiring the index dimension corresponding to the service index selected by the current user in the configuration library;
and the forming module is used for forming the data of the target database into the panel view corresponding to the index dimension according to the index dimension.
The present application further provides a computer device comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the above method when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method as described above.
According to the method and the device, the data synchronization script is developed, the related data are obtained and collected from different source databases, then data cleaning is carried out through the correlation with the service indexes to obtain the target data, the target data are formed according to the signboard images corresponding to the dimension items through the index dimensions selected by the user, and the dimension data are visually displayed.
Drawings
FIG. 1 is a schematic flow chart illustrating a method for generating a data analysis billboard according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of an apparatus for generating a data analysis billboard according to an embodiment of the present application;
fig. 3 is a schematic diagram of an internal structure of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Referring to fig. 1, the method for generating a data analysis billboard of this embodiment includes:
s1: determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user;
s2: calling a data synchronization script, and acquiring data corresponding to each service index from each database to form a collection source database;
s3: according to the correlation coefficient of each service index, cleaning the data in the sink source database to obtain a target database;
s4: acquiring an index dimension corresponding to the service index selected by the current user in a configuration library;
s5: and forming a panel graph corresponding to the index dimension according to the data of the target database according to the index dimension.
In the embodiment of the application, the business indexes are determined by means of the data topics required by the business, the data related to the business indexes are extracted from different databases through data mining and collected into a collection source database, and therefore centralized analysis of the data is achieved through data combing. And in the data analysis process, forming an efficient queue according to the correlation coefficient of the data and the service index, obtaining a target database through a data filtering program and a data input program, extracting data corresponding to the specified dimension of the service index from the target database in a memory grid according to a dynamic rule to form dimension data, and visually displaying the dimension data through an icon function to form a dashboard map which can be directly used by the service. For example, the embodiment of the present application takes a summary of human resource data as an example. For example, the organization team assessment data needs the support of a personnel support management system; the training budget data needs the support of a training budget system and the support of related financial data; training course data needs data support of a bird-knowing system; the recruitment data requires data support related to a human resource system, and the like. The data source is different, the data synchronization mode is also different, and the data scattered in the system are integrated together by developing the Sqoop script and the button script. The Sqoop script is used for acquiring required original data from a Hive database and acquiring related original data through a spark program; the key script is mainly used for synchronously acquiring needed data from the relational database, developing the key script, and operating the key script through a java program to acquire data corresponding to the service index. Personnel information data and personnel architecture data required by human resource data analysis are synchronized into a personnel management system from an association system through a button script; the training indexes, the recruitment indexes, the financial indexes, the course indexes and other indexes needed by human resource data analysis are synchronized into the personnel management system from the association system through the hadoop script to form a collection source database, so that the gathering of scattered data is realized, and the centralized analysis of the data is facilitated.
According to the embodiment of the application, the data processing center is developed, and the synchronized data are cleaned through the data processing center to generate the target database. The data cleaning process is realized according to the data theme required by the customized service. The data subject is determined by the selected business index. And monitoring business data required by the current data theme, performing correlation analysis on the obtained business data, and writing the data meeting the specified correlation requirement after the analysis into a corresponding target database to form a target database.
According to the method and the device, data processing is further performed in the target database through the current data analysis dimension to obtain the dimension database, and dimension data are displayed in an intuitive billboard graph through a data display mode determined on a dynamic configuration page. Different business indexes corresponding to different business requirements are different, and different index dimensions of data analysis can cause the displayed billboard graph to change along with the business indexes, so that the business requirements can be better met. According to the method and the device, the data synchronization script is developed, the related data are acquired from different source databases, the scattered data are collected, then data cleaning is carried out through the correlation with the service indexes, the target data are obtained, the billboard image corresponding to the dimensional data is formed through the index dimension selected by the user, the dimensional data are visually displayed, and convenience, intuition and instantaneity of data analysis data display are improved.
Further, step S3, cleaning the data in the sink source database according to the correlation coefficient with each of the service indicators to obtain a target database, includes:
s31: acquiring data corresponding to a specified index, wherein the specified index belongs to any one of all service indexes;
s32: carrying out sorting calculation on the data corresponding to the specified indexes through a sorting calculation component to obtain correlation coefficients of the data and the specified indexes;
s33: forming a data queue corresponding to the specified index according to the correlation coefficient from high to low;
s34: storing the data of which the correlation coefficient is larger than a preset threshold value in the data queue into a target database to form target data corresponding to the specified index;
s35: and generating target data corresponding to each service index according to the generation process of the target data corresponding to the specified index to obtain the target database after cleaning the sink source database.
According to the embodiment of the application, a data queue is formed according to the correlation coefficient of the data and the service index, and data cleaning is carried out according to preset data cleaning conditions. For example, data which cannot meet the correlation threshold value is cleaned, and data with the correlation number larger than the preset threshold value is stored in the target database for subsequent business data analysis. According to the method and the device, through data cleaning, tens of millions of orders of magnitude of data in the source database are collected, and through the efficient data queue, the data cleaning is completed within 1 hour, so that the target database is obtained.
Further, the step S32 of performing a ranking calculation on the data related to the specified index through a ranking calculation component to obtain a correlation coefficient between the data and the specified index includes:
s321: extracting an index feature vector of the specified index and data feature vectors respectively corresponding to data related to the specified index;
s322: calculating cosine distances between the index characteristic vectors and the data characteristic vectors respectively according to a specified calculation formula, wherein the specified calculation formula is
Figure BDA0002763248360000061
m represents index feature vectors, n represents each data feature vector, P represents the total number of the data feature vectors, P is a positive integer, theta represents a vector included angle between the index feature vectors and the data feature vectors, i represents the number of the data feature vectors, and n represents the total number of the data feature vectorsiRepresenting the ith data feature vector;
s323: and taking the cosine distance as a correlation coefficient of data and the specified index.
In the data cleaning process of the embodiment of the application, the high-efficiency queue is formed according to the correlation coefficient of the data and the service index. In the embodiment of the application, the correlation coefficient of the data and the service index is evaluated through the vector cosine distances respectively corresponding to the data and the service index, the calculated amount is small, and the accuracy can meet the service requirement.
Further, after step S35 of generating target data corresponding to each of the service indicators according to the generation process of the target data corresponding to the specified indicator to obtain the target database after cleaning the aggregated source database, the method includes:
s351: pushing the target data corresponding to each service index to the corresponding memory grid node;
s352: calling pre-configured computing components in each memory grid node in parallel, starting multithreading to respectively correspond target data to each service index, and performing data analysis according to dimension items to obtain dimension data respectively corresponding to each dimension item;
s353: and summarizing the dimensional data corresponding to each service index returned by each memory grid node to obtain a dimensional database.
According to the method and the device, the asynchronous task is developed, the background timing task is developed for the data corresponding to each dimension project, the data corresponding to each dimension project is further extracted from the target database by utilizing multithreading, and dimension data capable of being displayed are generated. According to the embodiment of the application, the data in the target database are pushed into the memory grid by developing the computing function of the memory grid, and data analysis is carried out according to the dimension items through a distributed computing program to obtain the dimension data corresponding to each dimension item. For example, 16 memory grid nodes are configured in the embodiment of the application, the generation process of dimensional data is accelerated, 45-dimensional mechanism data is processed, and data analysis can be completed within 30 minutes to approximately 700 dimensions, so that the dimensional data corresponding to 700 dimensions respectively is obtained.
Further, the step S5 of forming the target database data into the dashboard map corresponding to the index dimension according to the index dimension includes:
s51: enabling multiple threads to process data of each index dimension in the dimension database to obtain attribute values corresponding to each index dimension;
s52: acquiring configuration attributes of a corresponding timing task on a timing scheduling platform at the current moment, wherein the configuration attributes comprise a designated chart mode for displaying attribute values corresponding to the index dimensions respectively;
s53: and forming a panel chart in the manner of the designated chart according to the configuration attributes of the timing task and the attribute values corresponding to the index dimensions respectively.
The embodiment of the application supports the configuration attribute and the dynamic setting of the timing task by developing a service dynamic configuration system and a timing scheduling platform. The timing tasks are determined according to the service scene, all the timing tasks are written into a task table and are completed through an asynchronous program, and the names of the timing tasks are mounted in a timing scheduling platform after the completion. The configuration attributes of the timing task, such as execution time, running time, task type, and the like, are used as business operations that can be actually performed by the user, including real-time operations or timing operations, by dynamically configuring the relevant configuration attributes, such as running time. The data of the target database cannot support index analysis and display, and needs to be converted into attribute values of the dimension data for display. The calculation of the attribute value of the dimension data is related to a specific dimension item, for example, the dimension item is a training course coverage rate corresponding to a training index, and then the corresponding attribute value can be obtained by analyzing the total number of people participating in the training course and dividing the total number of people. The specified chart modes include, but are not limited to, pie charts, histograms, and line charts. The method comprises the steps of analyzing service indexes needing attention of services stored in a configuration library, obtaining attribute values corresponding to each service index through a program, outputting the attribute values to a page in the forms of a pie chart, a histogram, a line chart and the like through the page, and dynamically configuring the display form according to the preference of each user. Through developing a front-end page, displaying dimension data, various trend graphs, pie charts, linear graphs and the like through the page, providing a downloading function, and directly displaying and reporting the dimension data obtained through analysis.
Further, after the step S51 of enabling multithreading to process the data of each index dimension in the dimension database to obtain the attribute value corresponding to each index dimension, the method includes:
s511: judging whether a comprehensive index corresponding to each index dimension exists or not, wherein the comprehensive index is obtained by combining a specified number of index dimensions;
s512: inputting the attribute values corresponding to the index dimensions into a calculation component for comprehensive calculation to obtain comprehensive attribute values corresponding to the comprehensive indexes;
s513: and forming a panel graph by using the comprehensive attribute values in the specified graph mode.
In the embodiment of the application, for the comprehensive index formed by a plurality of basic index dimensions, the calculation component can analyze and calculate the dimensional data again to obtain the comprehensive attribute value corresponding to the comprehensive index. The calculation functions in the calculation component include, but are not limited to, performing accumulation calculation, proportional calculation and the like according to the weight, and the application range of data analysis is widened.
Further, before the step S4 of obtaining an index dimension corresponding to the service index selected by the current user in the configuration library, the method includes:
s41: acquiring a service index currently selected by a user and a setting weight corresponding to each service index;
s42: inputting the currently selected service index into a configuration library;
s43: and correspondingly associating each service index with the corresponding setting weight of each service index in the configuration library one by one to form configured index dimensionality, and storing the configured index dimensionality in the configuration library.
According to the embodiment of the application, the configuration library is formed according to the service indexes reflected by the service lines by analyzing the index sources mainly recorded in the data synchronization process. For example, the number of the service indexes fed back by each service line is nearly 200, the service indexes are input into a configuration library, the service indexes which are mainly concerned are obtained through program sequencing, and each service index is set with a weight to form a final configuration library. The business personnel can dynamically adjust the weight of the business index through the page to determine the index dimension corresponding to the selected business index during data analysis, so as to determine a processing target for the subsequent data processing direction and data display.
According to the method and the device, for newly increased incremental data every day, the target data is obtained through data cleaning and data processing every day, and due to the fact that the service data in the synchronized sink source database cannot be directly used by a corresponding service system, corresponding service index analysis and service logic calculation can be supported after the data are cleaned.
Referring to fig. 2, an apparatus for generating a data analysis billboard according to an embodiment of the present application includes:
the determining module 1 is used for determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user;
the calling module 2 is used for calling a data synchronization script, acquiring data respectively corresponding to each service index from each database, and forming a collection source database;
a cleaning module 3, configured to clean the data in the sink source database according to the correlation coefficient with each service index, to obtain a target database;
the first obtaining module 4 is configured to obtain an index dimension corresponding to the service index selected by the current user in the configuration library;
and the forming module 5 is used for forming the data of the target database into the panel view corresponding to the index dimension according to the index dimension.
Further, the cleaning module 3 includes:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring data corresponding to a specified index, and the specified index belongs to any one of all service indexes;
the first calculation unit is used for carrying out sequencing calculation on the data corresponding to the specified indexes through a sequencing calculation component to obtain correlation coefficients of the data and the specified indexes;
the first forming unit is used for forming a data queue corresponding to the specified index according to the correlation coefficient from high to low;
the second forming unit is used for storing the data of which the correlation coefficient is larger than a preset threshold value in the data queue into a target database to form target data corresponding to the specified index;
and the generating unit is used for generating target data corresponding to each service index according to the generation process of the target data corresponding to the specified index to obtain the target database after the source database is cleaned.
Further, a computing unit comprising:
an extraction subunit, configured to extract an index feature vector of the specified index and data feature vectors corresponding to data related to the specified index, respectively;
a calculating subunit, configured to calculate cosine distances between the index feature vectors and the data feature vectors according to a specified calculation formula, where the specified calculation formula is
Figure BDA0002763248360000101
m represents index feature vectors, n represents each data feature vector, P represents the total number of the data feature vectors, P is a positive integer, theta represents a vector included angle between the index feature vectors and the data feature vectors, i represents the number of the data feature vectors, and n represents the total number of the data feature vectorsiRepresenting the ith data feature vector;
and the subunit is used for taking the cosine distance as a correlation coefficient of the data and the specified index.
Further, the cleaning module 3 includes:
the pushing unit is used for pushing the target data corresponding to each service index to the corresponding memory grid node;
the calling unit is used for calling the pre-configured computing components in each memory grid node in parallel, starting multithreading to respectively correspond to the target data of each service index, and performing data analysis according to the dimension items to obtain the dimension data respectively corresponding to each dimension item;
and the summarizing unit is used for summarizing the dimensional data corresponding to each service index returned by each memory grid node to obtain a dimensional database.
Further, a module 5 is formed comprising:
the starting unit is used for starting multithreading to process data of each index dimension in the dimension database to obtain an attribute value corresponding to each index dimension;
the second obtaining unit is used for obtaining the configuration attribute of the corresponding timing task on the timing scheduling platform at the current moment, wherein the configuration attribute comprises a designated chart mode for displaying attribute values respectively corresponding to the index dimensions;
and the third forming unit is used for forming a panel view in the manner of the designated graph according to the configuration attributes of the timing task and the attribute values corresponding to the index dimensions respectively.
Further, a module 5 is formed comprising:
the device comprises a judging unit and a judging unit, wherein the judging unit is used for judging whether comprehensive indexes corresponding to all index dimensions exist or not, and the comprehensive indexes are obtained by combining a specified number of index dimensions;
the second calculation unit is used for inputting the attribute values corresponding to the index dimensions into a calculation component for comprehensive calculation to obtain comprehensive attribute values corresponding to the comprehensive indexes;
and the fourth forming unit is used for forming the board view by the comprehensive attribute value in the mode of the designated chart.
Further, an apparatus for generating a data analysis billboard, comprising:
the second acquisition module is used for acquiring the currently selected service indexes of the user and the setting weights corresponding to the service indexes;
the input module is used for inputting the currently selected service index into a configuration library;
and the association module is used for associating each service index with the corresponding setting weight of each service index in the configuration library in a one-to-one correspondence manner to form configured index dimensionality, and storing the configured index dimensionality in the configuration library.
For the explanation of the embodiments of the apparatus section, please refer to the corresponding method section, which is not described in detail.
Referring to fig. 3, a computer device, which may be a server and whose internal structure may be as shown in fig. 3, is also provided in the embodiment of the present application. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer designed processor is used to provide computational and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium. The database of the computer device is used to store all the data required by the process of generating the data analysis kanban. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of generating a data analysis kanban.
The processor executes the method for generating the data analysis billboard, and comprises the following steps: determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user; calling a data synchronization script, and acquiring data corresponding to each service index from each database to form a collection source database; according to the correlation coefficient of each service index, cleaning the data in the sink source database to obtain a target database; acquiring an index dimension corresponding to the service index selected by the current user in a configuration library; and forming a panel graph corresponding to the index dimension according to the data of the target database according to the index dimension.
According to the computer equipment, the related data are acquired and collected from different source databases by developing the data synchronization script, then data cleaning is carried out through the correlation with the service indexes to obtain the target data, the target data are formed according to the signboard images corresponding to the dimension items through the index dimensions selected by the user, and the dimension data are visually displayed.
In an embodiment, the step of the processor cleaning the data in the sink source database according to the correlation coefficient with each service index to obtain the target database includes: acquiring data corresponding to a specified index, wherein the specified index belongs to any one of all service indexes; carrying out sorting calculation on the data corresponding to the specified indexes through a sorting calculation component to obtain correlation coefficients of the data and the specified indexes; forming a data queue corresponding to the specified index according to the correlation coefficient from high to low; storing the data of which the correlation coefficient is larger than a preset threshold value in the data queue into a target database to form target data corresponding to the specified index; and generating target data corresponding to each service index according to the generation process of the target data corresponding to the specified index to obtain the target database after cleaning the sink source database.
In an embodiment, the step of performing a ranking calculation on the data related to the specified index by the processor through a ranking calculation component to obtain a correlation coefficient between the data and the specified index includes: extracting an index feature vector of the specified index and data feature vectors respectively corresponding to data related to the specified index; calculating cosine distances between the index characteristic vectors and the data characteristic vectors respectively according to a specified calculation formula, wherein the specified calculation formula is
Figure BDA0002763248360000121
m represents index feature vectors, n represents each data feature vector, P represents the total number of the data feature vectors, P is a positive integer, theta represents a vector included angle between the index feature vectors and the data feature vectors, i represents the number of the data feature vectors, and n represents the total number of the data feature vectorsiRepresenting the ith data feature vector; and taking the cosine distance as a correlation coefficient of data and the specified index.
In an embodiment, after the step of generating, by the processor, target data corresponding to each of the service indicators according to a generation process of the target data corresponding to the specified indicator to obtain the target database after the sink source database is cleaned, the method includes: pushing the target data corresponding to each service index to the corresponding memory grid node; calling pre-configured computing components in each memory grid node in parallel, starting multithreading to respectively correspond target data to each service index, and performing data analysis according to dimension items to obtain dimension data respectively corresponding to each dimension item; and summarizing the dimensional data corresponding to each service index returned by each memory grid node to obtain a dimensional database.
In an embodiment, the step of forming, by the processor, a dashboard map corresponding to the index dimension from the data of the target database according to the index dimension includes: enabling multiple threads to process data of each index dimension in the dimension database to obtain attribute values corresponding to each index dimension; acquiring configuration attributes of a corresponding timing task on a timing scheduling platform at the current moment, wherein the configuration attributes comprise a designated chart mode for displaying attribute values corresponding to the index dimensions respectively; and forming a panel chart in the manner of the designated chart according to the configuration attributes of the timing task and the attribute values corresponding to the index dimensions respectively.
In an embodiment, after the step of enabling multithreading to process the data of each index dimension in the dimension database to obtain the attribute value corresponding to each index dimension, the processor includes: judging whether a comprehensive index corresponding to each index dimension exists or not, wherein the comprehensive index is obtained by combining a specified number of index dimensions; inputting the attribute values corresponding to the index dimensions into a calculation component for comprehensive calculation to obtain comprehensive attribute values corresponding to the comprehensive indexes; and forming a panel graph by using the comprehensive attribute values in the specified graph mode.
In an embodiment, before the step of obtaining the index dimension corresponding to the service index selected by the current user in the configuration library, the processor includes: acquiring a service index currently selected by a user and a setting weight corresponding to each service index; inputting the currently selected service index into a configuration library; and correspondingly associating each service index with the corresponding setting weight of each service index in the configuration library one by one to form configured index dimensionality, and storing the configured index dimensionality in the configuration library.
Those skilled in the art will appreciate that the architecture shown in fig. 3 is only a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects may be applied.
An embodiment of the present application further provides a computer-readable storage medium having a computer program stored thereon, the computer program, when executed by a processor, implementing a method for generating a data analysis billboard, comprising: determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user; calling a data synchronization script, and acquiring data corresponding to each service index from each database to form a collection source database; according to the correlation coefficient of each service index, cleaning the data in the sink source database to obtain a target database; acquiring an index dimension corresponding to the service index selected by the current user in a configuration library; and forming a panel graph corresponding to the index dimension according to the data of the target database according to the index dimension.
According to the computer-readable storage medium, related data are acquired and collected from different source databases by developing a data synchronization script, then data cleaning is performed through the correlation between the data and a service index to obtain target data, a target data is formed according to a signboard image corresponding to a dimension project through an index dimension selected by a user, and dimension data can be visually displayed.
In an embodiment, the step of the processor cleaning the data in the sink source database according to the correlation coefficient with each service index to obtain the target database includes: acquiring data corresponding to a specified index, wherein the specified index belongs to any one of all service indexes; carrying out sorting calculation on the data corresponding to the specified indexes through a sorting calculation component to obtain correlation coefficients of the data and the specified indexes; forming a data queue corresponding to the specified index according to the correlation coefficient from high to low; storing the data of which the correlation coefficient is larger than a preset threshold value in the data queue into a target database to form target data corresponding to the specified index; and generating target data corresponding to each service index according to the generation process of the target data corresponding to the specified index to obtain the target database after cleaning the sink source database.
In one embodiment, the processor passes the data related to the specified index through a sorting meterThe step of obtaining the correlation coefficient of the data and the specified index by the calculation component through sequencing calculation comprises the following steps: extracting an index feature vector of the specified index and data feature vectors respectively corresponding to data related to the specified index; calculating cosine distances between the index characteristic vectors and the data characteristic vectors respectively according to a specified calculation formula, wherein the specified calculation formula is
Figure BDA0002763248360000141
m represents index feature vectors, n represents each data feature vector, P represents the total number of the data feature vectors, P is a positive integer, theta represents a vector included angle between the index feature vectors and the data feature vectors, i represents the number of the data feature vectors, and n represents the total number of the data feature vectorsiRepresenting the ith data feature vector; and taking the cosine distance as a correlation coefficient of data and the specified index.
In an embodiment, after the step of generating, by the processor, target data corresponding to each of the service indicators according to a generation process of the target data corresponding to the specified indicator to obtain the target database after the sink source database is cleaned, the method includes: pushing the target data corresponding to each service index to the corresponding memory grid node; calling pre-configured computing components in each memory grid node in parallel, starting multithreading to respectively correspond target data to each service index, and performing data analysis according to dimension items to obtain dimension data respectively corresponding to each dimension item; and summarizing the dimensional data corresponding to each service index returned by each memory grid node to obtain a dimensional database.
In an embodiment, the step of forming, by the processor, a dashboard map corresponding to the index dimension from the data of the target database according to the index dimension includes: enabling multiple threads to process data of each index dimension in the dimension database to obtain attribute values corresponding to each index dimension; acquiring configuration attributes of a corresponding timing task on a timing scheduling platform at the current moment, wherein the configuration attributes comprise a designated chart mode for displaying attribute values corresponding to the index dimensions respectively; and forming a panel chart in the manner of the designated chart according to the configuration attributes of the timing task and the attribute values corresponding to the index dimensions respectively.
In an embodiment, after the step of enabling multithreading to process the data of each index dimension in the dimension database to obtain the attribute value corresponding to each index dimension, the processor includes: judging whether a comprehensive index corresponding to each index dimension exists or not, wherein the comprehensive index is obtained by combining a specified number of index dimensions; inputting the attribute values corresponding to the index dimensions into a calculation component for comprehensive calculation to obtain comprehensive attribute values corresponding to the comprehensive indexes; and forming a panel graph by using the comprehensive attribute values in the specified graph mode.
In an embodiment, before the step of obtaining the index dimension corresponding to the service index selected by the current user in the configuration library, the processor includes: acquiring a service index currently selected by a user and a setting weight corresponding to each service index; inputting the currently selected service index into a configuration library; and correspondingly associating each service index with the corresponding setting weight of each service index in the configuration library one by one to form configured index dimensionality, and storing the configured index dimensionality in the configuration library.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. A method of generating a data analysis billboard, comprising:
determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user;
calling a data synchronization script, and acquiring data corresponding to each service index from each database to form a collection source database;
according to the correlation coefficient of each service index, cleaning the data in the sink source database to obtain a target database;
acquiring an index dimension corresponding to the service index selected by the current user in a configuration library;
and forming a panel graph corresponding to the index dimension according to the data of the target database according to the index dimension.
2. The method of claim 1, wherein the step of cleansing the data in the aggregate source database to obtain the target database according to the correlation coefficient associated with each of the traffic indicators comprises:
acquiring data corresponding to a specified index, wherein the specified index belongs to any one of all service indexes;
carrying out sorting calculation on the data corresponding to the specified indexes through a sorting calculation component to obtain correlation coefficients of the data and the specified indexes;
forming a data queue corresponding to the specified index according to the correlation coefficient from high to low;
storing the data of which the correlation coefficient is larger than a preset threshold value in the data queue into a target database to form target data corresponding to the specified index;
and generating target data corresponding to each service index according to the generation process of the target data corresponding to the specified index to obtain the target database after cleaning the sink source database.
3. The method of claim 2, wherein the step of performing a ranking calculation on the data related to the specified index by a ranking calculation component to obtain a correlation coefficient between the data and the specified index comprises:
extracting an index feature vector of the specified index and data feature vectors respectively corresponding to data related to the specified index;
calculating cosine distances between the index characteristic vectors and the data characteristic vectors respectively according to a specified calculation formula, wherein the specified calculation formula is
Figure FDA0002763248350000021
m represents index feature vectors, n represents each data feature vector, P represents the total number of the data feature vectors, P is a positive integer, theta represents a vector included angle between the index feature vectors and the data feature vectors, i represents the number of the data feature vectors, and n represents the total number of the data feature vectorsiRepresenting the ith data feature vector;
and taking the cosine distance as a correlation coefficient of data and the specified index.
4. The method for generating a data analysis billboard according to claim 2, wherein, after the step of generating target data corresponding to each of the service indicators according to the generation process of the target data corresponding to the specified indicator to obtain the target database after the step of cleaning the aggregated source database, the method comprises:
pushing the target data corresponding to each service index to the corresponding memory grid node;
calling pre-configured computing components in each memory grid node in parallel, starting multithreading to respectively correspond target data to each service index, and performing data analysis according to dimension items to obtain dimension data respectively corresponding to each dimension item;
and summarizing the dimensional data corresponding to each service index returned by each memory grid node to obtain a dimensional database.
5. The method of generating a data analysis billboard of claim 1, wherein the step of forming a billboard view corresponding to the index dimension from the data of the target database according to the index dimension comprises:
enabling multiple threads to process data of each index dimension in the dimension database to obtain attribute values corresponding to each index dimension;
acquiring configuration attributes of a corresponding timing task on a timing scheduling platform at the current moment, wherein the configuration attributes comprise a designated chart mode for displaying attribute values corresponding to the index dimensions respectively;
and forming a panel chart in the manner of the designated chart according to the configuration attributes of the timing task and the attribute values corresponding to the index dimensions respectively.
6. The method of claim 5, wherein the step of enabling multithreading to process data of each index dimension in the dimension database to obtain the attribute value corresponding to each index dimension is followed by the step of:
judging whether a comprehensive index corresponding to each index dimension exists or not, wherein the comprehensive index is obtained by combining a specified number of index dimensions;
inputting the attribute values corresponding to the index dimensions into a calculation component for comprehensive calculation to obtain comprehensive attribute values corresponding to the comprehensive indexes;
and forming a panel graph by using the comprehensive attribute values in the specified graph mode.
7. The method of claim 1, wherein the step of obtaining an index dimension corresponding to the business index selected by the current user in the configuration library comprises:
acquiring a service index currently selected by a user and a setting weight corresponding to each service index;
inputting the currently selected service index into a configuration library;
and correspondingly associating each service index with the corresponding setting weight of each service index in the configuration library one by one to form configured index dimensionality, and storing the configured index dimensionality in the configuration library.
8. An apparatus for generating a data analysis billboard, comprising:
the determining module is used for determining a database to which data respectively corresponding to each business index belongs according to the business index selected by the current user;
the calling module is used for calling a data synchronization script, acquiring data respectively corresponding to each service index from each database and forming a collection source database;
the cleaning module is used for cleaning the data in the sink source database according to the correlation coefficient of each service index to obtain a target database;
the first acquisition module is used for acquiring the index dimension corresponding to the service index selected by the current user in the configuration library;
and the forming module is used for forming the data of the target database into the panel view corresponding to the index dimension according to the index dimension.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202011224649.3A 2020-11-05 2020-11-05 Method, device and computer equipment for generating data analysis billboard Active CN112347092B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011224649.3A CN112347092B (en) 2020-11-05 2020-11-05 Method, device and computer equipment for generating data analysis billboard

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011224649.3A CN112347092B (en) 2020-11-05 2020-11-05 Method, device and computer equipment for generating data analysis billboard

Publications (2)

Publication Number Publication Date
CN112347092A true CN112347092A (en) 2021-02-09
CN112347092B CN112347092B (en) 2023-07-18

Family

ID=74428829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011224649.3A Active CN112347092B (en) 2020-11-05 2020-11-05 Method, device and computer equipment for generating data analysis billboard

Country Status (1)

Country Link
CN (1) CN112347092B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115759019A (en) * 2022-11-15 2023-03-07 广州天维信息技术股份有限公司 Business data calculation method and device, storage medium and computer equipment
CN116383299A (en) * 2023-03-31 2023-07-04 国任财产保险股份有限公司 Data display system based on distributed database

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110078108A1 (en) * 2009-09-29 2011-03-31 Oracle International Corporation Agentless data collection
CN110069519A (en) * 2018-08-23 2019-07-30 平安科技(深圳)有限公司 Data information management method, apparatus, computer equipment and storage medium
CN110109978A (en) * 2019-05-16 2019-08-09 深圳前海微众银行股份有限公司 Data analysing method, device, server and readable storage medium storing program for executing based on index
CN111368089A (en) * 2018-12-25 2020-07-03 中国移动通信集团浙江有限公司 Service processing method and device based on knowledge graph

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110078108A1 (en) * 2009-09-29 2011-03-31 Oracle International Corporation Agentless data collection
CN110069519A (en) * 2018-08-23 2019-07-30 平安科技(深圳)有限公司 Data information management method, apparatus, computer equipment and storage medium
CN111368089A (en) * 2018-12-25 2020-07-03 中国移动通信集团浙江有限公司 Service processing method and device based on knowledge graph
CN110109978A (en) * 2019-05-16 2019-08-09 深圳前海微众银行股份有限公司 Data analysing method, device, server and readable storage medium storing program for executing based on index

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115759019A (en) * 2022-11-15 2023-03-07 广州天维信息技术股份有限公司 Business data calculation method and device, storage medium and computer equipment
CN115759019B (en) * 2022-11-15 2023-10-20 广州天维信息技术股份有限公司 Service data calculation method, device, storage medium and computer equipment
CN116383299A (en) * 2023-03-31 2023-07-04 国任财产保险股份有限公司 Data display system based on distributed database

Also Published As

Publication number Publication date
CN112347092B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
JP4980395B2 (en) Data analysis system and method
US10074079B2 (en) Systems and methods for automated analysis, screening and reporting of group performance
CN111209731B (en) Assessment quantization value acquisition method, device, equipment and readable storage medium
Eslami et al. Estimating most productive scale size with imprecise-chance constrained input–output orientation model in data envelopment analysis
CN112347092A (en) Method and device for generating data analysis billboard and computer equipment
CN113393060A (en) Task allocation method and device, electronic equipment and storage medium
CN113268403B (en) Time series analysis and prediction method, device, equipment and storage medium
CN112990646A (en) Performance assessment and evaluation method for workers
CN115641019A (en) Index anomaly analysis method and device, computer equipment and storage medium
JP2023029604A (en) Apparatus and method for processing patent information, and program
US9875443B2 (en) Unified attractiveness prediction framework based on content impact factor
CN116596284B (en) Travel decision management method and system based on customer requirements
EP3764310A1 (en) Prediction task assistance device and prediction task assistance method
CN112631889A (en) Portrayal method, device and equipment for application system and readable storage medium
Herbst Methods and benchmarks for auto-scaling mechanisms in elastic cloud environments
Dong et al. Recalculating the agricultural labor force in china
JP2022531480A (en) Visit prediction
EP3283932A1 (en) Requirements determination
Chen et al. A Web‐based distributed system for hurricane occurrence projection
US11893069B2 (en) Platform, method, and system for a search engine of time series data
CN115018473A (en) Service processing method, device, storage medium and equipment
CN114997813A (en) Flow chart generation method, device, equipment and storage medium
CN113128739B (en) Prediction method of user touch time, prediction model training method and related devices
CN114610308A (en) Application function layout adjusting method and device, electronic equipment and storage medium
Avdeenko et al. Modeling information space for decision-making in the interaction of higher education system with regional labor market

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant