CN114254918A - Index data calculation method and device, readable medium and electronic equipment - Google Patents

Index data calculation method and device, readable medium and electronic equipment Download PDF

Info

Publication number
CN114254918A
CN114254918A CN202111563635.9A CN202111563635A CN114254918A CN 114254918 A CN114254918 A CN 114254918A CN 202111563635 A CN202111563635 A CN 202111563635A CN 114254918 A CN114254918 A CN 114254918A
Authority
CN
China
Prior art keywords
index
calculated
data
current
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111563635.9A
Other languages
Chinese (zh)
Inventor
张兴
李远照
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Securities Co Ltd
Original Assignee
Ping An Securities Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Securities Co Ltd filed Critical Ping An Securities Co Ltd
Priority to CN202111563635.9A priority Critical patent/CN114254918A/en
Publication of CN114254918A publication Critical patent/CN114254918A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2291User-Defined Types; Storage management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Software Systems (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a method, a device, a readable medium and electronic equipment for calculating index data, wherein the method comprises the following steps: preprocessing the original data to store the original data in a form of a set data structure; classifying the indexes to be calculated, and generating a reusable component for each type of indexes to be calculated; and acquiring original data required by the index to be calculated from the set data structure based on the reusable component, and calculating to obtain a calculation result of the index to be calculated. According to the technical scheme provided by the embodiment of the application, the original data can be acquired more conveniently and rapidly in the index calculation process, the access of repeated data is avoided, the index calculation efficiency is improved, the memory occupation in the index calculation process is reduced, and the calculation resources are saved.

Description

Index data calculation method and device, readable medium and electronic equipment
Technical Field
The application belongs to the technical field of computers, and particularly relates to a method and a device for calculating index data, a readable medium and electronic equipment.
Background
With the development of computer technology, data on the internet is more and more abundant, big data analysis is more and more favored, index calculation is an important component of big data analysis and is also an important application of business intelligence in the internet era. In a traditional index calculation method, thousands of data are acquired from a database for accumulation calculation when index data are calculated each time. In a computer, an object needs to be constructed every time of calculation, and each object occupies a certain memory of the computer, and like a traditional index calculation method, a calculation mode of creating the object every time of calculation greatly occupies the memory of the computer, so that the waste of calculation resources is caused.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present application and therefore may include information that does not constitute prior art known to a person of ordinary skill in the art.
Disclosure of Invention
The application aims to provide a method and a device for calculating index data, a readable medium and electronic equipment, so as to solve the problem that computer memory resources are wasted in the index calculation process in the related art.
Other features and advantages of the present application will be apparent from the following detailed description, or may be learned by practice of the application.
According to an aspect of an embodiment of the present application, there is provided a method for calculating index data, including:
preprocessing original data to store the original data in a form of a set data structure;
classifying the indexes to be calculated, and generating a reusable component for each type of indexes to be calculated;
and acquiring target data required by the index to be calculated from the original data of the set data structure based on the reusable component, and calculating to obtain a calculation result of the index to be calculated.
According to an aspect of an embodiment of the present application, there is provided an index data calculation apparatus including:
the data preprocessing module is used for preprocessing original data so as to store the original data in a set data structure form;
the index classification module is used for classifying the indexes to be calculated and generating reusable components aiming at each type of indexes to be calculated;
and the index calculation module is used for acquiring target data required by the index to be calculated from the original data of the set data structure based on the reusable component to calculate so as to obtain a calculation result of the index to be calculated.
In one embodiment of the present application, the data preprocessing module includes:
the arrangement sequence determining unit is used for traversing all the indexes to be calculated and determining the arrangement sequence of the index period of each index to be calculated, wherein the arrangement with the small index period is arranged at the front, and the arrangement with the large index period is arranged at the back;
and the data storage unit is used for sequentially storing the original data corresponding to each index to be calculated in a form of a set data structure according to the arrangement sequence.
In one embodiment of the present application, the setting data structure is a key-value pair; the data storage unit is specifically configured to:
storing the name field of the original data as a key, storing the numerical value of the original data as a value, and forming a key-value pair based on the key and the value so as to store the original data in the form of the key-value pair.
In an embodiment of the present application, the index classification module is specifically configured to:
if the index to be calculated has a first identifier, determining the index to be calculated as a first type of index, and generating a first reusable component aiming at the first type of index;
if the index to be calculated has a second identifier, determining the index to be calculated as a second type of index, and generating a second reusable component aiming at the second type of index;
and if the index to be calculated does not have the identifier, determining the index to be calculated as a third type of index, and generating a third reusable component aiming at the third type of index.
In one embodiment of the present application, the index calculation module includes:
the object generation unit is used for determining the current reusable component according to the type of the current index to be calculated;
the data writing unit is used for acquiring current target data required by the current index to be calculated from the set data structure and writing the current target data into the current reusable component;
and the index calculation unit is used for performing index calculation on the basis of the current reusable component written in the current target data to obtain a calculation result of the current index to be calculated.
In an embodiment of the present application, the index calculating unit is specifically configured to:
if the current index to be calculated is a first-type index or a second-type index, performing incremental calculation on current target data written into the current reusable component;
and if the current index to be calculated is the third-class index, performing full calculation on the current target data written into the current reusable component.
In an embodiment of the present application, the index calculation module further includes:
and the data clearing unit is used for deleting the data in the current reusable component if the data exists in the current reusable component.
According to an aspect of the embodiments of the present application, there is provided a computer-readable medium on which a computer program is stored, the computer program, when executed by a processor, implementing a method of calculating index data as in the above technical solutions.
According to an aspect of an embodiment of the present application, there is provided an electronic apparatus including: a processor; and a memory for storing executable instructions of the processor; wherein, the processor executes the executable instructions to make the electronic device execute the calculation method of the index data in the above technical scheme.
According to an aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the method for calculating the index data according to the above technical solution.
According to the technical scheme provided by the embodiment of the application, the original data are stored in a set data structure form by preprocessing the original data, so that the original data can be acquired more conveniently and quickly in the index calculation process, access of repeated data is avoided, and the index calculation efficiency is improved; and moreover, the indexes to be calculated are classified, and the reusable components are generated for each type of indexes to be calculated, so that the memory occupation in the index calculation process is reduced, and the calculation resources are saved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 schematically shows a block diagram of an exemplary system architecture to which the solution of the present application applies.
Fig. 2 schematically shows a flowchart of a calculation method of index data according to an embodiment of the present application.
Fig. 3 schematically shows a flowchart of a method for calculating index data according to an embodiment of the present application.
Fig. 4 schematically shows a block diagram of a computing device for index data provided in an embodiment of the present application.
FIG. 5 schematically illustrates a block diagram of a computer system suitable for use in implementing an electronic device of an embodiment of the present application.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the subject matter of the present application can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the application.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
Fig. 1 schematically shows a block diagram of an exemplary system architecture to which the solution of the present application applies.
As shown in fig. 1, system architecture 100 may include a terminal device 110, a network 120, and a server 130. The terminal device 110 may include various electronic devices such as a smart phone, a tablet computer, a notebook computer, and a desktop computer. The server 130 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services. Network 120 may be a communication medium of various connection types capable of providing a communication link between terminal device 110 and server 130, such as a wired communication link or a wireless communication link.
The system architecture in the embodiments of the present application may have any number of terminal devices, networks, and servers, according to implementation needs. For example, the server 130 may be a server group composed of a plurality of server devices. In addition, the technical solution provided in the embodiment of the present application may be applied to the terminal device 110, or may be applied to the server 130, or may be implemented by both the terminal device 110 and the server 130, which is not particularly limited in this application.
For example, the method for calculating the index data provided in the embodiment of the present application is implemented by the server 130. Specifically, the server 130 preprocesses the raw data to store the raw data in a form of a set data structure, wherein the raw data can be collected by the terminal device 110 and stored in the database. The server 130 then classifies the metrics to be calculated and generates reusable components for each class of metrics to be calculated. Finally, the server 130 obtains the original data required by the index to be calculated from the set data structure based on the reusable component, and calculates the original data to obtain the calculation result of the index to be calculated.
In an embodiment of the present application, after obtaining the calculation result of the index to be calculated, the server 130 may send the calculation result to the terminal device 110, so that a user can obtain the calculation result of the index through the terminal device 110.
In an embodiment of the present application, the method for calculating the index data provided in the embodiment of the present application may also be implemented by the terminal device 110. Specifically, the terminal device 110 collects raw data and preprocesses the raw data to store the raw data in a form of a set data structure. Then, the terminal device 110 classifies the to-be-calculated indexes and generates a reusable component for each type of to-be-calculated indexes. Finally, the terminal device 110 obtains the original data required by the index to be calculated from the set data structure based on the reusable component, calculates the calculation result of the index to be calculated, and displays the calculation result on the graphical user interface, so that the user can know the calculation result.
The calculation of the index data provided in the present application will be described in detail with reference to the embodiments.
Fig. 2 schematically shows a flowchart of a method for calculating index data according to an embodiment of the present application, which may be implemented by a terminal device, such as the terminal device 110 shown in fig. 1; the method may also be implemented by a server, such as server 130 shown in FIG. 1. As shown in fig. 2, the method for calculating index data provided in the embodiment of the present application includes steps 210 to 230, which are specifically as follows:
step 210, preprocessing the original data to store the original data in a form of a set data structure.
Specifically, the raw data is data required for index calculation, and is provided by different data sources and usually stored in a database. The data source refers to an object for collecting original data, for example, a company includes a plurality of departments, each department can collect data, and thus each department of the company is equivalent to the quantity source of the original data, and the original data collected by the departments are uniformly stored in a database, so that the original data can be conveniently searched and used.
For original data provided by different data sources, the data storage formats of the original data may be different, and the original data is preprocessed, and is also in fact in a unified format. In the embodiment of the application, the original data is preprocessed according to the set data structure, so that the original data is stored in the form of the set data structure.
In an embodiment of the present application, the preprocessing is performed on the original data according to a set data structure, that is, the original data is stored according to a set logical relationship, for example, according to the sequence of data generation time, according to the hierarchical relationship between data sources corresponding to the data, and the like.
In an embodiment of the present application, before preprocessing the raw data, acquiring the corresponding raw data according to the index to be calculated is further included. The index is an index, a specification, a standard and the like which are expected to be achieved, and is a measurement parameter. Different indexes are calculated, and required original data are different. For example, if the index to be calculated is the quarterly profit amount of a certain department of the company, the acquired original data is the data corresponding to the department; and if the index to be calculated is the whole quarter profit amount of the company, the acquired original data are the data corresponding to all departments of the company.
In one embodiment of the present application, the indicator generally has a time period, denoted as an indicator period, which specifies a time range of the raw data that needs to be acquired. For example, if the index to be calculated is the gross quarter profit amount of the company, the index period is one quarter (3 months); if the index to be calculated is the total annual profit of the company, the index period is one year; and if the index to be calculated is the profit amount of the company on the same day, the index period is one day. In general, the index period may be hours, days, weeks, months, quarters, years, etc., and in some particular instances, may be minutes or other specified periods.
In one embodiment of the present application, preprocessing raw data may include: traversing all indexes to be calculated, and determining the arrangement sequence of the index period of each index to be calculated; and storing the original data corresponding to each index to be calculated in a set data structure form in sequence according to the arrangement sequence.
Specifically, a user may need to calculate multiple indexes, and the user generally refers to an object for calculating the indexes, which may be a single person, an organization, an enterprise, a certain department of the enterprise, and the like. For example, the index to be calculated corresponding to one company may include hundreds of thousands of indexes.
In the embodiment of the application, traversing all the to-be-calculated indexes may be traversing all the to-be-calculated indexes which need to be calculated currently, or traversing all the to-be-calculated indexes of the current user. And determining the arrangement sequence of the index periods of the indexes to be calculated, namely arranging the index periods according to the size sequence, wherein the small index period is arranged before the large index period is arranged after the small index period is arranged. And then, storing the original data corresponding to each index to be calculated in a set data structure form according to the arrangement sequence of the index period, namely, storing the original data corresponding to the index to be calculated with a small index period in the set data structure, and storing the original data corresponding to the index to be calculated with a large index period in the set data structure.
In one embodiment of the present application, the index period refers to a time range that needs to be spanned by the index calculation, which may be day, week, month, year, etc., for example, the index is a profit of a company for one year, and the index period is one year. The minimum index period of all the indexes to be calculated can be included, which is equivalent to the maximum value in the index periods corresponding to all the indexes to be calculated.
In one embodiment of the present application, the data structure is set to a Map data structure. The Map data structure is a dictionary data structure, and the data storage form is a key-value pair form, wherein the key is equivalent to the name of the data, and the value is a specific numerical value of the data, so that the Map data structure is also a key-value pair structure. Then, the process of storing the original data in the form of the set data structure is as follows: storing the name field of the original data as a key, storing the numerical value of the original data as a value, and forming a key-value pair based on the key and the value so that the original data is stored in the form of the key-value pair. For example, raw data is profitable for a company: 300 ten thousand yuan, then "company profit" is the "key" and "300 ten thousand yuan" is the "value", thereby forming a key-value pair.
In one embodiment of the present application, the scope of the "key" in the Map data structure is not limited to a character string, but various types of values can be regarded as the "key", that is, the Map provides a key-value pair in the form of "value-value", which is a more perfect hash (hash) structure.
In an embodiment of the present application, when storing original data in a set data structure, the original data is first divided according to a user name or a user id, and then the original data corresponding to a user forms a Map data structure, where the Map data structure of a certain user may include multiple pieces of Map data, that is, multiple key value pairs.
In one embodiment of the present application, when an index has other attributes such as a time attribute or a mechanism attribute, the index is referred to as an extended index, which is similar to a multidimensional index. When the index to be calculated is the extension index, the original data corresponding to the index forms a plurality of nested Map structures. For example, the extended index of the quarter profit of the company is the index of "company profit" plus the time attribute of "quarter profit". For example, the extended index "business profit in last year" of the Shanghai department of company is an index "business profit" plus the time attribute "last year" and the agency attribute "business department of Shanghai". Because the extension indexes usually have uncertainty and are difficult to manage, the embodiment of the application realizes automatic extension of the indexes by constructing the extension indexes into the nested Map structure, namely, the extension indexes can be automatically obtained according to the nested Map structure without manual management, and the convenience of index management is improved.
In one embodiment of the present application, the final data structure formed according to the extended indicators and the general indicators (indicators other than the extended indicators) is in the form of Map < string > which means that for each user, all data required by each indicator is stored in the corresponding Map. The maximum index period in Map is actually the minimum time period that can contain all indexes.
And step 220, classifying the indexes to be calculated, and generating a reusable component for each type of indexes to be calculated.
Specifically, the indexes to be calculated may be classified into different categories according to different classification rules, for example, the indexes to be calculated may be classified according to an index period, and the indexes to be calculated in the same index period may be classified into the same category. A component refers to an object that can be instantiated, assembled, and managed, and a reusable component means that the component is reusable and not disposable. The reusable component is generated for each type of index to be calculated, which means that one type of index to be calculated corresponds to one reusable component.
In an embodiment of the present application, when setting the index to be calculated, a user sets an identifier for the index to be calculated, so as to distinguish the type of the index to be calculated, and the process of classifying the index to be calculated and generating the reusable component includes: if the index to be calculated has the first identification, determining the index to be calculated as a first type of index, and generating a first reusable component aiming at the first type of index; if the index to be calculated has a second identifier, determining the index to be calculated as a second type of index, and generating a second reusable component aiming at the second type of index; and if the index to be calculated does not have the identifier, determining the index to be calculated as a third type of index, and generating a third reusable component aiming at the third type of index.
Specifically, the indexes to be calculated are divided into three categories: the index management method comprises the following steps of a first index, a second index and a third index, wherein the first index has a first identifier, the second index has a second identifier, and the third index does not set an identifier. Furthermore, the first type of index corresponds to a first recyclable component, the second type of index corresponds to a second recyclable component, and the third type of index corresponds to a third recyclable component.
In an embodiment of the present application, the indexes to be calculated are classified according to the calculation mode of the indexes, that is, the category of the indexes to be calculated reflects the calculation mode of the indexes to be calculated on a certain program. The first type of indicator is a full accumulation indicator. The full-accumulated index is an index that requires the addition of all historical data, and is generally the sum of all values from the user's account to the present, for example, the total consumption of each client from the registration to the present is calculated. The second type of index is a first-time index, which is an index calculated by data of some first-time behavior of the user, and generally refers to a date when a common index of each time period first exceeds a set value from the account opening to the past of the user, for example, a time when each client first consumes more than a specified amount is calculated. The common index is an index other than the full-accumulation index and the first-time index.
In one embodiment of the present application, the reusable component is a JavaBean component, also known as a Bean object. Bean objects are Java classes written according to a certain specification. Generally, a Bean object occupies a certain memory space in a computer, and during the execution of a computer program, a correlation calculation is implemented by writing data (also referred to as writing the Bean object) into the memory space occupied by the Bean object. After the calculation is completed, the memory space occupied by the Bean object is not managed, and a new Bean object is constructed to occupy the memory space again in the next calculation. Therefore, when the calculation amount is large, the memory space is wasted. After the calculation is completed, the data written in the memory space occupied by the reusable Bean object is deleted, the new data is flushed and written, and the related calculation of the new data is realized, so that the repeated use of the Bean object is realized, and the effect of saving memory resources is achieved.
In an embodiment of the present application, since reusable components are generated for the classified indexes to be calculated, in a UDF (User Defined Function) Function, three different types of indexes correspond to three Bean objects.
And step 230, acquiring target data required by the index to be calculated from the original data of the set data structure based on the reusable component, and calculating to obtain a calculation result of the index to be calculated.
Specifically, the index data calculation is performed based on the reusable component, that is, target data required by the index to be calculated is written into the reusable component, and then the index calculation is performed based on the reusable component in which the data is written, so that a calculation result of the index to be calculated is obtained.
According to the technical scheme provided by the embodiment of the application, the original data are stored in a set data structure form by preprocessing the original data, so that the original data can be acquired more conveniently and quickly in the index calculation process, access of repeated data is avoided, and the index calculation efficiency is improved; and moreover, the indexes to be calculated are classified, and the reusable components are generated for each type of indexes to be calculated, so that the memory occupation in the index calculation process is reduced, and the calculation resources are saved.
In one embodiment of the present application, the index to be calculated is multiple and has different types, and the calculation of the index data based on the repeatable component may be: determining a current reusable component according to the type of the current index to be calculated; acquiring current target data required by the current index to be calculated from a set data structure, and writing the current target data into the current reusable component; and performing index calculation based on the current reusable component written in the current target data to obtain a calculation result of the current index to be calculated.
Specifically, the current reusable component is first determined according to the type of the index to be currently calculated, for example, if the index to be currently calculated is the first type of index, the component is determined to be the first reusable component. The current target data is then retrieved from the set data structure and written to the current reusable component. When the current target data is acquired, the current target data needs to be acquired according to the index period of the current index to be calculated, that is, the generation time of the acquired current target data is within the index period of the current index to be calculated, for example, if the current index to be calculated is the total sales of the company in 2020, the acquired current target data is data between 1/2020 and 12/31/2020. And finally, calculating based on the current reusable component written in the current target data to obtain a calculation result of the current index to be calculated.
In one embodiment of the present application, before writing the current target data into the current reusable component, it is further required to determine whether there is data in the current reusable component, and if there is data in the current reusable component, the current target data is written into the current reusable component after deleting the data in the current reusable component.
In one embodiment of the present application, after obtaining the calculation result of the index to be calculated currently, the current target data in the current reusable component is deleted, so as to write the original data of the index to be calculated next.
In an embodiment of the present application, the index calculation is performed according to a type of an index to be calculated, specifically: if the current index to be calculated is the first type index or the second type index, incremental calculation is carried out on the current target data written into the current reusable component; and if the current index to be calculated is the third-class index, performing full calculation on the current target data written into the current reusable component.
Specifically, incremental calculation refers to adding new data on the basis of historical data to obtain a calculation result, and full calculation refers to accumulating all data to obtain a calculation result. In the embodiment of the application, the first type of index is a full-accumulation index, the second type of index is a first-time index, and the third type of index is a common index. And for the full accumulation index and the first value index, incremental calculation is adopted. And for the full accumulation index, accumulating the current data into the historical index result obtained by the previous calculation to obtain the current full accumulation index. For the first-time value index, if the index calculation result meeting the condition exists before, calculation does not need to be continued, if the index calculation result meeting the condition does not exist, the current data is accumulated into the historical index result obtained by the previous calculation through incremental calculation, once the calculation result meets the condition, the calculation of the first-time value index can not be carried out subsequently, compared with the method that the data is obtained every time to calculate the first-time value index, the incremental calculation can effectively reduce the calculation amount of the data and improve the calculation efficiency. For the common index, the calculation result needs to be obtained through total calculation.
In an embodiment of the present application, it may also be determined to adopt incremental calculation or full calculation according to whether there is a corresponding historical index result for the index to be calculated. For the index to be calculated with the historical index result, the index is usually a full accumulation index or a first value index, and the calculation result can be obtained by using an increasing calculation mode. For a common index and a full-accumulation index or a first-time index without a historical index result, all original data corresponding to the index needs to be acquired for calculation, namely, a calculation result is obtained in a full-quantity calculation mode.
In an embodiment of the present application, when the index to be calculated is calculated, the calculation may be performed according to the priority of the index. For example, the full accumulation index has the highest calculation priority, the normal index has the second highest calculation priority, and the first value index has the lowest calculation priority. Because the full accumulation index and the first value index both adopt an incremental calculation mode, some first value indexes may be included in the process of calculating the full accumulation index; similarly, some first-order value indicators may be included in the calculation process of the common indicator. Therefore, the full accumulation index is calculated preferentially, part of the first value indexes can be calculated sequentially after the common indexes are next, and the first value indexes calculated finally only need to consider the first value indexes which are not related in the full accumulation index and the common indexes. Thus, index calculation efficiency is improved.
In an embodiment of the present application, when obtaining the original data required by the index, a time window may be set according to an index period of the index to be calculated, and the original data is obtained by sliding the time window in the Map data structure. Since the Map structure data is the minimum time period containing all indexes, all indexes of one user can be calculated by traversing the Map structure once without repeated data access. It follows that the temporal complexity of the calculation is proportional to the Size of the Map structure (Size).
In one embodiment of the application, because the original data is stored in the Map structure according to the index period, when the Map structure is traversed to perform index calculation, on one hand, when the index of the small period is calculated, a part of data of the index of the large period is calculated; on the other hand, when calculating the large cycle index, the small cycle index is calculated in a band. The calculation efficiency of the index data is improved.
Fig. 3 schematically shows a flowchart of a method for calculating index data according to an embodiment of the present application. As shown in fig. 3, the method includes:
and S310, acquiring original data. The original data is provided by different data sources, the data sources refer to objects for collecting the original data, for example, a company includes a plurality of departments, each department can collect data, so that each department of the company is equivalent to the number source of the original data, the original data collected by the departments are uniformly stored in the database, and thus the step is to obtain the original data from the database.
And S320, preprocessing the original data to generate Map structure data. The metrics to be calculated may form a list, and the metric period may span multiple time periods, such as 1m, 3m, 1y, etc. And traversing all the indexes, and calculating the minimum time window which most contains all the index periods, namely the maximum index period in all the index periods. The raw data contained in time is then placed in the Map structure according to the time window. If the index is an extension index, a plurality of nested maps are constructed like the multi-dimensional case. The final data structure is Map < string, double > >, that is, for each user, all the historical data required by each index is in the corresponding Map.
S330, aiming at the indexes to be calculated, generating Java internal Bean objects. Firstly, classifying the indexes to be calculated, such as dividing the indexes into full accumulation indexes, first value indexes and common indexes, and then generating a Bean object aiming at each type of indexes.
S340, judging whether the index to be calculated has a historical index result, and if so, adopting an incremental calculation mode; if no history index result exists (namely the history index is empty), a full-scale calculation mode is adopted. The aggregation algorithm of the indexes is as follows: avg (mean), sum, min (minimum), max (maximum), tavg (trade daily mean). The full accumulation index is the sum of all values from the account opening of the client to the present, and the first index is the date when the common index of each time window exceeds the set value for the first time from the account opening of the client to the present. By definition, the full accumulation index can be calculated by increment, and today's result is added to yesterday's index value; for the first index, once the first past date is calculated, the date will not change any further in the subsequent time. Through special processing of incremental calculation, scanning and calculation of data can be effectively reduced. The general index and other indexes without historical results are obtained by full-scale calculation.
And S350, calculating according to the calculation mode of the index to be calculated to obtain an index calculation result. Calculating the indexes according to the following sequence in the calculation process: the total accumulated index is preferred, the common index is the second, and the first value index is the last. And finally, calculating each index Bean of each client, wherein the calculation sequence is also optimized. Since the Map structure data is the minimum time window containing all indexes, all indexes can be calculated by traversing the Map structure, that is, when the number of indexes is far smaller than the historical data, the calculation time complexity is only proportional to the Size of the Map structure, and no repeated data access is available.
It should be noted that although the various steps of the methods in this application are depicted in the drawings in a particular order, this does not require or imply that these steps must be performed in this particular order, or that all of the shown steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.
Embodiments of the apparatus of the present application are described below, which may be used to perform the method for calculating the index data in the above-described embodiments of the present application. Fig. 4 schematically shows a block diagram of a computing device for index data provided in an embodiment of the present application. As shown in fig. 4, the index data calculation device includes:
a data preprocessing module 410, configured to preprocess original data to store the original data in a form of a set data structure;
the index classification module 420 is configured to classify the indexes to be calculated, and generate a reusable component for each type of the indexes to be calculated;
and an index calculation module 430, configured to obtain target data required by the index to be calculated from the original data of the set data structure based on the reusable component, and calculate to obtain a calculation result of the index to be calculated.
In one embodiment of the present application, the data preprocessing module 410 includes:
the arrangement sequence determining unit is used for traversing all the indexes to be calculated and determining the arrangement sequence of the index period of each index to be calculated, wherein the arrangement with the small index period is arranged at the front, and the arrangement with the large index period is arranged at the back;
and the data storage unit is used for sequentially storing the original data corresponding to each index to be calculated in a form of a set data structure according to the arrangement sequence.
In one embodiment of the present application, the setting data structure is a key-value pair; the data storage unit is specifically configured to:
storing the name field of the original data as a key, storing the numerical value of the original data as a value, and forming a key-value pair based on the key and the value so as to store the original data in the form of the key-value pair.
In an embodiment of the present application, the index classification module 420 is specifically configured to:
if the index to be calculated has a first identifier, determining the index to be calculated as a first type of index, and generating a first reusable component aiming at the first type of index;
if the index to be calculated has a second identifier, determining the index to be calculated as a second type of index, and generating a second reusable component aiming at the second type of index;
and if the index to be calculated does not have the identifier, determining the index to be calculated as a third type of index, and generating a third reusable component aiming at the third type of index.
In one embodiment of the present application, the metric calculation module 430 includes:
the object generation unit is used for determining the current reusable component according to the type of the current index to be calculated;
the data writing unit is used for acquiring current target data required by the current index to be calculated from the set data structure and writing the current target data into the current reusable component;
and the index calculation unit is used for performing index calculation on the basis of the current reusable component written in the current target data to obtain a calculation result of the current index to be calculated.
In an embodiment of the present application, the index calculating unit is specifically configured to:
if the current index to be calculated is a first-type index or a second-type index, performing incremental calculation on current target data written into the current reusable component;
and if the current index to be calculated is the third-class index, performing full calculation on the current target data written into the current reusable component.
In an embodiment of the present application, the index calculation module 430 further includes:
and the data clearing unit is used for deleting the data in the current reusable component if the data exists in the current reusable component.
The specific details of the calculation apparatus for index data provided in the embodiments of the present application have been described in detail in the corresponding method embodiments, and are not repeated herein.
Fig. 5 schematically shows a block diagram of a computer system of an electronic device for implementing an embodiment of the present application.
It should be noted that the computer system 500 of the electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU) 501 that can perform various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the random access memory 503, various programs and data necessary for system operation are also stored. The central processor 501, the read only memory 502 and the random access memory 503 are connected to each other via a bus 504. An Input/Output interface 505(Input/Output interface, i.e., I/O interface) is also connected to the bus 504.
The following components are connected to the input/output interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output section 507 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a local area network card, modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the input/output interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
In particular, according to embodiments of the present application, the processes described in the various method flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program, when executed by the central processor 501, performs various functions defined in the system of the present application.
It should be noted that the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or process a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be processed by any suitable medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present application.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A method for calculating index data, comprising:
preprocessing original data to store the original data in a form of a set data structure;
classifying the indexes to be calculated, and generating a reusable component for each type of indexes to be calculated;
and acquiring target data required by the index to be calculated from the original data of the set data structure based on the reusable component, and calculating to obtain a calculation result of the index to be calculated.
2. The index data calculation method according to claim 1, wherein preprocessing raw data to store the raw data in a set data structure includes:
traversing all indexes to be calculated, and determining the arrangement sequence of the index periods of the indexes to be calculated, wherein the indexes with small periods are arranged in the front, and the indexes with large periods are arranged in the back;
and sequentially storing the original data corresponding to each index to be calculated in a form of a set data structure according to the arrangement sequence.
3. The index data calculation method according to claim 2, wherein the setting data structure is a key-value pair; storing the original data in a set data structure form, including:
storing the name field of the original data as a key, storing the numerical value of the original data as a value, and forming a key-value pair based on the key and the value so as to store the original data in the form of the key-value pair.
4. The method according to claim 1, wherein the classifying the to-be-calculated indexes and generating the reusable component for each type of the to-be-calculated indexes comprises:
if the index to be calculated has a first identifier, determining the index to be calculated as a first type of index, and generating a first reusable component aiming at the first type of index;
if the index to be calculated has a second identifier, determining the index to be calculated as a second type of index, and generating a second reusable component aiming at the second type of index;
and if the index to be calculated does not have the identifier, determining the index to be calculated as a third type of index, and generating a third reusable component aiming at the third type of index.
5. The method for calculating index data according to claim 1, wherein the obtaining target data required by the index to be calculated from the raw data of the set data structure based on the reusable component to calculate the calculation result of the index to be calculated comprises:
determining a current reusable component according to the type of the current index to be calculated;
acquiring current target data required by the current index to be calculated from the original data of the set data structure, and writing the current target data into the current reusable component;
and performing index calculation based on the current reusable component written in the current target data to obtain a calculation result of the current index to be calculated.
6. The method of claim 5, wherein the performing a metric calculation based on the current reusable component written to the current target data comprises:
if the current index to be calculated is a first-type index or a second-type index, performing incremental calculation on current target data written into the current reusable component;
and if the current index to be calculated is the third-class index, performing full calculation on the current target data written into the current reusable component.
7. A method of calculating metric data in accordance with claim 5, wherein prior to writing the current target data to the current reusable component, the method further comprises:
and if the data exists in the current reusable component, deleting the data in the current reusable component.
8. An apparatus for calculating index data, comprising:
the data preprocessing module is used for preprocessing original data so as to store the original data in a set data structure form;
the index classification module is used for classifying the indexes to be calculated and generating reusable components aiming at each type of indexes to be calculated;
and the index calculation module is used for acquiring target data required by the index to be calculated from the original data of the set data structure based on the reusable component to calculate so as to obtain a calculation result of the index to be calculated.
9. A computer-readable medium on which a computer program is stored which, when executed by a processor, implements the index data calculation method of any one of claims 1 to 7.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein execution of the executable instructions by the processor causes the electronic device to perform the method of calculating metric data of any of claims 1-7.
CN202111563635.9A 2021-12-20 2021-12-20 Index data calculation method and device, readable medium and electronic equipment Pending CN114254918A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111563635.9A CN114254918A (en) 2021-12-20 2021-12-20 Index data calculation method and device, readable medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111563635.9A CN114254918A (en) 2021-12-20 2021-12-20 Index data calculation method and device, readable medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN114254918A true CN114254918A (en) 2022-03-29

Family

ID=80793180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111563635.9A Pending CN114254918A (en) 2021-12-20 2021-12-20 Index data calculation method and device, readable medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114254918A (en)

Similar Documents

Publication Publication Date Title
US8978034B1 (en) System for dynamic batching at varying granularities using micro-batching to achieve both near real-time and batch processing characteristics
US20200050968A1 (en) Interactive interfaces for machine learning model evaluations
KR102033971B1 (en) Data quality analysis
CN107247811B (en) SQL statement performance optimization method and device based on Oracle database
CN111241123A (en) View data query method, device, server and storage medium
CN114416891B (en) Method, system, apparatus and medium for data processing in a knowledge graph
Hansen et al. An empirical study of software architectures’ effect on product quality
CN115408381A (en) Data processing method and related equipment
CN110737673B (en) Data processing method and system
CN112631889A (en) Portrayal method, device and equipment for application system and readable storage medium
CN114254918A (en) Index data calculation method and device, readable medium and electronic equipment
CN110414813B (en) Index curve construction method, device and equipment
CN113934894A (en) Data display method based on index tree and terminal equipment
CN113377604A (en) Data processing method, device, equipment and storage medium
CN114840531B (en) Data model reconstruction method, device, equipment and medium based on blood edge relation
CN113987372B (en) Hot data acquisition method, device and equipment of domain business object model
CN113312410B (en) Data map construction method, data query method and terminal equipment
CN114579619B (en) Data query method and device, electronic equipment and storage medium
CN117038002B (en) Method and device for generating observation variable in drug evaluation research
CN117608994A (en) Data collection analysis method and system for application operation and maintenance
CN113076317A (en) Data processing method, device and equipment based on big data and readable storage medium
CN115186027A (en) Task quantity data display method, device, equipment and medium
Winant et al. METHODS PROTOCOL FOR THE UNITED STATES MORTALITY COUNTY DATABASE
CN117273782A (en) Crowd circling method and device and computing equipment
CN113989003A (en) Data processing method, device, equipment and storage medium of standard statistical report

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination