CN114912815A - Index automatic definition method, system and storage medium based on big data wide table - Google Patents

Index automatic definition method, system and storage medium based on big data wide table Download PDF

Info

Publication number
CN114912815A
CN114912815A CN202210568574.3A CN202210568574A CN114912815A CN 114912815 A CN114912815 A CN 114912815A CN 202210568574 A CN202210568574 A CN 202210568574A CN 114912815 A CN114912815 A CN 114912815A
Authority
CN
China
Prior art keywords
index
data
indexes
basic
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210568574.3A
Other languages
Chinese (zh)
Inventor
龚连平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Railway Lianchuang Technology Development Co ltd
Original Assignee
Hunan Railway Lianchuang Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Railway Lianchuang Technology Development Co ltd filed Critical Hunan Railway Lianchuang Technology Development Co ltd
Priority to CN202210568574.3A priority Critical patent/CN114912815A/en
Publication of CN114912815A publication Critical patent/CN114912815A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • General Business, Economics & Management (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an index automatic definition method, a system and a storage medium based on a big data wide table, wherein the method comprises the following steps: receiving an input basic data set, wherein the basic data set comprises basic indexes and relevant factors corresponding to the basic indexes, the basic indexes comprise sending amount, turnover amount and income, and the relevant factors comprise data types, data names, data sources, data dimensions, data apertures, data versions and data values; identifying basic indexes in the input basic data set and corresponding relevant factors thereof; and obtaining the custom derivative index according to each basic index and each relevant factor. The invention aims to simplify the index definition, meet the flexible use of users and lay the foundation for the subsequent flexible analysis.

Description

Index automatic definition method, system and storage medium based on big data wide table
Technical Field
The invention relates to the field of big data processing, in particular to an index automatic definition method, system and storage medium based on a big data wide table.
Background
The Business Intelligence (Business Intelligence, abbreviated as-BI), also known as Business Intelligence or Business Intelligence, is used for analyzing data by modern data warehouse technology, on-line analysis and processing technology, data mining and data display technology to realize Business value. At present, the traditional BI technology mainly aims at extracting, converting and loading basic indexes, and the definition, calculation and storage which are oriented to the indexes can only play an analysis role on the business, but the definition of the indexes has high requirement on professional knowledge of computer big data, is not easily accepted by enterprise managers, and further urgently needs to provide a data analysis processing mode which is applicable to common people.
The invention with the prior publication number of CN112633761A provides a method, a device, equipment and a storage medium for querying index data, which introduce a real-time analysis database and a cache database, invoke an index calculation engine to query index data in the real-time analysis database and the cache database according to an index query request, and perform standardized processing on the index data to generate target aggregated index data, thereby solving the problem that real-time index data cannot be queried.
Disclosure of Invention
The invention mainly aims to provide an index automatic definition method, an index automatic definition system and a storage medium based on a big data wide table, and aims to solve the technical problem of strong specialty of the existing index definition.
In order to achieve the above purpose, the present invention provides an index automatic definition method based on a big data wide table, the method includes the following steps:
receiving an input basic data set, wherein the basic data set comprises basic indexes and relevant factors corresponding to the basic indexes, the basic indexes comprise sending amount, turnover amount and income, and the relevant factors comprise data types, data names, data sources, data dimensions, data apertures, data versions and data values;
identifying basic indexes in the input basic data set and corresponding relevant factors thereof;
and obtaining the user-defined derivative indexes according to the basic indexes and the relevant factors.
Optionally, the step of obtaining a customized derivative index according to each basic index and each relevant factor includes:
and automatically defining to obtain a plurality of derived indexes according to the forward extension of each relevant factor, wherein the forward extension comprises the extension of data dimensionality, the extension of a data source and the extension of index coding.
Optionally, the step of automatically defining to obtain a plurality of derived indexes according to the forward expansion of each relevant factor includes:
and (4) freely combining into various dimension combinations according to the data dimensions in the relevant factors to obtain the derivative indexes of the various dimension combinations.
Optionally, the step of freely combining into various dimensional combinations according to the data dimensions in the relevant factors to obtain the derivative indexes of the various dimensional combinations includes:
identifying a data source corresponding to each basic index;
freely combining the data sources and the data dimensions to obtain an initial derivative index;
judging whether the target objects in the initial derivation indexes are repeated in each dimension combination;
if yes, determining the dimension priority of the target object;
if not, directly generating the derived index of the target object.
Optionally, the step of determining the dimension priority of the target object includes:
and determining the finest granularity in each initial derivative index containing the target object, and keeping the dimension combination with the smallest finest granularity as the derivative index of the target object.
Optionally, the step of obtaining a custom derivative index according to the basic index and each relevant factor includes:
and performing index calculation according to the basic indexes and the related factors included in the respective defined derivative indexes to obtain the index values of the corresponding user-defined derivative indexes.
Optionally, after the step of obtaining the customized derivative index according to the basic index and each relevant factor, the method includes:
automatically distributing corresponding timestamps according to basic indexes in the user-defined derived indexes;
and storing the corresponding user-defined derivative indexes in a column type storage mode according to the time stamps.
In addition, in order to achieve the above object, the present invention further provides a system for automatically defining an index based on a large data width table, including a memory, a processor, and an automatic index defining program based on a large data width table, stored in the memory and executable on the processor, where the automatic index defining program based on a large data width table, when executed by the processor, implements the steps of the automatic index defining method based on a large data width table according to any one of the above claims.
In addition, to achieve the above object, the present invention further provides a storage medium having stored thereon a large data width table based index automatic definition program, which when executed by a processor, implements the steps of the large data width table based index automatic definition method according to any one of the above.
The invention provides an index automatic definition method based on a big data wide table, which comprises the steps of receiving an input basic data set, wherein the basic data set comprises basic indexes and relevant factors corresponding to the basic indexes, the basic indexes comprise sending amount, turnover amount and income, and the relevant factors comprise data types, data names, data sources, data dimensions, data apertures, data versions and data values; identifying basic indexes in the input basic data set and corresponding relevant factors thereof; and obtaining the user-defined derivative indexes according to the basic indexes and the relevant factors. And then realized not needing artificial definition index, when recording the basic index, through the dimension in the basic index, let the system automatically generate the index of various dimension combinations, and then avoid loaded down with trivial details manual operation, also avoid the risk of missing in the existence of manual definition process, simultaneously, will all automatically define and calculate single dimension's index and the minimum granularity index of multidimension combination in advance, satisfy user's nimble use, establish the basis for subsequent nimble analysis.
Drawings
FIG. 1 is a schematic structural diagram of an automatic index definition system based on a large data width table according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an embodiment of a method for automatically defining an index based on a big data width table according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic structural diagram of an automatic index definition system based on a large data width table according to an embodiment of the present invention.
As shown in fig. 1, the system may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include an infrared receiving module for receiving a control command triggered by a user through a remote controller, and the optional user interface 1003 may further include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration of the large data width table based index automatic definition system shown in fig. 1 does not constitute a limitation of the large data width table based index automatic definition system, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.
The specific embodiment of the automatic index definition system based on the big data wide table according to the present invention is substantially the same as the following embodiments of the automatic index definition method based on the big data wide table, and will not be described herein again.
Referring to fig. 2, the method for automatically defining an index based on a big data wide table according to the present invention provides a schematic flow diagram of the first embodiment, and the method includes:
step S10, receiving an input basic data set; step S20, identifying the basic indexes in the recorded basic data set and the corresponding relevant factors;
and step S30, obtaining the user-defined derived index according to each basic index and each relevant factor.
Specifically, the basic data sets in this embodiment are large data wide tables, where the recorded data is a set of concepts that express atomic quantitative attributes of business entities and are not separable, where the concepts include basic indexes such as sending amount, turnover amount, income, and a class of relevant factors corresponding to each basic index, including data types, data names, data sources, data dimensions, data apertures, data versions, and data values, for example, in the field of transportation for large data, where the data types are data classified from a business perspective, such as passenger transportation, freight transportation, and the like; data dimension is the clustering of features of things or phenomena, such as time dimension (day, month, year), space dimension (province, city); the data aperture is a statistical logic standard adopted by statistical data, such as the aperture of all enterprises (calculating all enterprises), the aperture of stock control (not calculating non-stock control enterprises); the data version is a combination of different data calibers and data dimensions, and data statistics is performed, such as a release version (the current data calibers, such as the last year usage containing stock control calibers and the present year usage containing calibers), an analysis version (any combination of the data calibers and the data dimensions, such as the last year and the present year usage containing calibers), and the like. Further, in practical applications, it is not possible to analyze only the basic index when counting traffic, for example, it is necessary to analyze an index such as "passenger transmission amount in a certain time range at a certain gate", and "passenger transmission amount in a certain time range at a certain gate" belongs to an index generated by combining the basic index "transmission amount" with the index-related factors "certain gate", "certain time", and "passenger transport", that is, a derivative index, which is an index to be analyzed when counting traffic. For example, there is a basic indicator "volume sent" in wide table a, with the relevant factors identified: data dimension "province", "city", data type "passenger transport", expand into 3 kinds of dimensions through the dimension, "province", "city", "province city", define 3 derived indicators automatically, province passenger's sending amount, city passenger's sending amount, province passenger's sending amount.
In this embodiment, by receiving an entered basic data set, the basic data set includes basic indexes and relevant factors corresponding to the basic indexes, the basic indexes include a sending amount, a turnover amount, and revenue, and the relevant factors include a data type, a data name, a data source, a data dimension, a data caliber, a data version, and a data value; identifying basic indexes in the input basic data set and corresponding relevant factors thereof; and obtaining the user-defined derivative indexes according to the basic indexes and the relevant factors. And then realized not needing artificial definition index, when the record basic index, through the dimension in the basic index, let the system automatic generation index that various dimensions make up, and then avoid loaded down with trivial details manual operation, also avoid the risk of omitting in the existence of manual definition process.
Further, the step of step S30 includes:
and automatically defining to obtain a plurality of derived indexes according to the forward extension of each relevant factor, wherein the forward extension comprises the extension of data dimensionality, the extension of a data source and the extension of index coding.
In the first case, the expansion process for the data dimension is as follows:
and (4) freely combining into various dimension combinations according to the data dimensions in the relevant factors to obtain the derivative indexes of the various dimension combinations.
Specifically, first, data dimensions can be freely combined into a dimension combination, and then, indexes of various dimension combinations are automatically generated, for example, data dimensions in a base index include dimension 1, dimension 2, and dimension 3, and data dimensions can be freely combined into 3 single dimensions (dimension 1, dimension 2, and dimension 3) and 1 dimension combination (dimension 1_ dimension 2_ dimension 3), so that 4 derivative indexes are automatically defined, including: the "dimension 1" index, the "dimension 2" index, the "dimension 3" index, and the "dimension 1_ dimension 2_ dimension 3" index.
In the second case, the extension flow for the data source is as follows:
identifying a data source corresponding to each basic index;
and obtaining an initial derivative index according to the free combination of each data source and the data dimension.
Specifically, firstly, basic indexes are derived from an index wide table a, an index wide table B and an index wide table C, wherein dimension 1, dimension 2 and dimension 3 are derived from the index wide table a, dimension 1 and dimension 4 are derived from the index wide table B, and dimension 5 is derived from the index wide table C, and 7 derived indexes are automatically defined through free combination of data dimensions in each data source, and include a dimension 1 index, a dimension 2 index, a dimension 3 index, a dimension 1_ dimension 2_ dimension 3 index, a dimension 4 index, a dimension 1_ dimension 4 index and a dimension 5 index.
Further, when the dimension combination is expanded through the data source, it needs to be judged whether the target object in the initial derivation index is repeated in each dimension combination;
if the target object in the initial derivation index has an overlap condition in each dimension combination, determining the dimension priority of the target object, and the step of determining the dimension priority of the target object includes: and determining the finest granularity in each initial derivative index containing the target object, and keeping the dimension combination with the smallest finest granularity as the derivative index of the target object. For example, in the above example, the wide table a and the wide table B both contain a single dimension of "dimension 1", and at this time, the system will automatically detect that the dimension combination contains "dimension 1" (dimension 1_ dimension 2_ dimension 3, dimension 1_ dimension 4), the system will automatically compare two multidimensional combinations containing "dimension 1", determine the priority by the finest granularity and the generation time, and keep the single-dimensional combination with the higher priority, the finest granularity of dimension 1_ dimension 2_ dimension 3 is 3, and the finest granularity of dimension 1_ dimension 4 is 2, so that the single-dimensional combination "dimension 1" of the data source wide table a is kept.
And if the combination of the target object in the initial derivative indexes is not repeated in each dimension, directly generating the derivative indexes of the target object.
In the third case, the extension case for index coding is as follows:
the index code is also automatically configured according to the basic index-related factors, for example, if the basic index "transmission amount" is coded as FSL, the index "daily transmission amount" is coded as FSL _ D, and the index "monthly transmission amount" is coded as: FSL _ M.
Further, the step S30 further includes performing index calculation on the generated derived index, including:
and performing index calculation according to the basic indexes and the related factors included in the respective defined derivative indexes to obtain the index values of the corresponding user-defined derivative indexes.
Specifically, the index calculation is divided into: basic calculation, composite calculation and user-defined calculation.
Aiming at basic calculation, the index which indicates that the current index can be calculated by basic information according to relevant influence factors in a polymerization mode and can be automatically calculated without referring to other indexes is obtained; all the information in the calculation formula can be obtained from the current basic information, and the indexes are directly and automatically calculated. For example, the index "passenger-less transmission amount" is calculated by aggregating the basic information "transmission amount" by the data dimension "province", and the data type "passenger transport", and the index name "passenger-less transmission amount", the data dimension "province", the data type "passenger transport", and the index value "FSL" are recorded.
Aiming at the composite calculation, the current index needs to be calculated through one or more indexes and four arithmetic calculations, for example, the index of the number of vehicles loaded in the province in every day is calculated by dividing the number of days by the index of the number of vehicles loaded in the province in every day, the index of the average distance of the passengers in the province is calculated by dividing the number of the passengers in the province by the index of the number of people transported by the passengers in the province; the index "number of vehicles loaded per day" may record the index name "number of vehicles loaded per day", data dimension "province", and index value "ZCS", and the index "average distance of passengers in province" may record the index name "average distance of passengers in province", data dimension "province", data type "passenger transport", and index value "PJYC".
For the custom calculation, it indicates that the current index can be calculated by connecting the relational database in a custom SQL manner, for example, the custom index "city month sender" can be calculated by the custom SQL "select ny, city, sum (rs) from the wide table Agroup byny, city", and the index name "city month sender" and the data dimension "city", "month", and index value "FSL" can be recorded according to the custom SQL.
Further, after the step S30, the method further includes: storing the indexes;
specifically, corresponding timestamps are automatically allocated according to basic indexes in the user-defined derived indexes; and storing the corresponding user-defined derivative indexes in a column type storage mode according to the time stamps. The storage data is stored according to the column-based logical storage units, and the data in one column exists in a continuous storage form in the storage medium. Furthermore, in actual use, only the columns involved in the query are accessed during the query, so that the disk I/O of the system is greatly reduced, and the data types are consistent and the data characteristics are similar, so that the flexible use of users is met, and a foundation is laid for the subsequent flexible analysis; in addition, different compression algorithms can be adopted according to data characteristics, so that storage control is reduced to a certain extent, and generally, the storage data defined by the indexes only needs 20% of the storage space of the original width table.
Further, the data storage mode in the scheme is more convenient and clear, and the implementation process of the invention is described in detail by taking the creation, calculation and storage of an index 'passenger sending volume _ national railway _ month _ enterprise _ station' as a sample. The basic data set is a broad table A, wherein the basic indexes comprise sending amount, the related factors comprise year and month, ticket class, seat class, transportation class, provincial name, city, line, enterprise, station segment, station and train number,
firstly, inputting a basic index 'sending quantity', and identifying relevant factors of the basic index:
1) data type: the 'passenger transport' can be obtained through the class transportation;
2) data name: determining a code "FSL" in a transmission amount;
3) a data source: wide table A;
4) data dimension: wherein, the time dimension is 'year and month', and the space dimension is 'tickets', 'seats', 'transportation', 'provincial names', 'cities', 'lines', 'enterprises', 'station sections', 'stations', 'train numbers';
5) data caliber: the method is characterized in that the method is distinguished through enterprise attributes, enterprises belonging to the country record the national railway caliber, and all enterprises record the national railway caliber;
6) data version: the default record is "release version";
7) data value: corresponding to the amount of data sent.
Secondly, automatically defining indexes:
2.1 extension of data dimension: the dimension combination 'ticket class _ seat class _ transportation class _ province name _ city _ line _ enterprise _ station segment _ station _ train number' and the single dimension 'ticket class', 'seat class', 'transportation class', 'province name', 'city', 'line', 'enterprise', 'station segment', 'station', 'train number' can be automatically generated through the space dimension "
2.2 customization of derivation index: the following indicators may be automatically defined according to the data dimensions:
"passenger sending volume _ national railway/full-content _ month/year _ ticket class _ seat class _ transport class _ province name _ city _ line _ enterprise _ station segment _ station _ train number"
Passenger sending volume-national railway/full-content-month/year-ticket class "
"passenger sending volume _ national railroad/full-content _ month/year _ seat class"
"passenger sending volume _ national railroad/full-content _ month/year _ fortune class"
Passenger sending volume, national railway, full content, month, year, provincial names "
Passenger sending volume, national railway, full content, month, year, city "
"passenger sending volume _ national railroad/full-content _ month/year _ line"
"passenger delivery volume _ national railroad/full content _ month/year _ enterprise"
"passenger sending volume _ national railroad/full-content _ month/year _ station segment"
"passenger sending volume _ national railroad/full-content _ month/year _ station"
"passenger sending volume _ national railroad/full-content _ month/year _ train number"
Thirdly, calculating a derivative index through automatic index calculation, wherein data dimensions are clustered in a group by form, data apertures and data types are subjected to condition judgment in a where form, and SQL is automatically generated for calculation; for example, the SQL of the derived index "passenger transmission volume _ national railway _ month _ ticket _ agent _ class _ operation class _ provincial name _ city _ line _ business _ station _ train number" is as follows:
select month, tickets, seats, fortune, provincial names, cities, lines, enterprises, station sections, stations, train numbers,
sum (transmission amount)
from broad Table A
where transport category ═ passenger transport'
and enterprises belong to the country
Month of group, ticket class, seat class, transportation class, provincial name, city, line, enterprise, station section, station, train number
Fourth, by storing the derived indicators, wherein a timestamp is automatically assigned to the basic information "transmission amount", and storing the related derived indicators in the big data wide table, such as "passenger transmission amount _ national railroad _ month _ ticket class _ seat _ operation class _ provincial _ city _ line _ business _ station _ train number", the data will be recorded in the following fields: "timestamp", "index name", "data caliber", "data type", "month", "ticket type", "seat type", "transport type", "provincial name", "city", "route", "enterprise", "station section", "station", "train number", "transmission amount" (where the field "transmission amount" is a numerical type); for example, "passenger transmission amount _ all _ year _ number of cars" is recorded in the following field "time stamp", "index name", "data aperture", "data type", "year", "number of cars", and "transmission amount" (where the field "transmission amount" is a numerical type).
In addition, an embodiment of the present invention further provides a storage medium, where the storage medium stores an index automatic definition program based on a large data width table, and the index automatic definition program based on the large data width table, when executed by a processor, implements the steps of the above index automatic definition method based on the large data width table.
The specific embodiment of the readable storage medium of the present invention is substantially the same as the embodiments of the index automatic definition method based on the big data width table, and will not be described herein again.
In addition, the invention also provides an index automatic definition system based on the big data wide table, which comprises the following components:
the basic index entry module is used for receiving and identifying an entered basic data set, wherein the basic data set comprises basic indexes and relevant factors corresponding to the basic indexes, the basic indexes comprise definitions of data values, and the relevant factors comprise data types, data names, data width tables, data dimensions, data apertures, data versions and data values;
the index building module is used for automatically generating various dimensional combinations (single dimension and multi-dimension) according to the basic indexes and the corresponding relevant factors thereof and judging and generating indexes of the various dimensional combinations according to the priority;
the index calculation module is used for automatically calculating various dimension combination indexes according to the calculation mode of the indexes;
and the index storage module is used for automatically generating indexes according to the basic indexes and storing the indexes combined in various dimensions in a column type storage mode.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. An index automatic definition method based on a big data width table is characterized by comprising the following steps:
receiving an input basic data set, wherein the basic data set comprises basic indexes and relevant factors corresponding to the basic indexes, the basic indexes comprise sending amount, turnover amount and income, and the relevant factors comprise data types, data names, data sources, data dimensions, data apertures, data versions and data values;
identifying basic indexes in the input basic data set and corresponding relevant factors thereof;
and obtaining the user-defined derivative indexes according to the basic indexes and the relevant factors.
2. The method according to claim 1, wherein the step of obtaining the custom derivative index according to each basic index and each related factor comprises:
and automatically defining to obtain a plurality of derived indexes according to the forward extension of each relevant factor, wherein the forward extension comprises the extension of data dimensionality, the extension of a data source and the extension of index coding.
3. The method as claimed in claim 2, wherein the step of automatically defining and obtaining a plurality of derived indicators according to the forward expansion of each relevant factor comprises:
and (4) freely combining into various dimension combinations according to the data dimensions in the relevant factors to obtain the derivative indexes of the various dimension combinations.
4. The method according to claim 3, wherein the step of freely combining the data dimensions into various dimension combinations according to the relevant factors to obtain the derived indexes of the various dimension combinations comprises:
identifying a data source corresponding to each basic index;
freely combining the data sources and the data dimensions to obtain initial derivative indexes;
judging whether the target objects in the initial derivation indexes are repeated in each dimension combination;
if yes, determining the dimension priority of the target object;
if not, directly generating the derived index of the target object.
5. The method according to claim 4, wherein the step of determining the dimension priority of the target object comprises:
and determining the finest granularity in each initial derivative index containing the target object, and keeping the dimension combination with the smallest finest granularity as the derivative index of the target object.
6. The method according to any one of claims 1 to 5, wherein the step of obtaining the custom derivative index according to the basic index and each relevant factor includes:
and performing index calculation according to the basic indexes and the related factors included in the respective defined derivative indexes to obtain the index values of the corresponding user-defined derivative indexes.
7. The method as claimed in claim 6, wherein the step of obtaining the customized derivative index from the basic index and the relevant factors is followed by:
automatically distributing corresponding timestamps according to basic indexes in the user-defined derived indexes;
and storing the corresponding user-defined derivative indexes in a column type storage mode according to the time stamps.
8. A system of automatic index definition method based on big data width table, comprising a memory, a processor and an automatic index definition program based on big data width table stored in the memory and capable of running on the processor, wherein the automatic index definition program based on big data width table realizes the steps of the automatic index definition method based on big data width table according to any claim 1 to 7 when being executed by the processor.
9. A storage medium, characterized in that the storage medium stores thereon a large data width table-based index automatic definition program, which when executed by a processor implements the steps of the large data width table-based index automatic definition method according to any one of claims 1 to 7.
CN202210568574.3A 2022-05-24 2022-05-24 Index automatic definition method, system and storage medium based on big data wide table Pending CN114912815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210568574.3A CN114912815A (en) 2022-05-24 2022-05-24 Index automatic definition method, system and storage medium based on big data wide table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210568574.3A CN114912815A (en) 2022-05-24 2022-05-24 Index automatic definition method, system and storage medium based on big data wide table

Publications (1)

Publication Number Publication Date
CN114912815A true CN114912815A (en) 2022-08-16

Family

ID=82769557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210568574.3A Pending CN114912815A (en) 2022-05-24 2022-05-24 Index automatic definition method, system and storage medium based on big data wide table

Country Status (1)

Country Link
CN (1) CN114912815A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115718571A (en) * 2022-11-23 2023-02-28 深圳计算科学研究院 Data management method and device based on multi-dimensional features

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557498A (en) * 2015-09-25 2017-04-05 北京国双科技有限公司 Date storage method and device and data query method and apparatus
CN113298354A (en) * 2021-04-28 2021-08-24 上海淇玥信息技术有限公司 Automatic generation method and device of business derivative index and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557498A (en) * 2015-09-25 2017-04-05 北京国双科技有限公司 Date storage method and device and data query method and apparatus
CN113298354A (en) * 2021-04-28 2021-08-24 上海淇玥信息技术有限公司 Automatic generation method and device of business derivative index and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
数据产品小LEE: "一文帮你更好地理解指标", pages 1 - 7, Retrieved from the Internet <URL:https://www.woshipm.com/data-analysis/4364653.html> *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115718571A (en) * 2022-11-23 2023-02-28 深圳计算科学研究院 Data management method and device based on multi-dimensional features
CN115718571B (en) * 2022-11-23 2023-08-22 深圳计算科学研究院 Data management method and device based on multidimensional features

Similar Documents

Publication Publication Date Title
US8341131B2 (en) Systems and methods for master data management using record and field based rules
CN112231333A (en) Ecological environment data sharing and exchanging method and system
CN111079025A (en) Government affair recommendation method based on big data analysis and system comprising method
CN107808003A (en) A kind of document management method and device
CN106844320B (en) Financial statement integration method and equipment
CN105303349A (en) Rail freight business pre-warning method
CN114912815A (en) Index automatic definition method, system and storage medium based on big data wide table
CN107168937A (en) Financial cloud accounting element particle and assemble method based on XBRL
CN113095647B (en) Vehicle inspection system
CN116957528B (en) Method and system for automatically generating attendance result for multi-source card punching data
CN112711585A (en) Expressway green traffic credit management system based on big data technology
CN111915100A (en) High-precision freight prediction method and freight prediction system
CN115759514A (en) Cold chain distribution vehicle scheduling management method and device
KR102432126B1 (en) Data preparation method and data utilization system for data use
CN113256076A (en) Project declaration evaluation method for operation management of scientific and technological innovation park
CN108073624A (en) Business data processing system and method
US20140244673A1 (en) Systems and methods for visualizing master data services information
CN118037012B (en) Main data management method and system for engineering comprehensive scheduling
CN118014466B (en) Digital full supply chain logistics method and system
CN105404987A (en) Railway freight data processing method and device
JP2019185582A5 (en)
CN114519059B (en) Data processing method, device, electronic equipment and storage medium
CN116468021B (en) Encoding-based law enforcement evidence data processing and using method and system
CN118260354A (en) Vehicle data detection system and method
CN112416630B (en) Data flow architecture and data processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Zheng Shenggang

Inventor after: Yang Xuebin

Inventor after: Gong Lianping

Inventor after: Fang Xingjun

Inventor after: Xie Moyi

Inventor after: Wu Renjian

Inventor before: Gong Lianping