CN111125159A - Data statistics method and device - Google Patents

Data statistics method and device Download PDF

Info

Publication number
CN111125159A
CN111125159A CN201911347191.8A CN201911347191A CN111125159A CN 111125159 A CN111125159 A CN 111125159A CN 201911347191 A CN201911347191 A CN 201911347191A CN 111125159 A CN111125159 A CN 111125159A
Authority
CN
China
Prior art keywords
index
dimension
dimension group
node
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911347191.8A
Other languages
Chinese (zh)
Inventor
胡维达
张靖南
朱健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Asiainfo Technologies China Inc
Original Assignee
Asiainfo Technologies China Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Asiainfo Technologies China Inc filed Critical Asiainfo Technologies China Inc
Priority to CN201911347191.8A priority Critical patent/CN111125159A/en
Publication of CN111125159A publication Critical patent/CN111125159A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Abstract

The invention discloses a data statistics method and a data statistics device, wherein the data statistics method comprises the following steps: acquiring a preset dimension group and a preset index group corresponding to a target data source; obtaining a dimension group tree according to a preset dimension group, wherein each node in the dimension group tree is a dimension group, a root node in the dimension group tree is a preset dimension group, child nodes in the dimension group tree are proper subsets of parent nodes of the child nodes, the number of dimensions of the child nodes is less than that of the parent nodes of the child nodes by 1, and the lowest layer of nodes only comprises one dimension; according to all dimensions in the preset dimension group, counting all indexes in a preset index group in the target data source to obtain statistical data of all indexes of the root node; for other nodes except the root node: and counting each index in a preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain a statistical result of each index of the node. The invention can effectively improve the response speed of the system when inquiring data and simultaneously reduce the burden of the system.

Description

Data statistics method and device
Technical Field
The present invention relates to the field of data analysis, and in particular, to a method and an apparatus for data statistics.
Background
With the progress of the era and the development of science and technology, people know and change the world through data information. In modern times of data information explosion, databases are required to store large amounts of data information. People can inquire the required information through the database, but most of the data stored in the table of the database is summarized data with the finest granularity dimension. To query the summary information of the coarse-grained dimension, real-time summary calculation of the database is needed, the response speed of the database is low, and once the query times are too frequent, the burden on the system can be brought to the database.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for data statistics. The method and the device can effectively improve the response speed of the system when data is inquired, and simultaneously reduce the burden of the system.
In order to achieve the above object, the present invention provides the following technical solutions:
the invention discloses a data statistical method in a first aspect, which comprises the following steps:
acquiring a preset dimension group and a preset index group corresponding to a target data source;
obtaining a dimension group tree according to the preset dimension group, wherein each node in the dimension group tree is a dimension group, a root node in the dimension group tree is the preset dimension group, child nodes in the dimension group tree are proper subsets of parent nodes of the child nodes, the number of dimensions contained in the child nodes in the dimension group tree is 1 less than that contained in the parent nodes of the child nodes, the dimensions in any two nodes with the same level are not completely the same, and the lowest node in the dimension group tree is a dimension group containing only one dimension;
counting each index in the preset index group in the target data source according to all dimensions in the preset dimension group to obtain statistical data of each index of the root node;
for other nodes in the dimension group tree except the root node:
and counting each index in the preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain a statistical result of each index of the node.
Optionally, the method further includes:
obtaining a query statement sent by a query party, wherein the query statement comprises a target dimension and a target index, searching a dimension group formed by each target dimension in the query statement from the dimension group tree, obtaining a statistical result of the target index from a statistical result of each index corresponding to the searched dimension group, and sending the statistical result of the target index serving as the query result to the query party.
Optionally, the method further includes:
and creating an index corresponding to the data generation date according to the data generation date of the data source, and storing each statistical result obtained according to the target data source into the index corresponding to the data generation date of the target data source.
Optionally, the method further includes:
and encapsulating an application program interface for providing a query function, wherein the application program interface is used for receiving the query statement and sending a query result.
Optionally, the data types of the target data source are: one of maintenance data, transportation data, production data, and business data.
The second aspect of the invention discloses a data statistical device,
the device comprises: a data acquisition unit, a dimension group tree acquisition unit, a first statistic unit and a second statistic unit,
the data acquisition unit is used for acquiring a preset dimension group and a preset index group corresponding to the target data source;
the dimension group tree obtaining unit is configured to obtain a dimension group tree according to the preset dimension group, where each node in the dimension group tree is a dimension group, a root node in the dimension group tree is the preset dimension group, a child node in the dimension group tree is a proper subset of a parent node of the child node, the number of dimensions included in the child node in the dimension group tree is less than 1 than the number of dimensions included in the parent node of the child node, the dimensions in any two nodes with the same hierarchy are not completely the same, and a lowest node in the dimension group tree is a dimension group including only one dimension;
the first statistical unit is configured to perform statistics on each index in the preset index group in the target data source according to all dimensions in the preset dimension group, so as to obtain statistical data of each index of the root node;
the second statistical unit is configured to, for other nodes in the dimension group tree except the root node:
and counting each index in the preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain a statistical result of each index of the node.
Optionally, the apparatus further comprises: the unit of inquiry is used for inquiring the unit,
the query unit is configured to obtain a query statement sent by a query party, where the query statement includes a target dimension and a target index, search a dimension group formed by each target dimension in the query statement from the dimension group tree, obtain a statistical result of the target index from a statistical result of each index corresponding to the found dimension group, and send the statistical result of the target index as a query result to the query party.
Optionally, the apparatus further comprises: an index generation unit for generating an index of a document,
and the index generating unit is used for creating an index corresponding to the data generation date according to the data generation date of the data source and storing each statistical result obtained according to the target data source into the index corresponding to the data generation date of the target data source.
Optionally, the apparatus further comprises: the packaging unit is packaged in a packaging mode,
the packaging unit is used for packaging an application program interface used for providing a query function, and the application program interface is used for receiving a query statement and sending a query result.
Optionally, the data types of the target data source are: one of maintenance data, transportation data, production data, and business data.
The invention discloses a data statistical method and a device, comprising the following steps: acquiring a preset dimension group and a preset index group corresponding to a target data source; obtaining a dimension group tree according to a preset dimension group, wherein each node in the dimension group tree is a dimension group, a root node in the dimension group tree is a preset dimension group, child nodes in the dimension group tree are true subsets of parent nodes of the child nodes, the number of dimensions contained in the child nodes in the dimension group tree is 1 less than that contained in the parent nodes of the child nodes, the dimensions in any two nodes with the same level are not completely the same, and the lowest node in the dimension group tree is a dimension group containing only one dimension; counting each index in the target data source and in the preset index group according to all dimensions in the preset dimension group to obtain statistical data of each index of the root node; for other nodes except the root node in the dimension group tree: and counting each index in the preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain the statistical result of each index of the node. According to the invention, each dimension combination of the required dimensions and the index statistical result corresponding to each dimension combination can be obtained in advance, and related personnel do not need to calculate in real time when inquiring the target dimension and the target index, so that the response speed of the system is effectively improved when inquiring data, and the burden of the system is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a schematic flow chart of a data statistics method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a dimension group tree according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a data statistics apparatus according to an embodiment of the present invention.
Detailed Description
The invention discloses a data statistical method and a device, and a person skilled in the art can appropriately improve technical implementation details by taking the contents of the text as reference. It is expressly intended that all such similar substitutes and modifications which would be obvious to one skilled in the art are deemed to be included in the invention. While the methods and applications of this invention have been described in terms of preferred embodiments, it will be apparent to those of ordinary skill in the art that variations and modifications in the methods and applications described herein, as well as other suitable variations and combinations, may be made to implement and use the techniques of this invention without departing from the spirit and scope of the invention.
With the progress of the era and the development of science and technology, people know and change the world through data information. In modern times of data information explosion, databases are required to store large amounts of data information. People can inquire the required information through the database, but most of the data stored in the table of the database is summarized data with the finest granularity dimension. When summary information of coarse-grained dimensionality needs to be inquired, real-time summary calculation of a database is needed, the response speed of the database is low, and once the inquiry times are too frequent, burden on a system can be brought to the database.
Besides slow response speed and system burden, the related personnel also need to know which table of the data in the database is specific to the data to find the desired data from the database, which is inconvenient to use and has higher requirements for the user.
Therefore, the method and the device for data statistics provided by the invention can effectively improve the system response speed when inquiring data, reduce the system burden and simultaneously have lower use threshold.
As shown in fig. 1, the present invention provides a data statistics method, including:
step S101: and acquiring a preset dimension group and a preset index group corresponding to the target data source.
It should be noted that, a dimension is generally a feature of a phenomenon, such as a region, a gender, a version, and the like, and a dimension group is a combination of desired dimensions, and at least one dimension is included in the dimension group. The index is generally a unit and a method for measuring the development degree of things, and usually needs to be calculated, and is obtained under certain conditions, such as yield, speed, growth rate, and the like, and the index group is an index combination, and at least one index is included in the index group.
The preset dimension group may be a dimension group formed by all dimensions or part of dimensions in the target data source, and the indexes in the preset index group may be index groups formed by all indexes or part of indexes in the target data source. Of course, the indexes in the preset index group may also be indexes obtained by statistics according to part or all of the indexes in the target data source. For example: the target data source has the following indexes: height, then the index in the preset index group can be: average height, which is obtained from height statistics in the target data source.
Optionally, the data types of the target data source may be: one of maintenance data, transportation data, production data, and business data. For example: the repair data may be: maintenance data of vehicles of various types of certain brands in various regions throughout the country in a certain year; the transportation data may be: the weight of each vegetable variety transported by a certain province to the other province in a certain month; the production data may be: daily production capacity and power consumption of a daily chemical manufacturing plant at a certain season are respectively provided. The service data may be: the number of calls a certain communication group handles on a certain day.
Optionally, if the preset dimension group contains a dimension that is not included in the target data source, the dimension is added to the target data source, and data of each index corresponding to the dimension is NULL, that is, NULL.
Step S102: and obtaining a dimension group tree according to the preset dimension group. Each node in the dimension group tree is a dimension group, a root node in the dimension group tree is a preset dimension group, child nodes in the dimension group tree are a proper subset of parent nodes of the child nodes, the number of dimensions contained in the child nodes in the dimension group tree is 1 less than that contained in the parent nodes of the child nodes, the dimensions in any two nodes with the same level are not completely the same, and the lowest node in the dimension group tree is a dimension group containing only one dimension.
It should be noted that the dimensions of any two nodes with the same hierarchy are not completely the same, and a plurality of nodes with the same dimensions may be generated in the same hierarchy during the tree generation process, so that the nodes are rearranged, and only the next node is reserved. It is guaranteed that the dimensions contained in any two nodes of each level are not identical.
Optionally, in a specific embodiment, as shown in fig. 2, fig. 2 is a structure of the dimension group tree. If the target data source is the vehicle maintenance data, the obtained preset dimension group corresponding to the target data source may be (vehicle brand, vehicle type, place of production, engine type).
Further, the dimension group can be used as a root dimension group, and a dimension group tree is generated.
The first-level dimension sets are obtained from the root dimension set (car brand, car type, origin, engine type), wherein the first-level dimension sets are (car brand, car type, origin), (car brand, car type, engine type), (car brand, origin, engine type) and (car type, origin, engine type), respectively.
Because the dimensionalities of any two nodes in the same level are not completely the same, some nodes can generate few or no child nodes, and a second-level dimensionality group (automobile brand, automobile type), (automobile brand, origin), (automobile type, origin) is obtained from the first-level dimensionality group (automobile brand, automobile type, origin); obtaining a second set of layer dimensions (brand of car, type of engine), (type of car, type of engine) from the first set of layer dimensions (brand of car, type of engine); the second set of layer dimensions (origin, engine type) is obtained from the first set of layer dimensions (brand of car, origin, engine type).
Generating a third set of layer dimensions (brand of car), (type of car) from the second set of layer dimensions (brand of car, type of car); generating a third dimension set (origin) from the second dimension set (brand of car, origin); a third dimension set (engine type) is generated from the second dimension set (automobile brand, engine type).
The number of dimensions included in the generated third set of dimensions is 1.
It should be noted that, if the dimensions in the acquired root dimension group are all dimensions of the target data source, the root dimension group may be referred to as a finest-grained dimension group, and the third-layer dimension group is referred to as a coarsest-grained dimension group.
Step S103: and counting each index in a preset index group in the target data source according to all dimensions in the preset dimension group to obtain the statistical data of each index of the root node.
Optionally, in a specific embodiment, for example, the data dimension in the target data source is a computer brand, a computer model, a CPU type, a graphics card type, and the index is a sales volume, as shown in table 1:
TABLE 1 computer sales data sheet
Computer brand Kind of computer CPU type Display card type Sales volume
A Notebook computer I7 1080 1000
A Desk type machine I7 960 1500
B Notebook computer I7 1080 2000
B Desk type machine I5 960 2500
The preset dimension group is set as (computer type, CPU type and display card type), and the computer brand in the target data source is not used as the dimension in the preset dimension group. If the target index is sales volume, then the sales volume is counted according to the computer type, CPU type and graphics card type to obtain the statistical data shown in Table 2:
TABLE 2 computer sales data sheet
Kind of computer CPU type Display card type Sales volume
Notebook computer I7 1080 3000
Desk type machine I7 960 1500
Desk type machine I5 960 2500
Step S104: for other nodes except the root node in the dimension group tree:
and counting each index in a preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain a statistical result of each index of the node.
For other dimension groups except the dimension group with the finest granularity, in the embodiment, when the index statistics of the dimension groups is performed, the statistics is not required to be performed from the target data source, but is directly performed according to the statistical result of the parent node. As the parent node counts the indexes, the statistical result of the dimension group can be obtained only by simple calculation or processing when the statistics is carried out on the basis of the statistical result, so that the query quantity and the calculation quantity are greatly reduced in the process, and the operation burden of the system is further reduced.
For ease of understanding, step S104 is illustrated below:
setting the target data source as follows: automobile output and maintenance record table. The exterior and interior include the brand of the vehicle, the type of the vehicle, the place of origin, the type of the engine, the type of the tire, the yield, and the record of each maintenance.
The preset dimension group is (automobile brand, automobile type, origin, engine type).
The index set is (yield, maintenance times).
The data shown in table 3 can be obtained by step S103.
Wherein (yield, maintenance frequency) is a statistical result, and the corresponding dimension group is (automobile brand, automobile type, origin, engine type).
Optionally, in a specific embodiment, the preset dimension group and the statistical result of the preset dimension group are as shown in table 3:
TABLE 3
Figure BDA0002333702730000071
Figure BDA0002333702730000081
As shown in fig. 2, the first-tier dimension sets may be obtained from a root dimension set (automobile brand, automobile type, origin, engine type), where the first-tier dimension sets are (automobile brand, automobile type, origin), (automobile brand, automobile type, engine type), (automobile brand, origin, engine type), and (automobile type, origin, engine type), respectively. The statistics of the first layer dimension groups can be further obtained according to the statistics of table 3. For example: from the statistics of table 3, statistics of the (car brand, car type, origin) dimension groups as shown in table 4 can be obtained:
TABLE 4
Automobile brand Kind of car Producing area Production/vehicle Number of times of maintenance
Brand A automobile Car (R.C.) Germany 10000 10000
Brand A automobile SUV China (China) 20000 20000
Brand A automobile Off-road vehicle Germany 30000 30000
Brand B automobile Car (R.C.) China (China) 10000 10000
Brand B automobile SUV Germany 20000 20000
Brand B automobile Off-road vehicle China (China) 30000 30000
When the statistical result of the dimension group (automobile brand, automobile type, origin) is counted, because the dimension combination of the automobile brand, the automobile type and the origin does not have the completely same dimension combination in the table 3, when the statistical result of the dimension group (automobile brand, automobile type, origin) is carried out, only the statistical result of the table 3 needs to be used as the statistical result of the table 4, the indexes corresponding to the first-layer dimension group in the target data source do not need to be counted, and the system burden can be reduced.
And obtaining other first-layer dimension groups and statistical results corresponding to the dimension groups in the same way.
As shown in fig. 2, the second-layer dimension group (car brand, car type), (car brand, origin) can be obtained from the first-layer dimension group (car brand, car type, origin), and the statistical results of the second-layer dimension groups can be further obtained according to the statistical results in table 4. For example: from the statistics of table 4, statistics of the (brand of car, origin) dimension groups as shown in table 5 can be obtained:
TABLE 5
Figure BDA0002333702730000082
Figure BDA0002333702730000091
Because the indexes corresponding to (automobile brand, automobile type, origin) are already counted in table 4, when counting the statistical results of the dimension group (automobile brand, origin), only simple calculation needs to be performed on the index statistical results in table 4, and statistics need not to be performed according to a large amount of data in the target data source, so that the process of inquiring data and statistical data from the data source can be reduced, and the system burden is greatly reduced.
And similarly, obtaining other second-layer dimension groups and statistical results corresponding to the dimension groups.
As shown in fig. 2, a third layer dimension group (origin) can be obtained from the second layer dimension group (brand of automobile, origin), and the statistical results of the third layer dimension groups can be further obtained according to the statistical results in table 5. For example: from the statistics of table 5, statistics of the (source) dimension groups as shown in table 6 can be obtained:
TABLE 6
Producing area Production/vehicle Number of times of maintenance
Germany 60000 60000
China (China) 60000 60000
And similarly, obtaining other third-layer dimension groups and statistical results corresponding to the dimension groups.
Optionally, the dimension group and the statistical result corresponding to the dimension group are stored.
Optionally, the method further includes:
the query statement sent by the query party is obtained, the query statement comprises target dimensions and target indexes, a dimension group formed by the target dimensions in the query statement is searched from a dimension group tree, the statistical result of the target indexes is obtained from the statistical result of the indexes corresponding to the searched dimension group, and the statistical result of the target indexes is sent to the query party as the query result.
If each target dimension in the query statement indicates the query condition of the current query statement, and if each target dimension is a query keyword of the current query statement, each node in the dimension group tree corresponds to at least one dimension, each dimension is generally a characteristic of an object phenomenon, such as a region and a sex, then the dimension group corresponding to the node of which the dimension is consistent with or similar to the target dimension in the query statement is used as the dimension group formed by each target dimension in the query statement by comparing each target dimension in the query statement with the corresponding dimension of each node in the dimension group tree at the moment of searching in the dimension group tree. Each dimension corresponds to at least one index, and each index has at least one statistical result, so after the dimension group is found, the statistical results of the indexes corresponding to the found dimension group are obtained, and then the statistical results of the target indexes are extracted from the obtained statistical results.
For example, if the inquiring party wants to inquire the automobile yield with China as the origin, the inquiring party selects the target dimension to be inquired: producing area, inputting 'China', selecting index to be inquired: yield, and input "yield"; return results as origin: china, yield: 60000.
only the statistical results may be fed back as query results. The statistical results and the corresponding set of dimensions may also be used together as query results.
Optionally, the method further includes:
creating an index corresponding to the data generation date according to the data generation date of the data source, storing each statistical result obtained according to the target data source into the index corresponding to the data generation date of the target data source, and dividing each statistical result corresponding to the target data source under different data generation dates in an index mode so as to realize automatic division of each statistical result. Wherein the index corresponding to the date of data generation can be obtained by, but not limited to, a hashing algorithm, and the index corresponding to different data generation dates is different. The statistical results obtained by the target data source correspond to respective data generation dates, and after the indexes of the data generation dates of the statistical results are calculated, the statistical results can be stored in the corresponding indexes.
For example, the target data source 1, the target data source 2, and the target data source 3 obtain the statistical result A, B, C, D, E, where the data generation dates corresponding to the statistical results a and B are the same, and the statistical results a and B are stored under the index of the date; the data generation dates of the statistics C, D and E are the same, and the statistics C, D and E are stored under the index of the date.
Optionally, the method further includes:
creating an index, and storing all dimension combinations in the dimension group tree corresponding to one or more indexes in the index so as to correspond the dimension combinations with the same index to the index of the index, wherein the creation of the index can be calculated by, but not limited to, a hash algorithm. For example, if the index in the index is yield, the combination of the dimensions in the dimension tree corresponding to the brand, the type, the place of origin, and the type of the engine is stored in the index with the index as yield.
It should be noted that, the query staff can know which data can be queried and which dimensions are combined by the index, so that the use threshold is reduced.
Optionally, the method further includes:
and encapsulating an application program interface for providing a query function, wherein the application program interface is used for receiving the query statement and sending a query result.
In this embodiment, one way to encapsulate an application program interface for providing query functionality includes:
the obtained statistical result is combined with the action (or function) executed by a machine in the query process to form an organic whole in an abstract mode, namely the statistical result is combined with the source code for operating the statistical result in an organic mode to form an application program interface with the query function.
The application program interface for providing the query function is packaged, so that the operation executed by a machine in the query process can be hidden, only a simple program interface is left, the query party does not need to know what the query process is, and the result can be obtained only by inputting dimension combinations. When the unified application program interface is provided for querying data on different platforms, the interface and the operation are the same, the query can be performed only by training the personnel of the querying party under the application program interface, the personnel of the querying party can be queried through various training due to the difference of the platforms, the use threshold can be reduced, and the platform can be an operating system.
The data statistics method disclosed by the embodiment of the invention comprises the following steps: acquiring a preset dimension group and a preset index group corresponding to a target data source; obtaining a dimension group tree according to a preset dimension group, wherein each node in the dimension group tree is a dimension group, a root node in the dimension group tree is a preset dimension group, child nodes in the dimension group tree are true subsets of parent nodes of the child nodes, the number of dimensions contained in the child nodes in the dimension group tree is 1 less than that contained in the parent nodes of the child nodes, the dimensions in any two nodes with the same level are not completely the same, and the lowest node in the dimension group tree is a dimension group containing only one dimension; counting each index in the target data source and in the preset index group according to all dimensions in the preset dimension group to obtain statistical data of each index of the root node; for other nodes except the root node in the dimension group tree: and counting each index in the preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain the statistical result of each index of the node. According to the invention, each dimension combination of the required dimensions and the index statistical result corresponding to each dimension combination can be obtained in advance, and related personnel do not need to calculate in real time when inquiring the target dimension and the target index, so that the response speed of the system is effectively improved when inquiring data, and the burden of the system is reduced.
Based on the above disclosed data statistics method of the present invention, the present invention also discloses a data statistics device, as shown in fig. 3, the device comprises: a data acquisition unit 301, a dimension group tree acquisition unit 302, a first statistic unit 303 and a second statistic unit 304,
a data obtaining unit 301, configured to obtain a preset dimension group and a preset index group corresponding to a target data source;
a dimension group tree obtaining unit 302, configured to obtain a dimension group tree according to a preset dimension group, where each node in the dimension group tree is a dimension group, a root node in the dimension group tree is the preset dimension group, a child node in the dimension group tree is a proper subset of a parent node of the child node, the number of dimensions included in the child node in the dimension group tree is less than 1 of the dimensions included in the parent node of the child node, the dimensions in any two nodes with the same hierarchy are not completely the same, and a lowest node in the dimension group tree is a dimension group including only one dimension;
the first statistical unit 303 is configured to perform statistics on each index in a preset index group in the target data source according to all dimensions in the preset dimension group, so as to obtain statistical data of each index of the root node;
a second statistical unit 304, configured to, for other nodes in the dimension group tree except the root node:
and counting each index in a preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain a statistical result of each index of the node.
Optionally, the data types of the target data source are: one of maintenance data, transportation data, production data, and business data.
Optionally, the apparatus further comprises: the unit of inquiry is used for inquiring the unit,
and the query unit is used for obtaining a query statement sent by a query party, wherein the query statement comprises a target dimension and a target index, searching a dimension group formed by each target dimension in the query statement from the dimension group tree, obtaining a statistical result of the target index from a statistical result of each index corresponding to the searched dimension group, and sending the statistical result of the target index as the query result to the query party.
Optionally, the apparatus further comprises: an index generation unit for generating an index of a document,
and the index generating unit is used for creating an index corresponding to the data generation date according to the data generation date of the data source and storing each statistical result obtained according to the target data source into the index corresponding to the data generation date of the target data source.
Optionally, the apparatus further comprises: the packaging unit is packaged in a packaging mode,
and the packaging unit is used for packaging an application program interface used for providing a query function, and the application program interface is used for receiving the query statement and sending the query result.
The data statistics device disclosed by the embodiment of the invention comprises: the device comprises: the data acquisition unit 301 is used for acquiring a preset dimension group and a preset index group corresponding to a target data source; a dimension group tree obtaining unit 302, configured to obtain a dimension group tree according to a preset dimension group, where each node in the dimension group tree is a dimension group, a root node in the dimension group tree is the preset dimension group, a child node in the dimension group tree is a proper subset of a parent node of the child node, the number of dimensions included in the child node in the dimension group tree is less than 1 of the dimensions included in the parent node of the child node, the dimensions in any two nodes with the same hierarchy are not completely the same, and a lowest node in the dimension group tree is a dimension group including only one dimension; the first statistical unit 303 is configured to perform statistics on each index in the preset index group in the target data source according to all dimensions in the preset dimension group, so as to obtain statistical data of each index of the root node; a second statistical unit 304, configured to, for other nodes in the dimension group tree except the root node: and counting each index in the preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain the statistical result of each index of the node. According to the invention, each dimension combination of the required dimensions and the index statistical result corresponding to each dimension combination can be obtained in advance, and relevant personnel do not need to calculate in real time when inquiring the target dimension and the target index, so that the response speed of the system is effectively increased when inquiring data, and the burden of the system is reduced.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. A method of data statistics, comprising:
acquiring a preset dimension group and a preset index group corresponding to a target data source;
obtaining a dimension group tree according to the preset dimension group, wherein each node in the dimension group tree is a dimension group, a root node in the dimension group tree is the preset dimension group, child nodes in the dimension group tree are proper subsets of parent nodes of the child nodes, the number of dimensions contained in the child nodes in the dimension group tree is 1 less than that contained in the parent nodes of the child nodes, the dimensions in any two nodes with the same level are not completely the same, and the lowest node in the dimension group tree is a dimension group containing only one dimension;
counting each index in the preset index group in the target data source according to all dimensions in the preset dimension group to obtain statistical data of each index of the root node;
for other nodes in the dimension group tree except the root node: and counting each index in the preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain a statistical result of each index of the node.
2. The method of claim 1, further comprising:
obtaining a query statement sent by a query party, wherein the query statement comprises a target dimension and a target index, searching a dimension group formed by each target dimension in the query statement from the dimension group tree, obtaining a statistical result of the target index from a statistical result of each index corresponding to the searched dimension group, and sending the statistical result of the target index serving as the query result to the query party.
3. The method of claim 1, further comprising:
and creating an index corresponding to the data generation date according to the data generation date of the data source, and storing each statistical result obtained according to the target data source into the index corresponding to the data generation date of the target data source.
4. The method according to any one of claims 1-3, further comprising:
and encapsulating an application program interface for providing a query function, wherein the application program interface is used for receiving the query statement and sending a query result.
5. The method of claim 1, wherein the target data source has data types of: one of maintenance data, transportation data, production data, and business data.
6. An apparatus for data statistics, the apparatus comprising: a data acquisition unit, a dimension group tree acquisition unit, a first statistic unit and a second statistic unit,
the data acquisition unit is used for acquiring a preset dimension group and a preset index group corresponding to the target data source;
the dimension group tree obtaining unit is configured to obtain a dimension group tree according to the preset dimension group, where each node in the dimension group tree is a dimension group, a root node in the dimension group tree is the preset dimension group, a child node in the dimension group tree is a proper subset of a parent node of the child node, the number of dimensions included in the child node in the dimension group tree is less than 1 than the number of dimensions included in the parent node of the child node, the dimensions in any two nodes with the same hierarchy are not completely the same, and a lowest node in the dimension group tree is a dimension group including only one dimension;
the first statistical unit is configured to perform statistics on each index in the preset index group in the target data source according to all dimensions in the preset dimension group, so as to obtain statistical data of each index of the root node;
the second statistical unit is configured to, for other nodes in the dimension group tree except the root node:
and counting each index in the preset index group in the statistical data of the father node of the node according to all dimensions of the node to obtain a statistical result of each index of the node.
7. The apparatus of claim 6, further comprising: the unit of inquiry is used for inquiring the unit,
the query unit is configured to obtain a query statement sent by a query party, where the query statement includes a target dimension and a target index, search a dimension group formed by each target dimension in the query statement from the dimension group tree, obtain a statistical result of the target index from a statistical result of each index corresponding to the found dimension group, and send the statistical result of the target index as a query result to the query party.
8. The apparatus of claim 6, further comprising: an index generation unit for generating an index of a document,
and the index generating unit is used for creating an index corresponding to the data generation date according to the data generation date of the data source and storing each statistical result obtained according to the target data source into the index corresponding to the data generation date of the target data source.
9. The apparatus according to any one of claims 6-8, further comprising: the packaging unit is packaged in a packaging mode,
the packaging unit is used for packaging an application program interface used for providing a query function, and the application program interface is used for receiving a query statement and sending a query result.
10. The apparatus of claim 6, wherein the data types of the target data source are: one of maintenance data, transportation data, production data, and business data.
CN201911347191.8A 2019-12-24 2019-12-24 Data statistics method and device Pending CN111125159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911347191.8A CN111125159A (en) 2019-12-24 2019-12-24 Data statistics method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911347191.8A CN111125159A (en) 2019-12-24 2019-12-24 Data statistics method and device

Publications (1)

Publication Number Publication Date
CN111125159A true CN111125159A (en) 2020-05-08

Family

ID=70501771

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911347191.8A Pending CN111125159A (en) 2019-12-24 2019-12-24 Data statistics method and device

Country Status (1)

Country Link
CN (1) CN111125159A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150436A (en) * 2023-04-14 2023-05-23 北京锐服信科技有限公司 Data display method and system based on node tree

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1858743A (en) * 2006-03-10 2006-11-08 华为技术有限公司 Information searching method and device in relation ship data bank
WO2014177118A1 (en) * 2013-04-28 2014-11-06 浙江核新同花顺网络信息股份有限公司 Query selection method and system
CN104484392A (en) * 2014-12-11 2015-04-01 北京国双科技有限公司 Method and device for generating database query statement
CN104968008A (en) * 2015-01-21 2015-10-07 深圳市腾讯计算机系统有限公司 Access scheduling method, apparatus and system
CN105550241A (en) * 2015-12-07 2016-05-04 珠海多玩信息技术有限公司 Multidimensional database query method and apparatus
CN109213829A (en) * 2017-06-30 2019-01-15 北京国双科技有限公司 Data query method and device
CN109656968A (en) * 2018-11-15 2019-04-19 中国建设银行股份有限公司 Data query method, apparatus and storage medium under distributed environment
CN109710610A (en) * 2018-12-17 2019-05-03 北京三快在线科技有限公司 Data processing method, device and calculating equipment
CN110046287A (en) * 2019-03-19 2019-07-23 厦门市美亚柏科信息股份有限公司 A kind of the data query method, apparatus and storage medium unrelated with type of database
CN110175184A (en) * 2019-04-30 2019-08-27 阿里巴巴集团控股有限公司 A kind of lower drill method, system and the electronic equipment of data dimension
CN110515953A (en) * 2019-08-29 2019-11-29 百度在线网络技术(北京)有限公司 Querying method, device, equipment and the storage medium of data

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1858743A (en) * 2006-03-10 2006-11-08 华为技术有限公司 Information searching method and device in relation ship data bank
WO2014177118A1 (en) * 2013-04-28 2014-11-06 浙江核新同花顺网络信息股份有限公司 Query selection method and system
CN104484392A (en) * 2014-12-11 2015-04-01 北京国双科技有限公司 Method and device for generating database query statement
CN104968008A (en) * 2015-01-21 2015-10-07 深圳市腾讯计算机系统有限公司 Access scheduling method, apparatus and system
CN105550241A (en) * 2015-12-07 2016-05-04 珠海多玩信息技术有限公司 Multidimensional database query method and apparatus
CN109213829A (en) * 2017-06-30 2019-01-15 北京国双科技有限公司 Data query method and device
CN109656968A (en) * 2018-11-15 2019-04-19 中国建设银行股份有限公司 Data query method, apparatus and storage medium under distributed environment
CN109710610A (en) * 2018-12-17 2019-05-03 北京三快在线科技有限公司 Data processing method, device and calculating equipment
CN110046287A (en) * 2019-03-19 2019-07-23 厦门市美亚柏科信息股份有限公司 A kind of the data query method, apparatus and storage medium unrelated with type of database
CN110175184A (en) * 2019-04-30 2019-08-27 阿里巴巴集团控股有限公司 A kind of lower drill method, system and the electronic equipment of data dimension
CN110515953A (en) * 2019-08-29 2019-11-29 百度在线网络技术(北京)有限公司 Querying method, device, equipment and the storage medium of data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150436A (en) * 2023-04-14 2023-05-23 北京锐服信科技有限公司 Data display method and system based on node tree
CN116150436B (en) * 2023-04-14 2023-08-08 北京锐服信科技有限公司 Data display method and system based on node tree

Similar Documents

Publication Publication Date Title
CN110618983B (en) JSON document structure-based industrial big data multidimensional analysis and visualization method
CN106484875B (en) MOLAP-based data processing method and device
US8463739B2 (en) Systems and methods for generating multi-population statistical measures using middleware
Stefanoni et al. Estimating the cardinality of conjunctive queries over RDF data using graph summarisation
CN104598647B (en) A kind of tree graph search and the method for matching article
US11036685B2 (en) System and method for compressing data in a database
CN104123346B (en) A kind of structured data search method
US9342566B2 (en) Systems and methods for searching data structures of a database
CN101404032B (en) Video retrieval method and system based on contents
CN103123649B (en) A kind of message searching method based on microblog and system
CN107103032B (en) Mass data paging query method for avoiding global sequencing in distributed environment
CN110543517A (en) Method, device and medium for realizing complex query of mass data based on elastic search
CN111506621B (en) Data statistical method and device
CN104778540A (en) BOM (bill of material) management method and management system for building material equipment manufacturing
CN102693299A (en) System and method for parallel video copy detection
CN102306176A (en) On-line analytical processing (OLAP) keyword query method based on intrinsic characteristic of data warehouse
CN102270232A (en) Semantic data query system with optimized storage
CN103123650A (en) Extensible markup language (XML) data bank full-text indexing method based on integer mapping
CN106874367A (en) A kind of sampling distribution formula clustering method based on public sentiment platform
CN103049555A (en) Dynamic hierarchical integrated data accessing method capable of guaranteeing semantic correctness
CN101477555A (en) Fast retrieval and generation display method for task tree based on SQL database
CN111125159A (en) Data statistics method and device
CN104536957A (en) Retrieval method and system for rural land circulation information
CN106021423A (en) Group division-based meta-search engine personalized result recommendation method
Sheng et al. Dynamic top-k range reporting in external memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination