CN112817969A - Data query method, system, electronic device and storage medium - Google Patents
Data query method, system, electronic device and storage medium Download PDFInfo
- Publication number
- CN112817969A CN112817969A CN202110049723.0A CN202110049723A CN112817969A CN 112817969 A CN112817969 A CN 112817969A CN 202110049723 A CN202110049723 A CN 202110049723A CN 112817969 A CN112817969 A CN 112817969A
- Authority
- CN
- China
- Prior art keywords
- data
- value
- dimension
- record table
- stored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/221—Column-oriented storage; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The embodiment of the invention provides a data query method, a data query system, electronic equipment and a storage medium, wherein the data query method comprises the following steps: acquiring data to be stored, wherein the data to be stored comprises a storage date, a dimension value and a duplicate removal value; updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of hit times of the dimension value hit; when a query request is obtained, target hit times are obtained from the data record table according to the query request, and target query data are generated according to the target hit times. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.
Description
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a data query method, a data query system, electronic equipment and a storage medium.
Background
In order to control the risk or reduce the risk, the financial institution often performs a wind control evaluation through a wind control system, and the wind control system needs to obtain a statistical variable in real time when making a decision, specifically, the variable may refer to the time, the product, the operation type, and the like of a customer purchasing a certain product.
In a specific implementation, the variables may need to be deduplicated according to the wind control requirements. If it is necessary to count which customers buy a certain product within 7days, the number of times of a certain operation of a certain product customer in the last 7days needs to be acquired and the duplication of the certain operation needs to be removed according to the identification number, so that certain operation data of the customer can be acquired. In order to meet the accuracy of obtaining the variable and the high efficiency of decision efficiency, the storage structure of data and the query statistical mode become the difficulties for processing the variable.
Disclosure of Invention
In view of the above problems, embodiments of the present invention are proposed to provide a data query method and a corresponding data query system, electronic device, storage medium that overcome or at least partially solve the above problems.
In order to solve the above problem, an embodiment of the present invention discloses a data query method, which is applied to a non-relational database, where the non-relational database includes a dimension record table and a data record table, the dimension record table includes a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table includes a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the method comprises the following steps:
acquiring data to be stored, wherein the data to be stored comprises a storage date, a dimension value and a duplicate removal value;
updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of hit times of the dimension value hit;
when a query request is obtained, target hit times are obtained from the data record table according to the query request, and target query data are generated according to the target hit times.
Optionally, the updating the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication class value includes:
judging whether a first row of key values corresponding to the dimension values of the data to be stored exist in the dimension record table or not;
if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date;
judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table;
if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
Optionally, the method further comprises:
if a first row of key values corresponding to the dimension values of the data to be stored exists in the dimension record table, acquiring the first row of key values which are the dimension values of the data to be stored and an original date which is corresponding to the first column value of the de-duplication value of the data to be stored;
judging whether the original date is the same as the storage date or not;
if the original date is the same as the storage date, ending the operation;
and if the original date is not the same as the storage date, modifying the original date into the storage date.
Optionally, the method further comprises:
and decreasing the hit times corresponding to a second row key value of the original date and a second column value of the dimension value of the data to be stored in the data record table.
Optionally, the method further comprises:
if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increasing the number of hits corresponding to the second row key value which is the storage date of the data to be stored and a second column value which is the dimension value of the data to be stored.
Optionally, when the query request is obtained, obtaining the number of target hits from the data record table according to the query request, and generating target query data according to the number of target hits, includes:
when a query request is acquired, determining a query time period and a target dimension value;
acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record;
and taking the sum of the target hit times as target query data.
Optionally, the dimension value includes a channel and an operation type, and the deduplication value includes a customer identification.
The embodiment of the invention also discloses a data query system, which is applied to a non-relational database, wherein the non-relational database comprises a dimension record table and a data record table, the dimension record table comprises a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table comprises a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the system comprises:
the device comprises a to-be-stored data acquisition module, a to-be-stored data acquisition module and a data storage module, wherein the to-be-stored data acquisition module is used for acquiring to-be-stored data which comprises a storage date, a dimension value and a duplicate removal value;
a record table updating module, configured to update the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value, so that a field value of the first row key value and the first column value in the dimension record table is a latest date of the dimension value hit, and a field value of the second row key value and the second column value in the data record table is a number of times of the dimension value hit;
and the data query module is used for acquiring target hit times from the data record table according to the query request when the query request is acquired, and generating target query data according to the target hit times.
Optionally, the record table updating module is configured to determine whether a first row of key values corresponding to the dimension value of the data to be stored exists in the dimension record table; if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date; judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table; if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
Optionally, the record table updating module is configured to, if a first row of key values corresponding to the dimension value of the data to be stored exists in the dimension record table, obtain an original date corresponding to the first row of key values which are the dimension value of the data to be stored and the first column of values which are the deduplication values of the data to be stored; judging whether the original date is the same as the storage date or not; if the original date is the same as the storage date, ending the operation; and if the original date is not the same as the storage date, modifying the original date into the storage date.
Optionally, the record table updating module is configured to decrement the hit times corresponding to a second row key value of the source date and a second column value of the dimension value of the data to be stored in the data record table.
Optionally, the record table updating module is configured to, if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increment the number of hits corresponding to the second row key value that is the storage date of the data to be stored and a second column value that is the dimension value of the data to be stored.
Optionally, the data query module is configured to determine a query time period and a target dimension value when a query request is obtained; acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record; and taking the sum of the target hit times as target query data.
Optionally, the dimension value includes a channel and an operation type, and the deduplication value includes a customer identification.
The embodiment of the invention discloses electronic equipment, which comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the steps of the data query method are realized.
The embodiment of the invention discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the data query method are realized.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, data to be stored is obtained, a dimension record table and a data record table are updated according to the storage date, the dimension value and the deduplication value of the data to be stored, so that the field values of a first row of key values and a first column of values in the dimension record table are the latest date of dimension value hit, and the field values of a second row of key values and a second column of values in the data record table are the number of times of hit of the dimension value hit, when a query request is obtained, the number of times of target hit is obtained from the data record table according to the query request, and target query data are generated according to the number of times of target hit. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.
Drawings
FIG. 1 is a flow chart of the steps of an embodiment of a data query method of the present invention;
FIG. 2 is a flow diagram of the updating of a dimension record table and a data record table in accordance with the present invention;
FIG. 3 is a flow diagram of a data query of the present invention;
FIG. 4 is a block diagram of an embodiment of a data query system of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
First, technical terms related to the embodiments of the present invention are described.
HBase database: the HBase database is a NoSql (Not only sql, non-relational) database, the data structure is column-oriented storage, a table in the HBase can contain billions of rows and millions of columns of data, and the query efficiency is within one hundred milliseconds under mass data.
Mysql database: mysql is a relational database that stores data in rows and columns in the form of two-dimensional tables for the customer to understand the meaning of the fields in the tables and the relationships between tables.
SQL: structured Query Language (Structured Query Language), a programming Language for querying relational databases.
In a related scheme, to solve the problem of the deduplication variable, data may be stored in a relational database, such as a mysql database, where a data table contains query conditions, such as a time field, a product field, an operation type field, and the like, and also includes a deduplication type field, such as a customer identification number field. Each row in the table represents an operation of the client, and the operation type represents what operation the client has performed.
According to the structural design of the data table, the query deduplication variable has an SQL fixed form, for example, the number of certain operations of a certain product client in the last 7days and the deduplication variable according to the identity card number, the SQL is as follows:
selectable distintint (identification number) from table
where time > -not () -7days and time < ═ not ()
and product ═ certain product'
and operation ` certain operation `'
When the data in the data table is many, in order to improve the execution efficiency of SQL, indexes are often created for the fields and the deduplication fields in the query condition. Of course, this scheme is feasible in the case of low concurrency or less satisfied condition data, and satisfies the requirement of query efficiency.
However, in the case of high-concurrency query, and when the data satisfying the condition is excessive, for example, the data in the table is several tens of millions, the data satisfying the condition is several hundreds of thousands, and query concurrency 10TPS (Transactions Per Second) is performed, the index of the SQL is invalidated, the query efficiency is slowed, and the timeliness of acquiring the variable cannot be satisfied.
In view of the above problem, the embodiment of the present invention solves this problem through fixed-form query by changing the selected database and storage structure. Specifically, the database selects a non-relational database HBase, and two data tables are created in HBase, wherein one table is a dimension record table, and the other table is a data record table.
For convenience of description, the following description is expressed symbolically:
dimension record table: t _ dimension
Rowkey design: dimension D
Field name: the deduplication field R field value: storing last dimensional hit time T
Data record table: t _ data
Rowkey design: date, format: yyyyMMdd
Field name: variable name V _ dimension D
Field value: number of memory hits C
Subsequently, the embodiment of the invention can input the query instruction to query the target query data based on the dimension record table and the data record table of the storage structure, thereby improving the query efficiency and meeting the timeliness of acquiring variables based on the storage structure.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data query method according to the present invention is shown, and is applied to a non-relational database, where the non-relational database includes a dimension record table and a data record table, and the method according to the embodiment of the present invention may specifically include the following steps:
In the embodiment of the invention, the non-relational database is used, and the non-relational data has the advantages of high concurrent reading and writing of data, reading and writing of mass data, high expandability of data and the like.
The non-relational database comprises two data tables, namely a dimension record table (t _ dimension) and a data record table (t _ data), wherein the data tables comprise row keys (RowKey) and columns (Column), the row keys of the data tables comprise row key values, and the columns of the data tables comprise Column values. Specifically, the dimension record table includes a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table includes a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value.
The data a to be stored may include a storage date T, a dimension value D, a de-duplication value R, and the like. In a preferred embodiment of the present invention, the dimension value includes a channel and an operation type, and the deduplication value includes a customer identification. For example, for a financial institution, a channel may refer to a customer purchasing a financial product, an APP (application) used by the customer to purchase the financial product, a bank institution passed through, a place (e.g., domestic or foreign) where the financial product is purchased, etc., an operation type refers to a processing operation on the financial product, such as buying, selling, converting, applying for purchase, redeeming, etc., and a customer identification may refer to a customer's identification number or number, etc.
In the embodiment of the present invention, the data to be stored can be acquired every day, and then the dimension record table and the data record table are updated based on the data to be stored, so that the field values of the first row key values and the first column values in the dimension record table are the latest date of the dimensional value hit, and the field values of the second row key values and the second column values in the data record table are the number of times of the dimensional value hit. Of course, if there is no new data to be stored on a certain day, only the date of the record may be modified, and the data in the dimension record table and the data record table remain unchanged.
In the embodiment of the invention, if a query request for a non-relational database is received, the number of hits can be obtained from the data record table according to the query request, then the target query data is calculated based on the target number of hits and then fed back to relevant personnel, and then the relevant personnel can make a wind control decision based on the target query data.
In order that those skilled in the art will better understand the embodiments of the present invention, specific examples are given below. Assume that the data recorded during 2020.02.01-2020.02.15 are as follows:
table 1:
the dimension record table t _ dimension data formed according to the data recorded in the above 1 is as follows:
the grid with the shading is a newly added column, and the grid with the shading and the frame is a modified column.
Table 2:
in table 2 above, the field value in table 2 is the latest date of the dimensional value hit.
A data record table t _ data is formed according to the recorded data and the dimension record table, as shown in table 3 below:
wherein the grid with the ground pattern is an increased column and the grid with the ground pattern and the border is a decreased column.
Table 3:
in table 3 above, the field value in table 3 is the number of hits of the dimensional value hits.
In the data query method, data to be stored is obtained, a dimension record table and a data record table are updated according to the storage date, the dimension value and the deduplication value of the data to be stored, so that the field values of a first row of key values and a first column of values in the dimension record table are the latest date of dimensional value hit, and the field values of a second row of key values and a second column of values in the data record table are the number of times of dimensional value hit, when a query request is obtained, the number of times of target hits is obtained from the data record table according to the query request, and target query data are generated according to the number of times of target hits. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.
In a preferred embodiment of the present invention, the step 102 of updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value includes:
judging whether a first row of key values corresponding to the dimension values of the data to be stored exist in the dimension record table or not;
if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date;
judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table;
if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
In the embodiment of the present invention, assuming that the data a to be stored includes a storage date T1, a dimension value D1, and a deduplication value R1, it is determined whether a row key value corresponding to the dimension value D1 of the data to be stored exists in the dimension record table, if a row key value corresponding to the dimension value D1 of the data to be stored does not exist in the dimension record table, a row key value D1 corresponding to the dimension value D1 is created in the dimension record table (for example, if no row key value CB _ H2 exists in the dimension record table, a row key value CB _ H2 may be created), and a field value of a row key value D1 that is the data to be stored and a field value of a first column value of the deduplication value R1 that is the data to be stored are written as the storage date T1.
Judging whether a row key value corresponding to the storage date T1 of the data to be stored exists in the data record table, if the row key value corresponding to the storage date T1 of the data to be stored does not exist in the data record table, creating a row key value corresponding to the storage date T1 of the data to be stored in the data record table (for example, assuming that the data table does not have the row key value 20200202, the row key value 20200202 can be created), writing hit times (the initial hit times can be 1) for the row key value corresponding to the storage date T1 of the data to be stored and a column value of the dimension value D1 of the data to be stored, and ending the operation.
In a preferred embodiment of the present invention, the step 102, updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, further includes:
if a first row of key values corresponding to the dimension values of the data to be stored exists in the dimension record table, acquiring the first row of key values which are the dimension values of the data to be stored and an original date which is corresponding to the first column value of the de-duplication value of the data to be stored;
judging whether the original date is the same as the storage date or not;
if the original date is the same as the storage date, ending the operation;
and if the original date is not the same as the storage date, modifying the original date into the storage date.
In the embodiment of the present invention, if a row key value corresponding to the dimension value D1 of the data to be stored exists in the dimension record table, a row key value corresponding to the dimension value D1 of the data to be stored and an origin date T0 corresponding to a column value of the de-duplication value R1 of the data to be stored are acquired, whether the origin date T0 is the same as the storage date T1 is determined, if the origin date T0 is the same as the storage date T1, the operation is ended, and if the origin date T0 is not the same as the storage date T1, the origin date T0 is modified to be the storage date T1.
In a preferred embodiment of the present invention, the step 102, updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, further includes:
and decreasing the hit times corresponding to a second row key value of the original date and a second column value of the dimension value of the data to be stored in the data record table.
In the embodiment of the present invention, the number of hits corresponding to the row key value of the original date T0 and the second column value of the dimension value D1 of the data to be stored in the data record table is decreased, for example, the original number of hits is 1, and the decreased number of hits is 0.
In a preferred embodiment of the present invention, the step 102, updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, further includes:
if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increasing the number of hits corresponding to the second row key value which is the storage date of the data to be stored and a second column value which is the dimension value of the data to be stored.
In the embodiment of the present invention, if there is a row key value corresponding to the storage date T1 of the data to be stored in the data record table, the hit times corresponding to the row key value of the storage date T1 of the data to be stored and the column value of the dimensional value D1 of the data to be stored are incremented, for example, the original hit times is 0, and the hit times is 1 after the increment.
In a preferred embodiment of the present invention, when acquiring the query request, the step 103 acquires the number of target hits from the data record table according to the query request, and generates the target query data according to the number of target hits, including:
when a query request is acquired, determining a query time period and a target dimension value;
acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record;
and taking the sum of the target hit times as target query data.
In the embodiment of the invention, the required data can be quickly found according to the non-relational database and the storage structure comprising the dimension record table and the data record table.
For example, if 2020.02.15 queries the variable of H2 that is the CA channel client in the near 7 th day operating according to the id number, Rowkey in table 3 is in the range of 20200209 to 20200215, Column is the aggregate value {0, 1, 0, 1} of the hit times of CA _ H2, and the obtained aggregate values are added to obtain the variable value 2, which indicates that 2 clients have performed H2 operations through the CA channel during 20200209 to 20200215.
In order to make those skilled in the art better understand the embodiment of the present invention, a specific example is used for description below, and referring to fig. 2, a flowchart for updating a dimension record table and a data record table according to the embodiment of the present invention is shown, which specifically includes the following steps:
step 206, operating the dimension record table T _ dimension, creating row of D1, and storing a field R1 with a field value of T1;
in step 209, the data record table T _ data is operated, row rewkey T1 is created, and the field D1 is stored with a field value of 1.
Referring to fig. 3, a flowchart of data query according to an embodiment of the present invention is shown, which specifically includes the following steps:
and step 303, summing the field values C1-Cx to obtain a target query result.
In the embodiment of the invention, a non-relational database is used as a storage medium, and the deduplication statistical variables are stored through the dimension record table t _ dimension and the data record table t _ data, so that the query efficiency is high during data query, and the requirement on timeliness of acquiring the variables is met.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 4, a block diagram of a data query system according to an embodiment of the present invention is shown, which is applied to a non-relational database, where the non-relational database includes a dimension record table and a data record table, the dimension record table includes a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication type value, the data record table includes a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the embodiment of the invention can specifically comprise the following modules:
a to-be-stored data obtaining module 401, configured to obtain to-be-stored data, where the to-be-stored data includes a storage date, a dimension value, and a deduplication value;
a record table updating module 402, configured to update the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of times of the dimension value hit;
the data query module 403 is configured to, when a query request is obtained, obtain target hit times from the data record table according to the query request, and generate target query data according to the target hit times.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to determine whether a first row of key values corresponding to the dimension value of the data to be stored exists in the dimension record table; if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date; judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table; if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to, if a first row key value corresponding to the dimension value of the data to be stored exists in the dimension record table, obtain an original date corresponding to the first row key value which is the dimension value of the data to be stored and the first column value which is the deduplication value of the data to be stored; judging whether the original date is the same as the storage date or not; if the original date is the same as the storage date, ending the operation; and if the original date is not the same as the storage date, modifying the original date into the storage date.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to decrement the hit times corresponding to the second row key value of the source date and the second column value of the dimension value of the data to be stored in the data record table.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to, if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increment the number of hits corresponding to the second row key value that is the storage date of the data to be stored and a second column value that is the dimension value of the data to be stored.
In a preferred embodiment of the present invention, the data query module 403 is configured to determine a query time period and a target dimension value when obtaining the query request; acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record; and taking the sum of the target hit times as target query data.
In a preferred embodiment of the present invention, the dimension value comprises a channel and an operation type, and the deduplication value comprises a customer identification.
In the embodiment of the invention, data to be stored is obtained, a dimension record table and a data record table are updated according to the storage date, the dimension value and the deduplication value of the data to be stored, so that the field values of a first row of key values and a first column of values in the dimension record table are the latest date of dimension value hit, and the field values of a second row of key values and a second column of values in the data record table are the number of times of hit of the dimension value hit, when a query request is obtained, the number of times of target hit is obtained from the data record table according to the query request, and target query data are generated according to the number of times of target hit. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.
For the system embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiment of the invention discloses electronic equipment, which comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the steps of the data query method embodiment are realized.
The embodiment of the invention discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the embodiment of the data query method are realized.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including an instruction system which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The data query method, the data query system, the electronic device and the storage medium provided by the invention are introduced in detail, and specific examples are applied in the text to explain the principle and the implementation of the invention, and the description of the above embodiments is only used to help understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (10)
1. A data query method is applied to a non-relational database, the non-relational database comprises a dimension record table and a data record table, the dimension record table comprises a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table comprises a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the method comprises the following steps:
acquiring data to be stored, wherein the data to be stored comprises a storage date, a dimension value and a duplicate removal value;
updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of hit times of the dimension value hit;
when a query request is obtained, target hit times are obtained from the data record table according to the query request, and target query data are generated according to the target hit times.
2. The method of claim 1, wherein updating the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value comprises:
judging whether a first row of key values corresponding to the dimension values of the data to be stored exist in the dimension record table or not;
if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date;
judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table;
if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
3. The method of claim 2, further comprising:
if a first row of key values corresponding to the dimension values of the data to be stored exists in the dimension record table, acquiring the first row of key values which are the dimension values of the data to be stored and an original date which is corresponding to the first column value of the de-duplication value of the data to be stored;
judging whether the original date is the same as the storage date or not;
if the original date is the same as the storage date, ending the operation;
and if the original date is not the same as the storage date, modifying the original date into the storage date.
4. The method of claim 3, further comprising:
and decreasing the hit times corresponding to a second row key value of the original date and a second column value of the dimension value of the data to be stored in the data record table.
5. The method of claim 2, further comprising:
if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increasing the number of hits corresponding to the second row key value which is the storage date of the data to be stored and a second column value which is the dimension value of the data to be stored.
6. The method according to claim 1, wherein when obtaining the query request, obtaining the number of target hits from the data record table according to the query request, and generating target query data according to the number of target hits, includes:
when a query request is acquired, determining a query time period and a target dimension value;
acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record;
and taking the sum of the target hit times as target query data.
7. The method of claim 1, wherein the dimension values comprise channel and operation type, and wherein the de-duplication values comprise customer identification.
8. A data query system is applied to a non-relational database, the non-relational database comprises a dimension record table and a data record table, the dimension record table comprises a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table comprises a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the system comprises:
the device comprises a to-be-stored data acquisition module, a to-be-stored data acquisition module and a data storage module, wherein the to-be-stored data acquisition module is used for acquiring to-be-stored data which comprises a storage date, a dimension value and a duplicate removal value;
a record table updating module, configured to update the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value, so that a field value of the first row key value and the first column value in the dimension record table is a latest date of the dimension value hit, and a field value of the second row key value and the second column value in the data record table is a number of times of the dimension value hit;
and the data query module is used for acquiring target hit times from the data record table according to the query request when the query request is acquired, and generating target query data according to the target hit times.
9. An electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, the computer program, when executed by the processor, implementing the steps of the data query method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the data query method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110049723.0A CN112817969B (en) | 2021-01-14 | 2021-01-14 | Data query method, system, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110049723.0A CN112817969B (en) | 2021-01-14 | 2021-01-14 | Data query method, system, electronic device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112817969A true CN112817969A (en) | 2021-05-18 |
CN112817969B CN112817969B (en) | 2023-04-14 |
Family
ID=75869524
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110049723.0A Active CN112817969B (en) | 2021-01-14 | 2021-01-14 | Data query method, system, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112817969B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115658728A (en) * | 2022-11-16 | 2023-01-31 | 荣耀终端有限公司 | Query method, electronic device and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488704A (en) * | 2013-09-06 | 2014-01-01 | 乐视致新电子科技(天津)有限公司 | Method and device for storing data |
CN104239567A (en) * | 2014-09-28 | 2014-12-24 | 北京国双科技有限公司 | Method and device for processing dimension in data warehouse |
CN104298760A (en) * | 2014-10-23 | 2015-01-21 | 北京京东尚科信息技术有限公司 | Data processing method and data processing device applied to data warehouse |
CN107273482A (en) * | 2017-06-12 | 2017-10-20 | 北京市天元网络技术股份有限公司 | Alarm data storage method and device based on HBase |
CN108256088A (en) * | 2018-01-23 | 2018-07-06 | 清华大学 | A kind of storage method and system of the time series data based on key value database |
CN108255838A (en) * | 2016-12-28 | 2018-07-06 | 航天信息股份有限公司 | A kind of method and system for establishing the intermediate data warehouse for big data analysis |
CN108427684A (en) * | 2017-02-14 | 2018-08-21 | 华为技术有限公司 | Data query method, apparatus and computing device |
US10275400B1 (en) * | 2018-04-11 | 2019-04-30 | Xanadu Big Data, Llc | Systems and methods for forming a fault-tolerant federated distributed database |
CN109902130A (en) * | 2019-01-31 | 2019-06-18 | 北京明略软件系统有限公司 | A kind of date storage method, data query method and apparatus, storage medium |
CN110362549A (en) * | 2019-06-17 | 2019-10-22 | 平安普惠企业管理有限公司 | Log memory search method, electronic device and computer equipment |
-
2021
- 2021-01-14 CN CN202110049723.0A patent/CN112817969B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488704A (en) * | 2013-09-06 | 2014-01-01 | 乐视致新电子科技(天津)有限公司 | Method and device for storing data |
CN104239567A (en) * | 2014-09-28 | 2014-12-24 | 北京国双科技有限公司 | Method and device for processing dimension in data warehouse |
CN104298760A (en) * | 2014-10-23 | 2015-01-21 | 北京京东尚科信息技术有限公司 | Data processing method and data processing device applied to data warehouse |
CN108255838A (en) * | 2016-12-28 | 2018-07-06 | 航天信息股份有限公司 | A kind of method and system for establishing the intermediate data warehouse for big data analysis |
CN108427684A (en) * | 2017-02-14 | 2018-08-21 | 华为技术有限公司 | Data query method, apparatus and computing device |
CN107273482A (en) * | 2017-06-12 | 2017-10-20 | 北京市天元网络技术股份有限公司 | Alarm data storage method and device based on HBase |
CN108256088A (en) * | 2018-01-23 | 2018-07-06 | 清华大学 | A kind of storage method and system of the time series data based on key value database |
US10275400B1 (en) * | 2018-04-11 | 2019-04-30 | Xanadu Big Data, Llc | Systems and methods for forming a fault-tolerant federated distributed database |
CN109902130A (en) * | 2019-01-31 | 2019-06-18 | 北京明略软件系统有限公司 | A kind of date storage method, data query method and apparatus, storage medium |
CN110362549A (en) * | 2019-06-17 | 2019-10-22 | 平安普惠企业管理有限公司 | Log memory search method, electronic device and computer equipment |
Non-Patent Citations (4)
Title |
---|
NIKOS NTARMOS 等: "Rank join queries in NoSQL databases", 《PROCEEDINGS OF THE VLDB ENDOWMENT (PVLDB)》 * |
何鑫: "基于分布式文件系统HDFS的存储优化研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
徐小琳 等: "互联网攻击行为监测系统及关键技术", 《国家计算机网络与信息安全管理中心》 * |
陈希林等: "针对微博信息分析的HBase存储结构设计", 《信息网络安全》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115658728A (en) * | 2022-11-16 | 2023-01-31 | 荣耀终端有限公司 | Query method, electronic device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112817969B (en) | 2023-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8862566B2 (en) | Systems and methods for intelligent parallel searching | |
KR20090075885A (en) | Managing storage of individually accessible data units | |
CN103853802B (en) | Device and method for indexing digital content | |
JP6586184B2 (en) | Data analysis support device and data analysis support method | |
US20150095334A1 (en) | Data analysis support system | |
US20170161308A1 (en) | Metadump Spatial Database System | |
CN105095247A (en) | Symbolic data analysis method and system | |
US7653663B1 (en) | Guaranteeing the authenticity of the data stored in the archive storage | |
CN112817969B (en) | Data query method, system, electronic device and storage medium | |
CN105095436A (en) | Automatic modeling method for data of data sources | |
CN108073595B (en) | Method and device for realizing data updating and snapshot in OLAP database | |
Mao et al. | A dynamic feature generation system for automated metadata extraction in preservation of digital materials | |
AU2018345147B2 (en) | Database processing device, group map file production method, and recording medium | |
CN103226610A (en) | Method and device for querying database table | |
US20070282804A1 (en) | Apparatus and method for extracting database information from a report | |
KR102345410B1 (en) | Big data intelligent collecting method and device | |
CN115630070A (en) | Information pushing method, computer-readable storage medium and electronic device | |
CN110941952A (en) | Method and device for perfecting audit analysis model | |
US20220222271A1 (en) | Applying changes in a target database system | |
Thasal et al. | Information retrieval and de-duplication for tourism recommender system | |
KR101024494B1 (en) | Extraction method of modified data using meta data | |
JP6402600B2 (en) | Database apparatus, data management method, and program | |
JP6702425B2 (en) | Aggregation program, aggregation device, and aggregation method | |
Shen | A performance comparison of NoSQL and SQL databases for different scales of ecommerce systems | |
Dan et al. | Mining for insights in the search engine query stream |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |