CN112817969A - Data query method, system, electronic device and storage medium - Google Patents

Data query method, system, electronic device and storage medium Download PDF

Info

Publication number
CN112817969A
CN112817969A CN202110049723.0A CN202110049723A CN112817969A CN 112817969 A CN112817969 A CN 112817969A CN 202110049723 A CN202110049723 A CN 202110049723A CN 112817969 A CN112817969 A CN 112817969A
Authority
CN
China
Prior art keywords
data
value
dimension
record table
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110049723.0A
Other languages
Chinese (zh)
Other versions
CN112817969B (en
Inventor
张莎
何建芳
姜辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia Mengshang Consumer Finance Co ltd
Original Assignee
Inner Mongolia Mengshang Consumer Finance Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia Mengshang Consumer Finance Co ltd filed Critical Inner Mongolia Mengshang Consumer Finance Co ltd
Priority to CN202110049723.0A priority Critical patent/CN112817969B/en
Publication of CN112817969A publication Critical patent/CN112817969A/en
Application granted granted Critical
Publication of CN112817969B publication Critical patent/CN112817969B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention provides a data query method, a data query system, electronic equipment and a storage medium, wherein the data query method comprises the following steps: acquiring data to be stored, wherein the data to be stored comprises a storage date, a dimension value and a duplicate removal value; updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of hit times of the dimension value hit; when a query request is obtained, target hit times are obtained from the data record table according to the query request, and target query data are generated according to the target hit times. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.

Description

Data query method, system, electronic device and storage medium
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a data query method, a data query system, electronic equipment and a storage medium.
Background
In order to control the risk or reduce the risk, the financial institution often performs a wind control evaluation through a wind control system, and the wind control system needs to obtain a statistical variable in real time when making a decision, specifically, the variable may refer to the time, the product, the operation type, and the like of a customer purchasing a certain product.
In a specific implementation, the variables may need to be deduplicated according to the wind control requirements. If it is necessary to count which customers buy a certain product within 7days, the number of times of a certain operation of a certain product customer in the last 7days needs to be acquired and the duplication of the certain operation needs to be removed according to the identification number, so that certain operation data of the customer can be acquired. In order to meet the accuracy of obtaining the variable and the high efficiency of decision efficiency, the storage structure of data and the query statistical mode become the difficulties for processing the variable.
Disclosure of Invention
In view of the above problems, embodiments of the present invention are proposed to provide a data query method and a corresponding data query system, electronic device, storage medium that overcome or at least partially solve the above problems.
In order to solve the above problem, an embodiment of the present invention discloses a data query method, which is applied to a non-relational database, where the non-relational database includes a dimension record table and a data record table, the dimension record table includes a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table includes a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the method comprises the following steps:
acquiring data to be stored, wherein the data to be stored comprises a storage date, a dimension value and a duplicate removal value;
updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of hit times of the dimension value hit;
when a query request is obtained, target hit times are obtained from the data record table according to the query request, and target query data are generated according to the target hit times.
Optionally, the updating the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication class value includes:
judging whether a first row of key values corresponding to the dimension values of the data to be stored exist in the dimension record table or not;
if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date;
judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table;
if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
Optionally, the method further comprises:
if a first row of key values corresponding to the dimension values of the data to be stored exists in the dimension record table, acquiring the first row of key values which are the dimension values of the data to be stored and an original date which is corresponding to the first column value of the de-duplication value of the data to be stored;
judging whether the original date is the same as the storage date or not;
if the original date is the same as the storage date, ending the operation;
and if the original date is not the same as the storage date, modifying the original date into the storage date.
Optionally, the method further comprises:
and decreasing the hit times corresponding to a second row key value of the original date and a second column value of the dimension value of the data to be stored in the data record table.
Optionally, the method further comprises:
if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increasing the number of hits corresponding to the second row key value which is the storage date of the data to be stored and a second column value which is the dimension value of the data to be stored.
Optionally, when the query request is obtained, obtaining the number of target hits from the data record table according to the query request, and generating target query data according to the number of target hits, includes:
when a query request is acquired, determining a query time period and a target dimension value;
acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record;
and taking the sum of the target hit times as target query data.
Optionally, the dimension value includes a channel and an operation type, and the deduplication value includes a customer identification.
The embodiment of the invention also discloses a data query system, which is applied to a non-relational database, wherein the non-relational database comprises a dimension record table and a data record table, the dimension record table comprises a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table comprises a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the system comprises:
the device comprises a to-be-stored data acquisition module, a to-be-stored data acquisition module and a data storage module, wherein the to-be-stored data acquisition module is used for acquiring to-be-stored data which comprises a storage date, a dimension value and a duplicate removal value;
a record table updating module, configured to update the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value, so that a field value of the first row key value and the first column value in the dimension record table is a latest date of the dimension value hit, and a field value of the second row key value and the second column value in the data record table is a number of times of the dimension value hit;
and the data query module is used for acquiring target hit times from the data record table according to the query request when the query request is acquired, and generating target query data according to the target hit times.
Optionally, the record table updating module is configured to determine whether a first row of key values corresponding to the dimension value of the data to be stored exists in the dimension record table; if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date; judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table; if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
Optionally, the record table updating module is configured to, if a first row of key values corresponding to the dimension value of the data to be stored exists in the dimension record table, obtain an original date corresponding to the first row of key values which are the dimension value of the data to be stored and the first column of values which are the deduplication values of the data to be stored; judging whether the original date is the same as the storage date or not; if the original date is the same as the storage date, ending the operation; and if the original date is not the same as the storage date, modifying the original date into the storage date.
Optionally, the record table updating module is configured to decrement the hit times corresponding to a second row key value of the source date and a second column value of the dimension value of the data to be stored in the data record table.
Optionally, the record table updating module is configured to, if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increment the number of hits corresponding to the second row key value that is the storage date of the data to be stored and a second column value that is the dimension value of the data to be stored.
Optionally, the data query module is configured to determine a query time period and a target dimension value when a query request is obtained; acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record; and taking the sum of the target hit times as target query data.
Optionally, the dimension value includes a channel and an operation type, and the deduplication value includes a customer identification.
The embodiment of the invention discloses electronic equipment, which comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the steps of the data query method are realized.
The embodiment of the invention discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the data query method are realized.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, data to be stored is obtained, a dimension record table and a data record table are updated according to the storage date, the dimension value and the deduplication value of the data to be stored, so that the field values of a first row of key values and a first column of values in the dimension record table are the latest date of dimension value hit, and the field values of a second row of key values and a second column of values in the data record table are the number of times of hit of the dimension value hit, when a query request is obtained, the number of times of target hit is obtained from the data record table according to the query request, and target query data are generated according to the number of times of target hit. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.
Drawings
FIG. 1 is a flow chart of the steps of an embodiment of a data query method of the present invention;
FIG. 2 is a flow diagram of the updating of a dimension record table and a data record table in accordance with the present invention;
FIG. 3 is a flow diagram of a data query of the present invention;
FIG. 4 is a block diagram of an embodiment of a data query system of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
First, technical terms related to the embodiments of the present invention are described.
HBase database: the HBase database is a NoSql (Not only sql, non-relational) database, the data structure is column-oriented storage, a table in the HBase can contain billions of rows and millions of columns of data, and the query efficiency is within one hundred milliseconds under mass data.
Mysql database: mysql is a relational database that stores data in rows and columns in the form of two-dimensional tables for the customer to understand the meaning of the fields in the tables and the relationships between tables.
SQL: structured Query Language (Structured Query Language), a programming Language for querying relational databases.
In a related scheme, to solve the problem of the deduplication variable, data may be stored in a relational database, such as a mysql database, where a data table contains query conditions, such as a time field, a product field, an operation type field, and the like, and also includes a deduplication type field, such as a customer identification number field. Each row in the table represents an operation of the client, and the operation type represents what operation the client has performed.
According to the structural design of the data table, the query deduplication variable has an SQL fixed form, for example, the number of certain operations of a certain product client in the last 7days and the deduplication variable according to the identity card number, the SQL is as follows:
selectable distintint (identification number) from table
where time > -not () -7days and time < ═ not ()
and product ═ certain product'
and operation ` certain operation `'
When the data in the data table is many, in order to improve the execution efficiency of SQL, indexes are often created for the fields and the deduplication fields in the query condition. Of course, this scheme is feasible in the case of low concurrency or less satisfied condition data, and satisfies the requirement of query efficiency.
However, in the case of high-concurrency query, and when the data satisfying the condition is excessive, for example, the data in the table is several tens of millions, the data satisfying the condition is several hundreds of thousands, and query concurrency 10TPS (Transactions Per Second) is performed, the index of the SQL is invalidated, the query efficiency is slowed, and the timeliness of acquiring the variable cannot be satisfied.
In view of the above problem, the embodiment of the present invention solves this problem through fixed-form query by changing the selected database and storage structure. Specifically, the database selects a non-relational database HBase, and two data tables are created in HBase, wherein one table is a dimension record table, and the other table is a data record table.
For convenience of description, the following description is expressed symbolically:
dimension record table: t _ dimension
Rowkey design: dimension D
Field name: the deduplication field R field value: storing last dimensional hit time T
Data record table: t _ data
Rowkey design: date, format: yyyyMMdd
Field name: variable name V _ dimension D
Field value: number of memory hits C
Subsequently, the embodiment of the invention can input the query instruction to query the target query data based on the dimension record table and the data record table of the storage structure, thereby improving the query efficiency and meeting the timeliness of acquiring variables based on the storage structure.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data query method according to the present invention is shown, and is applied to a non-relational database, where the non-relational database includes a dimension record table and a data record table, and the method according to the embodiment of the present invention may specifically include the following steps:
step 101, data to be stored is obtained, wherein the data to be stored comprises a storage date, a dimension value and a duplicate removal value.
In the embodiment of the invention, the non-relational database is used, and the non-relational data has the advantages of high concurrent reading and writing of data, reading and writing of mass data, high expandability of data and the like.
The non-relational database comprises two data tables, namely a dimension record table (t _ dimension) and a data record table (t _ data), wherein the data tables comprise row keys (RowKey) and columns (Column), the row keys of the data tables comprise row key values, and the columns of the data tables comprise Column values. Specifically, the dimension record table includes a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table includes a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value.
The data a to be stored may include a storage date T, a dimension value D, a de-duplication value R, and the like. In a preferred embodiment of the present invention, the dimension value includes a channel and an operation type, and the deduplication value includes a customer identification. For example, for a financial institution, a channel may refer to a customer purchasing a financial product, an APP (application) used by the customer to purchase the financial product, a bank institution passed through, a place (e.g., domestic or foreign) where the financial product is purchased, etc., an operation type refers to a processing operation on the financial product, such as buying, selling, converting, applying for purchase, redeeming, etc., and a customer identification may refer to a customer's identification number or number, etc.
Step 102, updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, so that the field values of the first row key values and the first column values in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key values and the second column values in the data record table are the hit times of the dimension value hit.
In the embodiment of the present invention, the data to be stored can be acquired every day, and then the dimension record table and the data record table are updated based on the data to be stored, so that the field values of the first row key values and the first column values in the dimension record table are the latest date of the dimensional value hit, and the field values of the second row key values and the second column values in the data record table are the number of times of the dimensional value hit. Of course, if there is no new data to be stored on a certain day, only the date of the record may be modified, and the data in the dimension record table and the data record table remain unchanged.
Step 103, when a query request is obtained, obtaining target hit times from the data record table according to the query request, and generating target query data according to the target hit times.
In the embodiment of the invention, if a query request for a non-relational database is received, the number of hits can be obtained from the data record table according to the query request, then the target query data is calculated based on the target number of hits and then fed back to relevant personnel, and then the relevant personnel can make a wind control decision based on the target query data.
In order that those skilled in the art will better understand the embodiments of the present invention, specific examples are given below. Assume that the data recorded during 2020.02.01-2020.02.15 are as follows:
table 1:
Figure BDA0002898620370000081
Figure BDA0002898620370000091
the dimension record table t _ dimension data formed according to the data recorded in the above 1 is as follows:
the grid with the shading is a newly added column, and the grid with the shading and the frame is a modified column.
Table 2:
Figure BDA0002898620370000092
Figure BDA0002898620370000101
Figure BDA0002898620370000111
in table 2 above, the field value in table 2 is the latest date of the dimensional value hit.
A data record table t _ data is formed according to the recorded data and the dimension record table, as shown in table 3 below:
wherein the grid with the ground pattern is an increased column and the grid with the ground pattern and the border is a decreased column.
Table 3:
Figure BDA0002898620370000112
Figure BDA0002898620370000121
Figure BDA0002898620370000131
Figure BDA0002898620370000141
in table 3 above, the field value in table 3 is the number of hits of the dimensional value hits.
In the data query method, data to be stored is obtained, a dimension record table and a data record table are updated according to the storage date, the dimension value and the deduplication value of the data to be stored, so that the field values of a first row of key values and a first column of values in the dimension record table are the latest date of dimensional value hit, and the field values of a second row of key values and a second column of values in the data record table are the number of times of dimensional value hit, when a query request is obtained, the number of times of target hits is obtained from the data record table according to the query request, and target query data are generated according to the number of times of target hits. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.
In a preferred embodiment of the present invention, the step 102 of updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value includes:
judging whether a first row of key values corresponding to the dimension values of the data to be stored exist in the dimension record table or not;
if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date;
judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table;
if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
In the embodiment of the present invention, assuming that the data a to be stored includes a storage date T1, a dimension value D1, and a deduplication value R1, it is determined whether a row key value corresponding to the dimension value D1 of the data to be stored exists in the dimension record table, if a row key value corresponding to the dimension value D1 of the data to be stored does not exist in the dimension record table, a row key value D1 corresponding to the dimension value D1 is created in the dimension record table (for example, if no row key value CB _ H2 exists in the dimension record table, a row key value CB _ H2 may be created), and a field value of a row key value D1 that is the data to be stored and a field value of a first column value of the deduplication value R1 that is the data to be stored are written as the storage date T1.
Judging whether a row key value corresponding to the storage date T1 of the data to be stored exists in the data record table, if the row key value corresponding to the storage date T1 of the data to be stored does not exist in the data record table, creating a row key value corresponding to the storage date T1 of the data to be stored in the data record table (for example, assuming that the data table does not have the row key value 20200202, the row key value 20200202 can be created), writing hit times (the initial hit times can be 1) for the row key value corresponding to the storage date T1 of the data to be stored and a column value of the dimension value D1 of the data to be stored, and ending the operation.
In a preferred embodiment of the present invention, the step 102, updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, further includes:
if a first row of key values corresponding to the dimension values of the data to be stored exists in the dimension record table, acquiring the first row of key values which are the dimension values of the data to be stored and an original date which is corresponding to the first column value of the de-duplication value of the data to be stored;
judging whether the original date is the same as the storage date or not;
if the original date is the same as the storage date, ending the operation;
and if the original date is not the same as the storage date, modifying the original date into the storage date.
In the embodiment of the present invention, if a row key value corresponding to the dimension value D1 of the data to be stored exists in the dimension record table, a row key value corresponding to the dimension value D1 of the data to be stored and an origin date T0 corresponding to a column value of the de-duplication value R1 of the data to be stored are acquired, whether the origin date T0 is the same as the storage date T1 is determined, if the origin date T0 is the same as the storage date T1, the operation is ended, and if the origin date T0 is not the same as the storage date T1, the origin date T0 is modified to be the storage date T1.
In a preferred embodiment of the present invention, the step 102, updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, further includes:
and decreasing the hit times corresponding to a second row key value of the original date and a second column value of the dimension value of the data to be stored in the data record table.
In the embodiment of the present invention, the number of hits corresponding to the row key value of the original date T0 and the second column value of the dimension value D1 of the data to be stored in the data record table is decreased, for example, the original number of hits is 1, and the decreased number of hits is 0.
In a preferred embodiment of the present invention, the step 102, updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, further includes:
if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increasing the number of hits corresponding to the second row key value which is the storage date of the data to be stored and a second column value which is the dimension value of the data to be stored.
In the embodiment of the present invention, if there is a row key value corresponding to the storage date T1 of the data to be stored in the data record table, the hit times corresponding to the row key value of the storage date T1 of the data to be stored and the column value of the dimensional value D1 of the data to be stored are incremented, for example, the original hit times is 0, and the hit times is 1 after the increment.
In a preferred embodiment of the present invention, when acquiring the query request, the step 103 acquires the number of target hits from the data record table according to the query request, and generates the target query data according to the number of target hits, including:
when a query request is acquired, determining a query time period and a target dimension value;
acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record;
and taking the sum of the target hit times as target query data.
In the embodiment of the invention, the required data can be quickly found according to the non-relational database and the storage structure comprising the dimension record table and the data record table.
For example, if 2020.02.15 queries the variable of H2 that is the CA channel client in the near 7 th day operating according to the id number, Rowkey in table 3 is in the range of 20200209 to 20200215, Column is the aggregate value {0, 1, 0, 1} of the hit times of CA _ H2, and the obtained aggregate values are added to obtain the variable value 2, which indicates that 2 clients have performed H2 operations through the CA channel during 20200209 to 20200215.
In order to make those skilled in the art better understand the embodiment of the present invention, a specific example is used for description below, and referring to fig. 2, a flowchart for updating a dimension record table and a data record table according to the embodiment of the present invention is shown, which specifically includes the following steps:
step 201, acquiring data A to be stored, wherein the data A comprises a storage date T1, a dimension value D1 and a de-duplication value R1, and a variable (field value) of the de-duplication value R1 is V;
step 202, judging whether the dimension record table t _ dimension contains data of Rowkey ═ D1; if yes, go to step 203, otherwise go to step 206;
step 203, judging that the value of the field in row Rowkey D1 in the dimension record table T _ dimension is T0T 1; if not, go to step 204,
step 204, operating the dimension record table T _ dimension, modifying the field value of the row field R1 of row D1, and modifying the original T0 into T1;
step 205, operating the data record table T _ data, modifying the field value of the row field D1 of row T0, modifying the field value of row C0 of row T0 to C0-1;
step 206, operating the dimension record table T _ dimension, creating row of D1, and storing a field R1 with a field value of T1;
step 207, judging whether data of row (T1) exists in the data recording table T _ data; if yes, go to step 208, otherwise go to step 209;
step 208, operating the data record table T _ data, modifying row key to be row T1, and modifying the field value of the field D1 from original C1 to C1+ 1;
in step 209, the data record table T _ data is operated, row rewkey T1 is created, and the field D1 is stored with a field value of 1.
Referring to fig. 3, a flowchart of data query according to an embodiment of the present invention is shown, which specifically includes the following steps:
step 301, inquiring that the dimensionality of the near x day is D1, and removing duplication according to the R1 field;
step 302, querying a data record table T _ data, wherein the current date is T0, acquiring x-row data of Rowkey ═ T0-x + 1-T0, and acquiring field values C1-Cx listed as D1;
and step 303, summing the field values C1-Cx to obtain a target query result.
In the embodiment of the invention, a non-relational database is used as a storage medium, and the deduplication statistical variables are stored through the dimension record table t _ dimension and the data record table t _ data, so that the query efficiency is high during data query, and the requirement on timeliness of acquiring the variables is met.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 4, a block diagram of a data query system according to an embodiment of the present invention is shown, which is applied to a non-relational database, where the non-relational database includes a dimension record table and a data record table, the dimension record table includes a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication type value, the data record table includes a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the embodiment of the invention can specifically comprise the following modules:
a to-be-stored data obtaining module 401, configured to obtain to-be-stored data, where the to-be-stored data includes a storage date, a dimension value, and a deduplication value;
a record table updating module 402, configured to update the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of times of the dimension value hit;
the data query module 403 is configured to, when a query request is obtained, obtain target hit times from the data record table according to the query request, and generate target query data according to the target hit times.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to determine whether a first row of key values corresponding to the dimension value of the data to be stored exists in the dimension record table; if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date; judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table; if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to, if a first row key value corresponding to the dimension value of the data to be stored exists in the dimension record table, obtain an original date corresponding to the first row key value which is the dimension value of the data to be stored and the first column value which is the deduplication value of the data to be stored; judging whether the original date is the same as the storage date or not; if the original date is the same as the storage date, ending the operation; and if the original date is not the same as the storage date, modifying the original date into the storage date.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to decrement the hit times corresponding to the second row key value of the source date and the second column value of the dimension value of the data to be stored in the data record table.
In a preferred embodiment of the present invention, the record table updating module 402 is configured to, if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increment the number of hits corresponding to the second row key value that is the storage date of the data to be stored and a second column value that is the dimension value of the data to be stored.
In a preferred embodiment of the present invention, the data query module 403 is configured to determine a query time period and a target dimension value when obtaining the query request; acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record; and taking the sum of the target hit times as target query data.
In a preferred embodiment of the present invention, the dimension value comprises a channel and an operation type, and the deduplication value comprises a customer identification.
In the embodiment of the invention, data to be stored is obtained, a dimension record table and a data record table are updated according to the storage date, the dimension value and the deduplication value of the data to be stored, so that the field values of a first row of key values and a first column of values in the dimension record table are the latest date of dimension value hit, and the field values of a second row of key values and a second column of values in the data record table are the number of times of hit of the dimension value hit, when a query request is obtained, the number of times of target hit is obtained from the data record table according to the query request, and target query data are generated according to the number of times of target hit. In the embodiment of the invention, through the non-relational database and the storage structure comprising the dimension record table and the data record table, the query efficiency can be improved, and the timeliness of acquiring the variables is met.
For the system embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiment of the invention discloses electronic equipment, which comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the steps of the data query method embodiment are realized.
The embodiment of the invention discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the embodiment of the data query method are realized.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including an instruction system which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The data query method, the data query system, the electronic device and the storage medium provided by the invention are introduced in detail, and specific examples are applied in the text to explain the principle and the implementation of the invention, and the description of the above embodiments is only used to help understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A data query method is applied to a non-relational database, the non-relational database comprises a dimension record table and a data record table, the dimension record table comprises a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table comprises a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the method comprises the following steps:
acquiring data to be stored, wherein the data to be stored comprises a storage date, a dimension value and a duplicate removal value;
updating the dimension record table and the data record table according to the storage date, the dimension value and the deduplication value, so that the field values of the first row key value and the first column value in the dimension record table are the latest date of the dimension value hit, and the field values of the second row key value and the second column value in the data record table are the number of hit times of the dimension value hit;
when a query request is obtained, target hit times are obtained from the data record table according to the query request, and target query data are generated according to the target hit times.
2. The method of claim 1, wherein updating the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value comprises:
judging whether a first row of key values corresponding to the dimension values of the data to be stored exist in the dimension record table or not;
if the first row of key values corresponding to the dimension values of the data to be stored does not exist in the dimension record table, creating the first row of key values corresponding to the dimension values in the dimension record table, and writing the first row of key values which are the dimension values of the data to be stored and the field values which are the first column values of the de-duplicated values of the data to be stored as the storage date;
judging whether a second row of key values corresponding to the storage date of the data to be stored exists in the data record table;
if the second row key value corresponding to the storage date of the data to be stored does not exist in the data record table, creating the second row key value corresponding to the storage date of the data to be stored in the data record table, writing the second row key value corresponding to the storage date of the data to be stored and the second column value which is the dimension value of the data to be stored in the data record table for the number of times of hit, and ending the operation.
3. The method of claim 2, further comprising:
if a first row of key values corresponding to the dimension values of the data to be stored exists in the dimension record table, acquiring the first row of key values which are the dimension values of the data to be stored and an original date which is corresponding to the first column value of the de-duplication value of the data to be stored;
judging whether the original date is the same as the storage date or not;
if the original date is the same as the storage date, ending the operation;
and if the original date is not the same as the storage date, modifying the original date into the storage date.
4. The method of claim 3, further comprising:
and decreasing the hit times corresponding to a second row key value of the original date and a second column value of the dimension value of the data to be stored in the data record table.
5. The method of claim 2, further comprising:
if a second row key value corresponding to the storage date of the data to be stored exists in the data record table, increasing the number of hits corresponding to the second row key value which is the storage date of the data to be stored and a second column value which is the dimension value of the data to be stored.
6. The method according to claim 1, wherein when obtaining the query request, obtaining the number of target hits from the data record table according to the query request, and generating target query data according to the number of target hits, includes:
when a query request is acquired, determining a query time period and a target dimension value;
acquiring a second row key value of the query time period and a target hit frequency corresponding to a second column value of the target dimension value from the data record;
and taking the sum of the target hit times as target query data.
7. The method of claim 1, wherein the dimension values comprise channel and operation type, and wherein the de-duplication values comprise customer identification.
8. A data query system is applied to a non-relational database, the non-relational database comprises a dimension record table and a data record table, the dimension record table comprises a first row key value and a first column value, the first row key value is a dimension value, the first column value is a deduplication value, the data record table comprises a second row key value and a second column value, the second row key value is a storage date, and the second column value is a dimension value; the system comprises:
the device comprises a to-be-stored data acquisition module, a to-be-stored data acquisition module and a data storage module, wherein the to-be-stored data acquisition module is used for acquiring to-be-stored data which comprises a storage date, a dimension value and a duplicate removal value;
a record table updating module, configured to update the dimension record table and the data record table according to the storage date, the dimension value, and the deduplication value, so that a field value of the first row key value and the first column value in the dimension record table is a latest date of the dimension value hit, and a field value of the second row key value and the second column value in the data record table is a number of times of the dimension value hit;
and the data query module is used for acquiring target hit times from the data record table according to the query request when the query request is acquired, and generating target query data according to the target hit times.
9. An electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, the computer program, when executed by the processor, implementing the steps of the data query method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the data query method according to any one of claims 1 to 7.
CN202110049723.0A 2021-01-14 2021-01-14 Data query method, system, electronic device and storage medium Active CN112817969B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110049723.0A CN112817969B (en) 2021-01-14 2021-01-14 Data query method, system, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110049723.0A CN112817969B (en) 2021-01-14 2021-01-14 Data query method, system, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN112817969A true CN112817969A (en) 2021-05-18
CN112817969B CN112817969B (en) 2023-04-14

Family

ID=75869524

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110049723.0A Active CN112817969B (en) 2021-01-14 2021-01-14 Data query method, system, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN112817969B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115658728A (en) * 2022-11-16 2023-01-31 荣耀终端有限公司 Query method, electronic device and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488704A (en) * 2013-09-06 2014-01-01 乐视致新电子科技(天津)有限公司 Method and device for storing data
CN104239567A (en) * 2014-09-28 2014-12-24 北京国双科技有限公司 Method and device for processing dimension in data warehouse
CN104298760A (en) * 2014-10-23 2015-01-21 北京京东尚科信息技术有限公司 Data processing method and data processing device applied to data warehouse
CN107273482A (en) * 2017-06-12 2017-10-20 北京市天元网络技术股份有限公司 Alarm data storage method and device based on HBase
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
CN108255838A (en) * 2016-12-28 2018-07-06 航天信息股份有限公司 A kind of method and system for establishing the intermediate data warehouse for big data analysis
CN108427684A (en) * 2017-02-14 2018-08-21 华为技术有限公司 Data query method, apparatus and computing device
US10275400B1 (en) * 2018-04-11 2019-04-30 Xanadu Big Data, Llc Systems and methods for forming a fault-tolerant federated distributed database
CN109902130A (en) * 2019-01-31 2019-06-18 北京明略软件系统有限公司 A kind of date storage method, data query method and apparatus, storage medium
CN110362549A (en) * 2019-06-17 2019-10-22 平安普惠企业管理有限公司 Log memory search method, electronic device and computer equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488704A (en) * 2013-09-06 2014-01-01 乐视致新电子科技(天津)有限公司 Method and device for storing data
CN104239567A (en) * 2014-09-28 2014-12-24 北京国双科技有限公司 Method and device for processing dimension in data warehouse
CN104298760A (en) * 2014-10-23 2015-01-21 北京京东尚科信息技术有限公司 Data processing method and data processing device applied to data warehouse
CN108255838A (en) * 2016-12-28 2018-07-06 航天信息股份有限公司 A kind of method and system for establishing the intermediate data warehouse for big data analysis
CN108427684A (en) * 2017-02-14 2018-08-21 华为技术有限公司 Data query method, apparatus and computing device
CN107273482A (en) * 2017-06-12 2017-10-20 北京市天元网络技术股份有限公司 Alarm data storage method and device based on HBase
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
US10275400B1 (en) * 2018-04-11 2019-04-30 Xanadu Big Data, Llc Systems and methods for forming a fault-tolerant federated distributed database
CN109902130A (en) * 2019-01-31 2019-06-18 北京明略软件系统有限公司 A kind of date storage method, data query method and apparatus, storage medium
CN110362549A (en) * 2019-06-17 2019-10-22 平安普惠企业管理有限公司 Log memory search method, electronic device and computer equipment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
NIKOS NTARMOS 等: "Rank join queries in NoSQL databases", 《PROCEEDINGS OF THE VLDB ENDOWMENT (PVLDB)》 *
何鑫: "基于分布式文件系统HDFS的存储优化研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
徐小琳 等: "互联网攻击行为监测系统及关键技术", 《国家计算机网络与信息安全管理中心》 *
陈希林等: "针对微博信息分析的HBase存储结构设计", 《信息网络安全》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115658728A (en) * 2022-11-16 2023-01-31 荣耀终端有限公司 Query method, electronic device and storage medium

Also Published As

Publication number Publication date
CN112817969B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
US8862566B2 (en) Systems and methods for intelligent parallel searching
KR20090075885A (en) Managing storage of individually accessible data units
CN103853802B (en) Device and method for indexing digital content
JP6586184B2 (en) Data analysis support device and data analysis support method
US20150095334A1 (en) Data analysis support system
US20170161308A1 (en) Metadump Spatial Database System
CN105095247A (en) Symbolic data analysis method and system
US7653663B1 (en) Guaranteeing the authenticity of the data stored in the archive storage
CN112817969B (en) Data query method, system, electronic device and storage medium
CN105095436A (en) Automatic modeling method for data of data sources
CN108073595B (en) Method and device for realizing data updating and snapshot in OLAP database
Mao et al. A dynamic feature generation system for automated metadata extraction in preservation of digital materials
AU2018345147B2 (en) Database processing device, group map file production method, and recording medium
CN103226610A (en) Method and device for querying database table
US20070282804A1 (en) Apparatus and method for extracting database information from a report
KR102345410B1 (en) Big data intelligent collecting method and device
CN115630070A (en) Information pushing method, computer-readable storage medium and electronic device
CN110941952A (en) Method and device for perfecting audit analysis model
US20220222271A1 (en) Applying changes in a target database system
Thasal et al. Information retrieval and de-duplication for tourism recommender system
KR101024494B1 (en) Extraction method of modified data using meta data
JP6402600B2 (en) Database apparatus, data management method, and program
JP6702425B2 (en) Aggregation program, aggregation device, and aggregation method
Shen A performance comparison of NoSQL and SQL databases for different scales of ecommerce systems
Dan et al. Mining for insights in the search engine query stream

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant