CN114238258A - Database data processing method and device, computer equipment and storage medium - Google Patents

Database data processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114238258A
CN114238258A CN202111452515.1A CN202111452515A CN114238258A CN 114238258 A CN114238258 A CN 114238258A CN 202111452515 A CN202111452515 A CN 202111452515A CN 114238258 A CN114238258 A CN 114238258A
Authority
CN
China
Prior art keywords
data
database
statistical
information
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111452515.1A
Other languages
Chinese (zh)
Other versions
CN114238258B (en
Inventor
赵勇
王金虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qichacha Technology Co ltd
Original Assignee
Qichacha Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qichacha Technology Co ltd filed Critical Qichacha Technology Co ltd
Priority to CN202111452515.1A priority Critical patent/CN114238258B/en
Publication of CN114238258A publication Critical patent/CN114238258A/en
Application granted granted Critical
Publication of CN114238258B publication Critical patent/CN114238258B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The present disclosure relates to a database data processing method, apparatus, computer device, storage medium, and computer program product. The method comprises the following steps: collecting log data in a first database; pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects; storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data; calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data; saving the new statistical value to a third database. By adopting the method, a large number of data cleaning personnel can be saved, and the process is carried out through a computer program, so that the error probability can be reduced compared with the processing by a large number of data cleaning personnel.

Description

Database data processing method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of electrical data processing technologies, and in particular, to a database data processing method, an apparatus, a computer device, a storage medium, and a computer program product.
Background
With the development of big data technology, dimension correlation data statistics technology appears. Data information associated with an object or primary key may be divided into a plurality of dimensions. The establishment of information dimension facilitates the classification, statistics and effective information utilization of the associated information. In the specific statistics of the information included in each dimension, the statistics (for example, the number of core persons, the number of lists, and the number of pieces of patent information) such as the number, the number of times, and the number of pieces of specific content may be collectively referred to as count.
In the existing count calculation method, because data of each dimension has different content characteristics and statistical requirements, data cleaning personnel of different dimensions generally write respective refreshing logic to calculate a count value, so that a large number of data cleaning personnel are required to participate in count calculation, the efficiency is low, and errors are easy to occur.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a database data processing method, an apparatus, a computer device, a computer readable storage medium, and a computer program product, which can efficiently and accurately perform count calculation.
In a first aspect, the present disclosure provides a database data processing method. The method comprises the following steps:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
In one embodiment, the second database is an open source distributed relational database, and the third database is a distributed document storage database.
In one embodiment, the second database further comprises a data refresh table for:
determining a target object and a target dimension based on the log data;
storing the target object and the target dimension to a data refresh table;
and determining a statistical data table needing to be subjected to statistical value calculation according to the data refreshing table.
In one embodiment, the statistical data table includes classified statistical data table and unclassified statistical data table.
In one embodiment, the pushing the log data to the second database includes:
and pushing the log data to a second database in a message queue mode.
In a second aspect, the present disclosure also provides a database data processing apparatus. The device comprises:
the data acquisition module is used for acquiring log data in a first database, wherein the log data comprises object associated data, and the object associated data comprises dimension associated data;
the data pushing module is used for pushing the log data to a second database, and the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
the data storage module is used for storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data and determining the current data;
the calculation module is used for calculating a new statistical value corresponding to the statistical data table according to a historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
and the data summarization module is used for storing the new summarized value into a third database.
In a third aspect, the present disclosure also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
In a fifth aspect, the present disclosure also provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
According to the database data processing method, the database data processing device, the computer equipment, the storage medium and the computer program product, the statistical value is calculated by utilizing the log data, the fact that the statistical value is directly calculated through the associated data of each dimension is avoided, the statistical value can be calculated without compiling refreshing logic corresponding to the associated data of each dimension, a large number of data cleaning personnel are saved, the process is carried out through the computer program, and compared with the processing by a large number of data cleaning personnel, the probability of errors can be reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a diagram of an application environment of a database data processing method in one embodiment;
FIG. 2 is a schematic flow chart diagram illustrating a database data processing method according to one embodiment;
FIG. 3 is a schematic flow chart diagram illustrating a database data processing method according to another embodiment;
FIG. 4 is a block diagram showing the structure of a database data processing apparatus according to an embodiment;
FIG. 5 is a block diagram showing the structure of a database data processing apparatus according to another embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clearly understood, the present disclosure is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the disclosure and are not intended to limit the disclosure.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The database data processing method provided by the embodiment of the disclosure can be applied to the application environment shown in fig. 1. Wherein the data storage system may store data that the server 102 needs to process. The data storage system may be integrated on the server 102, or may be located on the cloud or other network server. The server 102 may include one or more data acquisition terminals that acquire log data in a first database, the log data including object association data, the object association data including dimension association data. The server 102 pushes the log data to a second database comprising statistical data tables corresponding to different dimensions of different objects, respectively. The server 102 stores the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determines the current data. The server 102 calculates a new statistical value corresponding to the statistical data table based on the historical statistical value and the data state change information, the numerical value increase/decrease information, and the data increase/decrease information of the current data. Server 102 saves the new statistical value to a third database. The server 102 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, a database data processing method is provided, which is described by taking the application environment in fig. 1 as an example, and includes the following steps:
s202, collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data.
The log data may refer to data for recording modified content of the database. The object association data may refer to data having an association relation with an object. Dimension association data may refer to data associated with the presence of an object in one or more information dimensions.
In particular, a dimension may refer to an information dimension. The first database may broadly refer to one or more databases storing dimension-related data. The log data may be a binlog log. The first database may include one or more service tables, and each service table may have a corresponding binlog log. Log data, which may include all binlog logs, in the first database is collected. The log data includes a modification record of the object association data. The object may be an individual or an organization, such as a company boss or a company. The dimension association data may be dimension association data of an individual, such as dimension association data of a company boss. The dimension association data may be dimension association data of an organization, such as dimension association data of a certain company. An object may have one or more associated information dimensions, and an associated information dimension may relate to one or more statistics. A particular piece of information may become dimension related information for one or more objects. A particular piece of information may affect the calculation of one or more statistical values. Dimensions of different objects may have the same dimension name.
S204, pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects.
In particular, the statistics table may be referred to as a count table. Each dimension of each object may have a corresponding table of statistics. The statistics table may be of different types. Different types of statistical data tables can have different table structures according to actual needs. And pushing the log data to a second database. The full field log data may be pushed to the second database.
S206, according to the object information and the dimension information in the log data, storing the log data into a corresponding statistical data table, and determining the current data.
Specifically, the log data are consumed, and the log data are stored into a statistical record table of corresponding dimensions of corresponding objects according to object information and dimension information in the log data.
And S208, calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data.
Wherein, the historical statistic value may refer to a numerical value of the statistic value before recalculation. The data increase/decrease information may be information for increasing or decreasing the number of pieces of data.
Specifically, the historical statistical value is the latest value of the statistical value before recalculation. And consuming the log data in the statistical data table, and determining the statistical value change amount corresponding to each statistical data table according to the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data. And adding the statistic value change quantity and the corresponding historical statistic value to obtain a new statistic value corresponding to each statistic data table. For example, when the data state of a certain dimension associated data is modified from valid to invalid, the amount of change of the statistical value may be determined as-1, and the corresponding historical statistical value (e.g., the value count _ X) is added to-1 to obtain a value count _ X-1, which is a new statistical value. For example, when a new dimension related data is added to the statistical data table, the amount of change of the statistical value may be determined as 1, and the corresponding historical statistical value (for example, the number count _ Y) is added to 1 to obtain a number count _ Y +1, which is a new statistical value. The historical statistics may be stored in the third database. For example, when the statistical data table includes two people for increasing or decreasing the number of staff, the amount of change of the statistical value may be determined as 2, and the corresponding historical statistical value (e.g., the number count _ Z) is added to 2 to obtain a number count _ Z +2, which is a new statistical value.
And S210, storing the new statistical value into a third database.
Specifically, the calculated statistical values are collected and stored in a third database, so that the statistical values can be conveniently inquired and used.
In the database data processing method, the log data is used for calculating the statistical value, the fact that the statistical value is directly calculated through the associated data of each dimension is avoided, the statistical value can be calculated without compiling the refreshing logic corresponding to the associated data of each dimension, a large number of data cleaning personnel are saved, the process is carried out through a computer program, and compared with the process of being processed by a large number of data cleaning personnel, the probability of errors can be reduced.
In one embodiment, the second database is an open source distributed relational database and the third database is a distributed document storage database.
Specifically, the second database is an open-source distributed relational database, such as a TiDB database (the TiDB database is an abbreviation of an open-source distributed HTAP database, and an HTAP is a name of a database column, which is called Hybrid Transactional and Analytical Processing in all english). The third database is a distributed document storage database, and may be, for example, a MongoDB database (the MongoDB database is a distributed document storage database written in C + + language).
In this embodiment, the open-source distributed relational database is used as the second database, so that the beneficial effects of better meeting the large-scale data processing requirements and better storing the statistical data table can be achieved. By using the distributed document storage database as the third database, the advantageous effect of facilitating the query and use of the statistical value data can be achieved.
In one embodiment, the second database further includes a data refresh table, and the calculating a new statistic corresponding to the statistic table according to the log data in the statistic table and the historical statistic includes:
s302, determining a target object and a target dimension based on the log data.
S304, storing the target object and the target dimension into a data updating table.
S306, determining a statistical data table needing to be subjected to statistical value calculation according to the data refreshing table.
Specifically, an object included in the log data is determined as a target object. And determining the information dimension contained in the log data as a target dimension, wherein the target dimension and the target object have an association relation. And storing the target object and the target dimension to a data updating table. And determining the statistical data table with data change as a target statistical data table according to the data refreshing table, thereby providing an indication for the calculation of the statistical value, and only calculating the statistical value corresponding to the target statistical data table.
In this embodiment, the statistical data table that needs to be subjected to statistical value calculation is screened, so that the beneficial effects of reducing the calculation amount and accelerating the calculation speed of the statistical value can be achieved.
In one embodiment, the statistics table includes a sorted statistics table, a non-sorted statistics table.
The classified statistical data table may refer to a statistical data table that may further classify the stored information. A non-sorted statistics table may refer to a statistics table that does not further sort the stored information.
Specifically, the statistical data table includes a classified statistical data table and a non-classified statistical data table. For a sorted statistics table, each sub-category may correspond to a statistics value. And storing corresponding dimension associated data by using a statistical data table of a proper type according to actual needs.
In this embodiment, the statistical data table includes a classified statistical data table and a non-classified statistical data table, so that the beneficial effects of storing dimension associated data more clearly, facilitating data consumption, and facilitating statistical value calculation can be achieved.
In one embodiment, said pushing said log data to a second database comprises:
and pushing the log data to a second database in a message queue mode.
Specifically, the log data is pushed to the second database in the form of a message queue. The message queue may be a kafka message queue (kafka is a name of a message queue).
In this embodiment, the log data is pushed to the second database in the form of a message queue, so that the beneficial effects of improving the data pushing stability and efficiency and meeting the large-scale data processing requirements can be achieved.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present disclosure further provides a database data processing apparatus for implementing the above-mentioned database data processing method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so the specific limitations in one or more embodiments of the database data processing device provided below may refer to the limitations on the database data processing method in the above description, and are not described herein again.
In one embodiment, as shown in fig. 4, there is provided a database data processing apparatus including: data acquisition module 402, data push module 404, data storage module 406, calculation module 408 and data summarization module 410, wherein:
a data collection module 402, configured to collect log data in a first database, where the log data includes object association data, and the object association data includes dimension association data.
A data pushing module 404, configured to push the log data to a second database, where the second database includes statistical data tables respectively corresponding to different dimensions of different objects.
And the data storage module 406 is configured to store the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determine current data.
The calculating module 408 is configured to calculate a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase/decrease information, and the data increase/decrease information of the current data.
And a data summarization module 410, configured to save the new summarized value to a third database.
In one embodiment, as shown in fig. 5, the database data processing apparatus includes: a goal determination module 502, a refresh table module 504, a screening module 506, wherein:
a target determination module 502 for determining a target object and a target dimension based on the log data.
A refresh table module 504, configured to store the target object and the target dimension to a data refresh table.
And the screening module 506 is configured to determine a statistical data table that needs to perform statistical value calculation according to the data refresh table.
In one embodiment, the data pushing module 404 is configured to push the log data to the second database in the form of a message queue.
The modules in the database data processing device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing dimension association data and related processing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a database data processing method.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
determining a target object and a target dimension based on the log data;
storing the target object and the target dimension to a data refresh table;
and determining a statistical data table needing to be subjected to statistical value calculation according to the data refreshing table.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
In one embodiment, the computer program when executed by the processor further performs the steps of:
determining a target object and a target dimension based on the log data;
storing the target object and the target dimension to a data refresh table;
and determining a statistical data table needing to be subjected to statistical value calculation according to the data refreshing table.
In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the steps of:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
In one embodiment, the computer program when executed by the processor further performs the steps of:
determining a target object and a target dimension based on the log data;
storing the target object and the target dimension to a data refresh table;
and determining a statistical data table needing to be subjected to statistical value calculation according to the data refreshing table.
It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present disclosure are information and data that are authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, databases, or other media used in the embodiments provided by the present disclosure may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases involved in embodiments provided by the present disclosure may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided in this disclosure may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic, quantum computing based data processing logic, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present disclosure, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present disclosure. It should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the concept of the present disclosure, and these changes and modifications are all within the scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the appended claims.

Claims (10)

1. A database data processing method, the method comprising:
collecting log data in a first database, wherein the log data comprise object associated data, and the object associated data comprise dimension associated data;
pushing the log data to a second database, wherein the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data, and determining the current data;
calculating a new statistical value corresponding to the statistical data table according to the historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
saving the new statistical value to a third database.
2. The method of claim 1, wherein the second database further comprises a data refresh table for:
determining a target object and a target dimension based on the log data;
storing the target object and the target dimension to a data refresh table;
and determining a statistical data table needing to be subjected to statistical value calculation according to the data refreshing table.
3. The method of claim 1, wherein the second database is an open source distributed relational database and the third database is a distributed document storage database.
4. The method of claim 1, wherein the statistics table comprises a sorted statistics table and a non-sorted statistics table.
5. The method of claim 1, wherein pushing the log data to a second database comprises:
and pushing the log data to a second database in a message queue mode.
6. A statistical apparatus for correlating dimensional data, the apparatus comprising:
the data acquisition module is used for acquiring log data in a first database, wherein the log data comprises object associated data, and the object associated data comprises dimension associated data;
the data pushing module is used for pushing the log data to a second database, and the second database comprises statistical data tables respectively corresponding to different dimensions of different objects;
the data storage module is used for storing the log data into a corresponding statistical data table according to the object information and the dimension information in the log data and determining the current data;
the calculation module is used for calculating a new statistical value corresponding to the statistical data table according to a historical statistical value and the data state change information, the numerical value increase and decrease information and the data increase and decrease information of the current data;
and the data summarization module is used for storing the new summarized value into a third database.
7. The apparatus of claim 6, further comprising:
a target determination module for determining a target object and a target dimension based on the log data;
the refreshing table module is used for storing the target object and the target dimension to a data refreshing table;
and the screening module is used for determining a statistical data table which needs to be subjected to statistical value calculation according to the data updating table.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 5.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 5 when executed by a processor.
CN202111452515.1A 2021-11-30 2021-11-30 Database data processing method, device, computer equipment and storage medium Active CN114238258B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111452515.1A CN114238258B (en) 2021-11-30 2021-11-30 Database data processing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111452515.1A CN114238258B (en) 2021-11-30 2021-11-30 Database data processing method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114238258A true CN114238258A (en) 2022-03-25
CN114238258B CN114238258B (en) 2024-02-20

Family

ID=80752521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111452515.1A Active CN114238258B (en) 2021-11-30 2021-11-30 Database data processing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114238258B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060095405A1 (en) * 2004-10-29 2006-05-04 International Business Machines Corporation Mirroring database statistics
CN108985981A (en) * 2018-06-28 2018-12-11 北京奇虎科技有限公司 Data processing system and method
CN109753531A (en) * 2018-12-26 2019-05-14 深圳市麦谷科技有限公司 A kind of big data statistical method, system, computer equipment and storage medium
CN109933505A (en) * 2019-03-14 2019-06-25 深圳市珍爱捷云信息技术有限公司 Log processing method, device, computer equipment and storage medium
US10437470B1 (en) * 2015-06-22 2019-10-08 Amazon Technologies, Inc. Disk space manager
CN111008244A (en) * 2019-11-22 2020-04-14 厦门安胜网络科技有限公司 Database synchronization and analysis method and system
CN111897867A (en) * 2020-08-17 2020-11-06 杭州安恒信息技术股份有限公司 Database log statistical method, system and related device
CN113656377A (en) * 2021-08-23 2021-11-16 深圳市万睿智能科技有限公司 Automatic matching method and device for data migration, computer equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060095405A1 (en) * 2004-10-29 2006-05-04 International Business Machines Corporation Mirroring database statistics
US10437470B1 (en) * 2015-06-22 2019-10-08 Amazon Technologies, Inc. Disk space manager
CN108985981A (en) * 2018-06-28 2018-12-11 北京奇虎科技有限公司 Data processing system and method
CN109753531A (en) * 2018-12-26 2019-05-14 深圳市麦谷科技有限公司 A kind of big data statistical method, system, computer equipment and storage medium
CN109933505A (en) * 2019-03-14 2019-06-25 深圳市珍爱捷云信息技术有限公司 Log processing method, device, computer equipment and storage medium
CN111008244A (en) * 2019-11-22 2020-04-14 厦门安胜网络科技有限公司 Database synchronization and analysis method and system
CN111897867A (en) * 2020-08-17 2020-11-06 杭州安恒信息技术股份有限公司 Database log statistical method, system and related device
CN113656377A (en) * 2021-08-23 2021-11-16 深圳市万睿智能科技有限公司 Automatic matching method and device for data migration, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐娟娟;朱成亮;: "NOSQL在WEB日志分析中的应用", 中国新技术新产品, no. 10 *

Also Published As

Publication number Publication date
CN114238258B (en) 2024-02-20

Similar Documents

Publication Publication Date Title
AU2017202873B2 (en) Efficient query processing using histograms in a columnar database
WO2017096892A1 (en) Index construction method, search method, and corresponding device, apparatus, and computer storage medium
CN105653609A (en) Memory-based data processing method and device
CN105630934A (en) Data statistic method and system
CN114757602B (en) Supply side electric power carbon emission risk early warning method and device and computer equipment
CN117033424A (en) Query optimization method and device for slow SQL (structured query language) statement and computer equipment
CN109446167A (en) A kind of storage of daily record data, extracting method and device
CN110019017B (en) High-energy physical file storage method based on access characteristics
CN113901279A (en) Graph database retrieval method and device
US8548980B2 (en) Accelerating queries based on exact knowledge of specific rows satisfying local conditions
CN110851758B (en) Webpage visitor quantity counting method and device
CN116401238A (en) Deviation monitoring method, apparatus, device, storage medium and program product
CN116611914A (en) Salary prediction method and device based on grouping statistics
CN114238258B (en) Database data processing method, device, computer equipment and storage medium
CN115858471A (en) Service data change recording method, device, computer equipment and medium
Purdilă et al. Single‐scan: a fast star‐join query processing algorithm
US10558647B1 (en) High performance data aggregations
CN116932779B (en) Knowledge graph data processing method and device
CN117978859A (en) Information pushing method and related equipment
CN117312283A (en) Database and table data verification method and device, computer equipment and storage medium
CN116204549A (en) Data query method, apparatus, computer device, storage medium, and program product
CN116051152A (en) Business product generation method, device, computer program product and storage medium
EP2657862B1 (en) Parallel set aggregation
CN114756654A (en) Dynamic place name and address matching method and device, computer equipment and storage medium
CN116049190A (en) Kafka-based data processing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: No. 8 Huizhi Street, Suzhou Industrial Park, Suzhou Area, China (Jiangsu) Pilot Free Trade Zone, Suzhou City, Jiangsu Province, 215000

Applicant after: Qichacha Technology Co.,Ltd.

Address before: Room 503, 5 / F, C1 building, 88 Dongchang Road, Suzhou Industrial Park, 215000, Jiangsu Province

Applicant before: Qicha Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant