CN113449017A - Historical behavior data processing method and storage medium - Google Patents
Historical behavior data processing method and storage medium Download PDFInfo
- Publication number
- CN113449017A CN113449017A CN202110799892.6A CN202110799892A CN113449017A CN 113449017 A CN113449017 A CN 113449017A CN 202110799892 A CN202110799892 A CN 202110799892A CN 113449017 A CN113449017 A CN 113449017A
- Authority
- CN
- China
- Prior art keywords
- data
- time
- original
- processing method
- historical behavior
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 238000003860 storage Methods 0.000 title claims abstract description 19
- 238000004458 analytical method Methods 0.000 claims abstract description 15
- 238000005065 mining Methods 0.000 claims abstract description 7
- 230000002860 competitive effect Effects 0.000 claims abstract description 6
- 238000007405 data analysis Methods 0.000 claims abstract description 6
- 238000011160 research Methods 0.000 claims abstract description 5
- 230000006399 behavior Effects 0.000 claims description 25
- 238000012216 screening Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000004220 aggregation Methods 0.000 claims description 9
- 230000002776 aggregation Effects 0.000 claims description 8
- 238000000034 method Methods 0.000 claims description 8
- 238000007418 data mining Methods 0.000 claims description 7
- 230000004044 response Effects 0.000 claims description 7
- 238000013329 compounding Methods 0.000 claims description 6
- 230000014509 gene expression Effects 0.000 claims description 5
- 230000004048 modification Effects 0.000 claims description 5
- 238000012986 modification Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 claims description 3
- 230000003287 optical effect Effects 0.000 claims description 3
- 238000007711 solidification Methods 0.000 claims description 3
- 230000008023 solidification Effects 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000005728 strengthening Methods 0.000 abstract description 4
- 230000007547 defect Effects 0.000 abstract description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Fuzzy Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a historical behavior data processing method and a storage medium, which are used for analyzing data information of a historical behavior path of a visitor, determining a strengthened input drainage channel by analyzing behavior habits of the visitor in an early stage, a middle stage and a later stage, guiding editing contents and researching and analyzing competitive products, and comprise the following three stages: firstly, storing raw data, and performing three steps of data analysis: (1) a conventional BI stage; (2) mining data; (3) predictive analysis of the data. The technical problem to be solved by the invention is to overcome the defects of the prior art, comprehensively analyze and store behavior habits in the early stage, the middle stage and the later stage of the visit, thereby determining a drainage channel for strengthening investment, and guiding the editing content and the research and analysis of the competitive products.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a historical behavior data processing method and a storage medium.
Background
Through accumulation of years, most of medium-sized and large-sized enterprises and public institutions have established relatively perfect basic informatization systems such as CRM, ERP, OA and the like. However, a large amount of data which is distributed and independent in the database is only a few unintelligible astronomical books for business personnel. What is needed by business personnel is information, which is an abstract information that they can understand, and benefit from. At this time, how to convert the data into information so that business personnel (including managers) can fully grasp and utilize the information and assist decision-making is a problem mainly solved by business intelligence.
How to translate data present in the database into information needed by business personnel? Most answers are reporting systems. In brief, the reporting system may already be referred to as a BI, which is a low-end implementation of a BI. Most of the foreign enterprises now enter the middle-end BI, called data analysis. Some businesses have begun to enter high-end BI, called data mining. Due to the rise of the smart phone, related enterprises of live broadcast with goods and online shopping need to analyze data of behaviors of a customer group urgently, but most of the related enterprises at present stay in a report stage.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the defects of the prior art and provide a historical behavior data processing method and a storage medium, which are used for comprehensively analyzing and storing behavior habits in the early stage, the middle stage and the later stage of access, thereby determining a drainage channel for strengthening investment, and guiding the research and analysis of editing contents and competitive products.
In order to solve the technical problems, the technical scheme provided by the invention is as follows: a historical behavior data processing method and a storage medium analyze data information of a historical behavior path of a visitor, and are used for determining a drainage channel for strengthening investment, guiding editing contents and analyzing research of competitive products by analyzing behavior habits of the visitor in an early stage, a middle stage and a later stage, wherein the processing method comprises the following three stages:
firstly, storing raw data, and performing three steps of data analysis:
(1) a conventional BI stage; (2) mining data; (3) predictive analysis of the data;
establishing a data warehouse based on the traditional method, implementing the data warehouse based on the data warehouse, generating an analysis result of data for a solidified business report mode, and obtaining a decision result;
during data mining, based on the result of the calculation of the solidification service mode, extracting the characteristics of the data from the initial stage based on the original data;
based on the data calculation result obtained by the solidified business report model, the storage engine stores the data by using original data and simultaneously meets the requirements of a BI (business intelligence) stage and the requirements of future data mining and data prediction analysis;
secondly, providing real-time multidimensional query, and after data are stored based on original data, realizing quick query of a user on incremental data by storing and processing an aggregation result;
thirdly, a quick response requirement is provided, a visual embedded point is provided when data are accessed, and quick response from generation of data to display of the data and occurrence of a query result is realized by adopting a single-process access mode for acquisition of files and MR data;
and fourthly, searching and analyzing the data, searching the data based on the original multidimensional data, and mining the new value of the data.
Furthermore, when the data of the behavior habit is stored, the concept of the life cycle is adopted, and the operation types including adding, modifying and deleting, effective time and ineffective time are added on the basis of the main data;
adding data, wherein main data and original data are required to be inserted, the operation type of the original data is set as adding, the effective time is the current time, and the failure time is null;
modifying data, namely modifying records in the main data, setting the expiration time corresponding to the records in the original data as the current time, inserting the newly modified records in the main data into the original data to form new records, setting the operation identifier as modification, setting the effective time as the current time, and setting the expiration time as null;
deleting the data, if the data in the main data is not associated with the data, deleting the data, setting the record failure time corresponding to the record in the original data table as the current time, additionally inserting one piece of data of the original data, setting the operation identifier as deleted, setting the effective time as the current time, and setting the failure time as null.
Furthermore, the real-time multidimensional query is based on the index technology of a search engine, and the screening mode supports time screening, text screening and numerical screening.
Further, real-time multi-dimensional query comprises single index definition, and aggregation calculation is carried out according to a certain dimension; and compounding indexes, namely compounding four arithmetic expressions, inquiring only effective data directly from the main data in normal operation, and inquiring only the original data or inquiring all data including the original data simultaneously if the original data needs to be inquired or all updating history records of one record need to be inquired.
Further, the operation type also includes an operation to recover data from the original data.
Further, the storage medium has stored thereon a computer program that, when executed, implements the stages of the historical behavior-based data processing method.
Further, the storage medium includes: a read only memory ROM, a random access memory RAM, a magnetic or optical disk, and other media for storing program code.
The invention has the following advantages: the data in the invention is stored based on the original detail data, so that the data generation, the data display and the query result can be realized within 5 seconds without pre-calculation in advance. Any data cross analysis can be carried out on the interface, and the understanding of the distribution state of the data is very convenient.
Detailed Description
The present invention will be described in further detail with reference to examples.
A historical behavior data processing method and a storage medium analyze data information of a historical behavior path of a visitor, and are used for determining a drainage channel for strengthening investment, guiding editing contents and analyzing research of competitive products by analyzing behavior habits of the visitor in an early stage, a middle stage and a later stage, wherein the processing method comprises the following three stages:
firstly, storing raw data, and performing three steps of data analysis:
(1) a conventional BI stage; (2) mining data; (3) predictive analysis of the data;
establishing a data warehouse based on the traditional method, implementing the data warehouse based on the data warehouse, generating an analysis result of data for a solidified business report mode, and obtaining a decision result;
during data mining, based on the result of the calculation of the solidification service mode, extracting the characteristics of the data from the initial stage based on the original data;
based on the data calculation result obtained by the solidified business report model, the storage engine stores the data by using original data and simultaneously meets the requirements of a BI (business intelligence) stage and the requirements of future data mining and data prediction analysis;
secondly, providing real-time multidimensional query, and after data are stored based on original data, realizing quick query of a user on incremental data by storing and processing an aggregation result;
thirdly, a quick response requirement is provided, a visual embedded point is provided when data are accessed, and quick response from generation of data to display of the data and occurrence of a query result is realized by adopting a single-process access mode for acquisition of files and MR data;
and fourthly, searching and analyzing the data, searching the data based on the original multidimensional data, and mining the new value of the data.
The service requirement of real-time response is realized through flexible definition of indexes, the indexes define that the block has a plurality of indexes, one index is called a single index, namely, one aggregation calculation is carried out according to a certain dimension, and the operation can be simply and quickly finished through an interface. The other is called a composite index, which needs to perform four arithmetic operations and can be defined through the interface. The method is also complex in the aspect of indexes, needs to be defined through multiple dimensions, can be quickly defined through some expressions, and directly sees results through an interface after the definition is finished to obtain graphic display and perform data analysis.
When the data of the behavior habit is stored, the concept of a life cycle is adopted, and operation types including adding, modifying and deleting, effective time and ineffective time are added on the basis of the main data;
adding data, wherein main data and original data are required to be inserted, the operation type of the original data is set as adding, the effective time is the current time, and the failure time is null;
modifying data, namely modifying records in the main data, setting the expiration time corresponding to the records in the original data as the current time, inserting the newly modified records in the main data into the original data to form new records, setting the operation identifier as modification, setting the effective time as the current time, and setting the expiration time as null;
deleting the data, if the data in the main data is not associated with the data, deleting the data, setting the record failure time corresponding to the record in the original data table as the current time, additionally inserting one piece of data of the original data, setting the operation identifier as deleted, setting the effective time as the current time, and setting the failure time as null.
Advanced queries like user grouping, user funnel queries, user retention queries, and also support filtering of multiple conditions like date range, numerical range, geographic coordinate range, and accurate matching of strings. Multiple ways of polymerization are also supported. Such as statistics, grouping, and aggregation re-aggregation, which are also frequently encountered in service needs.
Real-time multidimensional query is based on an index technology of a search engine, a screening mode supports time screening, text screening and numerical screening, real-time multidimensional query comprises single index definition, and aggregation calculation is carried out according to a certain dimensionality; and compounding indexes, namely compounding four arithmetic operation expressions, inquiring only effective data directly from the main data in normal operation, and inquiring only from the original data or inquiring all the data including the original data simultaneously if the original data needs to be inquired or all the updating history records of one record need to be inquired, wherein the operation type also includes the operation of recovering the data from the original data.
The method realizes the arbitrary customization of the index based on the platform, and because the data is stored based on the original detail record, the customization of the index can be easily realized through some expressions without pre-calculation in advance and directly through an interface.
The free screening of dimensionality can freely drag data through an interface, and then cross analysis can be completed.
The storage medium having stored thereon a computer program that, when executed, implements stages of a historical behavior-based data processing method, the storage medium comprising: a read only memory ROM, a random access memory RAM, a magnetic or optical disk, and other media for storing program code.
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.
Claims (7)
1. A historical behavior data processing method and a storage medium are characterized in that: analyzing data information of a historical behavior path of the visitor, and analyzing behavior habits of the visitor in early, middle and later periods of access to determine a strengthened input drainage channel, guide editing content and competitive product research and analysis, wherein the processing method comprises the following three stages:
firstly, storing raw data, and performing three steps of data analysis:
(1) a conventional BI stage; (2) mining data; (3) predictive analysis of the data;
establishing a data warehouse based on the traditional method, implementing the data warehouse based on the data warehouse, generating an analysis result of data for a solidified business report mode, and obtaining a decision result;
during data mining, based on the result of the calculation of the solidification service mode, extracting the characteristics of the data from the initial stage based on the original data;
based on the data calculation result obtained by the solidified business report model, the storage engine stores the data by using original data and simultaneously meets the requirements of a BI (business intelligence) stage and the requirements of future data mining and data prediction analysis;
secondly, providing real-time multidimensional query, and after data are stored based on original data, realizing quick query of a user on incremental data by storing and processing an aggregation result;
thirdly, a quick response requirement is provided, a visual embedded point is provided when data are accessed, and quick response from generation of data to display of the data and occurrence of a query result is realized by adopting a single-process access mode for acquisition of files and MR data;
and fourthly, searching and analyzing the data, searching the data based on the original multidimensional data, and mining the new value of the data.
2. The historical behavior data processing method according to claim 1, wherein: when the data of the behavior habit is stored, the concept of a life cycle is adopted, and operation types including adding, modifying and deleting, effective time and ineffective time are added on the basis of the main data;
adding data, wherein main data and original data are required to be inserted, the operation type of the original data is set as adding, the effective time is the current time, and the failure time is null;
modifying data, namely modifying records in the main data, setting the expiration time corresponding to the records in the original data as the current time, inserting the newly modified records in the main data into the original data to form new records, setting the operation identifier as modification, setting the effective time as the current time, and setting the expiration time as null;
deleting the data, if the data in the main data is not associated with the data, deleting the data, setting the record failure time corresponding to the record in the original data table as the current time, additionally inserting one piece of data of the original data, setting the operation identifier as deleted, setting the effective time as the current time, and setting the failure time as null.
3. The historical behavior data processing method according to claim 2, wherein: the real-time multidimensional query is based on an index technology of a search engine, and a screening mode supports time screening, text screening and numerical screening.
4. The historical behavior data processing method according to claim 3, wherein: real-time multidimensional query, including single index definition, and performing aggregation calculation according to a certain dimension; and compounding indexes, namely compounding four arithmetic expressions, inquiring only effective data directly from the main data in normal operation, and inquiring only the original data or inquiring all data including the original data simultaneously if the original data needs to be inquired or all updating history records of one record need to be inquired.
5. The historical behavior data processing method according to claim 4, wherein: the operation type also includes operations to recover data from the original data.
6. A storage medium used based on the historical behavior data processing method of any one of claims 1 to 5, characterized in that: the storage medium has stored thereon a computer program which, when executed, implements the stages of a historical behavior-based data processing method.
7. A storage medium according to claim 6, wherein: the storage medium includes: a read only memory ROM, a random access memory RAM, a magnetic or optical disk, and other media for storing program code.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110799892.6A CN113449017A (en) | 2021-07-15 | 2021-07-15 | Historical behavior data processing method and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110799892.6A CN113449017A (en) | 2021-07-15 | 2021-07-15 | Historical behavior data processing method and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113449017A true CN113449017A (en) | 2021-09-28 |
Family
ID=77816251
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110799892.6A Pending CN113449017A (en) | 2021-07-15 | 2021-07-15 | Historical behavior data processing method and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113449017A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110781252A (en) * | 2019-11-05 | 2020-02-11 | 安徽数据堂科技有限公司 | Intelligent data analysis visualization method based on big data |
CN111881204A (en) * | 2020-07-24 | 2020-11-03 | 海南中金德航科技股份有限公司 | Big data visualization platform |
CN112148810A (en) * | 2020-11-10 | 2020-12-29 | 南京智数云信息科技有限公司 | User portrait analysis system supporting custom label |
-
2021
- 2021-07-15 CN CN202110799892.6A patent/CN113449017A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110781252A (en) * | 2019-11-05 | 2020-02-11 | 安徽数据堂科技有限公司 | Intelligent data analysis visualization method based on big data |
CN111881204A (en) * | 2020-07-24 | 2020-11-03 | 海南中金德航科技股份有限公司 | Big data visualization platform |
CN112148810A (en) * | 2020-11-10 | 2020-12-29 | 南京智数云信息科技有限公司 | User portrait analysis system supporting custom label |
Non-Patent Citations (2)
Title |
---|
诸葛本不亮: "历史数据解决方案", 《CSDN》 * |
青云QINGCLOUD: "海量实时用户行为数据的存储和分析", 《SEGMENTFAULT》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104765731B (en) | Database inquiry optimization method and apparatus | |
US20110119300A1 (en) | Method Of Generating An Analytical Data Set For Input Into An Analytical Model | |
US6970882B2 (en) | Unified relational database model for data mining selected model scoring results, model training results where selection is based on metadata included in mining model control table | |
US6718338B2 (en) | Storing data mining clustering results in a relational database for querying and reporting | |
US20110137875A1 (en) | Incremental materialized view refresh with enhanced dml compression | |
CN105074724A (en) | Efficient query processing using histograms in a columnar database | |
CN101477522A (en) | Systems for collecting and analyzing business intelligence data | |
US20140101167A1 (en) | Creation of Inverted Index System, and Data Processing Method and Apparatus | |
US20090299969A1 (en) | Data warehouse system | |
Agarwal et al. | Approximate incremental big-data harmonization | |
US11556838B2 (en) | Efficient data relationship mining using machine learning | |
US20090043788A1 (en) | Proactive business intelligence | |
US20030177117A1 (en) | Metadata system for managing data mining environments | |
CN105630934A (en) | Data statistic method and system | |
CN117933206B (en) | Service data processing method, device, equipment, storage medium and program product | |
Jukic et al. | Expediting analytical databases with columnar approach | |
Souza et al. | Towards a Human-in-the-Loop Library for Tracking Hyperparameter Tuning in Deep Learning Development. | |
CN113449017A (en) | Historical behavior data processing method and storage medium | |
US12026146B2 (en) | Data analysis method, apparatus and device | |
Baruti | Analysis and Implementation of a Business Intelligence QlikView application for logistic and procurement management. Sews Cabind case for the shortage problem. | |
US11308115B2 (en) | Method and system for persisting data | |
Mohania et al. | Active, Real-Time, and Intellective Data Warehousing. | |
WO2017118474A1 (en) | A data processing apparatus and method and a data container structure | |
CN111143582A (en) | Multimedia resource recommendation method and device for updating associative words in real time through double indexes | |
JP2017010376A (en) | Mart-less verification support system and mart-less verification support method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210928 |
|
RJ01 | Rejection of invention patent application after publication |