CN112559504A - Data cleaning method and device based on data heat and storage medium - Google Patents

Data cleaning method and device based on data heat and storage medium Download PDF

Info

Publication number
CN112559504A
CN112559504A CN202011448046.1A CN202011448046A CN112559504A CN 112559504 A CN112559504 A CN 112559504A CN 202011448046 A CN202011448046 A CN 202011448046A CN 112559504 A CN112559504 A CN 112559504A
Authority
CN
China
Prior art keywords
data
heat
information
preset
cleaning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011448046.1A
Other languages
Chinese (zh)
Inventor
严敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Si Tech Information Technology Co Ltd
Original Assignee
Beijing Si Tech Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Si Tech Information Technology Co Ltd filed Critical Beijing Si Tech Information Technology Co Ltd
Priority to CN202011448046.1A priority Critical patent/CN112559504A/en
Publication of CN112559504A publication Critical patent/CN112559504A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data cleaning method, a device and a storage medium based on data heat, wherein the method comprises the following steps: collecting data heat information from a target data platform; analyzing a plurality of data types in the data heat information respectively to obtain the heat of each data type; performing heat evaluation on each data type heat according to a preset heat weight to obtain evaluation information of the data type heat; determining the type of the data to be deleted according to a preset cleaning strategy and the evaluation information, and cleaning the data corresponding to the type of the data to be deleted. According to the invention, the collected data heat information can be analyzed to obtain the data type heat, the heat evaluation information of the data type heat is given based on the data heat evaluation model, the data type is automatically cleaned according to the evaluation information, manual processing is not required, and the cleaning efficiency and accuracy are improved.

Description

Data cleaning method and device based on data heat and storage medium
Technical Field
The invention mainly relates to the technical field of data cleaning, in particular to a data cleaning method and device based on data heat and a storage medium.
Background
With the development of large data platforms, large data centers such as large-scale data warehouses and data lakes are increasingly common, the data centers continuously deposit data and bring pressure on storage and performance, efficient operation of the data centers is guaranteed, new challenges are brought to data operation and maintenance by improving data value values of the data centers, and low-value and low-heat data are cleaned in time.
Disclosure of Invention
The invention provides a data cleaning method and device based on data heat and a storage medium, aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a data cleaning method based on data heat degree comprises the following steps:
collecting data heat information from a target data platform;
analyzing a plurality of data types in the data heat information respectively to obtain the heat of each data type;
performing heat evaluation on each data type heat according to a preset heat weight to obtain evaluation information of the data type heat;
determining the type of the data to be deleted according to a preset cleaning strategy and the evaluation information, and cleaning the data corresponding to the type of the data to be deleted.
Another technical solution of the present invention for solving the above technical problems is as follows: a data cleansing apparatus based on data heat, comprising:
the acquisition module is used for acquiring data heat information from the target data platform;
the analysis module is used for respectively analyzing a plurality of data types in the data heat information to obtain the heat of each data type;
the evaluation module is used for carrying out heat evaluation on each data type heat according to a preset heat weight to obtain evaluation information of the data type heat;
and the cleaning module is used for determining the type of the data to be deleted according to a preset cleaning strategy and the evaluation information and cleaning the data corresponding to the type of the data to be deleted.
Another technical solution of the present invention for solving the above technical problems is as follows: a data-heat-based data cleansing apparatus comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the computer program, when executed by the processor, implementing the data-heat-based data cleansing method as described above.
Another technical solution of the present invention for solving the above technical problems is as follows: a computer-readable storage medium, storing a computer program which, when executed by a processor, implements a data cleansing method based on data heat as described above.
The invention has the beneficial effects that: the collected data heat information can be analyzed to obtain the data type heat, the heat evaluation information of the data type heat is given based on the data heat evaluation model, the data type is automatically cleaned according to the evaluation information, manual processing is not needed, and cleaning efficiency and accuracy are improved.
Drawings
FIG. 1 is a schematic flow chart illustrating a data cleaning method based on data heat according to an embodiment of the present invention;
fig. 2 is a schematic functional block diagram of a data cleaning apparatus based on data heat according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a schematic flow chart of a data cleaning method based on data heat according to an embodiment of the present invention.
As shown in fig. 1, a data cleaning method based on data heat includes the following steps:
collecting data heat information from a target data platform;
analyzing a plurality of data types in the data heat information respectively to obtain the heat of each data type;
performing heat evaluation on each data type heat according to a preset heat weight to obtain evaluation information of the data type heat;
determining the type of the data to be deleted according to a preset cleaning strategy and the evaluation information, and cleaning the data corresponding to the type of the data to be deleted.
In the embodiment, the collected data heat information can be analyzed to obtain the data type heat, the heat evaluation information of the data type heat is given based on the data heat evaluation model, the data type is automatically cleaned according to the evaluation information, manual processing is not needed, and cleaning efficiency and accuracy are improved.
Optionally, as an embodiment of the present invention, the data popularity information includes a database log, a metadata access log, data authorization information, and data association information;
the process of analyzing the plurality of data types in the data heat information respectively to obtain the heat of each data type comprises the following steps:
extracting table operation statements related to data access operation from the database log through a preset analysis data processing script and analyzing the table operation statements to obtain table information, obtaining table information access times according to the table information, and obtaining data access heat according to the table information access times;
analyzing the metadata access log through a preset analysis data processing script to obtain browsing model metadata information, obtaining table metadata access times from the browsing model metadata information, and obtaining metadata structure query heat according to the table metadata access times;
analyzing the data entitlement information through a preset analysis data processing script to obtain table entitlement information, obtaining table entitlement access times from the table entitlement information, and obtaining data entitlement degree according to the table entitlement access times;
analyzing the data association information through a preset analysis data processing script to obtain association table information, analyzing the association table information through a recursion method by utilizing a preset association script to obtain a plurality of source tables and target tables associated with the source tables, stopping recursion until the source tables and the target tables do not have an association relation, and obtaining association heat according to the plurality of associated source tables and target tables.
Specifically, the preset analysis data processing script includes select, update, insert analysis keywords, and the database log, the metadata access log, the data authorization information, and the data association information are analyzed through the analysis keywords.
Specifically, the preset association script includes insert No. select and create table No. as select, the table behind select is the source table, and the table behind insert No. and create table is the target table.
It should be understood that the process of "parsing a plurality of source tables and target tables associated with the source tables from the association table information by using a preset association script" is repeated in a recursive manner, the influence and the dependency of the target tables are continuously parsed, and the recursion is repeated until no association exists.
In the above embodiment, a plurality of data types of different types can be analyzed to obtain corresponding heat.
Optionally, as an embodiment of the present invention, the process of performing heat evaluation on the heat of each data type according to a preset heat weight to obtain evaluation information includes:
ranking the heat of each data type according to a preset heat weight to obtain heat ranking information, scoring the heat of each data type according to the heat ranking information, and taking the heat and the score of each data type and the corresponding data type as evaluation information, wherein the preset heat weight is that the data access heat proportion is 35%, the association heat proportion is 30%, the query heat proportion of a metadata structure is 10%, the data weighted heat proportion is 10% and the activity proportion is 15%; and the active ratio is an additional weight ratio of the data type with the highest access frequency from each data type in the current preset time period.
Specifically, the "current preset time period" may be set to be, for example, within three months. And counting the number of times of access to each data type within three months, wherein the data type with the highest number of times of access obtains an additional weight ratio, namely, the weight corresponding to the data type is the sum of the self ratio and the additional weight ratio.
Specifically, a data popularity evaluation model can be established through a preset popularity weight, evaluation is performed through dimensions of data access popularity (database access times), metadata structure query popularity (metadata access times), data empowerment popularity (empowerment times), association popularity (association model popularity score) and recent access information (activity proportion), and the evaluation is respectively converted into percentages and then multiplied by the proportions to calculate scores of all popularity. The heat evaluation model embodies that the higher the heat of the model which is frequently authorized to be accessed recently is, the higher the value of the influence model is, and the higher the value of the model is; wherein the correlation model heat score is calculated by the affected downstream model score.
It should be understood that the operation heat, query heat, weighting heat, and association heat weights may be adjusted manually.
In the above embodiment, the data heat evaluation model is used to perform weight sorting on each heat, and the score is performed according to the weight sorting information, so as to obtain the evaluation information, and the evaluation information can be used to further process the data types of different heats.
Optionally, as an embodiment of the present invention, the preset cleaning strategy includes a number n of cleaning items and a cleaning cycle;
the process of determining the type of the data to be deleted according to the preset cleaning strategy and the evaluation information comprises the following steps:
and determining n data types with lower grades as the data types to be deleted according to the number n of the cleaning items and the evaluation information in each cleaning period.
Specifically, the cleaning cycle may be set to a fixed time point, for example, 1 o' clock in the morning, and cleaning of data is performed once every time the fixed time point is reached. The cleaning cycle may be set to a period of time, for example, 10 hours as a cycle, and the data is cleaned every 10 hours.
The method also comprises the following steps before the cleaning treatment:
exporting the data types to be cleaned to form files, and backing up the files.
Specifically, the preset cleaning strategy not only includes the number n of cleaning items and the cleaning period, but also includes a scoring rule, a backup rule and the like, and can meet the application requirements of various scenes.
In the embodiment, the data types with low heat can be cleaned regularly according to the cleaning period to obtain valuable data types, manual participation is not needed, and the efficiency is improved. Data backup is carried out on the cleaned data before data cleaning, and data loss caused by misoperation is prevented.
Fig. 2 is a schematic functional block diagram of a data cleaning apparatus based on data heat according to an embodiment of the present invention.
Optionally, as another embodiment of the present invention, as shown in fig. 2, a data cleansing apparatus based on data heat includes:
the acquisition module is used for acquiring data heat information from the target data platform;
the analysis module is used for respectively analyzing a plurality of data types in the data heat information to obtain the heat of each data type;
the evaluation module is used for carrying out heat evaluation on each data type heat according to a preset heat weight to obtain evaluation information of the data type heat;
and the cleaning module is used for determining the type of the data to be deleted according to a preset cleaning strategy and the evaluation information and cleaning the data corresponding to the type of the data to be deleted.
Optionally, as an embodiment of the present invention, the data popularity information includes a database log, a metadata access log, data authorization information, and data association information;
the analysis module is specifically configured to:
extracting associated table operation statements from the database log through a preset analysis data processing script and analyzing the table operation statements to obtain table information, obtaining table information access times according to the table information, and obtaining data access heat according to the table information access times;
analyzing the metadata access log through a preset analysis data processing script to obtain browsing model metadata information, obtaining table metadata access times from the browsing model metadata information, and obtaining metadata structure query heat according to the table metadata access times;
analyzing the data entitlement information through a preset analysis data processing script to obtain table entitlement information, obtaining table entitlement access times from the table entitlement information, and obtaining data entitlement degree according to the table entitlement access times;
analyzing the data association information through a preset analysis data processing script to obtain association table information, analyzing the association table information through a recursion method by utilizing a preset association script to obtain a plurality of source tables and target tables associated with the source tables, stopping recursion until the source tables and the target tables do not have an association relation, and obtaining association heat according to the plurality of associated source tables and target tables.
Optionally, as an embodiment of the present invention, the preset cleaning strategy includes a number n of cleaning items and a cleaning cycle;
the process of cleaning all data types according to the preset cleaning strategy and the evaluation information comprises the following steps:
and cleaning the data types corresponding to the n data types with lower scores according to each cleaning cycle, the cleaning item number n and the evaluation information.
Optionally, as an embodiment of the present invention, before performing the cleaning process, the method further includes:
exporting the n data types to be cleaned to form files, and carrying out backup processing on the files. Optionally, as another embodiment of the present invention, a data cleansing apparatus based on data heat includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the data cleansing method based on data heat as described above is implemented.
Alternatively, as another embodiment of the present invention, a computer-readable storage medium stores a computer program which, when executed by a processor, implements the data cleansing method based on data heat as described above.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A data cleaning method based on data heat is characterized by comprising the following steps:
collecting data heat information from a target data platform;
analyzing a plurality of data types in the data heat information respectively to obtain the heat of each data type;
performing heat evaluation on each data type heat according to a preset heat weight to obtain evaluation information of the data type heat;
determining the type of the data to be deleted according to a preset cleaning strategy and the evaluation information, and cleaning the data corresponding to the type of the data to be deleted.
2. The data cleaning method according to claim 1, wherein the data heat information includes a database log, a metadata access log, data entitlement information, and data association information;
the process of analyzing the plurality of data types in the data heat information respectively to obtain the heat of each data type comprises the following steps:
extracting table operation statements related to data access operation from the database log through a preset analysis data processing script, analyzing the table operation statements to obtain table information, obtaining table information access times according to the table information, and obtaining data access heat according to the table information access times;
analyzing the metadata access log through a preset analysis data processing script to obtain browsing model metadata information, obtaining table metadata access times from the browsing model metadata information, and obtaining metadata structure query heat according to the table metadata access times;
analyzing the data entitlement information through a preset analysis data processing script to obtain table entitlement information, obtaining table entitlement access times from the table entitlement information, and obtaining data entitlement degree according to the table entitlement access times;
analyzing the data association information through a preset analysis data processing script to obtain association table information, analyzing the association table information through a recursion method by utilizing a preset association script to obtain a plurality of source tables and target tables associated with the source tables, stopping recursion until the source tables and the target tables do not have an association relation, and obtaining association heat according to the plurality of associated source tables and target tables.
3. The data cleaning method according to claim 2, wherein the obtaining of the evaluation information by performing heat evaluation on each of the data types according to a preset heat weight comprises:
ranking the heat of each data type according to a preset heat weight to obtain heat ranking information, scoring the heat of each data type according to the heat ranking information, and taking the heat and the score of each data type and the corresponding data type as evaluation information, wherein the preset heat weight is that the data access heat proportion is 35%, the association heat proportion is 30%, the query heat proportion of a metadata structure is 10%, the data weighted heat proportion is 10% and the activity proportion is 15%; and the active ratio is an additional weight ratio of the data type with the highest access frequency from each data type in the current preset time period.
4. The data cleaning method according to claim 3, wherein the preset cleaning policy includes a number n of cleaning items and a cleaning period;
the process of determining the type of the data to be deleted according to the preset cleaning strategy and the evaluation information comprises the following steps:
and determining n data types with lower grades as the data types to be deleted according to the number n of the cleaning items and the evaluation information in each cleaning period.
5. The data cleansing method according to claim 4, characterized by further comprising, before said performing cleansing processing, the steps of:
exporting the data types to be cleaned to form files, and backing up the files.
6. A data cleaning device based on data heat degree is characterized by comprising:
the acquisition module is used for acquiring data heat information from the target data platform;
the analysis module is used for respectively analyzing a plurality of data types in the data heat information to obtain the heat of each data type;
the evaluation module is used for carrying out heat evaluation on each data type heat according to a preset heat weight to obtain evaluation information of the data type heat;
and the cleaning module is used for determining the type of the data to be deleted according to a preset cleaning strategy and the evaluation information and cleaning the data corresponding to the type of the data to be deleted.
7. The data cleaning device of claim 6, wherein the data heat information comprises a database log, a metadata access log, data entitlement information and data association information;
the analysis module is specifically configured to:
extracting table operation statements related to data access operation from the database log through a preset analysis data processing script, analyzing the table operation statements to obtain table information, obtaining table information access times according to the table information, and obtaining data access heat according to the table information access times;
analyzing the metadata access log through a preset analysis data processing script to obtain browsing model metadata information, obtaining table metadata access times from the browsing model metadata information, and obtaining metadata structure query heat according to the table metadata access times;
analyzing the data entitlement information through a preset analysis data processing script to obtain table entitlement information, obtaining table entitlement access times from the table entitlement information, and obtaining data entitlement degree according to the table entitlement access times;
analyzing the data association information through a preset analysis data processing script to obtain association table information, analyzing the association table information through a recursion method by utilizing a preset association script to obtain a plurality of source tables and target tables associated with the source tables, stopping recursion until the source tables and the target tables do not have an association relation, and obtaining association heat according to the plurality of associated source tables and target tables.
8. The data cleaning apparatus of claim 7, wherein the parsing module is specifically configured to:
ranking the heat of each data type according to a preset heat weight to obtain heat ranking information, scoring the heat of each data type according to the heat ranking information, and taking the heat and the score of each data type and the corresponding data type as evaluation information, wherein the preset heat weight is that the data access heat proportion is 35%, the association heat proportion is 30%, the query heat proportion of a metadata structure is 10%, the data weighted heat proportion is 10% and the activity proportion is 15%; and the active ratio is an additional weight ratio of the data type with the highest access frequency from each data type in the current preset time period.
9. A data-heat-based data cleansing apparatus comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that when the computer program is executed by the processor, the data cleansing method according to any one of claims 1 to 5 is implemented.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the data hot-based data cleansing method according to any one of claims 1 to 5.
CN202011448046.1A 2020-12-09 2020-12-09 Data cleaning method and device based on data heat and storage medium Pending CN112559504A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011448046.1A CN112559504A (en) 2020-12-09 2020-12-09 Data cleaning method and device based on data heat and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011448046.1A CN112559504A (en) 2020-12-09 2020-12-09 Data cleaning method and device based on data heat and storage medium

Publications (1)

Publication Number Publication Date
CN112559504A true CN112559504A (en) 2021-03-26

Family

ID=75061326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011448046.1A Pending CN112559504A (en) 2020-12-09 2020-12-09 Data cleaning method and device based on data heat and storage medium

Country Status (1)

Country Link
CN (1) CN112559504A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268477A (en) * 2021-06-07 2021-08-17 中国联合网络通信集团有限公司 Data table cleaning method and device and server
CN113792084A (en) * 2021-08-12 2021-12-14 北京中交兴路信息科技有限公司 Data heat analysis method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120290950A1 (en) * 2011-05-12 2012-11-15 Jeffrey A. Rapaport Social-topical adaptive networking (stan) system allowing for group based contextual transaction offers and acceptances and hot topic watchdogging
CN106294595A (en) * 2016-07-29 2017-01-04 海尔优家智能科技(北京)有限公司 A kind of document storage, search method and device
CN108388555A (en) * 2018-02-01 2018-08-10 口碑(上海)信息技术有限公司 Commodity De-weight method based on category of employment and device
CN109918575A (en) * 2019-03-29 2019-06-21 阿里巴巴集团控股有限公司 A kind of superseded method and apparatus of the data applied to search system
CN110209345A (en) * 2018-12-27 2019-09-06 中兴通讯股份有限公司 The method and device of data storage
CN110807009A (en) * 2019-11-06 2020-02-18 湖南快乐阳光互动娱乐传媒有限公司 File processing method and device
CN111104300A (en) * 2019-11-30 2020-05-05 浪潮电子信息产业股份有限公司 Statistical method, system and device for directory access heat
CN111913954A (en) * 2020-06-20 2020-11-10 杭州城市大数据运营有限公司 Intelligent data standard catalog generation method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120290950A1 (en) * 2011-05-12 2012-11-15 Jeffrey A. Rapaport Social-topical adaptive networking (stan) system allowing for group based contextual transaction offers and acceptances and hot topic watchdogging
CN106294595A (en) * 2016-07-29 2017-01-04 海尔优家智能科技(北京)有限公司 A kind of document storage, search method and device
CN108388555A (en) * 2018-02-01 2018-08-10 口碑(上海)信息技术有限公司 Commodity De-weight method based on category of employment and device
CN110209345A (en) * 2018-12-27 2019-09-06 中兴通讯股份有限公司 The method and device of data storage
CN109918575A (en) * 2019-03-29 2019-06-21 阿里巴巴集团控股有限公司 A kind of superseded method and apparatus of the data applied to search system
CN110807009A (en) * 2019-11-06 2020-02-18 湖南快乐阳光互动娱乐传媒有限公司 File processing method and device
CN111104300A (en) * 2019-11-30 2020-05-05 浪潮电子信息产业股份有限公司 Statistical method, system and device for directory access heat
CN111913954A (en) * 2020-06-20 2020-11-10 杭州城市大数据运营有限公司 Intelligent data standard catalog generation method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
EMILIA KACPRZAK 等: "Characterising dataset search—An analysis of search logs and data requests", 《JOURNAL OF WEB SEMANTICS》, vol. 55, 19 November 2018 (2018-11-19), pages 37 - 55 *
黄恺翔: "云存储环境下副本管理策略研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 06, 15 June 2013 (2013-06-15), pages 137 - 31 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268477A (en) * 2021-06-07 2021-08-17 中国联合网络通信集团有限公司 Data table cleaning method and device and server
CN113268477B (en) * 2021-06-07 2023-06-23 中国联合网络通信集团有限公司 Data table cleaning method and device and server
CN113792084A (en) * 2021-08-12 2021-12-14 北京中交兴路信息科技有限公司 Data heat analysis method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
Adler et al. A content-driven reputation system for the Wikipedia
US10216773B2 (en) Apparatus and method for tuning relational database
US9317550B2 (en) Query expansion
US6546394B1 (en) Database system having logical row identifiers
CN101477542B (en) Sampling analysis method, system and equipment
US20150261848A1 (en) Query rewriting with entity detection
CN112559504A (en) Data cleaning method and device based on data heat and storage medium
CN101441660A (en) Knowledge evaluating system and method in inquiry and answer community
CN104268064A (en) Abnormity diagnosis method and device of product logs
Anitha A new web usage mining approach for next page access prediction
CN116362823A (en) Recommendation model training method, recommendation method and recommendation device for behavior sparse scene
CN106502881B (en) Method and device for testing commodity sequencing rule
CN112328865A (en) Information processing and recommending method, device, equipment and storage medium
Trushkowsky et al. Getting it all from the crowd
CN116756373A (en) Project review expert screening method, system and medium based on knowledge graph update
CN116071133A (en) Cross-border electronic commerce environment analysis method and system based on big data and computing equipment
Gunnarsson et al. The most popular programming languages of GitHub's trending repositories
JP2008282111A (en) Similar document retrieval method, program and device
CN101048777B (en) Data processing system and method
Hawking et al. ANU/ACSys TREC-5 Experiments
CN114020643B (en) Knowledge base testing method and device
CN112684909B (en) Input method association effect evaluation method and device, electronic equipment and storage medium
CN117290355B (en) Metadata map construction system
CN113742571B (en) Message pushing method and device based on big data and storage medium
CN113590597B (en) Identification method and equipment for analysis hierarchical division of key personnel of network abnormal behaviors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination