TW201805839A - Data processing method, device and system - Google Patents

Data processing method, device and system Download PDF

Info

Publication number
TW201805839A
TW201805839A TW106119497A TW106119497A TW201805839A TW 201805839 A TW201805839 A TW 201805839A TW 106119497 A TW106119497 A TW 106119497A TW 106119497 A TW106119497 A TW 106119497A TW 201805839 A TW201805839 A TW 201805839A
Authority
TW
Taiwan
Prior art keywords
data
index
search engine
keyword
keywords
Prior art date
Application number
TW106119497A
Other languages
Chinese (zh)
Inventor
譚純
Original Assignee
阿里巴巴集團服務有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集團服務有限公司 filed Critical 阿里巴巴集團服務有限公司
Publication of TW201805839A publication Critical patent/TW201805839A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

Provided are a data processing method, device and system, the method comprising: a query terminal receiving a query request of a user, wherein the query request comprises search keywords; acquiring dimension keywords, index keywords and time granularity keywords among the search keywords, and first data corresponding to dimension features matching the dimension keywords, second data corresponding to index features matching the index keywords, and third data corresponding to time granularity features matching the time granularity keywords; and the query terminal determining target data fed back to the user according to the first data, the second data and the third data. In the present embodiment, the user only need to input the search keywords once without traversing each data outlet to search data, and a search engine database can then find data related to the search keywords in all the data outlets, thereby improving the efficiency of searching data.

Description

資料處理方法、設備及系統 Data processing method, equipment and system

本發明涉及網際網路技術,尤其涉及一種資料處理方法、設備及系統。 The present invention relates to Internet technology, and in particular, to a data processing method, device, and system.

隨著網際網路的飛速發展,資料呈爆炸性增長。目前,所有具備巨量資料資產的公司,其儲存資料的資料量均較大。而上述公司一般透過四種資料出口,該資料出口是指儲存有資料的儲存空間或能夠生成資料的軟體應用,且該儲存空間或軟體應用能夠為資料庫提供資料來源,將其儲存的海量資料呈現給公司的所有員工,該四種資料出口分別為資料應用程式出口(比如阿里巴巴公司的淘寶生意經和百度公司的百度指數等)、報表出口(比如公司的工資報表)、知識庫平臺出口(比如百度公司的百度百科)和集群物理表出口(比如公司使用者的個人資訊)。 With the rapid development of the Internet, data has exploded. At present, all companies with huge data assets have a large amount of data. The above-mentioned companies generally use four types of data export. The data export refers to the storage space where the data is stored or the software application that can generate data, and the storage space or software application can provide a data source for the database and store a large amount of data. Presented to all employees of the company, the four kinds of data exports are data application program exports (such as Alibaba's Taobao Business and Baidu's Baidu Index, etc.), report exports (such as the company's salary report), and knowledge base platform exports. (Such as Baidu Encyclopedia of Baidu Company) and export of cluster physical tables (such as personal information of company users).

對於上述公司的非技術員工,一般需依次查找上述四種資料出口,才能獲得所需的資料。比如公司的一非技術員工,有獲取公司“某天家裝產品的成交金額”的需求,那麼該非技術人員,需依次查找公司的資料應用程式出口、報表出口、知識庫平臺出口以及集群物理表出口,直 至查找到公司的“某天家裝產品的成交金額”為止。 For the non-technical employees of the above companies, they generally need to search the above four data outlets in order to obtain the required data. For example, a non-technical employee of the company needs to obtain the company's "amount of home improvement products in a certain day", then the non-technical staff must search the company's data application program export, report export, knowledge base platform export, and cluster physical table export in order. ,straight Until you find the company's "amount of home improvement products in a certain day".

由於在實際應用中,上述每種資料出口所呈現資料的資料量均較大,那麼非技術員工透過依次查找每種資料出口進行資料的查找,勢必會造成查詢資料的效率低下。 Because in actual applications, the amount of data presented by each of the above data outlets is large, so non-technical employees searching for each data outlet in turn to find the data will inevitably cause inefficient data query.

本發明提供一種資料處理方法、設備及系統,以提高查找資料的效率。 The invention provides a data processing method, equipment and system to improve the efficiency of searching data.

一個方面,本發明提供一種資料處理系統,包括:查詢終端和搜尋引擎資料庫;所述查詢終端,用於接收使用者的查詢請求,所述查詢請求包括檢索關鍵字;所述查詢終端獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字,並將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給所述搜尋引擎資料庫;所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;所述搜尋引擎資料庫,用於獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,並將所述第一資料、所述第二資料和所述第三資料發送給所述查詢終端;所述查詢終端,還用於根據所述第一資料、所述第二 資料和所述第三資料,確定回饋給所述使用者的目標資料,並將所述目標資料顯示給所述使用者。 In one aspect, the present invention provides a data processing system, including: a query terminal and a search engine database; the query terminal is configured to receive a query request from a user, the query request includes a search keyword; Describing the dimensional keywords, index keywords, and time fineness keywords in the search keywords, and sending the dimensional keywords, the indicator keywords, and the time fineness keywords to the search engine database The search engine database pre-stores the data in the data exit and the characteristic information of the data, the characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics; the search engine data A library for obtaining first data corresponding to a dimensional feature matching the dimensional keyword, second data corresponding to an index feature matching the index keyword, and time fineness matching the time fineness keyword The third material corresponding to the sexual characteristics, and send the first material, the second material, and the third material Said terminal inquiry; the query terminal, for further information according to the first, the second The data and the third data, determine target data to be fed back to the user, and display the target data to the user.

另一方面,本發明提供一種資料處理方法,包括:查詢終端接收使用者的查詢請求,所述查詢請求包括檢索關鍵字;所述查詢終端獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字;所述查詢終端將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;所述查詢終端接收所述搜尋引擎資料庫發送的所述第一資料、所述第二資料和所述第三資料;所述查詢終端根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 In another aspect, the present invention provides a data processing method, including: a query terminal receiving a user's query request, the query request including a search keyword; and the query terminal obtaining a dimensional keyword and an index key in the search keyword Word and time fineness keywords; the query terminal sends the dimensional keywords, the index keywords, and the time fineness keywords to a search engine database to enable the search engine database to obtain and The first data corresponding to the dimensional feature matched by the dimensional keyword, the second data corresponding to the index feature matched with the index keyword, and the third data corresponding to the temporal nuance feature matched with the temporal nuance key Data, the search engine database pre-stores the data in the data exit, and the characteristic information of the data, the characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics; the query terminal Receive the first data, the second data, and the third data sent by the search engine database The query according to the first data terminal, the second information and the third information to determine the user back to the target data.

另一方面,本發明提供一種資料處理方法,包括:查詢終端接收使用者的查詢請求,所述查詢請求包括檢索關鍵字;所述查詢終端至少獲取所述檢索關鍵字中的兩類關鍵 字;所述查詢終端將至少兩類關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述至少兩類關鍵字分別對應的來源資料;所述查詢終端接收所述搜尋引擎資料庫發送的所述來源資料;所述查詢終端根據所述來源資料,確定回饋給所述使用者的目標資料。 In another aspect, the present invention provides a data processing method, including: a query terminal receiving a user's query request, the query request including a search keyword; and the query terminal obtaining at least two types of keys in the search keyword Word; the query terminal sends at least two types of keywords to a search engine database, so that the search engine database obtains source data corresponding to the at least two types of keywords respectively; the query terminal receives the search The source data sent by the engine database; the query terminal determines the target data to be returned to the user according to the source data.

另一方面,本發明提供一種資料處理方法,包括:搜尋引擎資料庫接收查詢終端發送的維度關鍵字、指標關鍵字、以及時間細微性關鍵字,所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字是所述查詢終端接收使用者的查詢請求,並從所述查詢請求包括的檢索關鍵字中獲取的;所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;所述搜尋引擎資料獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料;所述搜尋引擎資料將所述第一資料、所述第二資料和所述第三資料發送給所述查詢終端,以使所述查詢終端根據所述第一資料、所述第二資料和所述第三資料,確定回 饋給所述使用者的目標資料。 In another aspect, the present invention provides a data processing method, which includes: a search engine database receives a dimensional keyword, an index keyword, and a time fineness keyword sent by a query terminal, the dimensional keyword, the index keyword, And the time fineness keyword is received by the query terminal from the query request of the user and obtained from the search keywords included in the query request; the search engine database stores the data in the data exit in advance, And characteristic information of the data, the characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics; the search engine data obtains first data corresponding to the dimensional characteristics matching the dimensional keywords , Second data corresponding to the index characteristic matching the index keyword, and third data corresponding to the temporal nuance characteristic matching the temporal nuance keyword; the search engine data combines the first data, Sending the second data and the third data to the query terminal, so that the query terminal First data, said second data and said third data, return OK Target data for the user.

還一方面,本發明提供一種資料處理方法,包括:搜尋引擎資料庫獲取資料應用程式中的第一資料,以及所述第一資料的維度特徵、指標特徵、時間細微性特徵;所述搜尋引擎資料庫分別獲取報表、知識庫平臺、集群物理表中的第二資料,以及所述第二資料的維度特徵;所述搜尋引擎資料庫儲存所述第一資料,以及所述第一資料的維度特徵、指標特徵、時間細微性特徵;所述搜尋引擎資料庫儲存所述第二資料,以及所述第二資料的維度特徵。 In yet another aspect, the present invention provides a data processing method, including: searching a database of a search engine to obtain first data in a data application, and dimensional characteristics, index characteristics, and time nuance characteristics of the first data; the search engine The database obtains the second data in the report, the knowledge base platform, the cluster physical table, and the dimensional characteristics of the second data; the search engine database stores the first data and the dimensions of the first data Characteristics, index characteristics, temporal nuance characteristics; the search engine database stores the second data, and the dimensional characteristics of the second data.

另一方面,本發明提供一種查詢終端,包括:接收單元、處理單元、以及發送單元;所述接收單元,用於接收使用者的查詢請求,所述查詢請求包括檢索關鍵字;所述處理單元,耦合到所述接收單元,用於獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字;所述發送單元,耦合到所述處理單元,用於將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資 料,所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;所述接收單元還用於接收所述搜尋引擎資料庫發送的所述第一資料、所述第二資料和所述第三資料;所述處理單元還用於根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 In another aspect, the present invention provides a query terminal, including: a receiving unit, a processing unit, and a sending unit; the receiving unit is configured to receive a user's query request, the query request includes a search keyword; and the processing unit Is coupled to the receiving unit, and is configured to obtain a dimension keyword, an index keyword, and a time nuance keyword in the search keyword; and the sending unit is coupled to the processing unit, and is configured to couple the dimension The keywords, the index keywords, and the time-specificity keywords are sent to a search engine database, so that the search engine database obtains the first data corresponding to the dimensional characteristics matching the dimensional keywords, and The second data corresponding to the index characteristics of the index keyword matching, and the third data corresponding to the time nuance characteristics matching the time nuance keyword matching Data, the search engine database pre-stores the data in the data export, and the characteristic information of the data, the data export includes at least one of the following: data applications, reports, knowledge base platforms, and cluster physical tables. The feature information includes at least one of the following: a dimensional feature, an index feature, and a temporal nuance feature; the receiving unit is further configured to receive the first data, the second data, and the sent data from the search engine database Third data; the processing unit is further configured to determine target data to be returned to the user according to the first data, the second data, and the third data.

再一方面,本發明提供一種搜尋引擎資料庫,包括:接收器、記憶體、處理器、以及發送器;所述接收器,用於接收查詢終端發送的維度關鍵字、指標關鍵字、以及時間細微性關鍵字,所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字是所述查詢終端接收使用者的查詢請求,並從所述查詢請求包括的檢索關鍵字中獲取的;所述記憶體,用於儲存資料出口中的資料,以及所述資料的特徵資訊,所述資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;所述處理器,耦合到所述接收器和所述記憶體,用於獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及 與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料;所述發送器,耦合到所述處理器,用於將所述第一資料、所述第二資料和所述第三資料發送給所述查詢終端,以使所述查詢終端根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 In yet another aspect, the present invention provides a search engine database including a receiver, a memory, a processor, and a transmitter; the receiver is configured to receive a dimensional keyword, an index keyword, and a time sent by a query terminal; Subtlety keywords, the dimension keywords, the index keywords, and the time fineness keywords are the query request received by the query terminal from the user, and are obtained from the search keywords included in the query request Said memory is used to store data in a data export and characteristic information of said data, said data export includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table, said The feature information includes at least one of the following: a dimensional feature, an index feature, and a temporal nuance feature; the processor is coupled to the receiver and the memory, and is configured to obtain a dimensional feature correspondence that matches the dimensional keyword The first data of, the second data corresponding to the index characteristics matching the index keywords, and A third material corresponding to the time nuance feature matching the time nuance keyword; the transmitter is coupled to the processor and is configured to combine the first material, the second material, and the first material Three data are sent to the query terminal, so that the query terminal determines the target data to be returned to the user according to the first data, the second data, and the third data.

在本發明中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In the present invention, data in a data application, a report, a knowledge base platform, and a cluster physical table are collected in advance into a search engine database, and dimensional features, index features, and time nuance features are added to each piece of collected data. At least one of: when a search engine receives a search keyword entered by a user, the search keyword is first split to obtain a dimensional keyword, an index keyword, and a time fineness keyword; then, in a pre-built search In the engine database, find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and display the matched data to the user; the user does not need to go through each data exit to find the data, only Enter a search keyword once, and the search engine database can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

1‧‧‧查詢終端 1‧‧‧Query terminal

2‧‧‧搜尋引擎資料庫 2‧‧‧Search Engine Database

10‧‧‧使用者 10‧‧‧ users

11‧‧‧查詢終端 11‧‧‧Query terminal

12‧‧‧語義識別模組 12‧‧‧ Semantic Recognition Module

13‧‧‧搜尋引擎資料庫 13‧‧‧Search Engine Database

14‧‧‧排序器 14‧‧‧Sequencer

15‧‧‧資料應用程式 15‧‧‧ Data Application

16‧‧‧報表 16‧‧‧statement

17‧‧‧知識庫平臺 17‧‧‧ Knowledge Base Platform

18‧‧‧集群物理表 18‧‧‧ cluster physical table

19‧‧‧語法解析器 19‧‧‧ Syntax Parser

1900‧‧‧查詢終端 1900‧‧‧Query terminal

1922‧‧‧處理元件 1922‧‧‧Processing element

1926‧‧‧電源元件 1926‧‧‧Power Components

1932‧‧‧記憶體 1932‧‧‧Memory

1950‧‧‧網路介面 1950‧‧‧Interface

1958‧‧‧輸入輸出介面 1958‧‧‧Input and output interface

為了更清楚地說明本發明實施例或現有技術中的技術方案,下面將對實施例或現有技術描述中所需要使用的圖式作一簡單地介紹,顯而易見地,下面描述中的圖式是本發明的一些實施例,對於本領域普通技術人員來講,在不 付出創造性勞動性的前提下,還可以根據這些圖式獲得其他的圖式。 In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings in the following description are Some embodiments of the invention, for those of ordinary skill in the art, On the premise of paying creative labor, other schemes can be obtained based on these schemes.

圖1為本發明的一種可選的應用場景的示意圖;圖2為本發明實施例提供的資料處理系統的結構示意圖;圖3為本發明實施例一提供的資料處理方法的流程圖;圖4為本發明實施例二提供的資料處理方法的流程圖;圖5為本發明實施例三提供的資料處理方法的流程圖;圖6為本發明實施例四提供的資料處理方法的流程圖;圖7為本發明實施例五提供的資料處理方法的流程圖;圖8為本發明實施例六提供的資料處理方法的流程圖;圖9為本發明實施例七提供的資料處理方法的流程圖;圖10為本發明實施例八提供的資料處理方法的流程圖;圖11為本發明實施例九提供的資料處理方法的流程圖;圖12為本發明實施例一提供的查詢終端的結構示意 圖;圖13為本發明實施例二提供的查詢終端的結構示意圖;圖14為本發明實施例三提供的查詢終端的結構示意圖;圖15為本發明實施例提供的搜尋引擎資料庫的結構示意圖。 FIG. 1 is a schematic diagram of an optional application scenario of the present invention; FIG. 2 is a schematic structural diagram of a data processing system provided by an embodiment of the present invention; FIG. 3 is a flowchart of a data processing method provided by Embodiment 1 of the present invention; FIG. 5 is a flowchart of a data processing method provided by Embodiment 2 of the present invention; FIG. 5 is a flowchart of a data processing method provided by Embodiment 3 of the present invention; FIG. 6 is a flowchart of a data processing method provided by Embodiment 4 of the present invention; 7 is a flowchart of a data processing method provided in Embodiment 5 of the present invention; FIG. 8 is a flowchart of a data processing method provided in Embodiment 6 of the present invention; FIG. 9 is a flowchart of a data processing method provided in Embodiment 7 of the present invention; FIG. 10 is a flowchart of a data processing method provided by Embodiment 8 of the present invention; FIG. 11 is a flowchart of a data processing method provided by Embodiment 9 of the present invention; FIG. 12 is a schematic structural diagram of a query terminal provided by Embodiment 1 of the present invention FIG. 13 is a schematic structural diagram of a query terminal provided in Embodiment 2 of the present invention; FIG. 14 is a schematic structural diagram of a query terminal provided in Embodiment 3 of the present invention; and FIG. 15 is a structural schematic diagram of a search engine database provided by an embodiment of the present invention .

這裡將詳細地對示例性實施例進行說明,其示例表示在圖式中。下面的描述涉及圖式時,除非另有表示,不同圖式中的相同數字表示相同或相似的要素。以下示例性實施例中所描述的實施方式並不代表與本發明相一致的所有實施方式。相反,它們僅是與如申請專利範圍中所詳述的、本發明的一些方面相一致的裝置和方法的例子。 Exemplary embodiments will be described in detail here, examples of which are illustrated in the drawings. When the following description refers to drawings, the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the present invention. Rather, they are merely examples of devices and methods consistent with aspects of the invention as detailed in the scope of the patent application.

現有技術中,當公司的一非技術員工,需要獲取該公司“某天家裝產品的成交金額”時,需要依次查找公司的資料應用程式出口、報表出口、知識庫平臺出口以及集群物理表出口,直至查找到公司的“某天家裝產品的成交金額”為止,如此將導致資料查找效率的下降。針對這個間題,本案提出了一種資料處理方法,現將結合圖1介紹本案提供的資料處理方法的具體過程。 In the prior art, when a non-technical employee of a company needs to obtain the company's "amount of home improvement products in a certain day", he needs to search the company's data application program export, report export, knowledge base platform export, and cluster physical table export in order. Until you find the company's "amount of home improvement products in a certain day", this will lead to a decline in the efficiency of data search. In response to this problem, a data processing method is proposed in this case. The specific process of the data processing method provided in this case will now be described with reference to FIG. 1.

如圖1所示,使用者10透過查詢終端11查詢資料,使用者10可以是公司裡的非技術人員,還可以是消費 者,查詢終端11可以是使用者10所屬的公司內的終端設備,還可以是使用者10的個人電腦、筆記型電腦等設備。查詢終端11安裝有搜尋引擎,使用者10可透過查詢終端11的鍵盤在搜尋引擎的搜索框中輸入搜索關鍵字,例如,搜索關鍵字是“最近一天家裝類目成交金額”,語義識別模組12將該搜索關鍵字拆分為大資料領域的維度關鍵字、指標關鍵字和時間細微性關鍵字,具體地,維度關鍵字是“家裝類目”、指標關鍵字是“成交金額”、時間細微性關鍵字是“最近一天”。語義識別模組12將該搜索關鍵字拆分為維度關鍵字、指標關鍵字和時間細微性關鍵字的方法將在下述實施例中詳細描述。 As shown in FIG. 1, the user 10 queries the data through the query terminal 11. The user 10 may be a non-technical person in the company or a consumer. Alternatively, the inquiry terminal 11 may be a terminal device in the company to which the user 10 belongs, or may be a personal computer, a notebook computer, or other device of the user 10. The search terminal 11 is equipped with a search engine, and the user 10 can enter a search keyword in the search box of the search engine through the keyboard of the query terminal 11, for example, the search keyword is "the transaction amount of the home improvement category in the last day", and the semantic recognition module 12 This search keyword is split into dimensional keywords, index keywords, and time nuance keywords in the field of big data. Specifically, the dimensional keywords are "home improvement category", the index keywords are "transaction amount", time The subtlety keyword is "last day". The method for the semantic recognition module 12 to split the search keyword into a dimension keyword, an index keyword, and a time nuance keyword will be described in detail in the following embodiments.

語義識別模組12將拆分後的維度關鍵字“家裝類目”、指標關鍵字“成交金額”、以及時間細微性關鍵字“最近一天”發送給搜尋引擎資料庫13,搜尋引擎資料庫13的資料來源包括資料應用程式15、報表16、知識庫平臺17和集群物理表18,其中,資料應用程式15具體可以是資料產品,比如阿里巴巴公司的淘寶生意經和百度公司的百度指數等,資料產品是web頁面形式的web產品,資料產品與普通的web產品最大區別在於:資料產品承載有大量資料,且需要頻繁與後臺資料來源交互,該後臺資料來源具體是儲存有該資料應用程式15可操作的資料的器件。在本實施例中,資料應用程式15、報表16中的資料可透過語法解析器19儲存在搜尋引擎資料庫13,以資料應用程式15為例,由於資料應用程式15是透過軟 體開發套件(Software Development Kit,簡稱SDK)開發的,所以可透過SDK將資料應用程式15中的資料獲取到語法解析器19中。語法解析器19可解析出一段結構化查詢語言(Structured Query Language,簡稱SQL)的維度特徵、指標特徵、時間細微性特徵和讀取的表名,例如一段SQL具體如下:SELECT stat_date AS 日期 ,user_type AS 使用者類型 ,se_lpv_pc_1d_001 AS Pv ,se_uv_pc_1d_001 AS Uv The semantic recognition module 12 sends the split dimension keyword “home improvement category”, the index keyword “transaction amount”, and the time detail keyword “last day” to the search engine database 13 and the search engine database 13 The data sources include data application 15, report 16, knowledge base platform 17, and cluster physical table 18. Among them, data application 15 can be data products, such as Alibaba's Taobao Business and Baidu's Baidu Index. A data product is a web product in the form of a web page. The biggest difference between a data product and a common web product is that the data product carries a large amount of data and needs to interact frequently with background data sources. The background data source specifically stores the data application. 15 Operational information device. In this embodiment, the data in the data application 15 and the report 16 can be stored in the search engine database 13 through the parser 19. Taking the data application 15 as an example, since the data application 15 is It is developed by the Software Development Kit (SDK), so the data in the data application 15 can be obtained into the parser 19 through the SDK. Syntax parser 19 can parse a structured query language (Structured Query Language, SQL for short) of dimensional characteristics, index characteristics, time nuance characteristics and read table names. For example, a piece of SQL is as follows: SELECT stat_date AS date, user_type AS user type, se_lpv_pc_1d_001 AS Pv, se_uv_pc_1d_001 AS Uv

FROM tbbi.ads_tb_log_1d FROM tbbi.ads_tb_log_1d

WHERE ds='20151026' WHERE ds = '20151026'

語法解析器19可解析出該段SQL的維度特徵是“使用者類型”,指標特徵是“Pv、Uv”,時間細微性特徵是“最近一天”,讀取的表名是“tbbi.ads_tb_log_1d”。透過前述方法語法解析器19可解析出資料應用程式15和報表16中每個資料的維度特徵、指標特徵和時間細微性特徵。 The parser 19 can parse the SQL dimension of this segment as "user type", the index feature is "Pv, Uv", the time nuance feature is "last day", and the table name read is "tbbi.ads_tb_log_1d" . Through the foregoing method, the syntax parser 19 can parse out the dimensional characteristics, index characteristics, and time nuance characteristics of each data in the data application 15 and the report 16.

語法解析器19將解析後的資料發送給搜尋引擎資料庫13,搜尋引擎資料庫13中不僅儲存有資料本身,同時還儲存有資料的維度特徵、指標特徵和時間細微性特徵。另外,搜尋引擎資料庫13還可以儲存有知識庫平臺17和集群物理表18中的資料,儲存過程具體為:對知識庫平臺17和集群物理表18中的每個資料進行拆分,提取出拆 分後的每個資料的維度特徵,並將知識庫平臺17和集群物理表18中的每個資料,以及每個資料的維度特徵儲存在搜尋引擎資料庫13。如此,搜尋引擎資料庫13中儲存的每個資料至少具有維度特徵。 The grammar parser 19 sends the parsed data to the search engine database 13. The search engine database 13 not only stores the data itself, but also stores the dimensional characteristics, index characteristics, and temporal nuance characteristics of the data. In addition, the search engine database 13 can also store the data in the knowledge base platform 17 and the cluster physical table 18. The storage process is as follows: each data in the knowledge base platform 17 and the cluster physical table 18 is split and extracted. Dismantle The dimensional characteristics of each of the divided materials are stored in the search engine database 13 for each of the materials in the knowledge base platform 17 and the cluster physical table 18 and the dimensional characteristics of each of the materials. As such, each piece of data stored in the search engine database 13 has at least dimensional characteristics.

當搜尋引擎資料庫13接收到語義識別模組12發送的維度關鍵字“家裝類目”、指標關鍵字“成交金額”、以及時間細微性關鍵字“最近一天”時,分別查找出與維度關鍵字“家裝類目”匹配的資料、與指標關鍵字“成交金額”匹配的資料、以及與時間細微性關鍵字“最近一天”匹配的資料,搜尋引擎資料庫13將查找出的匹配資料發送給排序器14,若搜尋引擎資料庫13查找出的匹配資料只有一個,則排序器14將該匹配資料發送給查詢終端11,查詢終端11顯示該匹配資料;若搜尋引擎資料庫13查找出的匹配資料有多個,則排序器14按照預設演算法對該多個匹配資料進行排序,將排序後的多個匹配資料發送給查詢終端11,查詢終端11按照排序的先後順序顯示該多個匹配資料。在本實施例中,排序器14對該多個匹配資料進行排序的預設演算法包括如下至少一種:Pagerank演算法、CUS一距離演算法、文檔主題生成模型(Latent Dirichlet Allocation,簡稱LDA)演算法、寬度優先搜索(Breadth First Search,簡稱BFS)演算法等。 When the search engine database 13 receives the dimensional keyword "home improvement category", the index keyword "transaction amount", and the time nuanced keyword "last day" sent by the semantic recognition module 12, it finds the key related to the dimension The data matching the word "home improvement category", the data matching the index keyword "transaction amount", and the data matching the time granularity keyword "last day", the search engine database 13 sends the found matching data to Sorter 14, if there is only one matching data found by the search engine database 13, the sorter 14 sends the matching data to the query terminal 11, and the query terminal 11 displays the matching data; if the search engine database 13 finds a match If there are multiple data, the sorter 14 sorts the multiple matching data according to a preset algorithm, and sends the sorted multiple matching data to the query terminal 11. The query terminal 11 displays the multiple matches in the order of sorting. data. In this embodiment, the preset algorithm that the sorter 14 uses to sort the multiple matching materials includes at least one of the following: Pagerank algorithm, CUS distance algorithm, and Latent Dirichlet Allocation (LDA) algorithm. Method, breadth first search (BFS) algorithm, etc.

本實施例中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間 細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, data in application programs, reports, knowledge base platforms, and cluster physical tables are collected in advance into the search engine database, and dimensional features, index features, and time are added for each piece of collected data. At least one of the subtle features; when the search engine receives the search keywords entered by the user, it first splits the search keywords to obtain the dimensional keywords, index keywords, and time fineness keywords; then, in advance The search engine database is created to find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and displays the matched data to users; users do not need to go through each data exit to search for data You only need to enter a search keyword once, and the search engine database can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

圖2為本發明實施例提供的資料處理系統的結構示意圖,如圖2所示,資料處理系統包括查詢終端1和搜尋引擎資料庫2,其中,查詢終端1用於接收使用者的查詢請求,所述查詢請求包括檢索關鍵字;所述查詢終端獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字,並將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給所述搜尋引擎資料庫。 FIG. 2 is a schematic structural diagram of a data processing system according to an embodiment of the present invention. As shown in FIG. 2, the data processing system includes a query terminal 1 and a search engine database 2, where the query terminal 1 is configured to receive a query request from a user. The query request includes a search keyword; the query terminal obtains a dimensional keyword, an index keyword, and a time fineness keyword in the search keyword, and sets the dimension keyword, the index keyword, and The time granularity keywords are sent to the search engine database.

如圖1所示,查詢終端11接收使用者10的查詢請求,查詢請求的方式可以有多種,例如,使用者10在查詢終端11的搜尋引擎上輸入文字、語音,該文字或語音包括使用者10預檢索的關鍵字。如圖1所示,語義識別模組12和排序器14可以是屬於查詢終端11中的模組,語義識別模組12將該搜索關鍵字拆分為大資料領域的維度關鍵字、指標關鍵字和時間細微性關鍵字,具體地,維度關鍵字是“家裝類目”、指標關鍵字是“成交金額”、 時間細微性關鍵字是“最近一天”。語義識別模組12還將維度關鍵字“家裝類目”、指標關鍵字“成交金額”、時間細微性關鍵字“最近一天”發送給搜尋引擎資料庫2。 As shown in FIG. 1, the query terminal 11 receives a query request from the user 10. There are various ways for the query request. For example, the user 10 enters text or voice on the search engine of the query terminal 11, and the text or voice includes the user 10 pre-retrieved keywords. As shown in FIG. 1, the semantic recognition module 12 and the sorter 14 may be modules belonging to the query terminal 11. The semantic recognition module 12 splits the search keyword into dimensional keywords and index keywords in the big data field. And time subtlety keywords, specifically, the dimension keyword is "home improvement category", the index keyword is "transaction amount", The time nuance keyword is "last day". The semantic recognition module 12 also sends the dimension keyword “home improvement category”, the index keyword “transaction amount”, and the time detail keyword “last day” to the search engine database 2.

搜尋引擎資料庫2預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵。 The search engine database 2 previously stores data in the data outlet and characteristic information of the data, and the characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics.

可選的,在本實施例中,資料出口包括:資料應用程式、報表、知識庫平臺以及集群物理表,搜尋引擎資料庫13儲存有資料應用程式、報表、知識庫平臺以及集群物理表中的資料,以及每個資料的特徵資訊,資料應用程式中的每個資料具有維度特徵、指標特徵和時間細微性特徵,報表、知識庫平臺以及集群物理表中的資料均具有維度特徵。 Optionally, in this embodiment, the data export includes: data applications, reports, knowledge base platforms, and cluster physical tables. The search engine database 13 stores data applications, reports, knowledge base platforms, and cluster physical tables. Data, and the characteristic information of each data, each data in the data application has dimensional characteristics, index characteristics and time nuance characteristics, and the data in reports, knowledge base platforms and cluster physical tables all have dimensional characteristics.

搜尋引擎資料庫2用於獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,並將所述第一資料、所述第二資料和所述第三資料發送給所述查詢終端。 The search engine database 2 is configured to obtain first data corresponding to a dimensional feature matching the dimensional keyword, second data corresponding to an index feature matching the index keyword, and matching with the temporal nuance keyword The third material corresponding to the time nuance characteristic, and sending the first material, the second material, and the third material to the query terminal.

當搜尋引擎資料庫13接收到語義識別模組12發送的維度關鍵字“家裝類目”、指標關鍵字“成交金額”、以及時間細微性關鍵字“最近一天”時,可分別查找出與維度關鍵字“家裝類目”匹配的資料、與指標關鍵字“成交金額”匹配的資料、以及與時間細微性關鍵字“最近一 天”匹配的資料。搜尋引擎資料庫13可將語義識別模組12識別出的維度關鍵字“家裝類目”與其儲存的資料的維度特徵進行匹配,獲得與所述維度關鍵字匹配的維度特徵對應的第一資料,該第一資料可以是多個資料,並且該第一資料可以是源自於資料應用程式15、報表16、知識庫平臺17或集群物理表18的資料。 When the search engine database 13 receives the dimensional keyword "home improvement category", the index keyword "transaction amount", and the time nuanced keyword "last day" sent by the semantic recognition module 12, it can find the dimensions and dimensions respectively. The data matching the keyword "home improvement category", the data matching the indicator keyword "transaction amount", and the time detail keyword "recent "Day" match. The search engine database 13 can match the dimensional keyword "home improvement category" identified by the semantic recognition module 12 with the dimensional characteristics of the stored data to obtain the dimensional characteristics that match the dimensional keywords. Corresponding first data, the first data may be multiple data, and the first data may be data derived from a data application 15, a report 16, a knowledge base platform 17, or a cluster physical table 18.

另外,搜尋引擎資料庫13還可將語義識別模組12識別出的指標關鍵字“成交金額”與其儲存的資料的指標特徵進行匹配,獲得與所述指標關鍵字匹配的指標特徵對應的第二資料,該第二資料可以是源自於資料應用程式15的多個資料。 In addition, the search engine database 13 can also match the index keyword "transaction amount" identified by the semantic recognition module 12 with the index characteristics of the stored data to obtain a second index corresponding to the index characteristics matching the index keywords. Data, the second data may be a plurality of data derived from the data application 15.

此外,搜尋引擎資料庫13還可將語義識別模組12識別出的時間細微性關鍵字“最近一天”與其儲存的資料的時間細微性特徵進行匹配,獲得與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,該第三資料可以是源自於資料應用程式15的多個資料。 In addition, the search engine database 13 can also match the temporal nuance keywords “last day” identified by the semantic recognition module 12 with the temporal nuance characteristics of the stored data to obtain a match with the temporal nuance keywords. The third data corresponding to the temporal nuance feature may be a plurality of data derived from the data application program 15.

搜尋引擎資料庫13其獲得的所述第一資料、所述第二資料和所述第三資料發送給查詢終端11,具體可以發送給查詢終端11中的排序器14。 The search engine database 13 sends the first data, the second data, and the third data obtained to the query terminal 11, and may specifically send it to the sequencer 14 in the query terminal 11.

查詢終端1還用於根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料,並將所述目標資料顯示給所述使用者。 The query terminal 1 is further configured to determine, according to the first data, the second data, and the third data, target data that is returned to the user, and display the target data to the user.

若搜尋引擎資料庫13查找出的匹配資料只有一個,即所述第一資料、所述第二資料和所述第三資料為同一資 料,則排序器14將該匹配資料發送給查詢終端11的顯示器,查詢終端11的顯示器顯示該匹配資料。 If there is only one matching data found by the search engine database 13, the first data, the second data, and the third data are the same information. Data, the sorter 14 sends the matching data to the display of the query terminal 11, and the display of the query terminal 11 displays the matching data.

若搜尋引擎資料庫13查找出的匹配資料有多個,即所述第一資料、所述第二資料和所述第三資料不為同一資料,則排序器14按照預設演算法對該多個匹配資料進行排序,將排序後的多個匹配資料發送給查詢終端11的顯示器,查詢終端11的顯示器按照排序的先後順序顯示該多個匹配資料。 If there are multiple matching data found by the search engine database 13, that is, the first data, the second data, and the third data are not the same data, the sorter 14 performs a preset algorithm on the multiple data. The matching materials are sorted, and the sorted matching materials are sent to the display of the query terminal 11. The display of the query terminal 11 displays the multiple matching materials in the order of sorting.

本實施例中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, data in a data application, a report, a knowledge base platform, and a cluster physical table are collected in advance into a search engine database, and dimensional features, index features, and time nuance features are added for each piece of collected data. At least one of: when a search engine receives a search keyword entered by a user, the search keyword is first split to obtain a dimensional keyword, an index keyword, and a time fineness keyword; then, in a pre-built search In the engine database, find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and display the matched data to the user; the user does not need to go through each data exit to find the data, only Enter a search keyword once, and the search engine database can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

圖3為本發明實施例一提供的資料處理方法的流程圖,如圖3所示,該方法包括如下步驟: FIG. 3 is a flowchart of a data processing method provided in Embodiment 1 of the present invention. As shown in FIG. 3, the method includes the following steps:

步驟S201、查詢終端接收使用者的查詢請求,所述查詢請求包括檢索關鍵字。 Step S201: The query terminal receives a query request from a user, where the query request includes a search keyword.

如圖1所示,查詢終端11接收使用者10的查詢請求,查詢請求的方式可以有多種,例如,使用者10在查詢終端11的搜尋引擎上輸入文字、語音,該文字或語音包括使用者10預檢索的關鍵字;或者,查詢終端11的搜尋引擎上設置有下拉清單,該清單中預先儲存有關鍵字,使用者可以透過選擇清單中的關鍵字並點擊的方式輸入預檢索的關鍵字;再或者,使用者10在查詢終端11上預覽文字資訊,使用者10從其預覽的文字資訊中選擇關鍵字,透過拖動、滑動、點擊功能鍵的方式對該關鍵字進行檢索。 As shown in FIG. 1, the query terminal 11 receives a query request from the user 10. There are various ways for the query request. For example, the user 10 enters text or voice on the search engine of the query terminal 11. The text or voice includes the user 10 pre-retrieved keywords; or, the search engine of the query terminal 11 is provided with a drop-down list in which keywords are stored in advance, and the user can enter the pre-retrieved keywords by selecting keywords in the list and clicking ; Or alternatively, the user 10 previews the text information on the query terminal 11, and the user 10 selects a keyword from the previewed text information, and searches for the keyword by dragging, sliding, and clicking a function key.

使用者10透過查詢終端11查詢資料,使用者10可以是公司裡的非技術人員,還可以是消費者,查詢終端11可以是使用者10所屬的公司內的終端設備,還可以是使用者10的個人電腦、筆記型電腦等設備。查詢終端11安裝有搜尋引擎,使用者10可透過查詢終端11的鍵盤在搜尋引擎的搜索框中輸入搜索關鍵字,例如,搜索關鍵字是“最近一天家裝類目成交金額”。 The user 10 queries the data through the query terminal 11. The user 10 may be a non-technical person in the company or a consumer. The query terminal 11 may be a terminal device in the company to which the user 10 belongs, or the user 10 Personal computers, laptops, and more. The search terminal 11 is equipped with a search engine, and the user 10 can enter a search keyword in the search box of the search engine through the keyboard of the search terminal 11, for example, the search keyword is "the transaction amount of home improvement category in the last day".

步驟S202、所述查詢終端獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字。 Step S202: The query terminal obtains a dimensional keyword, an index keyword, and a time fineness keyword among the search keywords.

如圖1所示,語義識別模組12和排序器14可以是屬於查詢終端11中的模組,也可以是屬於搜尋引擎資料庫13中的模組,查詢終端11和搜尋引擎資料庫13可以直接連接,也可以透過其他設備間接連接。在本實施例中,以語義識別模組12和排序器14屬於查詢終端11、查詢 終端11和搜尋引擎資料庫13直接連接為例。 As shown in FIG. 1, the semantic recognition module 12 and the sorter 14 may be modules belonging to the query terminal 11 or modules belonging to the search engine database 13. The query terminal 11 and the search engine database 13 may be Connect directly or indirectly through other devices. In this embodiment, the semantic recognition module 12 and the sorter 14 belong to the query terminal 11, the query The terminal 11 and the search engine database 13 are directly connected as an example.

語義識別模組12將該搜索關鍵字拆分為大資料領域的維度關鍵字、指標關鍵字和時間細微性關鍵字,具體地,維度關鍵字是“家裝類目”、指標關鍵字是“成交金額”、時間細微性關鍵字是“最近一天”。 The semantic recognition module 12 splits the search keywords into dimensional keywords, index keywords, and time nuance keywords in the field of big data. Specifically, the dimensional keywords are "home improvement category" and the index keywords are "transactions." Amount ", time nuance keyword is" last day ".

步驟S203、所述查詢終端將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料。 Step S203: The query terminal sends the dimensional keywords, the index keywords, and the time fineness keywords to a search engine database, so that the search engine database obtains the dimensional keywords. The first data corresponding to the matched dimensional feature, the second data corresponding to the index feature matching the index key, and the third data corresponding to the time subtle feature matching the time subtle key.

在實施例中,所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵。 In an embodiment, the search engine database pre-stores data in the data outlet and characteristic information of the data, and the characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics.

所述查詢終端將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給搜尋引擎資料庫,所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵,所述資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表。可選的,在本實施例中,資料出口包括:資料應用程式、報表、知識庫平臺以及集群物理表,搜尋引擎資料庫13儲存有資料應用程式、報表、知識庫平臺以及集 群物理表中的資料。 The query terminal sends the dimensional keyword, the index keyword, and the time fineness keyword to a search engine database, the search engine database stores data in a data outlet in advance, and the Characteristic information of the data, the characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics, and the data outlet includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table. Optionally, in this embodiment, the data export includes: data applications, reports, knowledge base platforms, and cluster physical tables, and the search engine database 13 stores data applications, reports, knowledge base platforms, and collections. The data in the group physical table.

如圖1所示,搜尋引擎資料庫13的資料來源包括資料應用程式15、報表16、知識庫平臺17和集群物理表18,其中,資料應用程式15具體可以是資料產品,比如阿里巴巴公司的淘寶生意經和百度公司的百度指數等,資料產品是web頁面形式的web產品,資料產品與普通的web產品最大區別在於:資料產品承載有大量資料,且需要頻繁與後臺資料來源交互,該後臺資料來源具體是儲存有該資料應用程式15可操作的資料的器件。在本實施例中,資料應用程式15中的資料可透過語法解析器19儲存在搜尋引擎資料庫13,具體的,透過SDK將資料應用程式15中的資料獲取到語法解析器19中。語法解析器19可解析出一段結構化查詢語言(Structured Query Language,簡稱SQL)的維度特徵、指標特徵、時間細微性特徵和讀取的表名,例如一段SQL具體如下:SELECT stat_date AS 日期 ,user_type AS 使用者類型 ,se_lpv_pc_1d_001 AS Pv ,se_uv_pc_1d_001 AS Uv As shown in FIG. 1, the data sources of the search engine database 13 include a data application 15, a report 16, a knowledge base platform 17, and a cluster physical table 18. Among them, the data application 15 may specifically be a data product, such as Alibaba ’s Taobao Business and Baidu's Baidu Index, etc., data products are web products in the form of web pages. The biggest difference between data products and ordinary web products is that data products carry a large amount of data and need to frequently interact with background data sources. The data source is specifically a device storing data operable by the data application program 15. In this embodiment, the data in the data application program 15 can be stored in the search engine database 13 through the parser 19. Specifically, the data in the data application program 15 is acquired into the parser 19 through the SDK. Syntax parser 19 can parse a structured query language (Structured Query Language, SQL for short) of dimensional characteristics, index characteristics, time nuance characteristics and read table names. For example, a piece of SQL is as follows: SELECT stat_date AS date, user_type AS user type, se_lpv_pc_1d_001 AS Pv, se_uv_pc_1d_001 AS Uv

FROM tbbi.ads_tb_log_1d FROM tbbi.ads_tb_log_1d

WHERE ds='20151026' WHERE ds = '20151026'

語法解析器19可解析出該段SQL的維度特徵是“使用者類型”,指標特徵是“Pv、Uv”,時間細微性特徵是“最近一天”,讀取的表名是“tbbi.ads_tb_log_1d”。 透過前述方法,語法解析器19可解析出資料應用程式15中每個資料的維度特徵、指標特徵和時間細微性特徵。語法解析器19將解析後的資料發送給搜尋引擎資料庫13,搜尋引擎資料庫13中不僅儲存有資料本身,同時還儲存有資料的維度特徵、指標特徵和時間細微性特徵。 The parser 19 can parse the SQL dimension of this segment as "user type", the index feature is "Pv, Uv", the time nuance feature is "last day", and the table name read is "tbbi.ads_tb_log_1d" . Through the foregoing method, the grammar parser 19 can parse out the dimensional characteristics, index characteristics, and temporal nuance characteristics of each data in the data application program 15. The grammar parser 19 sends the parsed data to the search engine database 13. The search engine database 13 not only stores the data itself, but also stores the dimensional characteristics, index characteristics, and temporal nuance characteristics of the data.

另外,搜尋引擎資料庫13還可以儲存有報表16、知識庫平臺17和集群物理表18中的資料,儲存過程具體為:對報表16、知識庫平臺17和集群物理表18中的每個資料進行拆分,提取出拆分後的每個資料的維度特徵,並將報表16、知識庫平臺17和集群物理表18中的每個資料,以及每個資料的維度特徵儲存在搜尋引擎資料庫13。如此,搜尋引擎資料庫13中儲存的每個資料至少具有維度特徵。 In addition, the search engine database 13 can also store data in the report 16, the knowledge base platform 17, and the cluster physical table 18. The storage process is as follows: For each data in the report 16, the knowledge base platform 17, and the cluster physical table 18 Perform splitting to extract the dimensional characteristics of each data after the split, and store each data in the report 16, the knowledge base platform 17, and the cluster physical table 18, and the dimensional characteristics of each data in the search engine database 13. As such, each piece of data stored in the search engine database 13 has at least dimensional characteristics.

當搜尋引擎資料庫13接收到語義識別模組12發送的維度關鍵字“家裝類目”、指標關鍵字“成交金額”、以及時間細微性關鍵字“最近一天”時,可分別查找出與維度關鍵字“家裝類目”匹配的資料、與指標關鍵字“成交金額”匹配的資料、以及與時間細微性關鍵字“最近一天”匹配的資料。 When the search engine database 13 receives the dimensional keyword "home improvement category", the index keyword "transaction amount", and the time nuanced keyword "last day" sent by the semantic recognition module 12, it can find the dimensions and dimensions respectively. Data matching the keyword "home improvement category", data matching the index keyword "transaction amount", and data matching the time nuanced keyword "last day".

在本實施例中,搜尋引擎資料庫13中儲存有資料應用程式15中的資料,以及資料應用程式15中每個資料的維度特徵、指標特徵和時間細微性特徵。另外,搜尋引擎資料庫13還儲存有報表16、知識庫平臺17和集群物理表18中的資料,以及報表16、知識庫平臺17和集群物 理表18中每個資料的維度特徵。另外,搜尋引擎資料庫13中各資料的維度特徵可能不同,可能相同;各資料的指標特徵可能不同,可能相同;各資料的時間細微性特徵可能不同,可能相同。 In this embodiment, the search engine database 13 stores the data in the data application 15 and the dimensional characteristics, index characteristics, and time nuance characteristics of each data in the data application 15. In addition, the search engine database 13 also stores the data in the report 16, the knowledge base platform 17, and the cluster physical table 18, as well as the report 16, the knowledge base platform 17, and the cluster properties. Manage the dimensional characteristics of each data in Table 18. In addition, the dimensional characteristics of each data in the search engine database 13 may be different and may be the same; the index characteristics of each data may be different and may be the same; the temporal nuance characteristics of each data may be different and may be the same.

本實施例中的搜尋引擎資料庫13可將語義識別模組12識別出的維度關鍵字“家裝類目”與其儲存的資料的維度特徵進行匹配,獲得與所述維度關鍵字匹配的維度特徵對應的第一資料,該第一資料可以是多個資料,並且該第一資料可以是源自於資料應用程式15、報表16、知識庫平臺17或集群物理表18的資料。 The search engine database 13 in this embodiment can match the dimensional keyword “home improvement category” identified by the semantic recognition module 12 with the dimensional characteristics of the stored data to obtain a corresponding dimensional characteristic that matches the dimensional keyword. The first data may be multiple data, and the first data may be data derived from a data application 15, a report 16, a knowledge base platform 17, or a cluster physical table 18.

另外,搜尋引擎資料庫13還可將語義識別模組12識別出的指標關鍵字“成交金額”與其儲存的資料的指標特徵進行匹配,獲得與所述指標關鍵字匹配的指標特徵對應的第二資料,該第二資料可以是源自於資料應用程式15的多個資料。 In addition, the search engine database 13 can also match the index keyword "transaction amount" identified by the semantic recognition module 12 with the index characteristics of the stored data to obtain a second index corresponding to the index characteristics matching the index keywords. Data, the second data may be a plurality of data derived from the data application 15.

此外,搜尋引擎資料庫13還可將語義識別模組12識別出的時間細微性關鍵字“最近一天”與其儲存的資料的時間細微性特徵進行匹配,獲得與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,該第三資料可以是源自於資料應用程式15的多個資料。 In addition, the search engine database 13 can also match the temporal nuance keywords “last day” identified by the semantic recognition module 12 with the temporal nuance characteristics of the stored data to obtain a match with the temporal nuance keywords. The third data corresponding to the temporal nuance feature may be a plurality of data derived from the data application program 15.

步驟S204、所述查詢終端接收所述搜尋引擎資料庫發送的所述第一資料、所述第二資料和所述第三資料。 Step S204: The query terminal receives the first data, the second data, and the third data sent by the search engine database.

搜尋引擎資料庫13其獲得的所述第一資料、所述第二資料和所述第三資料發送給查詢終端11,具體可以發 送給查詢終端11中的排序器14。 The search engine database 13 sends the first data, the second data, and the third data obtained to the query terminal 11. It is sent to the sequencer 14 in the query terminal 11.

步驟S205、所述查詢終端根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 Step S205: The query terminal determines target data to be returned to the user according to the first data, the second data, and the third data.

若搜尋引擎資料庫13查找出的匹配資料只有一個,即所述第一資料、所述第二資料和所述第三資料為同一資料,則排序器14將該匹配資料發送給查詢終端11的顯示器,查詢終端11的顯示器顯示該匹配資料。 If there is only one matching data found by the search engine database 13, that is, the first data, the second data, and the third data are the same data, the sorter 14 sends the matching data to the query terminal 11. Display, the display of the query terminal 11 displays the matching data.

若搜尋引擎資料庫13查找出的匹配資料有多個,即所述第一資料、所述第二資料和所述第三資料不為同一資料,則排序器14按照預設演算法對該多個匹配資料進行排序,將排序後的多個匹配資料發送給查詢終端11的顯示器,查詢終端11的顯示器按照排序的先後順序顯示該多個匹配資料。 If there are multiple matching data found by the search engine database 13, that is, the first data, the second data, and the third data are not the same data, the sorter 14 performs a preset algorithm on the multiple data. The matching materials are sorted, and the sorted matching materials are sent to the display of the query terminal 11. The display of the query terminal 11 displays the multiple matching materials in the order of sorting.

本實施例中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即 可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, data in a data application, a report, a knowledge base platform, and a cluster physical table are collected in advance into a search engine database, and dimensional features, index features, and time nuance features are added for each piece of collected data. At least one of: when a search engine receives a search keyword entered by a user, the search keyword is first split to obtain a dimensional keyword, an index keyword, and a time fineness keyword; then, in a pre-built search In the engine database, find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and display the matched data to the user; the user does not need to go through each data exit to find the data, only Enter your search keywords once and the search engine database will be It can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

圖4為本發明實施例二提供的資料處理方法的流程圖,如圖4所示,在圖3所示實施例的基礎上,所述查詢終端獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字的方法可以具體包括如下步驟: FIG. 4 is a flowchart of a data processing method provided in Embodiment 2 of the present invention. As shown in FIG. 4, based on the embodiment shown in FIG. 3, the query terminal obtains a dimensional keyword, The method for index keywords and time-specific keywords can include the following steps:

步驟S301、所述查詢終端對所述檢索關鍵字進行分詞處理獲得多個目標分詞。 Step S301: The query terminal performs segmentation processing on the search keywords to obtain multiple target segmentations.

例如步驟S2()1所述,使用者輸入的搜索關鍵字是“最近一天家裝類目成交金額”。查詢終端11還可透過TF-idf演算法對使用者輸入的檢索關鍵字進行拆分,獲得多個目標分詞,多個目標分詞分別為“最近一天”、“家裝類目”、“成交金額”。 For example, as described in step S2 () 1, the search keyword entered by the user is "the transaction amount of the home improvement category in the last day". The query terminal 11 can also use TF-idf algorithm to split the search keywords entered by the user to obtain multiple target participles. The multiple target participles are "last day", "home improvement category", and "transaction amount". .

步驟S302、所述查詢終端根據各目標分詞查詢預設的映射表,所述映射表包括維度分詞、指標分詞和時間細微性分詞。 Step S302: The query terminal queries a preset mapping table according to each target segmentation, and the mapping table includes dimensional segmentation, index segmentation, and time subtle segmentation.

在本實施例中,查詢終端11預先建立有映射表,該映射表包括維度分詞、指標分詞和時間細微性分詞,維度分詞可以是多個具有維度特徵的分詞,指標分詞可以是多個具有指標特徵的分詞,時間細微性分詞可以是多個具有時間細微性特徵的分詞。根據步驟S301拆分後的多個目標分詞,查詢終端11分別查詢該映射表,對於每個目標分詞,確定該映射表中是否存在與該目標分詞匹配的分詞。 In this embodiment, the query terminal 11 establishes a mapping table in advance, and the mapping table includes dimension segmentation, index segmentation, and time subtle segmentation. The dimension segmentation may be a plurality of segmentations having dimensional characteristics, and the index segmentation may be a plurality of segmentations Feature word segmentation, time subtlety segmentation can be multiple segmentation with time subtleness features. According to the multiple target segmentations after step S301, the query terminal 11 queries the mapping table respectively, and for each target segment, determines whether there is a segmentation matching the target segmentation in the mapping table.

步驟S303、所述查詢終端將所述多個目標分詞中與所述維度分詞匹配的目標分詞確定為所述維度關鍵字。 Step S303: The query terminal determines, as the dimension keyword, a target segmentation that matches the dimensional segmentation among the plurality of target segmentations.

例如,上述多個目標分詞中的“家裝類目”與映射表中的維度分詞匹配,則將“家裝類目”作為檢索關鍵字中的維度關鍵字。 For example, if the "home improvement category" in the multiple target participles matches the dimensional participle in the mapping table, the "home improvement category" is used as the dimensional keyword in the search key.

步驟S304、所述查詢終端將所述多個目標分詞中與所述指標分詞匹配的目標分詞確定為所述指標關鍵字。 Step S304: The query terminal determines, as the index key, a target segmentation that matches the index segmentation among the plurality of target segmentations.

例如,上述多個目標分詞中的“成交金額”與映射表中的指標分詞匹配,則將“成交金額”作為檢索關鍵字中的指標關鍵字。 For example, if the "transaction amount" in the above multiple target segmentations matches the index segmentation in the mapping table, then the "transaction amount" is used as the index keyword in the search keywords.

步驟S305、所述查詢終端將所述多個目標分詞中與所述時間細微性分詞匹配的目標分詞確定為所述時間細微性關鍵字。 Step S305: The query terminal determines a target segmentation that matches the temporal subtle segmentation among the multiple target segmentations as the temporal subtleness keyword.

例如,上述多個目標分詞中的“最近一天”與映射表中的時間細微性分詞匹配,則將“最近一天”作為檢索關鍵字中的時間細微性關鍵字。 For example, if the "last day" in the multiple target participles matches the time subtlety participle in the mapping table, then the "last day" is used as the time subtlety key in the search key.

本實施例中,透過對檢索關鍵字進行分詞處理獲得多個目標分詞,根據預先建立的映射表查詢該多個目標分詞中的維度關鍵字、指標關鍵字以及時間細微性關鍵字,提高了確定檢索關鍵字中維度關鍵字、指標關鍵字以及時間細微性關鍵字的效率。 In this embodiment, a plurality of target segmentations are obtained by performing segmentation processing on the search keywords, and the dimension keywords, index keywords, and time nuance keywords in the multiple target segmentations are queried according to a pre-established mapping table, thereby improving the determination Retrieve the efficiency of dimensional keywords, index keywords, and time nuance keywords in keywords.

圖5為本發明實施例三提供的資料處理方法的流程圖,如圖5所示,在上述任一實施例的基礎上,以實施例二為基礎,本實施例提供的資料處理方法的具體步驟如 下: FIG. 5 is a flowchart of a data processing method provided in Embodiment 3 of the present invention. As shown in FIG. 5, based on any one of the foregoing embodiments, and based on Embodiment 2, the specifics of the data processing method provided in this embodiment are specific. Steps as under:

步驟S401、查詢終端接收使用者的查詢請求,所述查詢請求包括檢索關鍵字。 Step S401: The query terminal receives a query request from a user, where the query request includes a search keyword.

步驟S402、所述查詢終端獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字。 Step S402: The query terminal obtains a dimensional keyword, an index keyword, and a time fineness keyword among the search keywords.

步驟S403、所述查詢終端將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料。 Step S403: The query terminal sends the dimensional keyword, the index keyword, and the time fineness keyword to a search engine database, so that the search engine database obtains the dimensional keyword. The first data corresponding to the matched dimensional feature, the second data corresponding to the index feature matching the index key, and the third data corresponding to the time subtle feature matching the time subtle key.

步驟S404、所述查詢終端接收所述搜尋引擎資料庫發送的所述第一資料、所述第二資料和所述第三資料。 Step S404: The query terminal receives the first data, the second data, and the third data sent by the search engine database.

步驟S401-步驟S404分別與步驟S201-S204一致,具體方法此處不再贅述。 Steps S401 to S404 are consistent with steps S201 to S204, respectively, and specific methods are not described herein again.

步驟S405、所述查詢終端確定所述第一資料、所述第二資料和所述第三資料是否為同一資料,若是,則執行步驟S406,否則,執行步驟S407。 Step S405: The query terminal determines whether the first data, the second data, and the third data are the same data. If yes, step S406 is performed; otherwise, step S407 is performed.

步驟S406、所述查詢終端將所述同一資料確定為回饋給所述使用者的目標資料。 Step S406: The query terminal determines the same data as target data to be returned to the user.

如圖1所示,若搜尋引擎資料庫13查找出的匹配資料只有一個,即所述第一資料、所述第二資料和所述第三資料為同一資料,則排序器14將該匹配資料發送給查詢終端11的顯示器,查詢終端11的顯示器顯示該匹配資 料。 As shown in FIG. 1, if there is only one matching data found by the search engine database 13, that is, the first data, the second data, and the third data are the same data, the sorter 14 matches the matching data. Sent to the display of the query terminal 11, and the display of the query terminal 11 displays the matching information material.

步驟S407、所述查詢終端對所述第一資料、所述第二資料和所述第三資料進行排序,將排序後的資料確定為回饋給所述使用者的目標資料。 Step S407: The query terminal sorts the first data, the second data, and the third data, and determines the sorted data as target data to be returned to the user.

若搜尋引擎資料庫13查找出的匹配資料有多個,即所述第一資料、所述第二資料和所述第三資料不為同一資料,則排序器14按照預設演算法對該多個匹配資料進行排序,將排序後的多個匹配資料發送給查詢終端11的顯示器,查詢終端11的顯示器按照排序的先後順序顯示該多個匹配資料。 If there are multiple matching data found by the search engine database 13, that is, the first data, the second data, and the third data are not the same data, the sorter 14 performs a preset algorithm on the multiple data. The matching materials are sorted, and the sorted matching materials are sent to the display of the query terminal 11. The display of the query terminal 11 displays the multiple matching materials in the order of sorting.

在步驟S407中,所述查詢終端對所述第一資料、所述第二資料和所述第三資料進行排序的方法具體可以包括如下步驟: In step S407, the method by which the query terminal sorts the first data, the second data, and the third data may specifically include the following steps:

步驟S51、所述查詢終端計算所述第一資料、所述第二資料和所述第三資料中每個資料的權重值。 Step S51: The query terminal calculates a weight value of each of the first data, the second data, and the third data.

具體可透過Pagerank演算法計算每個資料的權重值。 Specifically, the weight value of each data can be calculated through the Pagerank algorithm.

步驟S52、所述查詢終端計算所述第一資料、所述第二資料和所述第三資料中每個資料與所述檢索關鍵字的相似度。 Step S52: The query terminal calculates a similarity between each of the first data, the second data, and the third data and the search keyword.

具體可利用CUS一距離演算法,計算每個資料與使用者輸入的檢索關鍵字的相似度。 Specifically, a CUS distance algorithm can be used to calculate the similarity between each data and the search keywords entered by the user.

步驟S53、所述查詢終端根據所述每個資料的權重值和相似度,計算所述每個資料的排序值。 Step S53: The query terminal calculates the ranking value of each material according to the weight value and similarity of each material.

具體的,可將每個資料的權重值和相似度相加得到的值作為該資料的排序值。 Specifically, the value obtained by adding the weight value and the similarity of each material may be used as the ranking value of the material.

步驟S54、所述查詢終端根據所述每個資料的排序值,對所述第一資料、所述第二資料和所述第三資料中的每個資料進行排序。 Step S54: The query terminal sorts each of the first material, the second material, and the third material according to the ranking value of each material.

具體的,可根據每個資料的排序值,按照從大到小的順序對所述第一資料、所述第二資料和所述第三資料中的每個資料進行排序。 Specifically, each of the first material, the second material, and the third material may be sorted according to a sort value of each material in a descending order.

可選的,所述查詢終端根據所述每個資料的排序值,確定所述第一資料、所述第二資料和所述第三資料中排序值大於第一閾值的資料;所述查詢終端對所述排序值大於第一閾值的資料,按照所述排序值的大小進行排序。 Optionally, the query terminal determines, according to the ranking value of each of the materials, the materials in which the ranking value of the first material, the second material, and the third material is greater than a first threshold; the query terminal Sort the data whose ranking value is greater than the first threshold according to the size of the ranking value.

另外,計算出所述第一資料、所述第二資料和所述第三資料中每個資料的排序值後,可確定出所述第一資料、所述第二資料和所述第三資料中排序值大於第一閾值的資料,並對排序值大於第一閾值的資料,按照所述排序值的大小進行排序。 In addition, after calculating a ranking value of each of the first data, the second data, and the third data, the first data, the second data, and the third data may be determined. The data with a middle ranking value greater than the first threshold is sorted according to the size of the ranking value.

本實施例中,對搜尋引擎資料庫查找出的多個與檢索關鍵字匹配的資料進行排序,排序的依據是每個資料的排序值,該排序值與每個資料的權重值和該資料與檢索關鍵字的相似度有關,則排序值越大,表示該資料與檢索關鍵字的關聯性越強,將排序後的多個資料回饋給使用者,使用者可方便的查看到與檢索關鍵字關聯性最強的資料,提高了使用者體驗。 In this embodiment, a plurality of materials matching the search keywords found by the search engine database are sorted, and the sorting is based on the sorted value of each data, the sorted value and the weight value of each data, and the data and The similarity of the search keywords is related. The larger the ranking value, the stronger the correlation between the data and the search keywords. The sorted data is returned to the user, and the user can easily view the search keywords. The most relevant data improves the user experience.

圖6為本發明實施例四提供的資料處理方法的流程圖,如圖6所示,在上述任一實施例的基礎上,以實施例二為基礎,本實施例提供的資料處理方法的具體步驟如下: FIG. 6 is a flowchart of a data processing method provided in Embodiment 4 of the present invention. As shown in FIG. 6, on the basis of any of the foregoing embodiments, and based on Embodiment 2, the specifics of the data processing method provided by this embodiment are specific. Proceed as follows:

步驟S601、查詢終端接收使用者的查詢請求,所述查詢請求包括檢索關鍵字。 Step S601: The query terminal receives a query request from a user, where the query request includes a search keyword.

步驟S602、所述查詢終端獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字。 Step S602: The query terminal obtains a dimensional keyword, an index keyword, and a time fineness keyword among the search keywords.

步驟S603、所述查詢終端將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料。 Step S603: The query terminal sends the dimensional keyword, the index keyword, and the time fineness keyword to a search engine database, so that the search engine database obtains the dimensional keyword. The first data corresponding to the matched dimensional feature, the second data corresponding to the index feature matching the index key, and the third data corresponding to the time subtle feature matching the time subtle key.

步驟S604、所述查詢終端接收所述搜尋引擎資料庫發送的所述第一資料、所述第二資料和所述第三資料。 Step S604: The query terminal receives the first data, the second data, and the third data sent by the search engine database.

步驟S605、所述查詢終端根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 Step S605: The query terminal determines target data to be returned to the user according to the first data, the second data, and the third data.

步驟S601-步驟S605分別與步驟S201-步驟S205一致,具體方法此處不再贅述。 Steps S601 to S605 are consistent with steps S201 to S205, respectively, and specific methods are not described herein again.

步驟S606、所述查詢終端接收所述使用者對所述目標資料的點擊操作。 Step S606: The query terminal receives the user's click operation on the target data.

在步驟S407之後,可將排序後的多個目標資料顯示 在查詢終端,使用者透過查詢終端可點擊查看到該多個目標資料。當使用者點擊某個目標資料時,查詢終端可接收到該使用者對該目標資料的點擊操作。 After step S407, multiple sorted target data can be displayed In the query terminal, the user can click to view the multiple target data through the query terminal. When a user clicks on some target data, the query terminal can receive the user's click operation on the target data.

步驟S607、所述查詢終端根據所述點擊操作建立所述使用者與所述目標資料的關聯關係。 Step S607: The query terminal establishes an association relationship between the user and the target data according to the click operation.

所述關聯關係包括關聯度,所述關聯度標識所述使用者與所述目標資料的關聯程度。 The association relationship includes a degree of association that identifies a degree of association between the user and the target material.

在本實施例中,查詢終端根據使用者點擊某個目標資料產生的點擊操作建立所述使用者與所述目標資料的關聯關係,另外,還可根據關聯規則和協同過濾規則計算使用者與其點擊的目標資料的關聯度,該使用者點擊的目標資料的個數可以是多個。 In this embodiment, the query terminal establishes an association relationship between the user and the target data according to a click operation generated when the user clicks on a certain target data. In addition, the user and his click can be calculated according to association rules and collaborative filtering rules. The relevance of the target data, the number of target data clicked by the user may be multiple.

步驟S608、當使用者未輸入所述檢索關鍵字時,所述查詢終端根據所述關聯關係顯示所述目標資料。 Step S608: When the user does not enter the search keyword, the query terminal displays the target data according to the association relationship.

當使用者在查詢終端11未輸入檢索關鍵字時,查詢終端11可根據使用者與其點擊過的目標資料之間的關聯關係顯示該目標資料,即查詢終端11可將使用者點擊過的目標資料顯示給使用者。 When the user does not enter a search keyword in the query terminal 11, the query terminal 11 may display the target data according to the association relationship between the user and the target data that the user clicked, that is, the query terminal 11 may display the target data that the user clicked. Visible to users.

具體的,所述關聯關係包括關聯度,所述關聯度標識所述使用者與所述目標資料的關聯程度。所述查詢終端根據所述關聯關係顯示所述目標資料,包括:所述查詢終端顯示關聯度大於第二閾值的所述目標資料。 Specifically, the association relationship includes a degree of association that identifies a degree of association between the user and the target data. The displaying, by the query terminal, the target material according to the association relationship includes: displaying, by the query terminal, the target material with a correlation degree greater than a second threshold.

可選的,查詢終端顯示關聯度大於第二閾值的所述目標資料。使用者與其點擊過的每個目標資料的關聯關係還 包括使用者與該目標資料的關聯度,查詢終端11還可以顯示使用者點擊過的關聯度大於第二閾值的目標資料。 Optionally, the query terminal displays the target data whose relevance is greater than a second threshold. The user ’s association with each target Including the degree of association between the user and the target data, the query terminal 11 may also display the target data with the degree of relevance clicked by the user greater than the second threshold.

本實施例中,透過建立使用者與其點擊過的目標資料之間的關聯關係,當使用者未輸入檢索關鍵字時,可根據使用者與目標資料之間的關聯關係,顯示使用者點擊過的目標資料,提高了使用者查詢資料的便捷性。 In this embodiment, by establishing the association between the user and the target data that they clicked on, when the user does not enter a search keyword, the user ’s clicked information can be displayed according to the association relationship between the user and the target data. The target data improves the convenience of user query data.

圖7為本發明實施例五提供的資料處理方法的流程圖,如圖7所示,本實施例提供的資料處理方法的具體步驟如下: FIG. 7 is a flowchart of a data processing method provided by Embodiment 5 of the present invention. As shown in FIG. 7, the specific steps of the data processing method provided by this embodiment are as follows:

步驟S501、查詢終端接收使用者的查詢請求,所述查詢請求包括檢索關鍵字。 Step S501: The query terminal receives a query request from a user, where the query request includes a search keyword.

如圖1所示,查詢終端11接收使用者10的查詢請求,查詢請求的方式可以有多種,例如,使用者10在查詢終端11的搜尋引擎上輸入文字、語音,該文字或語音包括使用者10預檢索的關鍵字。 As shown in FIG. 1, the query terminal 11 receives a query request from the user 10. There are various ways for the query request. For example, the user 10 enters text or voice on the search engine of the query terminal 11, and the text or voice includes the user 10 pre-retrieved keywords.

步驟S502、所述查詢終端至少獲取所述檢索關鍵字中的兩類關鍵字。 Step S502: The query terminal obtains at least two types of keywords in the search keywords.

在本實施例中,查詢終端對使用者請求查詢的檢索關鍵字分類時,可以不局限於維度關鍵字、指標關鍵字和時間細微性關鍵字這三類關鍵字,因為,並不是使用者請求查詢的每個檢索關鍵字都包括維度關鍵字、指標關鍵字和時間細微性關鍵字這三類關鍵字,因此,如圖1所示的查詢終端11對應的語義識別模組12還可以將使用者請求查詢的檢索關鍵字拆分為至少兩類關鍵字,例如,使用者是 賣家,賣家請求查詢的檢索關鍵字是“有客戶評價我的商品嗎”,可拆分出動詞“評價”、名詞“商品”。 In this embodiment, when the query terminal categorizes the search keywords requested by the user, the query keywords may not be limited to the three types of keywords: dimension keywords, index keywords, and time nuance keywords, because they are not user requests. Each search keyword of the query includes three types of keywords: a dimension keyword, an index keyword, and a time nuance keyword. Therefore, the semantic recognition module 12 corresponding to the query terminal 11 shown in FIG. 1 can also use The search keywords that the user requested the query are split into at least two types of keywords, for example, the user is Seller, the search keyword that the seller requests to query is "Are there any customers evaluating my products?" The verb "evaluation" and the noun "product" can be split.

步驟S503、所述查詢終端將至少兩類關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述至少兩類關鍵字分別對應的來源資料。 Step S503: The query terminal sends at least two types of keywords to a search engine database, so that the search engine database obtains source data corresponding to the at least two types of keywords respectively.

查詢終端將動詞“評價”、名詞“商品”發送給搜尋引擎資料庫,搜尋引擎資料庫儲存有賣家的所有商品的商品資訊,以及每件商品的評價資訊。搜尋引擎資料庫根據“商品”獲得該賣家的所有商品的商品資訊,該商品資訊具體包括名稱、產地、材料等,根據“評價”獲得所有商品的評價資訊。 The query terminal sends the verb "evaluation" and the noun "commodity" to the search engine database. The search engine database stores the product information of all the products of the seller and the evaluation information of each product. The search engine database obtains the product information of all the products of the seller according to the "product". The product information specifically includes the name, the origin, the material, etc., and obtains the evaluation information of all products according to the "evaluation".

步驟S504、所述查詢終端接收所述搜尋引擎資料庫發送的所述來源資料。 Step S504: The query terminal receives the source data sent by the search engine database.

搜尋引擎資料庫將商品資訊和評價資訊發送給查詢終端,由於此處的商品資訊可以是多個,評價資訊也可以是多個。 The search engine database sends product information and evaluation information to the query terminal. Since there can be multiple product information here, there can also be multiple evaluation information.

步驟S505、所述查詢終端根據所述來源資料,確定回饋給所述使用者的目標資料。 Step S505: The query terminal determines target data to be returned to the user according to the source data.

查詢終端可以根據每個商品的評價資訊的個數,確定回饋給所述使用者評價資訊最多的商品的商品資訊,也可以將每個商品的前幾條評價資訊回饋給所述使用者,本實施例不限定查詢終端確定回饋給所述使用者的目標資料的具體實現方式。 The query terminal may determine the product information that is returned to the user with the most evaluation information according to the number of evaluation information for each product, or may return the first several pieces of evaluation information to each user. The embodiment does not limit the specific implementation manner of the query terminal determining the target data to be returned to the user.

本實施例中,透過對檢索關鍵字進行分類,分類的結 果並不局限於維度關鍵字、指標關鍵字以及時間細微性關鍵字,提高了對檢索關鍵字分類的靈活度,增加了對檢索關鍵字進行檢索的靈活度,同時也擴大了檢索範圍。 In this embodiment, by classifying the search keywords, the classification results The results are not limited to dimensional keywords, index keywords, and time nuance keywords, which improves the flexibility of classification of search keywords, increases the flexibility of search keywords, and expands the scope of search.

圖8為本發明實施例六提供的資料處理方法的流程圖,如圖8所示,本實施例提供的資料處理方法的具體步驟如下: FIG. 8 is a flowchart of a data processing method provided in Embodiment 6 of the present invention. As shown in FIG. 8, the specific steps of the data processing method provided in this embodiment are as follows:

步驟S701、搜尋引擎資料庫接收查詢終端發送的維度關鍵字、指標關鍵字、以及時間細微性關鍵字。 Step S701: The search engine database receives the dimensional keywords, index keywords, and time fineness keywords sent by the query terminal.

其中,所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字是所述查詢終端接收使用者的查詢請求,並從所述查詢請求包括的檢索關鍵字中獲取的。 The dimension keyword, the index keyword, and the time fineness keyword are obtained by the query terminal from a query request of a user and are obtained from the search keywords included in the query request.

在本實施例中,所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵。 In this embodiment, the search engine database stores the data in the data exit and the characteristic information of the data in advance. The characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics.

所述資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表。 The data export includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table.

步驟S702、所述搜尋引擎資料獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料。 Step S702, the search engine data obtains first data corresponding to a dimensional feature matching the dimensional keyword, second data corresponding to an index feature matching the index keyword, and a time nuanced keyword The third data corresponding to the temporal nuance feature of the match.

步驟S703、所述搜尋引擎資料將所述第一資料、所述第二資料和所述第三資料發送給所述查詢終端,以使所述查詢終端根據所述第一資料、所述第二資料和所述第三 資料,確定回饋給所述使用者的目標資料。 Step S703: The search engine data sends the first data, the second data, and the third data to the query terminal, so that the query terminal uses the first data, the second data Information and said third Data to determine the target data that is returned to the user.

本實施例所述方法的原理與圖3所示實施例方法的原理一致,具體過程此處不再贅述。 The principle of the method in this embodiment is consistent with the principle of the method in the embodiment shown in FIG. 3, and the specific process is not repeated here.

本實施例中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, data in a data application, a report, a knowledge base platform, and a cluster physical table are collected in advance into a search engine database, and dimensional features, index features, and time nuance features are added for each piece of collected data. At least one of: when a search engine receives a search keyword entered by a user, the search keyword is first split to obtain a dimensional keyword, an index keyword, and a time fineness keyword; then, in a pre-built search In the engine database, find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and display the matched data to the user; the user does not need to go through each data exit to find the data, only Enter a search keyword once, and the search engine database can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

圖9為本發明實施例七提供的資料處理方法的流程圖,如圖9所示,本實施例提供的資料處理方法的具體步驟如下: FIG. 9 is a flowchart of a data processing method provided in Embodiment 7 of the present invention. As shown in FIG. 9, specific steps of the data processing method provided in this embodiment are as follows:

步驟S801、所述搜尋引擎資料庫儲存所述資料應用程式、所述報表、所述知識庫平臺以及所述集群物理表中的資料。 Step S801: The search engine database stores data in the data application program, the report, the knowledge base platform, and the cluster physical table.

在圖3所示實施例的基礎上,在接收使用者輸入的檢索關鍵字之前,搜尋引擎資料庫13預先儲存有資料應用程式、報表、知識庫平臺以及集群物理表中的資料。 Based on the embodiment shown in FIG. 3, before receiving the search keywords input by the user, the search engine database 13 previously stores data in data applications, reports, knowledge base platforms, and cluster physical tables.

具體的,資料應用程式15中的資料可透過語法解析器19儲存在搜尋引擎資料庫13,具體的,透過SDK將資料應用程式15中的資料獲取到語法解析器19中。語法解析器19可解析出一段結構化查詢語言(Structured Query Language,簡稱SQL)的維度特徵、指標特徵、時間細微性特徵和讀取的表名,例如一段SQL具體如下:SELECT stat_date AS 日期 ,user_type AS 使用者類型 ,se_lpv_pc_1d_001 AS Pv ,se_uv_pc_1d_001 AS Uv Specifically, the data in the data application program 15 can be stored in the search engine database 13 through the parser 19. Specifically, the data in the data application program 15 is acquired into the parser 19 through the SDK. Syntax parser 19 can parse a structured query language (Structured Query Language, SQL for short) of dimensional characteristics, index characteristics, time nuance characteristics and read table names. For example, a piece of SQL is as follows: SELECT stat_date AS date, user_type AS user type, se_lpv_pc_1d_001 AS Pv, se_uv_pc_1d_001 AS Uv

FROM tbbi.ads_tb_log_1d FROM tbbi.ads_tb_log_1d

WHERE ds='20151026' WHERE ds = '20151026'

語法解析器19可解析出該段SQL的維度特徵是“使用者類型”,指標特徵是“Pv、Uv”,時間細微性特徵是“最近一天”,讀取的表名是“tbbi.ads_tb_log_1d”。透過前述方法,語法解析器19可解析出資料應用程式15中每個資料的維度特徵、指標特徵和時間細微性特徵。語法解析器19將解析後的資料發送給搜尋引擎資料庫13,搜尋引擎資料庫13中不僅儲存有資料本身,同時還儲存有資料的維度特徵、指標特徵和時間細微性特徵。 The parser 19 can parse the SQL dimension of this segment as "user type", the index feature is "Pv, Uv", the time nuance feature is "last day", and the table name read is "tbbi.ads_tb_log_1d" . Through the foregoing method, the grammar parser 19 can parse out the dimensional characteristics, index characteristics, and temporal nuance characteristics of each data in the data application program 15. The grammar parser 19 sends the parsed data to the search engine database 13. The search engine database 13 not only stores the data itself, but also stores the dimensional characteristics, index characteristics, and temporal nuance characteristics of the data.

另外,搜尋引擎資料庫13還可以儲存有報表16、知識庫平臺17和集群物理表18中的資料,儲存過程具體為:對報表16、知識庫平臺17和集群物理表18中的每個資料進行拆分,提取出拆分後的每個資料的維度特徵, 並將報表16、知識庫平臺17和集群物理表18中的每個資料,以及每個資料的維度特徵儲存在搜尋引擎資料庫13。如此,搜尋引擎資料庫13中儲存的每個資料至少具有維度特徵。 In addition, the search engine database 13 can also store data in the report 16, the knowledge base platform 17, and the cluster physical table 18. The storage process is as follows: For each data in the report 16, the knowledge base platform 17, and the cluster physical table 18 Perform a split to extract the dimensional characteristics of each data after the split, Each data in the report 16, the knowledge base platform 17, and the cluster physical table 18, and the dimensional characteristics of each data are stored in the search engine database 13. As such, each piece of data stored in the search engine database 13 has at least dimensional characteristics.

步驟S802、搜尋引擎資料庫接收查詢終端發送的維度關鍵字、指標關鍵字、以及時間細微性關鍵字,所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字是所述查詢終端接收使用者的查詢請求,並從所述查詢請求包括的檢索關鍵字中獲取的。 Step S802: The search engine database receives the dimensional keyword, the index keyword, and the time fineness keyword sent by the query terminal, and the dimensional keyword, the index keyword, and the time fineness keyword are the The query terminal receives a query request from a user, and obtains the query request from a search keyword included in the query request.

在本實施例中,所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵。 In this embodiment, the search engine database stores the data in the data exit and the characteristic information of the data in advance. The characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics.

所述資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表。 The data export includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table.

步驟S803、所述搜尋引擎資料獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料。 Step S803, the search engine data obtains the first data corresponding to the dimensional feature matching the dimensional keyword, the second data corresponding to the index feature matching the index keyword, and the time fineness keyword The third data corresponding to the temporal nuance feature of the match.

步驟S804、所述搜尋引擎資料將所述第一資料、所述第二資料和所述第三資料發送給所述查詢終端,以使所述查詢終端根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 Step S804: The search engine data sends the first data, the second data, and the third data to the query terminal, so that the query terminal can use the first data, the second data, and the second data The data and the third data determine the target data to be fed back to the user.

步驟S802-步驟S804所述的方法原理與步驟S701-步 驟S703所述的方法原理一致,此處不再贅述。 Step S802-Step S804 Method Principle and Step S701-Step The method described in step S703 has the same principle and will not be repeated here.

本實施例中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, data in a data application, a report, a knowledge base platform, and a cluster physical table are collected in advance into a search engine database, and dimensional features, index features, and time nuance features are added for each piece of collected data. At least one of: when a search engine receives a search keyword entered by a user, the search keyword is first split to obtain a dimensional keyword, an index keyword, and a time fineness keyword; then, in a pre-built search In the engine database, find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and display the matched data to the user; the user does not need to go through each data exit to find the data, only Enter a search keyword once, and the search engine database can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

圖10為本發明實施例八提供的資料處理方法的流程圖,如圖10所示,所述搜尋引擎資料庫儲存所述資料應用程式、所述報表、所述知識庫平臺以及所述集群物理表中的資料具體可以包括如下步驟S901和S902: 10 is a flowchart of a data processing method provided in Embodiment 8 of the present invention. As shown in FIG. 10, the search engine database stores the data application program, the report, the knowledge base platform, and the cluster physics. The information in the table may specifically include the following steps S901 and S902:

步驟S901、所述搜尋引擎資料庫儲存所述資料應用程式中的資料。 Step S901: The search engine database stores data in the data application.

步驟S901的可以透過如下步驟S11-S13來實現: Step S901 can be implemented through the following steps S11-S13:

步驟S11、所述搜尋引擎資料庫獲取所述資料應用程式訪問資料來源的訪問邏輯。 Step S11: The search engine database acquires the access logic of the data application to access the data source.

所述訪問邏輯包括所述資料應用程式中的資料,所述資料來源儲存有所述資料的產出邏輯。 The access logic includes data in the data application, and the data source stores output logic of the data.

本實施例介紹將所述資料應用程式中的資料儲存到所述搜尋引擎資料庫的方法,且本實施例所述的方法不同於上述實施例所述的透過語法解析器19將資料應用程式15中的資料儲存在搜尋引擎資料庫13的方法。 This embodiment describes a method for storing data in the data application program into the search engine database, and the method described in this embodiment is different from the data application program 15 through the parser 19 described in the above embodiment. The data in the search engine database 13 is stored.

在本實施例中,所述資料應用程式可具體為Web頁面形式,需頻繁與後臺資料來源進行交互;所述後臺資料來源可具體為儲存所述資料應用程式運算元據的器件。由於資料應用程式是根據SDK所開發的,SDK對資料應用程式具有最大的操作許可權,因此可透過SDK捕獲資料應用程式對後臺資料來源的第一訪問邏輯,所述第一訪問邏輯中包括資料應用程式訪問後臺資料來源的時間、使用者對資料應用程式的第二訪問邏輯等欄位。因此,透過現有的解析方式,即可獲取使用者對資料應用程式的第二訪問邏輯。使用者對資料應用程式的第二訪問邏輯中包括使用者訪問資料應用程式的時間、使用者當前所訪問的資料應用程式中的資料等欄位資訊。因此,透過現有的解析方式,即可獲取使用者當前所訪問的資料應用程式中的資料。 In this embodiment, the data application program may be specifically in the form of a Web page, which needs to frequently interact with background data sources; the background data source may specifically be a device that stores the data application operation data. Because the data application is developed according to the SDK, the SDK has the largest operation permission for the data application, so the SDK can capture the first access logic of the data application to the background data source through the SDK, and the first access logic includes data The time when the application accesses the background data source, the user ’s second access logic to the data application, and other fields. Therefore, through the existing analysis method, the second access logic of the user to the data application can be obtained. The second access logic of the user to the data application includes field information such as the time when the user accesses the data application, the data in the data application currently accessed by the user, and the like. Therefore, through the existing parsing method, the data in the data application currently accessed by the user can be obtained.

假設使用者對資料應用程式的第二訪問邏輯為:SELECT stat_date AS 日期 ,user_type AS 使用者類型 ,se_lpv_pc_1d_001 AS Pv ,se_uv_pc_1d_001 AS Uv Suppose the user's second access logic to the data application is: SELECT stat_date AS date, user_type AS user type, se_lpv_pc_1d_001 AS Pv, se_uv_pc_1d_001 AS Uv

FROM tbbi.ads_tb_log_1d FROM tbbi.ads_tb_log_1d

WHERE ds='20151026' WHERE ds = '20151026'

透過對上述第二訪問邏輯進行解析,即可獲取使用者當前所訪問的資料應用程式中的資料為“tbbi.ads_tb_log_1d”,即FROM欄位後的資訊。 By analyzing the second access logic, the data in the data application currently accessed by the user can be obtained as "tbbi.ads_tb_log_1d", that is, the information behind the FROM field.

另外,後臺資料來源中儲存有每個資料的產出邏輯,因此,在後臺資料來源中,可直接查找使用者當前所訪問的資料應用程式中的資料的產出邏輯。 In addition, the output logic of each data is stored in the background data source. Therefore, in the background data source, the output logic of the data in the data application currently accessed by the user can be directly found.

步驟S12、所述搜尋引擎資料庫根據所述產出邏輯,確定所述資料應用程式中的資料的特徵資訊。 Step S12: The search engine database determines characteristic information of data in the data application program according to the output logic.

具體的,所述產出邏輯包括所述資料的聚合物件資訊、聚合過程中參與運算的指標資訊以及指標運算的時間資訊。 Specifically, the output logic includes information of the polymer pieces of the data, index information participating in the calculation during the aggregation process, and time information of the index calculation.

步驟S12的實現方式具體為:所述搜尋引擎資料庫確定所述資料的聚合物件資訊為所述資料的維度特徵;所述搜尋引擎資料庫確定所述資料在聚合過程中參與運算的指標資訊為所述資料的指標特徵;所述搜尋引擎資料庫根據所述指標運算的時間資訊,確定所述資料的時間細微性特徵。 The implementation of step S12 is specifically: the search engine database determines that the polymer piece information of the data is a dimensional feature of the data; the search engine database determines that the index information of the data participating in the calculation during the aggregation process is The index characteristics of the data; the search engine database determines the temporal nuance characteristics of the data according to the time information calculated by the index.

對使用者當前所訪問的資料應用程式中的資料的產出邏輯進行解析,獲取當前資料的聚合物件資訊、聚合過程中參與運算的指標資訊以及指標運算的時間資訊。 Analyze the output logic of the data in the data application currently accessed by the user, and obtain the polymer piece information of the current data, the index information participating in the calculation during the aggregation process, and the time information of the index calculation.

在本實施例中,假設使用者當前所訪問的資料應用程式中的資料的產出邏輯,如下:Select stat_date,user_type,count(1)se_lpv_pc_1d_001, count(distinct uid)se_uv_pc_1d_001 In this embodiment, it is assumed that the output logic of the data in the data application currently accessed by the user is as follows: Select stat_date, user_type, count (1) se_lpv_pc_1d_001, count (distinct uid) se_uv_pc_1d_001

From tbcdm.dwd_tb_log_1d where ds=’20160119’ From tbcdm.dwd_tb_log_1d where ds = ’20160119’

Group by user_type,stat_date Group by user_type, stat_date

透過對上述產出邏輯進行解析,可獲得使用者當前所訪問的資料應用程式中的資料的聚合物件資訊為stat_date,user_type,即Group by欄位後的資訊;聚合過程中參與運算的指標資訊為se_lpv_pc_1d_001,se_uv_pc_1d_001,即count(1)和count(distinct uid)欄位後的資訊;指標運算的時間資訊為’20160119’,即where ds欄位後的分數區。 By analyzing the above output logic, the polymer piece information of the data in the data application that the user is currently accessing is stat_date, user_type, that is, the information after the Group by field; the index information participating in the calculation during the aggregation process is se_lpv_pc_1d_001, se_uv_pc_1d_001, that is, the information after the count (1) and count (distinct uid) fields; the time information of the index operation is '20160119', which is the score area after the where ds field.

確定當前資料的聚合物件資訊為當前資料的維度特徵,確定當前資料聚合過程中參與運算的指標資訊為當前資料的指標特徵,以及,根據當前資料指標運算的時間資訊,確定當前資料的時間細微性特徵。 Determine the polymer piece information of the current data as the dimensional characteristics of the current data, determine that the index information participating in the calculation during the current data aggregation is the index characteristic of the current data, and determine the time nuance of the current data based on the time information of the current data index calculation feature.

另外,還可將上述指標運算的時間資訊所代表的時間區間,作為當前資料的時間細微性特徵,比如,當前資料指標運算的時間資訊,即where欄位後的分數區為“ds=’20160119’”,則當前資料的時間細微性特徵為1,再如,當前資料指標運算的時間資訊,即where欄位後的分數區為“ds>=’20160101’and ds<=’20160107’”,則當前資料的時間細微性特徵為7。 In addition, the time interval represented by the time information calculated by the above indicators can be used as the time subtle characteristics of the current data. For example, the time information calculated by the current data indicators, that is, the score area after the where field is "ds = '20160119" '", Then the time nuance characteristic of the current data is 1, for example, the time information of the current data index calculation, that is, the score area after the where field is" ds> =' 20160101'and ds <= '20160107' ", The temporal nuance characteristic of the current data is 7.

步驟S13、所述搜尋引擎資料庫儲存所述資料應用程式中的資料,以及所述資料的特徵資訊。 Step S13: The search engine database stores data in the data application and characteristic information of the data.

最後,為使用者當前所訪問的資料應用程式中的資料 添加維度特徵、指標特徵以及時間細微性特徵,且將添加特徵後的資料儲存到搜尋引擎資料庫中。 Finally, the data in the data application that the user is currently accessing Add dimensional features, index features, and temporal nuances, and store the added data in the search engine database.

在本實施例中,由於使用者每訪問一次資料應用程式,即可獲取一次使用者當前所訪問的資料應用程式中的資料的維度特徵、指標特徵以及時間細微性特徵,且為使用者當前所訪問的資料應用程式中的資料添加上述維度特徵、指標特徵以及時間細微性特徵,最後,將添加上述特徵後的資料,儲存到搜尋引擎資料庫中。當使用者訪問盡資料應用程式中的所有資料時,即可將資料應用程式中的所有資料儲存到搜尋引擎資料庫內,則搜尋引擎資料庫內的每條資料均有維度特徵、指標特徵和時間細微性特徵。 In this embodiment, since the user accesses the data application once, the dimensional characteristics, index characteristics, and time nuance characteristics of the data in the data application currently accessed by the user can be obtained once, and it is The data in the accessed data application is added with the above-mentioned dimensional characteristics, index characteristics, and temporal nuance characteristics. Finally, the data after adding the above characteristics is stored in the search engine database. When the user accesses all the data in the data application, all the data in the data application can be stored in the search engine database. Each piece of data in the search engine database has dimensional characteristics, index characteristics, and Subtle characteristics of time.

步驟S902、所述搜尋引擎資料庫儲存所述報表、所述知識庫平臺以及所述集群物理表中的資料。 Step S902: The search engine database stores data in the report, the knowledge base platform, and the cluster physical table.

步驟S902的可以透過如下步驟S21-S23來實現: Step S902 can be implemented through the following steps S21-S23:

步驟S21、所述搜尋引擎資料庫分別獲取所述報表、所述知識庫平臺以及所述集群物理表中的資料。 Step S21: The search engine database obtains data in the report, the knowledge base platform, and the cluster physical table, respectively.

本實施例可透過TF-iDF演算法拆分所述報表、所述知識庫平臺以及所述集群物理表中的每個資料。 In this embodiment, each data in the report, the knowledge base platform, and the cluster physical table can be split by a TF-iDF algorithm.

步驟S22、所述搜尋引擎資料庫根據預設演算法,確定所述報表、所述知識庫平臺以及所述集群物理表中每個資料的維度特徵。 Step S22: The search engine database determines the dimensional characteristics of each data in the report, the knowledge base platform, and the cluster physical table according to a preset algorithm.

利用LDA演算法和TOPIC MODEL演算法對拆分後的資料進行特徵提取,並將提取的特徵作為對應資料的維度特徵。 LDA algorithm and TOPIC MODEL algorithm are used to extract the features of the split data, and the extracted features are used as the dimensional features of the corresponding data.

步驟S23、所述搜尋引擎資料庫儲存所述報表、所述知識庫平臺以及所述集群物理表中每個資料,以及所述資料的維度特徵。 Step S23: The search engine database stores each report in the report, the knowledge base platform, and the cluster physical table, and the dimensional characteristics of the data.

為所述報表、所述知識庫平臺以及所述集群物理表中的每個資料添加維度特徵,且將添加維度特徵後的資料,儲存到搜尋引擎資料庫中。 Add dimensional features to each report in the report, the knowledge base platform, and the cluster physical table, and store the data after adding the dimensional features to the search engine database.

本實施例中,搜尋引擎資料庫中儲存有資料應用程式中的所有資料,且從資料應用程式儲存到搜尋引擎資料庫中的每個資料關聯有維度特徵、指標特徵和時間細微性特徵;另外,搜尋引擎資料庫中儲存有報表、知識庫平臺以及集群物理表中的所有資料,且從報表、知識庫平臺以及集群物理表儲存到搜尋引擎資料庫中的每個資料關聯有維度特徵。 In this embodiment, the search engine database stores all the data in the data application, and each data stored from the data application to the search engine database is associated with dimensional characteristics, index characteristics, and temporal nuance characteristics; in addition, , The search engine database stores all data in the report, knowledge base platform, and cluster physical tables, and each data stored from the report, knowledge base platform, and cluster physical tables in the search engine database is associated with dimensional characteristics.

圖11為本發明實施例九提供的資料處理方法的流程圖,如圖11所示,本實施例提供的資料處理方法可以包括如下步驟: FIG. 11 is a flowchart of a data processing method provided in Embodiment 9 of the present invention. As shown in FIG. 11, the data processing method provided in this embodiment may include the following steps:

步驟S1001、搜尋引擎資料庫獲取資料應用程式中的第一資料,以及所述第一資料的維度特徵、指標特徵、時間細微性特徵。 Step S1001: The search engine database acquires the first data in the data application, and the dimensional characteristics, index characteristics, and time nuance characteristics of the first data.

在本實施例中,步驟S1001的實現方式可以包括以下兩種: In this embodiment, the implementation of step S1001 may include the following two methods:

第一種:所述搜尋引擎資料庫接收語法解析器發送的所述第一資料,以及所述第一資料的維度特徵、指標特徵、時間細微性特徵,所述語法解析器用於採集所述資料 應用程式中的第一資料,以及解析所述第一資料的維度特徵、指標特徵、時間細微性特徵。 The first type: the search engine database receives the first data sent by a grammar parser, and the dimensional characteristics, index characteristics, and time nuance characteristics of the first data, and the grammar parser is used to collect the data The first data in the application, and the dimensional characteristics, index characteristics, and time nuance characteristics of the first data.

具體的,資料應用程式15中的資料可透過語法解析器19儲存在搜尋引擎資料庫13,具體的,透過SDK將資料應用程式15中的資料獲取到語法解析器19中。語法解析器19可解析出一段結構化查詢語言(Structured Query Language,簡稱SQL)的維度特徵、指標特徵、時間細微性特徵和讀取的表名,例如一段SQL具體如下:SELECT stat_date AS 日期 ,user_type AS 使用者類型 ,se_lpv_pc_1d_001 AS Pv ,se_uv_pc_1d_001 AS Uv Specifically, the data in the data application program 15 can be stored in the search engine database 13 through the parser 19. Specifically, the data in the data application program 15 is acquired into the parser 19 through the SDK. Syntax parser 19 can parse a structured query language (Structured Query Language, SQL for short) of dimensional characteristics, index characteristics, time nuance characteristics and read table names. For example, a piece of SQL is as follows: SELECT stat_date AS date, user_type AS user type, se_lpv_pc_1d_001 AS Pv, se_uv_pc_1d_001 AS Uv

FROM tbbi.ads_tb_log_1d FROM tbbi.ads_tb_log_1d

WHERE ds='20151026' WHERE ds = '20151026'

語法解析器19可解析出該段SQL的維度特徵是“使用者類型”,指標特徵是“Pv、Uv”,時間細微性特徵是“最近一天”,讀取的表名是“tbbi.ads_tb_log_1d”。透過前述方法,語法解析器19可解析出資料應用程式15中每個資料的維度特徵、指標特徵和時間細微性特徵。語法解析器19將解析後的資料發送給搜尋引擎資料庫13,搜尋引擎資料庫13中不僅儲存有資料本身,同時還儲存有資料的維度特徵、指標特徵和時間細微性特徵。 The parser 19 can parse the SQL dimension of this segment as "user type", the index feature is "Pv, Uv", the time nuance feature is "last day", and the table name read is "tbbi.ads_tb_log_1d" . Through the foregoing method, the grammar parser 19 can parse out the dimensional characteristics, index characteristics, and temporal nuance characteristics of each data in the data application program 15. The grammar parser 19 sends the parsed data to the search engine database 13. The search engine database 13 not only stores the data itself, but also stores the dimensional characteristics, index characteristics, and temporal nuance characteristics of the data.

第二種包括如下步驟S31-S32: The second type includes the following steps S31-S32:

步驟S31、所述搜尋引擎資料庫獲取所述資料應用程 式訪問資料來源的訪問邏輯,所述訪問邏輯包括所述資料應用程式中的第一資料,所述資料來源儲存有所述第一資料的產出邏輯。 Step S31: The search engine database obtains the data application The access logic of the data source is accessed in a manner that includes the first data in the data application, and the data source stores the output logic of the first data.

在本實施例中,所述資料應用程式可具體為Web頁面形式,需頻繁與後臺資料來源進行交互;所述後臺資料來源可具體為儲存所述資料應用程式運算元據的器件。由於資料應用程式是根據SDK所開發的,SDK對資料應用程式具有最大的操作許可權,因此可透過SDK捕獲資料應用程式對後臺資料來源的第一訪問邏輯,所述第一訪問邏輯中包括資料應用程式訪問後臺資料來源的時間、使用者對資料應用程式的第二訪問邏輯等欄位。因此,透過現有的解析方式,即可獲取使用者對資料應用程式的第二訪問邏輯。使用者對資料應用程式的第二訪問邏輯中包括使用者訪問資料應用程式的時間、使用者當前所訪問的資料應用程式中的資料等欄位資訊。因此,透過現有的解析方式,即可獲取使用者當前所訪問的資料應用程式中的資料。 In this embodiment, the data application program may be specifically in the form of a Web page, which needs to frequently interact with background data sources; the background data source may specifically be a device that stores the data application operation data. Because the data application is developed according to the SDK, the SDK has the largest operation permission for the data application, so the SDK can capture the first access logic of the data application to the background data source through the SDK, and the first access logic includes data The time when the application accesses the background data source, the user ’s second access logic to the data application, and other fields. Therefore, through the existing analysis method, the second access logic of the user to the data application can be obtained. The second access logic of the user to the data application includes field information such as the time when the user accesses the data application, the data in the data application currently accessed by the user, and the like. Therefore, through the existing parsing method, the data in the data application currently accessed by the user can be obtained.

假設使用者對資料應用程式的第二訪問邏輯為:SELECT stat_date AS 日期 ,user_type AS 使用者類型 ,se_lpv_pc_1d_001 AS Pv ,se_uv_pc_1d_001 AS Uv Suppose the user's second access logic to the data application is: SELECT stat_date AS date, user_type AS user type, se_lpv_pc_1d_001 AS Pv, se_uv_pc_1d_001 AS Uv

FROM tbbi.ads_tb_log_1d FROM tbbi.ads_tb_log_1d

WHERE ds='20151026' WHERE ds = '20151026'

透過對上述第二訪問邏輯進行解析,即可獲取使用者當前所訪問的資料應用程式中的資料為“tbbi.ads_tb_log_1d”,即FROM欄位後的資訊。 By analyzing the second access logic, the data in the data application currently accessed by the user can be obtained as "tbbi.ads_tb_log_1d", that is, the information behind the FROM field.

另外,後臺資料來源中儲存有每個資料的產出邏輯,因此,在後臺資料來源中,可直接查找使用者當前所訪問的資料應用程式中的資料的產出邏輯。 In addition, the output logic of each data is stored in the background data source. Therefore, in the background data source, the output logic of the data in the data application currently accessed by the user can be directly found.

所述產出邏輯包括所述第一資料的聚合物件資訊、聚合過程中參與運算的指標資訊以及指標運算的時間資訊。具體的,所述搜尋引擎資料庫確定所述第一資料的聚合物件資訊為所述第一資料的維度特徵;所述搜尋引擎資料庫確定所述第一資料在聚合過程中參與運算的指標資訊為所述第一資料的指標特徵;所述搜尋引擎資料庫根據所述指標運算的時間資訊,確定所述第一資料的時間細微性特徵。 The output logic includes the polymer piece information of the first data, the index information participating in the calculation during the aggregation process, and the time information of the index calculation. Specifically, the search engine database determines that the polymer piece information of the first data is a dimensional feature of the first data; the search engine database determines the index information of the first data participating in the calculation during the aggregation process Is the index characteristic of the first data; the search engine database determines the temporal nuance characteristic of the first data according to the time information calculated by the index.

步驟S32、所述搜尋引擎資料庫根據所述產出邏輯,確定所述資料應用程式中的第一資料的特徵資訊,所述特徵資訊包括維度特徵、指標特徵、時間細微性特徵。 Step S32: The search engine database determines feature information of the first data in the data application program according to the output logic, and the feature information includes dimensional features, index features, and temporal nuance features.

在本實施例中,假設使用者當前所訪問的資料應用程式中的資料的產出邏輯,如下:Select stat_date,user_type,count(1)se_kpv_pc_1d_001,count(distinct uid)se_uv_pc_1d_001 In this embodiment, it is assumed that the output logic of data in the data application currently accessed by the user is as follows: Select stat_date, user_type, count (1) se_kpv_pc_1d_001, count (distinct uid) se_uv_pc_1d_001

From tbcdm.dwd_tb_log_1d where ds=’20160119’ From tbcdm.dwd_tb_log_1d where ds = ’20160119’

Group by user_type,stat_date Group by user_type, stat_date

透過對上述產出邏輯進行解析,可獲得使用者當前所 訪問的資料應用程式中的資料的聚合物件資訊為stat_date,user_type,即Group by欄位後的資訊;聚合過程中參與運算的指標資訊為se_lpv_pc_1d_001,se_uv_pc_1d_001,即count(1)和count(distinct uid)欄位後的資訊;指標運算的時間資訊為’20160119’,即where ds欄位後的分數區。 By analyzing the above output logic, the user's current The polymer piece information of the data in the accessed data application is stat_date, user_type, that is, the information after the Group by field; the index information involved in the calculation during the aggregation process is se_lpv_pc_1d_001, se_uv_pc_1d_001, which is count (1) and count (distinct uid) Information after the field; the time information for the indicator calculation is '20160119', which is the score area after the where ds field.

確定當前資料的聚合物件資訊為當前資料的維度特徵,確定當前資料聚合過程中參與運算的指標資訊為當前資料的指標特徵,以及,根據當前資料指標運算的時間資訊,確定當前資料的時間細微性特徵。 Determine the polymer piece information of the current data as the dimensional characteristics of the current data, determine the index information participating in the calculation during the current data aggregation process as the index characteristics of the current data, and determine the time nuance of the current data based on the time information of the current data index calculation. feature.

另外,還可將上述指標運算的時間資訊所代表的時間區間,作為當前資料的時間細微性特徵,比如,當前資料指標運算的時間資訊,即where欄位後的分數區為“ds=’20160119’”,則當前資料的時間細微性特徵為1,再如,當前資料指標運算的時間資訊,即where欄位後的分數區為“ds>=’20160101’and ds<=’20160107’”,則當前資料的時間細微性特徵為7。 In addition, the time interval represented by the time information calculated by the above indicators can be used as the time subtle characteristics of the current data. For example, the time information calculated by the current data indicators, that is, the score area after the where field is "ds = '20160119" '", Then the time nuance characteristic of the current data is 1, for example, the time information of the current data index calculation, that is, the score area after the where field is" ds> =' 20160101'and ds <= '20160107' ", The temporal nuance characteristic of the current data is 7.

步驟S1002、所述搜尋引擎資料庫分別獲取報表、知識庫平臺、集群物理表中的第二資料,以及所述第二資料的維度特徵。 Step S1002: The search engine database obtains the second data in the report, the knowledge base platform, and the cluster physical table, and the dimensional characteristics of the second data.

具體的,所述搜尋引擎資料庫分別獲取所述報表、所述知識庫平臺以及所述集群物理表中的第二資料;所述搜尋引擎資料庫根據預設演算法,確定所述報表、所述知識庫平臺以及所述集群物理表中每個第二資料的維度特徵。 Specifically, the search engine database obtains the report, the knowledge base platform, and the second data in the cluster physical table respectively; the search engine database determines the report, The dimensional characteristics of the knowledge base platform and each second data in the cluster physical table are described.

本實施例可透過TF-iDF演算法拆分所述報表、所述知識庫平臺以及所述集群物理表中的每個資料。利用LDA演算法和TOPIC MODEL演算法對拆分後的資料進行特徵提取,並將提取的特徵作為對應資料的維度特徵。 In this embodiment, each data in the report, the knowledge base platform, and the cluster physical table can be split by a TF-iDF algorithm. LDA algorithm and TOPIC MODEL algorithm are used to extract the features of the split data, and the extracted features are used as the dimensional features of the corresponding data.

步驟S1003、所述搜尋引擎資料庫儲存所述第一資料,以及所述第一資料的維度特徵、指標特徵、時間細微性特徵。 Step S1003: The search engine database stores the first data, and the dimensional characteristics, index characteristics, and time nuance characteristics of the first data.

步驟S1004、所述搜尋引擎資料庫儲存所述第二資料,以及所述第二資料的維度特徵。 Step S1004: The search engine database stores the second data and the dimensional characteristics of the second data.

為所述報表、所述知識庫平臺以及所述集群物理表中的每個資料添加維度特徵,且將添加維度特徵後的資料,儲存到搜尋引擎資料庫中。 Add dimensional features to each report in the report, the knowledge base platform, and the cluster physical table, and store the data after adding the dimensional features to the search engine database.

本實施例中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, data in a data application, a report, a knowledge base platform, and a cluster physical table are collected in advance into a search engine database, and dimensional features, index features, and time nuance features are added for each piece of collected data. At least one of: when a search engine receives a search keyword entered by a user, the search keyword is first split to obtain a dimensional keyword, an index keyword, and a time fineness keyword; then, in a pre-built search In the engine database, find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and display the matched data to the user; the user does not need to go through each data exit to find the data, only Enter a search keyword once, and the search engine database can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

圖12為本發明實施例一提供的查詢終端的結構示意圖,如圖12所示,該查詢終端包括:接收單元、處理單元、以及發送單元。 FIG. 12 is a schematic structural diagram of an inquiry terminal provided in Embodiment 1 of the present invention. As shown in FIG. 12, the inquiry terminal includes a receiving unit, a processing unit, and a sending unit.

所述接收單元,用於接收使用者的查詢請求,所述查詢請求包括檢索關鍵字。 The receiving unit is configured to receive a query request from a user, where the query request includes a search keyword.

所述處理單元,耦合到所述接收單元,用於獲取所述檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字。 The processing unit is coupled to the receiving unit and is configured to obtain a dimensional keyword, an index keyword, and a time nuance keyword in the search keywords.

所述發送單元,耦合到所述處理單元,用於將所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字發送給搜尋引擎資料庫,以使所述搜尋引擎資料庫獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,所述搜尋引擎資料庫預先儲存有資料出口中的資料,以及所述資料的特徵資訊,所述資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵。 The sending unit is coupled to the processing unit, and is configured to send the dimensional keyword, the index keyword, and the time fineness keyword to a search engine database, so that the search engine database Obtaining first data corresponding to a dimensional feature matching the dimensional keyword, second data corresponding to an index feature matching the index keyword, and time-corresponding features corresponding to the time-subtleness keyword match Third data, the search engine database pre-stores the data in the data export and the characteristic information of the data. The data export includes at least one of the following: data applications, reports, knowledge base platforms, and cluster physical tables , The feature information includes at least one of the following: a dimensional feature, an index feature, and a temporal nuance feature.

所述接收單元還用於接收所述搜尋引擎資料庫發送的所述第一資料、所述第二資料和所述第三資料。 The receiving unit is further configured to receive the first data, the second data, and the third data sent by the search engine database.

所述處理單元還用於根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 The processing unit is further configured to determine target data to be returned to the user according to the first data, the second data, and the third data.

本實施例中,透過預先採集資料應用程式、報表、知 識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, through pre-collected data applications, reports, and knowledge The data in the knowledge base platform and the cluster physical table are added to the search engine database, and at least one of dimensional characteristics, index characteristics, and time nuance characteristics is added to each piece of collected data; when the search engine receives user input When searching for keywords, the search keywords are first split to obtain the dimensional keywords, index keywords, and time-specific keywords; then, in the search engine database that is established in advance, the keywords related to the dimensional keywords and indicators are found respectively. Words and time-specific keywords match the data, and display the matched data to the user; the user does not need to go through each data exit to find the data, just enter the search keyword once, and the search engine database can find out All the data related to the search keyword in the data export, thereby improving the efficiency of finding data.

在圖12所示實施例的基礎上,所述處理單元具體用於對所述檢索關鍵字進行分詞處理獲得多個目標分詞;根據各目標分詞查詢預設的映射表,所述映射表包括維度分詞、指標分詞和時間細微性分詞;將所述多個目標分詞中與所述維度分詞匹配的目標分詞確定為所述維度關鍵字;將所述多個目標分詞中與所述指標分詞匹配的目標分詞確定為所述指標關鍵字;將所述多個目標分詞中與所述時間細微性分詞匹配的目標分詞確定為所述時間細微性關鍵字。 Based on the embodiment shown in FIG. 12, the processing unit is specifically configured to perform word segmentation processing on the search keywords to obtain multiple target segmentations; query a preset mapping table according to each target segmentation, and the mapping table includes dimensions Segmentation, index segmentation, and time subtle segmentation; determining a target segmentation that matches the dimensional segmentation among the plurality of target segmentations as the dimension keyword; and matching among the plurality of target segmentations that matches the index segmentation A target participle is determined as the index keyword; and a target participle that matches the temporal subtlety part of the plurality of target participles is determined as the temporal subtleness key.

進一步的,所述處理單元具體用於確定所述第一資料、所述第二資料和所述第三資料是否為同一資料;若所述第一資料、所述第二資料和所述第三資料是同一資料,則所述處理單元將所述同一資料確定為回饋給所述使用者 的目標資料;若所述第一資料、所述第二資料和所述第三資料不是同一資料,則所述處理單元對所述第一資料、所述第二資料和所述第三資料進行排序,將排序後的資料確定為回饋給所述使用者的目標資料。 Further, the processing unit is specifically configured to determine whether the first data, the second data, and the third data are the same data; if the first data, the second data, and the third data are The data is the same data, the processing unit determines the same data as a feedback to the user If the first data, the second data, and the third data are not the same data, the processing unit performs the first data, the second data, and the third data Sorting, to determine the sorted data as the target data for feedback to the user.

本實施例中,透過對檢索關鍵字進行分詞處理獲得多個目標分詞,根據預先建立的映射表查詢該多個目標分詞中的維度關鍵字、指標關鍵字以及時間細微性關鍵字,提高了確定檢索關鍵字中維度關鍵字、指標關鍵字以及時間細微性關鍵字的效率。 In this embodiment, a plurality of target segmentations are obtained by performing segmentation processing on the search keywords, and the dimension keywords, index keywords, and time nuance keywords in the multiple target segmentations are queried according to a pre-established mapping table, thereby improving the determination Retrieve the efficiency of dimensional keywords, index keywords, and time nuance keywords in keywords.

圖13為本發明實施例二提供的查詢終端的結構示意圖,如圖13所示,查詢終端還包括:顯示器。 FIG. 13 is a schematic structural diagram of an inquiry terminal provided in Embodiment 2 of the present invention. As shown in FIG. 13, the inquiry terminal further includes a display.

所述接收單元還用於接收所述使用者對所述目標資料的點擊操作。 The receiving unit is further configured to receive a click operation on the target data by the user.

所述處理單元還用於根據所述點擊操作建立所述使用者與所述目標資料的關聯關係。 The processing unit is further configured to establish an association relationship between the user and the target data according to the click operation.

所述顯示器,耦合到所述處理單元,當使用者未輸入所述檢索關鍵字時,所述顯示器顯示所述關聯關係關聯的所述目標資料。 The display is coupled to the processing unit, and when the user does not enter the search keyword, the display displays the target data associated with the association relationship.

本實施例中,透過建立使用者與其點擊過的目標資料之間的關聯關係,當使用者未輸入檢索關鍵字時,可根據使用者與目標資料之間的關聯關係,顯示使用者點擊過的目標資料,提高了使用者查詢資料的便捷性。 In this embodiment, by establishing the association between the user and the target data that they clicked on, when the user does not enter a search keyword, the user ’s clicked information can be displayed according to the association relationship between the user and the target data. The target data improves the convenience of user query data.

圖14為本發明實施例三提供的查詢終端的結構示意圖,參照圖14,查詢終端1900包括處理元件1922,其進 一步包括一個或多個處理器,以及由記憶體1932所代表的記憶體資源,用於儲存可由處理元件1922的執行的指令,例如應用程式。記憶體1932中儲存的應用程式可以包括一個或一個以上的每一個對應於一組指令的模組。此外,處理元件1922被配置為執行指令,以執行上述步驟S201-S1004的方法。 FIG. 14 is a schematic structural diagram of an inquiry terminal provided in Embodiment 3 of the present invention. Referring to FIG. 14, the inquiry terminal 1900 includes a processing element 1922, and its development One step includes one or more processors, and a memory resource represented by the memory 1932, for storing instructions executable by the processing element 1922, such as an application program. The application programs stored in the memory 1932 may include one or more modules each corresponding to a set of instructions. In addition, the processing element 1922 is configured to execute instructions to perform the methods of steps S201-S1004 described above.

裝置1900還可以包括一個電源元件1926被配置為執行裝置1900的電源管理,一個有線或無線網路介面1950被配置為將裝置1900連接到網路,和一個輸入輸出(I/O)介面1958。裝置1900可以操作基於儲存在記憶體1932的作業系統,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM或類似。 The device 1900 may further include a power supply element 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input / output (I / O) interface 1958. The device 1900 can operate based on an operating system stored in the memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

圖15為本發明實施例提供的搜尋引擎資料庫的結構示意圖,如圖15所示,該搜尋引擎資料庫包括:接收器、記憶體、處理器、以及發送器。 FIG. 15 is a schematic structural diagram of a search engine database provided by an embodiment of the present invention. As shown in FIG. 15, the search engine database includes a receiver, a memory, a processor, and a transmitter.

所述接收器,用於接收查詢終端發送的維度關鍵字、指標關鍵字、以及時間細微性關鍵字,所述維度關鍵字、所述指標關鍵字、以及所述時間細微性關鍵字是所述查詢終端接收使用者的查詢請求,並從所述查詢請求包括的檢索關鍵字中獲取的。 The receiver is configured to receive the dimensional keyword, the index keyword, and the time fineness keyword sent by the query terminal, and the dimensional keyword, the index keyword, and the time fineness keyword are the The query terminal receives a query request from a user, and obtains the query request from a search keyword included in the query request.

所述記憶體,用於儲存資料出口中的資料,以及所述資料的特徵資訊,所述資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表,所述特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微 性特徵。 The memory is used to store data in a data export and characteristic information of the data. The data export includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table. The characteristic information Including at least one of the following: dimensional characteristics, index characteristics, and time detail Sexual characteristics.

所述處理器,耦合到所述接收器和所述記憶體,用於獲取與所述維度關鍵字匹配的維度特徵對應的第一資料、與所述指標關鍵字匹配的指標特徵對應的第二資料、以及與所述時間細微性關鍵字匹配的時間細微性特徵對應的第三資料。 The processor is coupled to the receiver and the memory, and is configured to obtain a first data corresponding to a dimensional feature matching the dimensional keyword, and a second data corresponding to an index feature matching the index keyword. The data and the third data corresponding to the time nuance feature matching the time nuance keyword.

所述發送器,耦合到所述處理器,用於將所述第一資料、所述第二資料和所述第三資料發送給所述查詢終端,以使所述查詢終端根據所述第一資料、所述第二資料和所述第三資料,確定回饋給所述使用者的目標資料。 The transmitter is coupled to the processor, and is configured to send the first data, the second data, and the third data to the query terminal, so that the query terminal is configured according to the first The data, the second data, and the third data determine target data to be returned to the user.

本實施例中,透過預先採集資料應用程式、報表、知識庫平臺以及集群物理表中的資料至搜尋引擎資料庫內,且為所採集的每一條資料添加維度特徵、指標特徵和時間細微性特徵中的至少一個;當搜尋引擎接收到使用者輸入的檢索關鍵字時,首先對檢索關鍵字進行拆分,獲得維度關鍵字、指標關鍵字以及時間細微性關鍵字;然後,在預先建立的搜尋引擎資料庫中,分別查找與維度關鍵字、指標關鍵字以及時間細微性關鍵字相匹配的資料,並將匹配的資料顯示給使用者;使用者無需遍歷每個資料出口進行資料查找,僅需輸入一次檢索關鍵字,搜尋引擎資料庫即可查找出所有資料出口中與該檢索關鍵字相關的資料,從而提高了查找資料的效率。 In this embodiment, data in a data application, a report, a knowledge base platform, and a cluster physical table are collected in advance into a search engine database, and dimensional features, index features, and time nuance features are added for each piece of collected data. At least one of: when a search engine receives a search keyword entered by a user, the search keyword is first split to obtain a dimensional keyword, an index keyword, and a time fineness keyword; then, in a pre-built search In the engine database, find the data that matches the dimensional keywords, index keywords, and time-specific keywords, and display the matched data to the user; the user does not need to go through each data exit to find the data, only Enter a search keyword once, and the search engine database can find all the data related to the search keyword in all data outlets, thereby improving the efficiency of finding data.

在圖15所示實施例基礎上,所述處理器具體用於獲取所述資料應用程式訪問資料來源的訪問邏輯,所述訪問 邏輯包括所述資料應用程式中的資料,所述資料來源儲存有所述資料的產出邏輯;根據所述產出邏輯,確定所述資料應用程式中的資料的特徵資訊;將所述資料應用程式中的資料,以及所述資料的特徵資訊儲存到所述記憶體。 Based on the embodiment shown in FIG. 15, the processor is specifically configured to obtain access logic for the data application to access a data source, and the access The logic includes data in the data application, and the data source stores output logic of the data; according to the output logic, determining characteristic information of the data in the data application; applying the data The data in the program and characteristic information of the data are stored in the memory.

或者,在圖15所示實施例基礎上,所述接收器還用於接收語法解析器發送的資料,以及所述資料的維度特徵、指標特徵、時間細微性特徵,所述語法解析器用於採集所述資料應用程式中的資料,以及解析所述資料的維度特徵、指標特徵、時間細微性特徵;所述處理器還用於將所述資料應用程式中的資料,以及所述資料的維度特徵、指標特徵、時間細微性特徵儲存到所述記憶體。 Alternatively, based on the embodiment shown in FIG. 15, the receiver is further configured to receive data sent by a parser, and the dimensional characteristics, index characteristics, and time nuance characteristics of the data, and the parser is used to collect The data in the data application, and the dimensional characteristics, index characteristics, and time nuance characteristics of the data; the processor is further configured to combine the data in the data application and the dimensional characteristics of the data Index characteristics, time nuance characteristics are stored in the memory.

或者,在圖15所示實施例基礎上,所述處理器具體用於分別獲取所述報表、所述知識庫平臺以及所述集群物理表中的資料;根據預設演算法,確定所述報表、所述知識庫平臺以及所述集群物理表中每個資料的維度特徵;將所述報表、所述知識庫平臺以及所述集群物理表中每個資料,以及所述資料的維度特徵儲存到所述記憶體。 Alternatively, based on the embodiment shown in FIG. 15, the processor is specifically configured to obtain data in the report, the knowledge base platform, and the cluster physical table respectively; and determine the report according to a preset algorithm , The dimensional characteristics of each material in the knowledge base platform and the cluster physical table; storing the report, each material in the knowledge base platform and the cluster physical table, and the dimensional characteristics of the data in The memory.

本實施例中,搜尋引擎資料庫中儲存有資料應用程式中的所有資料,且從資料應用程式儲存到搜尋引擎資料庫中的每個資料關聯有維度特徵、指標特徵和時間細微性特徵;另外,搜尋引擎資料庫中儲存有報表、知識庫平臺以及集群物理表中的所有資料,且從報表、知識庫平臺以及集群物理表儲存到搜尋引擎資料庫中的每個資料關聯有維度特徵。 In this embodiment, the search engine database stores all the data in the data application, and each data stored from the data application to the search engine database is associated with dimensional characteristics, index characteristics, and temporal nuance characteristics; in addition, , The search engine database stores all data in the report, knowledge base platform, and cluster physical tables, and each data stored from the report, knowledge base platform, and cluster physical tables in the search engine database is associated with dimensional characteristics.

最後應說明的是:以上各實施例僅用以說明本發明的技術方案,而非對其限制;儘管參照前述各實施例對本發明進行了詳細的說明,本領域的普通技術人員應當理解:其依然可以對前述各實施例所記載的技術方案進行修改,或者對其中部分或者全部技術特徵進行等同替換;而這些修改或者替換,並不使相應技術方案的本質脫離本發明各實施例技術方案的範圍。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention, but not limited thereto. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features can be equivalently replaced; and these modifications or replacements do not depart from the essence of the corresponding technical solutions of the technical solutions of the embodiments of the present invention. range.

Claims (29)

一種資料處理系統,其特徵在於,包括:查詢終端和搜尋引擎資料庫;該查詢終端,用於接收使用者的查詢請求,該查詢請求包括檢索關鍵字;該查詢終端獲取該檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字,並將該維度關鍵字、該指標關鍵字、以及該時間細微性關鍵字發送給該搜尋引擎資料庫;該搜尋引擎資料庫預先儲存有資料出口中的資料,以及該資料的特徵資訊,該特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;該搜尋引擎資料庫,用於獲取與該維度關鍵字匹配的維度特徵對應的第一資料、與該指標關鍵字匹配的指標特徵對應的第二資料、以及與該時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,並將該第一資料、該第二資料和該第三資料發送給該查詢終端;該查詢終端,還用於根據該第一資料、該第二資料和該第三資料,確定回饋給該使用者的目標資料,並將該目標資料顯示給該使用者。 A data processing system, comprising: a query terminal and a search engine database; the query terminal is configured to receive a user's query request, the query request includes a search keyword; and the query terminal obtains the search keyword Dimension keywords, index keywords, and time fineness keywords, and send the dimension keywords, the indicator keywords, and the time fineness keywords to the search engine database; the search engine database stores data in advance The data in the exit, and the characteristic information of the data, the characteristic information includes at least one of the following: dimensional characteristics, index characteristics, and temporal nuance characteristics; the search engine database is used to obtain dimensional characteristics that match the dimensional keywords The corresponding first data, the second data corresponding to the index characteristic matching the index keyword, and the third data corresponding to the temporal nuance characteristic matching the temporal nuance keyword, and the first data, the first data The second data and the third data are sent to the inquiry terminal; the inquiry terminal is further used for The second information and the third information feedback to the user to determine the target data, and displays the target information to the user. 一種資料處理方法,其特徵在於,包括:查詢終端接收使用者的查詢請求,該查詢請求包括檢索關鍵字; 該查詢終端獲取該檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字;該查詢終端將該維度關鍵字、該指標關鍵字、以及該時間細微性關鍵字發送給搜尋引擎資料庫,以使該搜尋引擎資料庫獲取與該維度關鍵字匹配的維度特徵對應的第一資料、與該指標關鍵字匹配的指標特徵對應的第二資料、以及與該時間細微性關鍵字匹配的時間細微性特徵對應的第三資料,該搜尋引擎資料庫預先儲存有資料出口中的資料,以及該資料的特徵資訊,該特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;該查詢終端接收該搜尋引擎資料庫發送的該第一資料、該第二資料和該第三資料;該查詢終端根據該第一資料、該第二資料和該第三資料,確定回饋給該使用者的目標資料。 A data processing method, comprising: a query terminal receiving a query request from a user, the query request including a search keyword; The query terminal obtains the dimensional keywords, index keywords, and time fineness keywords in the search keywords; the query terminal sends the dimension keywords, the index keywords, and time fineness keywords to the search engine data Database, so that the search engine database obtains the first data corresponding to the dimensional feature matching the dimensional keyword, the second data corresponding to the index feature matching the metric keyword, and the time-specificity keyword matching The third data corresponding to the temporal nuance feature. The search engine database previously stores data in the data outlet and the characteristic information of the data. The characteristic information includes at least one of the following: dimensional features, index features, and temporal nuance features. ; The query terminal receives the first data, the second data, and the third data sent by the search engine database; the query terminal determines to return to the first data according to the first data, the second data, and the third data; User's goal data. 根據申請專利範圍第2項的方法,其中,該資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表。 The method according to item 2 of the scope of patent application, wherein the data export includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table. 根據申請專利範圍第3項的方法,其中,該查詢終端獲取該檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字,包括:該查詢終端對該檢索關鍵字進行分詞處理獲得多個目標分詞; 該查詢終端根據各目標分詞查詢預設的映射表,該映射表包括維度分詞、指標分詞和時間細微性分詞;該查詢終端將該多個目標分詞中與該維度分詞匹配的目標分詞確定為該維度關鍵字;該查詢終端將該多個目標分詞中與該指標分詞匹配的目標分詞確定為該指標關鍵字;該查詢終端將該多個目標分詞中與該時間細微性分詞匹配的目標分詞確定為該時間細微性關鍵字。 The method according to item 3 of the scope of patent application, wherein the query terminal obtains the dimensional keywords, index keywords, and time nuance keywords in the search keywords, including: the query terminal performs word segmentation processing on the search keywords to obtain Multiple target segmentation; The query terminal queries a preset mapping table according to each target segmentation, and the mapping table includes dimensional segmentation, index segmentation, and time subtle segmentation; the query terminal determines the target segmentation that matches the dimension segmentation among the plurality of target segmentations as the Dimension keyword; the query terminal determines the target segmentation that matches the index segmentation among the plurality of target segmentations as the index keyword; the query terminal determines the target segmentation that matches the time subtle segmentation among the plurality of target segmentations Keywords for that time nuance. 根據申請專利範圍第4項的方法,其中,該查詢終端根據該第一資料、該第二資料和該第三資料,確定回饋給該使用者的目標資料,包括:該查詢終端確定該第一資料、該第二資料和該第三資料是否為同一資料;若該第一資料、該第二資料和該第三資料是同一資料,則該查詢終端將該同一資料確定為回饋給該使用者的目標資料;若該第一資料、該第二資料和該第三資料不是同一資料,則該查詢終端對該第一資料、該第二資料和該第三資料進行排序,將排序後的資料確定為回饋給該使用者的目標資料。 The method according to item 4 of the scope of patent application, wherein the query terminal determines the target data to be returned to the user based on the first data, the second data, and the third data, including: the query terminal determines the first Whether the data, the second data, and the third data are the same data; if the first data, the second data, and the third data are the same data, the query terminal determines the same data as a feedback to the user The target data; if the first data, the second data, and the third data are not the same data, the query terminal sorts the first data, the second data, and the third data, and sorts the sorted data Determine the target data to give back to the user. 根據申請專利範圍第5項的方法,其中,該查詢終端對該第一資料、該第二資料和該第三資料進行排序,包 括:該查詢終端計算該第一資料、該第二資料和該第三資料中每個資料的權重值;該查詢終端計算該第一資料、該第二資料和該第三資料中每個資料與該檢索關鍵字的相似度;該查詢終端根據該每個資料的權重值和相似度,計算該每個資料的排序值;該查詢終端根據該每個資料的排序值,對該第一資料、該第二資料和該第三資料中的每個資料進行排序。 The method according to item 5 of the scope of patent application, wherein the query terminal sorts the first data, the second data, and the third data, including Including: the query terminal calculates the weight value of each of the first data, the second data and the third data; the query terminal calculates each data of the first data, the second data and the third data Similarity to the search key; the query terminal calculates the ranking value of each material according to the weight value and similarity of the each material; the query terminal uses the ranking value of each material to the first material , Each of the second data and the third data is sorted. 根據申請專利範圍第6項的方法,其中,該查詢終端根據該每個資料的排序值,對該第一資料、該第二資料和該第三資料中的每個資料進行排序,包括:該查詢終端根據該每個資料的排序值,確定該第一資料、該第二資料和該第三資料中排序值大於第一閾值的資料;該查詢終端對該排序值大於第一閾值的資料,按照該排序值的大小進行排序。 The method according to item 6 of the scope of patent application, wherein the query terminal sorts each of the first material, the second material, and the third material according to the ranking value of each material, including: the The query terminal determines, according to the ranking value of each material, the materials in which the ranking value of the first material, the second material, and the third material is greater than the first threshold; the query terminal determines the materials with the ranking value greater than the first threshold, Sort by the size of the sort value. 根據申請專利範圍第2至7項中任一項的方法,其中,該查詢終端根據該第一資料、該第二資料和該第三資料,確定回饋給該使用者的目標資料之後,還包括:該查詢終端接收該使用者對該目標資料的點擊操作;該查詢終端根據該點擊操作建立該使用者與該目標資 料的關聯關係;當使用者未輸入該檢索關鍵字時,該查詢終端根據該關聯關係顯示該目標資料。 The method according to any one of claims 2 to 7, wherein the query terminal determines the target data to be returned to the user according to the first data, the second data, and the third data, and further includes: : The query terminal receives the user's click operation on the target data; the query terminal establishes the user and the target data according to the click operation Data; when the user does not enter the search keyword, the query terminal displays the target data according to the relationship. 根據申請專利範圍第8項的方法,其中,該關聯關係包括關聯度,該關聯度標識該使用者與該目標資料的關聯程度;該查詢終端根據該關聯關係顯示該目標資料,包括:該查詢終端顯示關聯度大於第二閾值的該目標資料。 The method according to item 8 of the scope of patent application, wherein the association relationship includes a degree of association that identifies the degree of association between the user and the target material; the query terminal displays the target material according to the relationship, including: the query The terminal displays the target data whose relevance is greater than a second threshold. 一種資料處理方法,其特徵在於,包括:查詢終端接收使用者的查詢請求,該查詢請求包括檢索關鍵字;該查詢終端至少獲取該檢索關鍵字中的兩類關鍵字;該查詢終端將至少兩類關鍵字發送給搜尋引擎資料庫,以使該搜尋引擎資料庫獲取與該至少兩類關鍵字分別對應的來源資料;該查詢終端接收該搜尋引擎資料庫發送的該來源資料;該查詢終端根據該來源資料,確定回饋給該使用者的目標資料。 A data processing method, comprising: a query terminal receiving a user's query request, the query request including a search keyword; the query terminal obtaining at least two types of keywords in the search keyword; the query terminal The class keywords are sent to the search engine database, so that the search engine database obtains source data corresponding to the at least two types of keywords respectively; the query terminal receives the source data sent by the search engine database; the query terminal is based on The source data determines the target data returned to the user. 一種資料處理方法,其特徵在於,包括:搜尋引擎資料庫接收查詢終端發送的維度關鍵字、指 標關鍵字、以及時間細微性關鍵字,該維度關鍵字、該指標關鍵字、以及該時間細微性關鍵字是該查詢終端接收使用者的查詢請求,並從該查詢請求包括的檢索關鍵字中獲取的;該搜尋引擎資料庫預先儲存有資料出口中的資料,以及該資料的特徵資訊,該特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;該搜尋引擎資料獲取與該維度關鍵字匹配的維度特徵對應的第一資料、與該指標關鍵字匹配的指標特徵對應的第二資料、以及與該時間細微性關鍵字匹配的時間細微性特徵對應的第三資料;該搜尋引擎資料將該第一資料、該第二資料和該第三資料發送給該查詢終端,以使該查詢終端根據該第一資料、該第二資料和該第三資料,確定回饋給該使用者的目標資料。 A data processing method, comprising: a search engine database receiving a dimensional keyword, a pointer Target keyword, and time fineness keyword, the dimension keyword, the index keyword, and the time fineness keyword are the query request received by the query terminal from the query terminal, and from the search keywords included in the query request Obtained; the search engine database pre-stores the data in the data outlet, and the characteristic information of the data, the characteristic information includes at least one of the following: dimensional characteristics, index characteristics and temporal nuance characteristics; the search engine data acquisition and The first data corresponding to the dimensional characteristics matched by the dimensional keyword, the second data corresponding to the index characteristics matched with the index keyword, and the third data corresponding to the time granularity characteristics matched with the time dimensionality keyword; the The search engine data sends the first data, the second data, and the third data to the query terminal, so that the query terminal determines to give back to the use according to the first data, the second data, and the third data. Target information. 根據申請專利範圍第11項的方法,其中,該資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表。 The method according to item 11 of the scope of patent application, wherein the data export includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table. 根據申請專利範圍第12項的方法,其中,該搜尋引擎資料庫接收查詢終端發送的維度關鍵字、指標關鍵字、以及時間細微性關鍵字之前,還包括:該搜尋引擎資料庫儲存該資料應用程式、該報表、該 知識庫平臺以及該集群物理表中的資料。 The method according to item 12 of the patent application scope, wherein before the search engine database receives the dimensional keywords, index keywords, and time fineness keywords sent by the query terminal, the method further includes: the search engine database stores the data application Program, the report, the Knowledge base platform and materials in the physical table of the cluster. 根據申請專利範圍第13項的方法,其中,該搜尋引擎資料庫儲存該資料應用程式中的資料,包括:該搜尋引擎資料庫獲取該資料應用程式訪問資料來源的訪問邏輯,該訪問邏輯包括該資料應用程式中的資料,該資料來源儲存有該資料的產出邏輯;該搜尋引擎資料庫根據該產出邏輯,確定該資料應用程式中的資料的特徵資訊;該搜尋引擎資料庫儲存該資料應用程式中的資料,以及該資料的特徵資訊。 The method according to item 13 of the scope of patent application, wherein the search engine database stores data in the data application, including: the search engine database obtains access logic of the data application to access the data source, and the access logic includes the The data in the data application, the data source stores the output logic of the data; the search engine database determines the characteristic information of the data in the data application according to the output logic; the search engine database stores the data Data in the app, and information about the characteristics of that data. 根據申請專利範圍第14項的方法,其中,該產出邏輯包括該資料的聚合物件資訊、聚合過程中參與運算的指標資訊以及指標運算的時間資訊;該搜尋引擎資料庫根據該產出邏輯,確定該資料應用程式中的資料的特徵資訊,包括:該搜尋引擎資料庫確定該資料的聚合物件資訊為該資料的維度特徵;該搜尋引擎資料庫確定該資料在聚合過程中參與運算的指標資訊為該資料的指標特徵;該搜尋引擎資料庫根據該指標運算的時間資訊,確定該資料的時間細微性特徵。 The method according to item 14 of the scope of patent application, wherein the output logic includes the polymer piece information of the data, the index information participating in the calculation during the aggregation process, and the time information of the index calculation; the search engine database is based on the output logic, Determine the characteristic information of the data in the data application, including: the search engine database determines that the polymer piece information of the data is the dimensional characteristics of the data; the search engine database determines the index information of the data participating in the calculation during the aggregation process Is the index characteristic of the data; the search engine database determines the time subtle characteristics of the data according to the time information calculated by the index. 根據申請專利範圍第13項的方法,其中,該搜尋引擎資料庫儲存該報表、該知識庫平臺以及該集群物理表中的資料,包括:該搜尋引擎資料庫分別獲取該報表、該知識庫平臺以及該集群物理表中的資料;該搜尋引擎資料庫根據預設演算法,確定該報表、該知識庫平臺以及該集群物理表中每個資料的維度特徵;該搜尋引擎資料庫儲存該報表、該知識庫平臺以及該集群物理表中每個資料,以及該資料的維度特徵。 The method according to item 13 of the patent application scope, wherein the search engine database stores the report, the knowledge base platform, and the data in the cluster physical table, including: the search engine database obtains the report and the knowledge base platform, respectively And the data in the cluster physical table; the search engine database determines the dimensional characteristics of the report, the knowledge base platform, and each data in the cluster physical table according to a preset algorithm; the search engine database stores the report, Each material in the knowledge base platform and the cluster physical table, and the dimensional characteristics of the material. 一種資料處理方法,其特徵在於,包括:搜尋引擎資料庫獲取資料應用程式中的第一資料,以及該第一資料的維度特徵、指標特徵、時間細微性特徵;該搜尋引擎資料庫分別獲取報表、知識庫平臺、集群物理表中的第二資料,以及該第二資料的維度特徵;該搜尋引擎資料庫儲存該第一資料,以及該第一資料的維度特徵、指標特徵、時間細微性特徵;該搜尋引擎資料庫儲存該第二資料,以及該第二資料的維度特徵。 A data processing method, comprising: a search engine database acquiring first data in a data application, and dimensional characteristics, index characteristics, and time nuance characteristics of the first data; the search engine database obtaining reports separately , The knowledge base platform, the second data in the cluster physical table, and the dimensional characteristics of the second data; the search engine database stores the first data, and the dimensional characteristics, index characteristics, and time nuance characteristics of the first data ; The search engine database stores the second data and the dimensional characteristics of the second data. 根據申請專利範圍第17項的方法,其中,該搜尋引擎資料庫獲取資料應用程式中的第一資料,以及該第一資料的維度特徵、指標特徵、時間細微性特徵,包括:該搜尋引擎資料庫接收語法解析器發送的該第一資 料,以及該第一資料的維度特徵、指標特徵、時間細微性特徵,該語法解析器用於採集該資料應用程式中的第一資料,以及解析該第一資料的維度特徵、指標特徵、時間細微性特徵。 The method according to item 17 of the scope of patent application, wherein the search engine database obtains the first data in the data application, and the dimensional characteristics, index characteristics, and time nuance characteristics of the first data include the search engine data The library receives the first data sent by the parser Data, and the dimensional characteristics, index characteristics, and time nuances of the first data, the grammar parser is used to collect the first data in the data application, and parse the dimensional characteristics, index characteristics, and time details of the first data Sexual characteristics. 根據申請專利範圍第17項的方法,其中,該搜尋引擎資料庫獲取資料應用程式中的第一資料,以及該第一資料的維度特徵、指標特徵、時間細微性特徵,包括:該搜尋引擎資料庫獲取該資料應用程式訪問資料來源的訪問邏輯,該訪問邏輯包括該資料應用程式中的第一資料,該資料來源儲存有該第一資料的產出邏輯;該搜尋引擎資料庫根據該產出邏輯,確定該資料應用程式中的第一資料的特徵資訊,該特徵資訊包括維度特徵、指標特徵、時間細微性特徵。 The method according to item 17 of the scope of patent application, wherein the search engine database obtains the first data in the data application, and the dimensional characteristics, index characteristics, and time nuance characteristics of the first data include the search engine data The database obtains the access logic of the data application to access the data source. The access logic includes the first data in the data application. The data source stores the output logic of the first data. The search engine database is based on the output. Logic to determine characteristic information of the first data in the data application, the characteristic information including dimensional characteristics, index characteristics, and temporal nuance characteristics. 根據申請專利範圍第19項的方法,其中,該產出邏輯包括該第一資料的聚合物件資訊、聚合過程中參與運算的指標資訊以及指標運算的時間資訊;該搜尋引擎資料庫根據該產出邏輯,確定該資料應用程式中的第一資料的特徵資訊,包括:該搜尋引擎資料庫確定該第一資料的聚合物件資訊為該第一資料的維度特徵;該搜尋引擎資料庫確定該第一資料在聚合過程中參與運算的指標資訊為該第一資料的指標特徵; 該搜尋引擎資料庫根據該指標運算的時間資訊,確定該第一資料的時間細微性特徵。 The method according to item 19 of the scope of patent application, wherein the output logic includes the polymer piece information of the first data, the index information participating in the calculation during the aggregation process, and the time information of the index calculation; the search engine database is based on the output Logic to determine the characteristic information of the first data in the data application, including: the search engine database determines that the polymer piece information of the first data is a dimensional feature of the first data; the search engine database determines the first data The index information of the data participating in the calculation during the aggregation process is the index characteristic of the first data; The search engine database determines the temporal nuance characteristics of the first data according to the time information calculated by the index. 根據申請專利範圍第17至20項中任一項的方法,其中,該搜尋引擎資料庫分別獲取報表、知識庫平臺、集群物理表中的第二資料,以及該第二資料的維度特徵,包括:該搜尋引擎資料庫分別獲取該報表、該知識庫平臺以及該集群物理表中的第二資料;該搜尋引擎資料庫根據預設演算法,確定該報表、該知識庫平臺以及該集群物理表中每個第二資料的維度特徵。 The method according to any one of claims 17 to 20, wherein the search engine database obtains the second data in the report, the knowledge base platform, and the cluster physical table, and the dimensional characteristics of the second data, including : The search engine database obtains the report, the knowledge base platform, and the second data in the cluster physical table respectively; the search engine database determines the report, the knowledge base platform, and the cluster physical table according to a preset algorithm The dimensional characteristics of each second data in. 一種查詢終端,其特徵在於,包括:接收單元、處理單元、以及發送單元;該接收單元,用於接收使用者的查詢請求,該查詢請求包括檢索關鍵字;該處理單元,耦合到該接收單元,用於獲取該檢索關鍵字中的維度關鍵字、指標關鍵字和時間細微性關鍵字;該發送單元,耦合到該處理單元,用於將該維度關鍵字、該指標關鍵字、以及該時間細微性關鍵字發送給搜尋引擎資料庫,以使該搜尋引擎資料庫獲取與該維度關鍵字匹配的維度特徵對應的第一資料、與該指標關鍵字匹配的指標特徵對應的第二資料、以及與該時間細微性關鍵字匹 配的時間細微性特徵對應的第三資料,該搜尋引擎資料庫預先儲存有資料出口中的資料,以及該資料的特徵資訊,該資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表,該特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;該接收單元還用於接收該搜尋引擎資料庫發送的該第一資料、該第二資料和該第三資料;該處理單元還用於根據該第一資料、該第二資料和該第三資料,確定回饋給該使用者的目標資料。 A query terminal, comprising: a receiving unit, a processing unit, and a sending unit; the receiving unit is configured to receive a user's query request, the query request includes a search keyword; and the processing unit is coupled to the receiving unit To obtain the dimensional keyword, the index keyword, and the time subtlety keyword in the search keyword; the sending unit, coupled to the processing unit, is used to place the dimension keyword, the index keyword, and the time The subtle keywords are sent to the search engine database, so that the search engine database obtains the first data corresponding to the dimensional feature matching the dimensional keyword, the second data corresponding to the index feature matching the index keyword, and Matches the time nuanced keywords The third data corresponding to the time nuance characteristics is matched, and the search engine database stores the data in the data outlet and the characteristic information of the data in advance. The data outlet includes at least one of the following: data applications, reports, and knowledge bases Platform and cluster physical table, the feature information includes at least one of the following: a dimensional feature, an index feature, and a temporal detail feature; the receiving unit is further configured to receive the first data, the second data, and The third data; the processing unit is further configured to determine the target data to be returned to the user according to the first data, the second data, and the third data. 根據申請專利範圍第22項的查詢終端,其中,該處理單元具體用於對該檢索關鍵字進行分詞處理獲得多個目標分詞;根據各目標分詞查詢預設的映射表,該映射表包括維度分詞、指標分詞和時間細微性分詞;將該多個目標分詞中與該維度分詞匹配的目標分詞確定為該維度關鍵字;將該多個目標分詞中與該指標分詞匹配的目標分詞確定為該指標關鍵字;將該多個目標分詞中與該時間細微性分詞匹配的目標分詞確定為該時間細微性關鍵字。 The query terminal according to item 22 of the scope of patent application, wherein the processing unit is specifically configured to perform word segmentation processing on the search key to obtain multiple target segmentations; query a preset mapping table according to each target segmentation, and the mapping table includes dimension segmentation , Index segmentation, and time subtle segmentation; determine the target segmentation that matches the dimension segmentation among the multiple target segmentations as the dimension keyword; determine the target segmentation that matches the index segmentation among the multiple target segmentations as the index Keyword; the target participle that matches the time subtlety part of the plurality of target participle is determined as the time subtlety key. 根據申請專利範圍第23項的查詢終端,其中,該處理單元具體用於確定該第一資料、該第二資料和該第三資料是否為同一資料;若該第一資料、該第二資料和該第三資料是同一資料,則該處理單元將該同一資料確定為回饋給該使用者的 目標資料;若該第一資料、該第二資料和該第三資料不是同一資料,則該處理單元對該第一資料、該第二資料和該第三資料進行排序,將排序後的資料確定為回饋給該使用者的目標資料。 According to the query terminal of the scope of patent application No. 23, the processing unit is specifically configured to determine whether the first data, the second data, and the third data are the same data; if the first data, the second data, and If the third data is the same data, the processing unit determines the same data as the feedback to the user. Target data; if the first data, the second data, and the third data are not the same data, the processing unit sorts the first data, the second data, and the third data, and determines the sorted data Target data for giving back to the user. 根據申請專利範圍第24項的查詢終端,其中,還包括:顯示器;該接收單元還用於接收該使用者對該目標資料的點擊操作;該處理單元還用於根據該點擊操作建立該使用者與該目標資料的關聯關係;該顯示器,耦合到該處理單元,當使用者未輸入該檢索關鍵字時,該顯示器顯示該關聯關係關聯的該目標資料。 The query terminal according to item 24 of the patent application scope, which further includes: a display; the receiving unit is further configured to receive the user's click operation on the target data; the processing unit is further configured to establish the user according to the click operation Association with the target data; the display is coupled to the processing unit, and when the user does not enter the search keyword, the display displays the target data associated with the relationship. 一種搜尋引擎資料庫,其特徵在於,包括:接收器、記憶體、處理器、以及發送器;該接收器,用於接收查詢終端發送的維度關鍵字、指標關鍵字、以及時間細微性關鍵字,該維度關鍵字、該指標關鍵字、以及該時間細微性關鍵字是該查詢終端接收使用者的查詢請求,並從該查詢請求包括的檢索關鍵字中獲取的;該記憶體,用於儲存資料出口中的資料,以及該資料 的特徵資訊,該資料出口包括下述至少一種:資料應用程式、報表、知識庫平臺以及集群物理表,該特徵資訊包括下述至少一種:維度特徵、指標特徵和時間細微性特徵;該處理器,耦合到該接收器和該記憶體,用於獲取與該維度關鍵字匹配的維度特徵對應的第一資料、與該指標關鍵字匹配的指標特徵對應的第二資料、以及與該時間細微性關鍵字匹配的時間細微性特徵對應的第三資料;該發送器,耦合到該處理器,用於將該第一資料、該第二資料和該第三資料發送給該查詢終端,以使該查詢終端根據該第一資料、該第二資料和該第三資料,確定回饋給該使用者的目標資料。 A search engine database, comprising: a receiver, a memory, a processor, and a transmitter; the receiver is used to receive a dimensional keyword, an index keyword, and a time nuance keyword sent by a query terminal The dimension keyword, the index keyword, and the time fineness keyword are obtained by the query terminal from a query request of a user and obtained from the search keywords included in the query request; the memory is used for storing The data in the data export, and the data The feature information includes at least one of the following: a data application, a report, a knowledge base platform, and a cluster physical table, and the feature information includes at least one of the following: a dimensional feature, an index feature, and a temporal nuance feature; the processor Is coupled to the receiver and the memory, and is used to obtain the first data corresponding to the dimensional feature matching the dimensional keyword, the second data corresponding to the index feature matching the index keyword, and the time nuance The third material corresponding to the time nuance characteristic of the keyword matching; the transmitter, coupled to the processor, for sending the first material, the second material and the third material to the query terminal, so that the The query terminal determines the target data to be returned to the user according to the first data, the second data, and the third data. 根據申請專利範圍第26項的搜尋引擎資料庫,其中,該處理器具體用於獲取該資料應用程式訪問資料來源的訪問邏輯,該訪問邏輯包括該資料應用程式中的資料,該資料來源儲存有該資料的產出邏輯;根據該產出邏輯,確定該資料應用程式中的資料的特徵資訊;將該資料應用程式中的資料,以及該資料的特徵資訊儲存到該記憶體。 The search engine database according to item 26 of the patent application scope, wherein the processor is specifically configured to obtain the access logic of the data application to access the data source, and the access logic includes the data in the data application, and the data source stores The output logic of the data; determining characteristic information of the data in the data application according to the output logic; storing the data in the data application and the characteristic information of the data into the memory. 根據申請專利範圍第26項的搜尋引擎資料庫,其中,該接收器還用於接收語法解析器發送的資料,以及該資料的維度特徵、指標特徵、時間細微性特徵,該語法解析器用於採集該資料應用程式中的資料,以及解析該資料的維度特徵、指標特徵、時間細微性特徵; 該處理器還用於將該資料應用程式中的資料,以及該資料的維度特徵、指標特徵、時間細微性特徵儲存到該記憶體。 The search engine database according to item 26 of the patent application scope, wherein the receiver is also used to receive the data sent by the parser, as well as the dimensional characteristics, index characteristics, and time nuance characteristics of the data, and the parser is used to collect The data in the data application, and the dimensional characteristics, index characteristics, and time nuance characteristics of the data; The processor is also used to store the data in the data application, as well as the dimensional characteristics, index characteristics, and time nuance characteristics of the data into the memory. 根據申請專利範圍第26項的搜尋引擎資料庫,其中,該處理器具體用於分別獲取該報表、該知識庫平臺以及該集群物理表中的資料;根據預設演算法,確定該報表、該知識庫平臺以及該集群物理表中每個資料的維度特徵;將該報表、該知識庫平臺以及該集群物理表中每個資料,以及該資料的維度特徵儲存到該記憶體。 The search engine database according to item 26 of the patent application scope, wherein the processor is specifically configured to obtain data in the report, the knowledge base platform, and the cluster physical table respectively; according to a preset algorithm, determine the report, the The dimensional characteristics of each data in the knowledge base platform and the cluster physical table; the report, the each knowledge in the knowledge base platform and the cluster physical table, and the dimensional characteristics of the data are stored in the memory.
TW106119497A 2016-08-11 2017-06-12 Data processing method, device and system TW201805839A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610657498.8A CN107729336B (en) 2016-08-11 2016-08-11 Data processing method, device and system
??201610657498.8 2016-08-11

Publications (1)

Publication Number Publication Date
TW201805839A true TW201805839A (en) 2018-02-16

Family

ID=61162620

Family Applications (1)

Application Number Title Priority Date Filing Date
TW106119497A TW201805839A (en) 2016-08-11 2017-06-12 Data processing method, device and system

Country Status (3)

Country Link
CN (1) CN107729336B (en)
TW (1) TW201805839A (en)
WO (1) WO2018028443A1 (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664586B (en) * 2018-05-07 2022-04-15 北京国电通网络技术有限公司 Information acquisition method and system
CN108647213A (en) * 2018-05-21 2018-10-12 辽宁工程技术大学 A kind of composite key semantic relevancy appraisal procedure based on coupled relation analysis
CN109063108B (en) * 2018-07-27 2020-03-03 北京字节跳动网络技术有限公司 Search ranking method and device, computer equipment and storage medium
CN109344300A (en) * 2018-08-31 2019-02-15 深圳壹账通智能科技有限公司 The data query of natural language is intended to determine method, apparatus and computer equipment
CN110928903B (en) * 2018-08-31 2024-03-15 阿里巴巴集团控股有限公司 Data extraction method and device, equipment and storage medium
CN110737432B (en) * 2019-09-20 2023-10-20 黄沙沙 Script aided design method and device based on root list
CN110716950A (en) * 2019-09-20 2020-01-21 黄沙沙 Method, device and equipment for establishing aperture system and computer storage medium
CN110688541A (en) * 2019-10-08 2020-01-14 中国建设银行股份有限公司 Report data query method and device, storage medium and electronic equipment
CN110807089B (en) * 2019-10-29 2023-02-28 出门问问创新科技有限公司 Question answering method and device and electronic equipment
CN110851543A (en) * 2019-11-08 2020-02-28 深圳市彬讯科技有限公司 Data modeling method, device, equipment and storage medium
CN112948414A (en) * 2019-12-19 2021-06-11 深圳市明源云链互联网科技有限公司 Data report generation method and device, electronic equipment and storage medium
CN111309729A (en) * 2020-02-13 2020-06-19 湖南快乐阳光互动娱乐传媒有限公司 Data query method and device
CN111400556A (en) * 2020-03-06 2020-07-10 上海数据交易中心有限公司 Data query method and device, computer equipment and storage medium
CN111563095B (en) * 2020-04-30 2023-05-26 上海新炬网络信息技术股份有限公司 HBase-based data retrieval device
CN111913984A (en) * 2020-08-18 2020-11-10 南开大学 Drawing book information query method and system based on preschool child cognition
CN113793193B (en) * 2021-08-13 2024-02-02 唯品会(广州)软件有限公司 Data search accuracy verification method, device, equipment and computer readable medium
CN116257545B (en) * 2022-12-28 2024-01-30 联通智网科技股份有限公司 Data query method and device, electronic equipment and storage medium
CN117093708B (en) * 2023-10-17 2024-02-13 中电数创(北京)科技有限公司 Method for intelligently identifying search intention of user and visually displaying search results of element

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060047649A1 (en) * 2003-12-29 2006-03-02 Ping Liang Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
KR100776697B1 (en) * 2006-01-05 2007-11-16 주식회사 인터파크지마켓 Method for searching products intelligently based on analysis of customer's purchasing behavior and system therefor
US7536383B2 (en) * 2006-08-04 2009-05-19 Apple Inc. Method and apparatus for searching metadata
KR20080058634A (en) * 2006-12-22 2008-06-26 엔에이치엔(주) Retrieval system and method
US8156154B2 (en) * 2007-02-05 2012-04-10 Microsoft Corporation Techniques to manage a taxonomy system for heterogeneous resource domain
CN101620605A (en) * 2008-07-04 2010-01-06 华为技术有限公司 Search method, search server and search system
CN101661475B (en) * 2008-08-26 2013-04-24 华为技术有限公司 Search method and system
CN102314654B (en) * 2010-07-08 2017-10-17 阿里巴巴集团控股有限公司 A kind of information-pushing method and Information Push Server
CN102385585A (en) * 2010-08-27 2012-03-21 阿里巴巴集团控股有限公司 Establishing method of webpage database, webpage searching method and relative device
CN102033910A (en) * 2010-11-19 2011-04-27 福建富士通信息软件有限公司 Enterprise search engine technology based on multiple data resources
CN102184257A (en) * 2011-06-02 2011-09-14 广东亿迅科技有限公司 Unified searching method, device and system
CN102521223A (en) * 2011-09-02 2012-06-27 天津市道本科技有限公司 Three-word-in-one enterprise knowledge associative storing, searching and presenting method
CN105900081B (en) * 2013-02-19 2020-09-08 谷歌有限责任公司 Search based on natural language processing
US20150302006A1 (en) * 2014-04-18 2015-10-22 Verizon Patent And Licensing Inc. Advanced search for media content
CN104820715B (en) * 2015-05-19 2019-01-29 杭州迅涵科技有限公司 Based on the associated data sharing of various dimensions and analysis method and system
CN105279286A (en) * 2015-11-27 2016-01-27 陕西艾特信息化工程咨询有限责任公司 Interactive large data analysis query processing method

Also Published As

Publication number Publication date
CN107729336B (en) 2021-07-27
CN107729336A (en) 2018-02-23
WO2018028443A1 (en) 2018-02-15

Similar Documents

Publication Publication Date Title
TW201805839A (en) Data processing method, device and system
EP2160677B1 (en) System and method for measuring the quality of document sets
US10229200B2 (en) Linking data elements based on similarity data values and semantic annotations
CN103838756A (en) Method and device for determining pushed information
WO2021196541A1 (en) Method, apparatus and device used to search for content, and computer-readable storage medium
US20140006369A1 (en) Processing structured and unstructured data
JP5057474B2 (en) Method and system for calculating competition index between objects
US9336330B2 (en) Associating entities based on resource associations
JP6664599B2 (en) Ambiguity evaluation device, ambiguity evaluation method, and ambiguity evaluation program
Wang et al. Improving short text classification through better feature space selection
US9400789B2 (en) Associating resources with entities
CN110688559A (en) Retrieval method and device
US20200110769A1 (en) Machine learning (ml) based expansion of a data set
Parthasarathy et al. Trends in citation analysis
Ma et al. API prober–a tool for analyzing web API features and clustering web APIs
Du et al. Scientific users' interest detection and collaborators recommendation
WO2015159702A1 (en) Partial-information extraction system
Xu et al. Generating personalized web search using semantic context
Li et al. A personalized result recommendation method based on communities
Wang et al. Ontology-assisted deep Web source selection
Jiang et al. A personalized search engine model based on RSS User's interest
Sharma et al. RDF link generation by exploring related links on the Web of data
Gao et al. A service clustering method based on wisdom of crowds
Lan et al. Research on scoring mechanism based on BM25F model
Zhao et al. Improving academic homepage identification from the web using neural networks