CN107958406A - Inquire about acquisition methods, device and the terminal of data - Google Patents

Inquire about acquisition methods, device and the terminal of data Download PDF

Info

Publication number
CN107958406A
CN107958406A CN201711243962.XA CN201711243962A CN107958406A CN 107958406 A CN107958406 A CN 107958406A CN 201711243962 A CN201711243962 A CN 201711243962A CN 107958406 A CN107958406 A CN 107958406A
Authority
CN
China
Prior art keywords
query
inquiry
query characteristics
maximum entropy
disaggregated model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711243962.XA
Other languages
Chinese (zh)
Inventor
李泽中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaodu Information Technology Co Ltd
Original Assignee
Beijing Xiaodu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaodu Information Technology Co Ltd filed Critical Beijing Xiaodu Information Technology Co Ltd
Priority to CN201711243962.XA priority Critical patent/CN107958406A/en
Publication of CN107958406A publication Critical patent/CN107958406A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Abstract

The invention discloses a kind of acquisition methods, device and terminal for inquiring about data, is related to Internet technical field, main purpose is to improve the accuracy for obtaining inquiry data, while reduces unnecessary manual resource expense.The main technical schemes of the present invention include:The query characteristics for obtaining target query result are extracted according to searching keyword;The query characteristics are inputted default inquiry disaggregated model to be calculated, obtain the maximum entropy of the query characteristics, wherein, the default inquiry disaggregated model is used to be trained to obtain the corresponding maximum entropy of each query characteristics according to training examples, and the query characteristics of maximum entropy correspond to a query categories;Correspondence between query characteristics and query categories based on the maximum entropy, determines the query categories corresponding to the maximum entropy of the query characteristics;Return and belong to the same query categories the and relevant target query result with the searching keyword.Present invention is mainly applied to during inquiry data are obtained in taking out platform.

Description

Inquire about acquisition methods, device and the terminal of data
Technical field
The present invention relates to Internet technical field, more particularly to a kind of acquisition methods, device and terminal for inquiring about data.
Background technology
With the very fast development of Internet technology and social economy, internet increasingly influences the daily work of people Make, learn and live.For example, the life of people for convenience, in addition to providing traditional catering services, is going back in taking out platform The respective services such as supermarket, fresh, medicine are provided.
In the client for taking out platform, when user is in the form of word or voice are converted into word, input inquiry closes When key word inquires about end article, server can inquire about matching end article according to key word of the inquiry, so that user obtains. At present, common querying method is by the way of based on dictionary, and common are two kinds of dictionaries:Queries dictionary and query fragment Dictionary.The inquiry form of two kinds of dictionaries is consistent, and comprising multiple entries, each entry is by a character string and corresponding inquiry class Do not form.But the priority of two kinds of dictionaries is different, the priority of queries dictionary is higher than query fragment dictionary, i.e., when user is defeated When entering a key word of the inquiry, all entries in priority match queries dictionary, if key word of the inquiry Full Match Search dictionary In any one character string, then returned as a result with the corresponding query categories of character string;If it fails to match, continue to match Query fragment dictionary, if a sub-piece of inquiry have matched any one character string in query fragment dictionary completely, with The corresponding classification that is intended to of character string returns as a result;If query fragment dictionary pattern matching fails, null value is returned.
During inventor states invention in realization, find to be based in queries dictionary and query fragment dictionary in the prior art Any dictionary carry out end article inquiry when, a key word of the inquiry may match multiple entries in dictionary, therefore The conflict between multiple queries classification can occur, so that mistake occurs in the query result for causing to return;In addition, this kind is based on dictionary The mode of inquiry depends critically upon dictionary, and queries dictionary and query fragment dictionary are required to largely manually go to arrange, cost It is higher, and error rate is also higher.
The content of the invention
In view of this, a kind of acquisition methods, device and terminal for inquiring about data provided by the invention, main purpose are to carry Height obtains the accuracy of inquiry data, while reduces unnecessary manual resource expense.
In a first aspect, the present invention provides a kind of acquisition methods for inquiring about data, including:
The query characteristics for obtaining target query result are extracted according to searching keyword;
The query characteristics are inputted default inquiry disaggregated model to be calculated, obtain the maximum entropy of the query characteristics, Wherein, the default inquiry disaggregated model is used to be trained to obtain the corresponding maximum entropy of each query characteristics according to training examples, most The query characteristics of big entropy correspond to a query categories;
Correspondence between query characteristics and query categories based on the maximum entropy, determines the query characteristics most Query categories corresponding to big entropy;
Return and belong to the same query categories the and relevant target query result with the searching keyword.
Optionally, calculated by the default inquiry disaggregated model of query characteristics input, obtain the query characteristics Maximum entropy before, the method further includes:
The query characteristics for defining the default inquiry disaggregated model are unitary word and/or polynary word;
Build the default inquiry disaggregated model.
Optionally, the structure default inquiry disaggregated model includes:
Training examples are obtained, the training examples are to be determined according to historical query keyword and click inquiry data, described Click in inquiry data and include trade company's classification;
According to the training examples and the corresponding number of clicks for clicking on inquiry data, determine to click on each trade company's classification Weight;
According to the weight maximum for clicking on each trade company's classification, determine between the training examples and trade company's classification Correspondence, completes the structure of the default inquiry disaggregated model.
Optionally, obtaining training examples includes:
The historical query keyword in preset time period is obtained from history log data and corresponding click on inquires about data;
Merchant identification ID is determined according to the inquiry data of clicking on, and trade company's classification is determined according to the merchant identification ID;
Mapping relations between the historical query keyword, the merchant identification ID and trade company's classification are made carry out Storage, is denoted as a line;
The historical query feature of the historical query keyword is extracted according to the unitary word of definition and/or polynary word, each Historical query feature is as a training examples.
Optionally, the query characteristics of acquisition target query result are extracted according to searching keyword to be included:
The searching keyword is segmented, each participle can be used as the query characteristics;
According to the definition of the participle and the default inquiry disaggregated model, the unitary word for extracting the searching keyword is looked into Ask feature and/or polynary word query characteristics.
Optionally, the query characteristics are inputted default inquiry disaggregated model to be calculated, obtains the query characteristics Maximum entropy includes:
The query characteristics are matched with the historical query feature in the default inquiry disaggregated model;
If successful match, the weight limit corresponding to matching historical query feature is obtained;
Calculate the maximum entropy of the query characteristics according to the weight limit and default characteristic function, the weight limit with Maximum entropy is directly proportional.
Optionally, the method is applied to take out in platform, described to take out the inquiry data for including plurality of classes in platform.
Second aspect, the present invention provide a kind of acquisition device for inquiring about data, including:
Extraction unit, the query characteristics of target query result are obtained for being extracted according to searching keyword;
Processing unit, the default inquiry disaggregated model of query characteristics input for the acquiring unit to be obtained carry out Calculate, obtain the maximum entropy of the query characteristics, wherein, the default inquiry disaggregated model is used for trained according to training examples To the corresponding maximum entropy of each query characteristics, the query characteristics of maximum entropy correspond to a query categories;
Determination unit, between the query characteristics and query categories for the maximum entropy obtained based on the processing unit Correspondence, determine the query categories corresponding to the maximum entropy of the query characteristics;
Returning unit, belongs to the same query categories and relevant described for returning with the searching keyword Target query result.
Optionally, described device further includes:
Definition unit, based on the default inquiry disaggregated model of query characteristics input is carried out in the processing unit Calculate, before obtaining the maximum entropy of the query characteristics, define it is described it is default inquiry disaggregated model query characteristics for unitary word and/ Or polynary word;
Construction unit, for building the default inquiry disaggregated model.
Optionally, the construction unit includes:
Acquisition module, for obtaining training examples, the training examples are according to historical query keyword and click on inquiry Data determine that described click in inquiry data includes trade company's classification;
First determining module, for clicking on inquiry number with corresponding according to the training examples that the acquisition module obtains According to number of clicks, determine to click on the weight of each trade company's classification;
Second determining module, the weight of each trade company's classification of click for being determined according to first determining module are maximum Value, determines the correspondence between the training examples and trade company's classification, completes the structure of the default inquiry disaggregated model Build.
Optionally, the acquisition module includes:
Acquisition submodule, for obtaining historical query keyword and correspondence in preset time period from history log data Click inquiry data;
Determination sub-module, the click inquiry data for being obtained according to the acquisition submodule determine merchant identification ID, and trade company's classification is determined according to the merchant identification ID;
Sub-module stored, for the historical query keyword, the merchant identification for determining the determination sub-module Mapping relations between ID and trade company's classification are made are stored, and are denoted as a line;
Extracting sub-module, going through for the historical query keyword is extracted for the unitary word according to definition and/or polynary word History query characteristics, each historical query feature is as a training examples.
Optionally, the extraction unit includes:
Word-dividing mode, for the searching keyword to be segmented, each participle can be used as the query characteristics;
Extraction module, for determining for the participle that is obtained according to the word-dividing mode and the default inquiry disaggregated model Justice, extracts the unitary word query characteristics of the searching keyword and/or polynary word query characteristics.
Optionally, the processing unit includes:
Matching module, for the query characteristics and the historical query feature in the default inquiry disaggregated model to be carried out Matching;
Acquisition module, for when the matching module successful match, obtaining corresponding to matching historical query feature most Big weight;
Computing module, for calculating the maximum entropy of the query characteristics according to the weight limit and default characteristic function, The weight limit is directly proportional to maximum entropy.
Optionally, described device is applied to take out in platform, described to take out the inquiry data for including plurality of classes in platform.
The third aspect, the present invention also provides a kind of acquisition terminal for inquiring about data, the terminal includes processor and storage Device, wherein, the memory is used to store one or more computer instruction, and one or more computer instruction is described Processor performs the step of acquisition methods to realize the inquiry data any one of first aspect.
Fourth aspect, the present invention also provides a kind of computer-readable recording medium, is stored thereon with computer instruction, described The step of acquisition methods of the inquiry data any one of first aspect are realized when computer instruction is executed by processor.
By above-mentioned technical proposal, acquisition methods, device and the terminal of inquiry data provided by the invention, are closed according to inquiry After keyword extraction obtains the query characteristics of target query result, query characteristics are inputted into default inquiry disaggregated model and are counted Calculate, obtain the maximum entropy of query characteristics, wherein, default inquiry disaggregated model is used to be trained to obtain each inquiry according to training examples The corresponding maximum entropy of feature, the query characteristics of maximum entropy correspond to a query categories, query characteristics and inquiry based on maximum entropy Correspondence between classification, determines the query categories corresponding to the maximum entropy of query characteristics, returns same with searching keyword Query categories, relevant query result;Compared with prior art, the present invention saves labour turnover without artificial participation, and And looked into by the target that the maximum entropy that training in default inquiry disaggregated model obtains can be obtained accurately corresponding to searching keyword Ask as a result, improving the accuracy returned the result.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by the embodiment of the present invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area Technical staff will be clear understanding.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole attached drawing, identical component is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows a kind of flow chart of the querying method of data provided in an embodiment of the present invention;
Fig. 2 shows a kind of schematic diagram of APP display interfaces provided in an embodiment of the present invention;
Fig. 3 shows the flow chart of the querying method of another data provided in an embodiment of the present invention;
Fig. 4 shows a kind of flow chart for building the default inquiry disaggregated model provided in an embodiment of the present invention;
Fig. 5 shows a kind of composition frame chart of the inquiry unit of data provided in an embodiment of the present invention;
Fig. 6 shows the composition frame chart of the inquiry unit of another data provided in an embodiment of the present invention.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
In order to improve the correctness that target query result is returned according to query intention, while cost of labor is reduced, the present invention Embodiment provides a kind of querying method of data, and this method can be applied in kinds of platform, those platforms provide at two kinds of numbers According to the service of type, for example, online shopping store, take-away platform, video playing platform, news browsing platform etc. are comprising a variety of The platform of data type, for the ease of the description of following embodiment, the embodiment of the present invention is illustrated exemplified by taking out platform, should Take out comprising food and beverage sevice, supermarket service, medicine service, the service of fresh flower cake, the fresh service of fruit etc. in platform, it is different Service type corresponds to different query categories.It should be noted that in a manner of explanation of the present invention exemplified by take out platform, it is not easy It is applicable only to take out platform limiting the embodiment of the present invention.As shown in Figure 1, the described method includes:
101st, the query characteristics for obtaining target query result are extracted according to searching keyword.
Take out platform and include front end and backstage, wherein, front end is generally the application program for taking out platform (Application, APP), user can be browsed by APP, be clicked on, obtaining the operations such as inquiry data, and backstage is generally outer Sell the server of platform, there is provided the function such as renewal of the page, inquiry data return, after the method described in the embodiment of the present invention is applied to End.Wherein, acquisition terminal of the server as an inquiry data, it can be independent server, or with clothes Business device has the Relay Server of data interaction relation, and the embodiment of the present invention does not make specifically the type for inquiring about the acquisition terminal of data Limit.
As shown in Fig. 2, Fig. 2 shows a kind of schematic diagram of APP display interfaces provided in an embodiment of the present invention, when user is led to APP input inquiry keyword " dragging in Beijing seabed " in input frame is crossed, and after clicking on search, background server receives inquiry and closes Keyword.Background server extracts the query characteristics for obtaining target query result according to searching keyword, and the query characteristics are to look into The full content or partial content of keyword are ask, if for example, query characteristics are the full content of searching keyword, then from looking into Ask and a two-spot word " dragging in Beijing-seabed " is extracted in keyword, if query characteristics are the partial content of searching keyword, such as " north Capital ", " dragging in seabed ", then two unitary word features " Beijing " and " dragging in seabed " can be extracted according to searching keyword.
In practical applications, when extracting query characteristics according to searching keyword, the query characteristics of extraction are more, obtain Target query fruiting quantities are fewer, and content is more accurate, and the query characteristics of extraction are fewer, and obtained target query fruiting quantities are got over It is more.For example, if query characteristics are two-spot word " dragging in Beijing-seabed ", then as long as including " Beijing-seabed in target query result Drag for ";If query characteristics are two unitary words " Beijing ", " dragging in seabed ", then corresponding target query result is comprising " north The summation of the query result in capital " and the query result comprising " dragging in seabed ".
102nd, the query characteristics are inputted default inquiry disaggregated model to be calculated, obtains the maximum of the query characteristics Entropy.
Wherein, the default inquiry disaggregated model is used to train to obtain each query characteristics according to training examples corresponding most Big entropy, the query characteristics of maximum entropy correspond to a query categories.Training examples are according to historical query keyword and click on inquiry number According to determining, maximum entropy model (default inquiry disaggregated model) is obtained by the machine learning to training examples, main purpose is Realize the category classification of inquiry data, it is not required that the independence between query characteristics.
Entropy is the parameter for describing randomness, and entropy is bigger, illustrates that randomness is stronger, and entropy is smaller, is illustrated more orderly. In embodiments of the present invention, by the study to training examples, a query characteristics (being determined by historical query keyword) can The category classification (classification according to inquiry data are clicked on determines) of a variety of inquiry data can be corresponded to, still, those classifications are looked into The size of entropy has differences between asking data, and entropy is more big more can represent such inquiry data and belongs to the probability of user's query intention more Greatly;Entropy it is smaller more can represent such inquiry data belong to user's query intention probability it is smaller, therefore, in order to improve user inquiry The accuracy of intention, can be by the corresponding query categories of maximum entropy, as the classification returned the result.
When being trained study to default inquiry disaggregated model, the corresponding different query categories of query characteristics are determined Entropy size, when using default inquiry disaggregated model, inquiry disaggregated model is preset in the query characteristics input that step 101 is obtained The entropy that the query characteristics correspond to different query categories can be obtained.Exemplary, when query characteristics are " dragging in Beijing-seabed " When, the entropy of chafing dish restaurant query categories is 0.2, and the entropy of supermarket's query categories is 0.5, and the entropy that query categories are taken out in restaurant is 0.8 etc., it may be determined that maximum entropy of the query characteristics " dragging in Beijing-seabed " in default inquiry disaggregated model is 0.2.It is to show above The explanation of example property, the embodiment of the present invention do not limit query characteristics and entropy size specifically.
103rd, the correspondence between query characteristics and query categories based on the maximum entropy, determines the query characteristics Maximum entropy corresponding to query categories.
Except recording the corresponding pass having between the query characteristics of maximum entropy and query categories in default inquiry disaggregated model System is outer, and also record has the query categories corresponding to any entropy of query characteristics, as shown in table 1.In order to lift inquiry velocity, this In inventive embodiments after the maximum entropy of query characteristics is determined by the method for step 102, query categories can be directly determined, and simultaneously The query categories corresponding to each entropy are not determined one by one.It should be noted that the showing of only providing for convenience of description of table 1 Example, in practical applications, the entropy size of query characteristics, and its it is not limited to table with the storage mode of the correspondence of query categories Lattice, particular content are not limited to the content shown in table 1.
Table 1
104th, return and the same query categories of the searching keyword, relevant target query results.
The a variety of merchandise newss, more included under same query categories can be recorded in background server, in database A shops, branch information etc..Exemplary, when the target query classification of identification searching keyword " dragging in Beijing-seabed " is chafing dish During shop, can combine according to the current geographic position of user or, the positive rating of shops is dragged in seabed, to user recommend it is multiple on Target query that seabed is dragged for as a result, such as nearest apart from user's current geographic position merchant information, with the specific business under the trade company Product information etc., accurately identifies the query intention of user.
It is not single to be only returned only to maximum it should be noted that the present invention is implemented when returning to target query result The corresponding query categories of entropy query characteristics, corresponding target query is as a result, the corresponding target query result of other query categories Also can return, simply when returning to target query result, the corresponding target query result of maximum entropy is in the top, other inquiry classes Not corresponding target query result ranks behind, and is really query demand with meet different user.Exemplary, when inquiry is crucial When word is " dragging in Beijing-seabed ", in the target query result of return, ranking first five be chafing dish restaurant information, ranking are dragged in seabed 6th, the 7th target query result drags for chafing dish bottom flavorings, the target query result of eight to the tenth of ranking for the seabed that supermarket sells Commodity etc. are dragged for for the seabed of other businessmans.Specifically the quantity, the ranking that return to target query result are not limited.
It can be seen that the acquisition methods of inquiry data provided in an embodiment of the present invention, inquiry with reference to above-mentioned implementation After the terminal of data extracts the query characteristics for obtaining target query result according to searching keyword, query characteristics are inputted default Inquiry disaggregated model is calculated, and obtains the maximum entropy of query characteristics, wherein, preset inquiry disaggregated model and be used for according to training sample Example training obtains the corresponding maximum entropy of each query characteristics, and the query characteristics of maximum entropy correspond to a query categories, based on maximum Correspondence between the query characteristics and query categories of entropy, determines the query categories corresponding to the maximum entropy of query characteristics, returns Return and the same query categories of searching keyword, relevant query results;Compared with prior art, the embodiment of the present invention is without people The participation of work, saves labour turnover, and can accurately be obtained by the maximum entropy that training in default inquiry disaggregated model obtains Target query corresponding to searching keyword is as a result, improve the accuracy returned the result.
In order to which the acquisition methods of inquiry data provided in an embodiment of the present invention are explained in more detail, particularly this method should For taking out the use in platform, it is intended to return to accurate target query as a result, to carry to user according to the acquisition data of user Rise the inquiry experience of user.Specific steps as shown in figure 3, including:
201st, the query characteristics for defining the default inquiry disaggregated model are unitary word and/or polynary word.
It is to extract the query characteristics for obtaining target query result that query characteristics, which are defined, as the purpose of unitary word or polynary word, should Query characteristics are extracted according to searching keyword (or historical query keyword), it depends on the character string of searching keyword Length, therefore, in order to adapt to the change of the string length of different searching keywords, when query characteristics are extracted in definition, needs Define various ways extracting mode.
For the unitary word to be extracted whole searching keyword as an entirety, polynary word is from searching keyword Middle two or more polynary words of extraction.The number of polynary word is more, it is meant that each polynary when obtaining target query result Word will be matched with target query result, and the inquiry data of acquisition are more accurate, still, match the number also phase of target query result To reducing.
In order to improve matched accuracy, when the string length of searching keyword allows to be split as multiple participles, really Fixed corresponding query characteristics can include a polynary word query characteristics and multiple unitary word query characteristics, to match more business Family.When the string length of searching keyword do not allow to check the mark for multiple participles when, it is one one to determine corresponding query characteristics First word.
202nd, the query characteristics for obtaining target query result are extracted according to searching keyword.
When extracting query characteristics, in the following manner is may be employed without limitation of, such as:The searching keyword is carried out Participle, each participle can be used as the query characteristics, according to the participle and the default definition for inquiring about disaggregated model, Extract the unitary word query characteristics of the searching keyword and/or polynary word query characteristics.
Exemplary, if searching keyword is " vitamin ", segmented according to any one method of the prior art, Participle vitamin is obtained, the vitamin is can not be split participle, therefore the query characteristics extracted are " vitamin ".If inquiry is closed Keyword is " a bright spicy soup ", is obtained after participle:Zhang Liang, spicy soup, according to the definition of default inquiry disaggregated model, it may be determined that look into It can be a polynary word " Zhang Liang-spicy soup " and two unitary words " Zhang Liang ", " spicy soup " etc. to ask feature.Specifically, this Inventive embodiments do not limit searching keyword and corresponding query characteristics.
203rd, the query characteristics are inputted default inquiry disaggregated model to be calculated, obtains the maximum of the query characteristics Entropy.
From step 102, which is used to be trained to obtain each query characteristics according to training examples Corresponding maximum entropy, the query characteristics of maximum entropy correspond to a query categories.In specific implementation process, query characteristics are determined Afterwards, the query characteristics are matched with the historical query feature in the default inquiry disaggregated model, the historical query is special When levying inquiry disaggregated model default for training study, used query characteristics, if successful match, obtain matching historical query Weight limit corresponding to feature, the maximum entropy of the query characteristics is calculated according to the weight limit and default characteristic function. If matching is unsuccessful, corresponding target query knot is obtained from database according to the similarity of query characteristics or query characteristics Fruit.The definite of the historical query feature is that historical query keyword in history log data determines, rather than root According to the generation of a certain regular random, therefore, historical query feature has certain practicality, representativeness.In practical applications, lead to When crossing training examples and carrying out machine learning and obtain default inquiry disaggregated model, its historical query feature used is more comprehensively, often With, therefore, occur query characteristics and it is described it is default inquiry disaggregated model in the unmatched probability of historical query feature it is smaller.
When weight limit corresponding to the historical query feature is inquired about for user by historical query keyword, use The maximum probability for returning the result (a certain query categories), same historical query feature are clicked in family, and user clicks on same trade company's class Other number is more, and probability is bigger, and weight is bigger, and corresponding entropy is with regard to smaller (entropy of maximum entropy is minimum).
When calculating the maximum entropy of the query characteristics according to the weight limit and default characteristic function, it can be used Following manner:argmaxyiwifi(x, y), wherein, wi represents the weight of i-th of query characteristics, and fi represents ith feature letter Number, x training samples, y represent a query categories, and the weight limit is directly proportional to maximum entropy it can be seen from above-mentioned formula, Weight is bigger, and entropy is smaller, it is represented, and the probability of a kind of trade company of click is bigger, and the description in relation to characteristic function refer to existing skill Any explanation in art, the embodiment of the present invention are no longer repeated herein.
204th, the correspondence between query characteristics and query categories based on the maximum entropy, determines the query characteristics Maximum entropy corresponding to query categories.
For determining the specific determination mode of query categories corresponding to the maximum entropy of query characteristics with the above-described embodiments Step 103 in be illustrated, details are not described herein again.
205th, return and belong to the same query categories the and relevant target query knot with the searching keyword Fruit.
For returning to target query result specific implementation to be said in step 104 in the above-described embodiments Bright, details are not described herein again.
The embodiment of the present invention goes to solve to judge that user realizes identification by the machine learning thought of default inquiry disaggregated model The problem of, without the mode of manual sorting dictionary is needed in the prior art, economize on resources, avoid manual intervention and reduce standard True property.The embodiment of the present invention also provides a kind of method for building the default inquiry disaggregated model, as shown in figure 4, including:
301st, training examples are obtained, the training examples are to be determined according to historical query keyword and click inquiry data, Described click in inquiry data includes trade company's classification.
Specific implementation for this step, obtains the mode of training examples:
(1) the historical query keyword in preset time period is obtained from history log data and corresponding click on inquires about number According to.For building default inquiry disaggregated model by way of machine learning, its first step is to data progress to be learned Mark, and learnt the click inquiry data of user as labeled data in the embodiment of the present invention, the intention of user is known It is not converted into and how determines in inquiry classification.Except including historical query keyword and corresponding click in history log data Inquire about outside data, further include the account information of user, browsing time, click on the time, the information such as connection redirects, therefore, it is necessary to Historical query keyword is extracted from history log data and corresponding click on inquires about data.
In theory, by inquiring about data to the historical query keyword in history log data and corresponding click on During habit, the period of history log data is longer, and obtained statistical result is more accurate, meanwhile, calculation amount can also increase.In reality , can be by the historical query keyword in the continuous history log data in 1 month or 2 months and corresponding in Click on inquiry data to be learnt, specific it is not limited in the embodiment of the present invention.
In the embodiment of the present invention, specific inquiry data of clicking on can be that user checks trade company or checked in trade company's page The data that the click behavior of specific commodity is generated, can also be concern trade company, commodity are added the click datas such as shopping cart.
(2) determine merchant identification ID according to the inquiry data of clicking on, and trade company's class is determined according to the merchant identification ID Not.
Merchant identification ID is obtained from the information database of trade company according to inquiry data are clicked on, merchant identification ID is used for Take out one trade company of unique mark in platform.Merchant type is obtained according to merchant identification ID, merchant type includes:Food and drink, fresh flower, Pharmacy, supermarket, fruit are fresh etc..
(3) by the mapping relations between the historical query keyword, the merchant identification ID and trade company's classification work Stored, be denoted as a line.
The mapping relations between historical query keyword, the merchant identification ID and trade company's classification three are established, and Mapping relations are stored.Training learning process in, can count multiple historical query keywords, the merchant identification ID and Trade company's classification, therefore, for the ease of distinguishing, in storage, every kind of mapping relations branch is stored, in order to which follow-up makes With.
(4) the historical query feature of the historical query keyword is extracted according to the unitary word of definition and/or polynary word, often A historical query feature is as a training examples.
The method for extracting historical query feature is identical with the method that query characteristics are extracted in step 202, related extraction history The detailed description of query characteristics, refers to the related explanation of step 202.
302nd, according to the training examples and the corresponding number of clicks for clicking on inquiry data, determine to click on each trade company's class Other weight.
Same training examples (historical query feature), the corresponding trade company's classification clicked in inquiry data may exist Difference, for example, being directed to same training examples, clicks in inquiry data and includes 4 businessmans altogether, wherein, the number for clicking on trade company A is 20 times, the number for clicking on trade company B is 200 times, and the number for clicking on trade company C is 56 times, and the number for clicking on trade company D is 24 times, thus As can be seen that under same query characteristics, the probability that user clicks on trade company B is relatively large.
As a kind of achievable mode of the embodiment of the present invention, when calculating the weight of historical query feature, by some business The total degree of trade company is clicked under the number of clicks at family divided by same training examples.In by a upper example, under same training examples, point The total degree for hitting trade company is 300, wherein, the weight for clicking on trade company B is 200/300=0.67, and the weight for clicking on trade company A is 20/ 300=0.067 etc., wherein, the weight summation that different trade companies are clicked under same training examples is 1.
303rd, according to the weight maximum for clicking on each trade company's classification, determine the training examples and trade company's classification it Between correspondence, complete it is described it is default inquiry disaggregated model structure.
After the weight for clicking on each trade company's classification is calculated in step 302, weight maximum is obtained, and establish weight most The correspondence being worth greatly between corresponding trade company's classification and training examples (historical query feature).In practical applications, it is to be learned Historical query be characterized as multiple, calculate each click of each training examples (historical query feature) successively by the above method The weight of trade company's classification, and determine weight maximum, complete the structure of the default inquiry disaggregated model.
Further, as the realization to above method embodiment, another embodiment of the present invention additionally provides a kind of inquiry The acquisition device of data.The device embodiment is corresponding with preceding method embodiment, and for ease of reading, present apparatus embodiment is no longer right Detail content in preceding method embodiment is repeated one by one, it should be understood that the device in the present embodiment can correspond to reality Full content in existing preceding method embodiment.As shown in figure 5, including:
Extraction unit 41, the query characteristics of target query result are obtained for being extracted according to searching keyword;Looked into basis When asking keyword extraction query characteristics, the query characteristics of extraction are more, and obtained target query fruiting quantities are fewer, and content is more smart Really, the query characteristics of extraction are fewer, and obtained target query fruiting quantities are more
Processing unit 42, for the default inquiry disaggregated model of query characteristics input for obtaining the acquiring unit 41 Calculated, obtain the maximum entropy of the query characteristics, wherein, the default inquiry disaggregated model is used to be instructed according to training examples The corresponding maximum entropy of each query characteristics is got, the query characteristics of maximum entropy correspond to a query categories;By to training sample The machine learning of example obtains maximum entropy model (default inquiry disaggregated model), and main purpose is the classification point for realizing inquiry data Class, it is not required that the independence between query characteristics;Entropy is the parameter for describing randomness, and entropy is bigger, illustrates randomness Stronger, entropy is smaller, illustrates more orderly.In embodiments of the present invention, the study to training examples, a query characteristics are passed through (being determined by historical query keyword) may correspond to the category classification of a variety of inquiry data (according to the class for clicking on inquiry data Not Que Ding), still, the size of entropy has differences between the inquiry data of those classifications, entropy it is more big can represent such inquiry number It is bigger according to the probability for belonging to user's query intention;The smaller probability that can more represent such inquiry data and belong to user's query intention of entropy It is smaller, therefore, can be by the corresponding query categories of maximum entropy, as returning the result in order to improve the accuracy of user's query intention Classification;
Determination unit 43, for the query characteristics and query categories of the maximum entropy obtained based on the processing unit 42 Between correspondence, determine the query categories corresponding to the maximum entropy of the query characteristics;
Returning unit 44, belongs to the same query categories and relevant institute for returning with the searching keyword State target query result;The present invention is implemented when returning to target query result, is not that the single maximum entropy that is only returned only to is inquired about The corresponding query categories of feature, corresponding target query is as a result, the corresponding target query result of other query categories can also be returned Return, simply when returning to target query result, the corresponding target query result of maximum entropy is in the top, other query categories correspond to Target query result rank behind, be really query demand with meet different user.
Further, as shown in fig. 6, described device further includes:
Definition unit 45, carries out for the query characteristics to be inputted default inquiry disaggregated model in the processing unit 42 Calculate, before obtaining the maximum entropy of the query characteristics, the query characteristics for defining the default inquiry disaggregated model are unitary word And/or polynary word;The query characteristics are extracted according to searching keyword (or historical query keyword), it is dependent on inquiry The string length of keyword, therefore, in order to adapt to the change of the string length of different searching keywords, is extracted in definition During query characteristics, various ways extracting mode need to be defined;
Construction unit 46, for building the default inquiry disaggregated model.
The embodiment of the present invention goes to solve to judge that user realizes identification by the machine learning thought of default inquiry disaggregated model The problem of, without the mode of manual sorting dictionary is needed in the prior art, economize on resources, avoid manual intervention and reduce standard True property.As shown in fig. 6, the construction unit 46 includes:
Acquisition module 461, for obtaining training examples, the training examples are to be looked into according to historical query keyword and click Ask data to determine, described click in inquiry data includes trade company's classification;
First determining module 462, for the training examples obtained according to the acquisition module 461 and corresponding click The number of clicks of data is inquired about, determines to click on the weight of each trade company's classification;
Second determining module 463, for the power of each trade company's classification of click determined according to first determining module 462 Weight maximum, determines the correspondence between the training examples and trade company's classification, completes the default inquiry classification mould The structure of type.
Further, as shown in fig. 6, the acquisition module 461 includes:
Acquisition submodule 4611, for from history log data obtain preset time period in historical query keyword and It is corresponding to click on inquiry data;
Determination sub-module 4612, the click inquiry data for being obtained according to the acquisition submodule 4611 determine business Family identifies ID, and determines trade company's classification according to the merchant identification ID;
Sub-module stored 4613, for the historical query keyword, described for determining the determination sub-module 4612 Mapping relations between merchant identification ID and trade company's classification are made are stored, and are denoted as a line;
Extracting sub-module 4614, the historical query keyword is extracted for the unitary word according to definition and/or polynary word Historical query feature, each historical query feature is as a training examples.
Further, as shown in fig. 6, the extraction unit 41 includes:
Word-dividing mode 411, for the searching keyword to be segmented, each participle can be used as an inquiry special Sign;
Extraction module 412, for the participle obtained according to the word-dividing mode and the default inquiry disaggregated model Definition, extract the unitary word query characteristics of the searching keyword and/or polynary word query characteristics.
Further, as shown in fig. 6, the processing unit 42 includes:
Matching module 421, for by the query characteristics with it is described it is default inquiry disaggregated model in historical query feature Matched;
Acquisition module 422, for when the matching module successful match, obtaining corresponding to matching historical query feature Weight limit;
Computing module 423, for calculating the maximum of the query characteristics according to the weight limit and default characteristic function Entropy, the weight limit are directly proportional to maximum entropy.
Further, described device is applied to take out in platform, described to take out the inquiry number for including plurality of classes in platform According to.
Further, the embodiment of the present invention, which carries, also provides a kind of acquisition terminal for inquiring about data, and the terminal includes processing Device and memory, wherein, the memory is used to store one or more computer instruction, and one or more computer refers to Order is performed the step of acquisition methods to realize the inquiry data shown in above-mentioned Fig. 1, Fig. 3 or Fig. 4 by the processor.
Further, the embodiment of the present invention, which carries, also provides a kind of computer-readable recording medium, is stored thereon with computer Instruction, the computer instruction realize the acquisition methods of the inquiry data shown in above-mentioned Fig. 1, Fig. 3 or Fig. 4 when being executed by processor The step of.
It can be seen that acquisition device and the end of inquiry data provided in an embodiment of the present invention with reference to above-mentioned implementation End, is inquired about after query characteristics of the terminal of data according to searching keyword extraction acquisition target query result, by query characteristics The default inquiry disaggregated model of input is calculated, and obtains the maximum entropy of query characteristics, wherein, default inquiry disaggregated model is used for root To train to obtain the corresponding maximum entropy of each query characteristics according to training examples, the query characteristics of maximum entropy correspond to a query categories, Correspondence between query characteristics and query categories based on maximum entropy, determines the inquiry corresponding to the maximum entropy of query characteristics Classification, returns to the same query categories of searching keyword, relevant query result;Compared with prior art, the present invention is implemented Example saves labour turnover, and train obtained maximum entropy can by presetting in inquiry disaggregated model without artificial participation The accurate target query obtained corresponding to searching keyword is as a result, improve the accuracy returned the result.
Following scheme is also disclosed in the embodiment of the present invention:
A1, a kind of acquisition methods for inquiring about data, including:
The query characteristics for obtaining target query result are extracted according to searching keyword;
The query characteristics are inputted default inquiry disaggregated model to be calculated, obtain the maximum entropy of the query characteristics, Wherein, the default inquiry disaggregated model is used to be trained to obtain the corresponding maximum entropy of each query characteristics according to training examples, most The query characteristics of big entropy correspond to a query categories;
Correspondence between query characteristics and query categories based on the maximum entropy, determines the query characteristics most Query categories corresponding to big entropy;
Return and belong to the same query categories the and relevant target query result with the searching keyword.
A2, the method according to A1, are calculated the query characteristics are inputted default inquiry disaggregated model, are obtained Before the maximum entropy of the query characteristics, the method further includes:
The query characteristics for defining the default inquiry disaggregated model are unitary word and/or polynary word;
Build the default inquiry disaggregated model.
A3, the method according to A2, the structure default inquiry disaggregated model include:
Training examples are obtained, the training examples are to be determined according to historical query keyword and click inquiry data, described Click in inquiry data and include trade company's classification;
According to the training examples and the corresponding number of clicks for clicking on inquiry data, determine to click on each trade company's classification Weight;
According to the weight maximum for clicking on each trade company's classification, determine between the training examples and trade company's classification Correspondence, completes the structure of the default inquiry disaggregated model.
A4, the method according to A3, obtaining training examples includes:
The historical query keyword in preset time period is obtained from history log data and corresponding click on inquires about data;
Merchant identification ID is determined according to the inquiry data of clicking on, and trade company's classification is determined according to the merchant identification ID;
Mapping relations between the historical query keyword, the merchant identification ID and trade company's classification are made carry out Storage, is denoted as a line;
The historical query feature of the historical query keyword is extracted according to the unitary word of definition and/or polynary word, each Historical query feature is as a training examples.
A5, the method according to A2, the query characteristics of acquisition target query result are extracted according to searching keyword to be included:
The searching keyword is segmented, each participle can be used as the query characteristics;
According to the definition of the participle and the default inquiry disaggregated model, the unitary word for extracting the searching keyword is looked into Ask feature and/or polynary word query characteristics.
The query characteristics are inputted default inquiry disaggregated model and are calculated, obtain institute by A6, the method according to A3 Stating the maximum entropy of query characteristics includes:
The query characteristics are matched with the historical query feature in the default inquiry disaggregated model;
If successful match, the weight limit corresponding to matching historical query feature is obtained;
Calculate the maximum entropy of the query characteristics according to the weight limit and default characteristic function, the weight limit with Maximum entropy is directly proportional.
A7, the method according to any one of A1-A6, the method are applied to take out in platform, the take-away platform In include the inquiry data of plurality of classes.
B8, a kind of acquisition device for inquiring about data, including:
Extraction unit, the query characteristics of target query result are obtained for being extracted according to searching keyword;
Processing unit, the default inquiry disaggregated model of query characteristics input for the acquiring unit to be obtained carry out Calculate, obtain the maximum entropy of the query characteristics, wherein, the default inquiry disaggregated model is used for trained according to training examples To the corresponding maximum entropy of each query characteristics, the query characteristics of maximum entropy correspond to a query categories;
Determination unit, between the query characteristics and query categories for the maximum entropy obtained based on the processing unit Correspondence, determine the query categories corresponding to the maximum entropy of the query characteristics;
Returning unit, belongs to the same query categories and relevant described for returning with the searching keyword Target query result.
B9, the device according to B8, described device further include:
Definition unit, based on the default inquiry disaggregated model of query characteristics input is carried out in the processing unit Calculate, before obtaining the maximum entropy of the query characteristics, define it is described it is default inquiry disaggregated model query characteristics for unitary word and/ Or polynary word;
Construction unit, for building the default inquiry disaggregated model.
B10, the device according to B9, the construction unit include:
Acquisition module, for obtaining training examples, the training examples are according to historical query keyword and click on inquiry Data determine that described click in inquiry data includes trade company's classification;
First determining module, for clicking on inquiry number with corresponding according to the training examples that the acquisition module obtains According to number of clicks, determine to click on the weight of each trade company's classification;
Second determining module, the weight of each trade company's classification of click for being determined according to first determining module are maximum Value, determines the correspondence between the training examples and trade company's classification, completes the structure of the default inquiry disaggregated model Build.
B11, the device according to B10, the acquisition module include:
Acquisition submodule, for obtaining historical query keyword and correspondence in preset time period from history log data Click inquiry data;
Determination sub-module, the click inquiry data for being obtained according to the acquisition submodule determine merchant identification ID, and trade company's classification is determined according to the merchant identification ID;
Sub-module stored, for the historical query keyword, the merchant identification for determining the determination sub-module Mapping relations between ID and trade company's classification are made are stored, and are denoted as a line;
Extracting sub-module, going through for the historical query keyword is extracted for the unitary word according to definition and/or polynary word History query characteristics, each historical query feature is as a training examples.
B12, the device according to B9, the extraction unit include:
Word-dividing mode, for the searching keyword to be segmented, each participle can be used as the query characteristics;
Extraction module, for determining for the participle that is obtained according to the word-dividing mode and the default inquiry disaggregated model Justice, extracts the unitary word query characteristics of the searching keyword and/or polynary word query characteristics.
B13, the device according to B10, the processing unit include:
Matching module, for the query characteristics and the historical query feature in the default inquiry disaggregated model to be carried out Matching;
Acquisition module, for when the matching module successful match, obtaining corresponding to matching historical query feature most Big weight;
Computing module, for calculating the maximum entropy of the query characteristics according to the weight limit and default characteristic function, The weight limit is directly proportional to maximum entropy.
B14, the device according to any one of B8-B13, described device are applied to take out in platform, and described take out is put down The inquiry data of plurality of classes are included in platform.
C15, a kind of acquisition terminal for inquiring about data, the terminal include processor and memory, wherein, the memory For storing one or more computer instruction, one or more computer instruction is performed by the processor to realize A1 To the inquiry data any one of A7 acquisition methods the step of.
D16, a kind of computer-readable recording medium, are stored thereon with computer instruction, and the computer instruction is processed Device realizes the step of acquisition methods of the inquiry data any one of A1 to A7 when performing.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not have the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiment.
It is understood that the correlated characteristic in the above method and device can be referred to mutually.In addition, in above-described embodiment " first ", " second " etc. be to be used to distinguish each embodiment, and do not represent the quality of each embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, may be referred to the corresponding process in preceding method embodiment, details are not described herein.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.
In the specification that this place provides, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice in the case of these no details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description to the exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The application claims of shield features more more than the feature being expressly recited in each claim.It is more precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself Separate embodiments all as the present invention.
Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Member or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it can use any Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and attached drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power Profit requires, summary and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation Replace.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.
The all parts embodiment of the present invention can be with hardware realization, or to be run on one or more processor Software module realize, or realized with combinations thereof.It will be understood by those of skill in the art that it can use in practice Microprocessor or digital signal processor (DSP) realize the acquisition methods of inquiry data according to embodiments of the present invention, dress Put and some or all functions of some or all components in terminal.The present invention is also implemented as being used to perform here The some or all equipment or program of device of described method are (for example, computer program and computer program production Product).Such program for realizing the present invention can store on a computer-readable medium, or can have one or more The form of signal.Such signal can be downloaded from internet website and obtained, and either be provided or on carrier signal to appoint What other forms provides.
It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of some different elements and being come by means of properly programmed computer real It is existing.In if the unit claim of equipment for drying is listed, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.

Claims (10)

  1. A kind of 1. acquisition methods for inquiring about data, it is characterised in that including:
    The query characteristics for obtaining target query result are extracted according to searching keyword;
    The query characteristics are inputted default inquiry disaggregated model to be calculated, obtain the maximum entropy of the query characteristics, wherein, The default inquiry disaggregated model is used to be trained to obtain the corresponding maximum entropy of each query characteristics according to training examples, maximum entropy Query characteristics correspond to a query categories;
    Correspondence between query characteristics and query categories based on the maximum entropy, determines the maximum entropy of the query characteristics Corresponding query categories;
    Return and belong to the same query categories the and relevant target query result with the searching keyword.
  2. 2. according to the method described in claim 1, it is characterized in that, the query characteristics are being inputted into default inquiry disaggregated model Calculated, before obtaining the maximum entropy of the query characteristics, the method further includes:
    The query characteristics for defining the default inquiry disaggregated model are unitary word and/or polynary word;
    Build the default inquiry disaggregated model.
  3. 3. according to the method described in claim 2, it is characterized in that, the structure default inquiry disaggregated model includes:
    Training examples are obtained, the training examples are to be determined according to historical query keyword and click inquiry data, the click Include trade company's classification in inquiry data;
    According to the training examples and the corresponding number of clicks for clicking on inquiry data, determine to click on the power of each trade company's classification Weight;
    According to the weight maximum for clicking on each trade company's classification, determine corresponding between the training examples and trade company's classification Relation, completes the structure of the default inquiry disaggregated model.
  4. 4. according to the method described in claim 3, it is characterized in that, obtaining training examples includes:
    The historical query keyword in preset time period is obtained from history log data and corresponding click on inquires about data;
    Merchant identification ID is determined according to the inquiry data of clicking on, and trade company's classification is determined according to the merchant identification ID;
    Mapping relations between the historical query keyword, the merchant identification ID and trade company's classification are made are deposited Storage, is denoted as a line;
    The historical query feature of the historical query keyword, each history are extracted according to the unitary word of definition and/or polynary word Query characteristics are as a training examples.
  5. 5. according to the method described in claim 2, obtain target query result it is characterized in that, being extracted according to searching keyword Query characteristics include:
    The searching keyword is segmented, each participle can be used as the query characteristics;
    According to the definition of the participle and the default inquiry disaggregated model, the unitary word inquiry for extracting the searching keyword is special Sign and/or polynary word query characteristics.
  6. 6. according to the method described in claim 3, it is characterized in that, by the query characteristics input default inquiry disaggregated model into Row calculates, and obtaining the maximum entropy of the query characteristics includes:
    The query characteristics are matched with the historical query feature in the default inquiry disaggregated model;
    If successful match, the weight limit corresponding to matching historical query feature is obtained;
    The maximum entropy of the query characteristics, the weight limit and maximum are calculated according to the weight limit and default characteristic function Entropy is directly proportional.
  7. 7. according to the method any one of claim 1-6, it is characterised in that the method is applied to take out in platform, The inquiry data of plurality of classes are included in the take-away platform.
  8. A kind of 8. acquisition device for inquiring about data, it is characterised in that including:
    Extraction unit, the query characteristics of target query result are obtained for being extracted according to searching keyword;
    Processing unit, carry out by the default inquiry disaggregated model of query characteristics input for obtaining the acquiring unit based on Calculate, obtain the maximum entropy of the query characteristics, wherein, the default inquiry disaggregated model is used to train to obtain according to training examples The corresponding maximum entropy of each query characteristics, the query characteristics of maximum entropy correspond to a query categories;
    Determination unit, for pair between the query characteristics and query categories of the maximum entropy obtained based on the processing unit It should be related to, determine the query categories corresponding to the maximum entropy of the query characteristics;
    Returning unit, belongs to the same query categories the and relevant target for returning with the searching keyword Query result.
  9. A kind of 9. acquisition terminal for inquiring about data, it is characterised in that the terminal includes processor and memory, wherein, it is described Memory be used for store one or more computer instruction, one or more computer instruction by the processor perform with The step of realizing the acquisition methods of the inquiry data any one of claim 1 to 7.
  10. 10. a kind of computer-readable recording medium, is stored thereon with computer instruction, it is characterised in that the computer instruction The step of acquisition methods of the inquiry data any one of claim 1 to 7 are realized when being executed by processor.
CN201711243962.XA 2017-11-30 2017-11-30 Inquire about acquisition methods, device and the terminal of data Pending CN107958406A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711243962.XA CN107958406A (en) 2017-11-30 2017-11-30 Inquire about acquisition methods, device and the terminal of data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711243962.XA CN107958406A (en) 2017-11-30 2017-11-30 Inquire about acquisition methods, device and the terminal of data

Publications (1)

Publication Number Publication Date
CN107958406A true CN107958406A (en) 2018-04-24

Family

ID=61962766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711243962.XA Pending CN107958406A (en) 2017-11-30 2017-11-30 Inquire about acquisition methods, device and the terminal of data

Country Status (1)

Country Link
CN (1) CN107958406A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763200A (en) * 2018-05-15 2018-11-06 达而观信息科技(上海)有限公司 Chinese word cutting method and device
CN109657145A (en) * 2018-12-20 2019-04-19 拉扎斯网络科技(上海)有限公司 Trade company's searching method and device, electronic equipment and computer readable storage medium
CN110909231A (en) * 2019-11-15 2020-03-24 上海易点时空网络有限公司 Method and system for inquiring violation data and storage medium
CN111061835A (en) * 2019-12-17 2020-04-24 医渡云(北京)技术有限公司 Query method and device, electronic equipment and computer readable storage medium
CN111831867A (en) * 2020-04-02 2020-10-27 北京嘀嘀无限科技发展有限公司 Address query method and device, electronic equipment and computer readable storage medium
CN111858922A (en) * 2019-09-24 2020-10-30 北京嘀嘀无限科技发展有限公司 Service side information query method and device, electronic equipment and storage medium
CN112035599A (en) * 2020-11-06 2020-12-04 苏宁金融科技(南京)有限公司 Query method and device based on vertical search, computer equipment and storage medium
CN113342983A (en) * 2021-06-30 2021-09-03 中国平安人寿保险股份有限公司 Resume distribution method, device and equipment based on machine learning and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101887436A (en) * 2009-05-12 2010-11-17 阿里巴巴集团控股有限公司 Retrieval method, device and system
CN102033877A (en) * 2009-09-27 2011-04-27 阿里巴巴集团控股有限公司 Search method and device
EP2583190A1 (en) * 2010-06-18 2013-04-24 Alibaba Group Holding Limited Determining and using search term weightings
CN103207881A (en) * 2012-01-17 2013-07-17 阿里巴巴集团控股有限公司 Query method and unit

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101887436A (en) * 2009-05-12 2010-11-17 阿里巴巴集团控股有限公司 Retrieval method, device and system
CN102033877A (en) * 2009-09-27 2011-04-27 阿里巴巴集团控股有限公司 Search method and device
EP2583190A1 (en) * 2010-06-18 2013-04-24 Alibaba Group Holding Limited Determining and using search term weightings
CN103207881A (en) * 2012-01-17 2013-07-17 阿里巴巴集团控股有限公司 Query method and unit

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
中国索引学会主办等: "《中国索引 第2辑》", 30 September 2017, 复旦大学出版社 *
王爱功等: "《数字图书档案技术》", 31 May 2006, 时代文艺出版社 *
赵欣: ""基于最大熵的中文术语抽取系统的设计与实现"", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763200A (en) * 2018-05-15 2018-11-06 达而观信息科技(上海)有限公司 Chinese word cutting method and device
CN109657145A (en) * 2018-12-20 2019-04-19 拉扎斯网络科技(上海)有限公司 Trade company's searching method and device, electronic equipment and computer readable storage medium
CN111858922A (en) * 2019-09-24 2020-10-30 北京嘀嘀无限科技发展有限公司 Service side information query method and device, electronic equipment and storage medium
CN110909231A (en) * 2019-11-15 2020-03-24 上海易点时空网络有限公司 Method and system for inquiring violation data and storage medium
CN111061835A (en) * 2019-12-17 2020-04-24 医渡云(北京)技术有限公司 Query method and device, electronic equipment and computer readable storage medium
CN111061835B (en) * 2019-12-17 2023-09-22 医渡云(北京)技术有限公司 Query method and device, electronic equipment and computer readable storage medium
CN111831867A (en) * 2020-04-02 2020-10-27 北京嘀嘀无限科技发展有限公司 Address query method and device, electronic equipment and computer readable storage medium
CN112035599A (en) * 2020-11-06 2020-12-04 苏宁金融科技(南京)有限公司 Query method and device based on vertical search, computer equipment and storage medium
CN113342983A (en) * 2021-06-30 2021-09-03 中国平安人寿保险股份有限公司 Resume distribution method, device and equipment based on machine learning and storage medium
CN113342983B (en) * 2021-06-30 2023-02-07 中国平安人寿保险股份有限公司 Resume distribution method, device and equipment based on machine learning and storage medium

Similar Documents

Publication Publication Date Title
CN107958406A (en) Inquire about acquisition methods, device and the terminal of data
CN111784455B (en) Article recommendation method and recommendation equipment
US10977710B2 (en) Systems and methods for tracking consumer tasting preferences
CN109299994B (en) Recommendation method, device, equipment and readable storage medium
CN103246980B (en) Information output method and server
US8856129B2 (en) Flexible and scalable structured web data extraction
CN104199833B (en) The clustering method and clustering apparatus of a kind of network search words
US20220405607A1 (en) Method for obtaining user portrait and related apparatus
CN110383263B (en) Creating cognitive intelligent queries from multiple data corpora
WO2020187168A1 (en) Resume pushing method and apparatus, and task pushing method and apparatus
CN112434151A (en) Patent recommendation method and device, computer equipment and storage medium
CN108038161A (en) Information recommendation method, device and computing device based on photograph album
CN109460519B (en) Browsing object recommendation method and device, storage medium and server
CN110647696B (en) Business object sorting method and device
CN107247728B (en) Text processing method and device and computer storage medium
Jaeger et al. Analysing conjoint data with OLS and PLS regression: a case study with wine
CN110175895A (en) A kind of item recommendation method and device
US9256805B2 (en) Method and system of identifying an entity from a digital image of a physical text
US20220107980A1 (en) Providing an object-based response to a natural language query
CN107958270A (en) Classification recognition methods, device, electronic equipment and computer-readable recording medium
CN115982463A (en) Resource recommendation method, device, equipment and storage medium
CN109472025B (en) Dish name extraction method and device
KR101498944B1 (en) Method and apparatus for deciding product seller related document
KR102082900B1 (en) System for providing optimal keyword of sale items
US10217080B1 (en) Item classification using visible attributes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Building N3, building 12, No. 27, Chengzhong Road, Jiancai, Haidian District, Beijing 100096

Applicant after: Beijing Xingxuan Technology Co.,Ltd.

Address before: Room 202, 2 floors, 1-3 floors, No. 11 Shangdi Information Road, Haidian District, Beijing 100085

Applicant before: Beijing Xiaodu Information Technology Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20180424

RJ01 Rejection of invention patent application after publication