CN105930362B - Search for target identification method, device and terminal - Google Patents

Search for target identification method, device and terminal Download PDF

Info

Publication number
CN105930362B
CN105930362B CN201610224273.3A CN201610224273A CN105930362B CN 105930362 B CN105930362 B CN 105930362B CN 201610224273 A CN201610224273 A CN 201610224273A CN 105930362 B CN105930362 B CN 105930362B
Authority
CN
China
Prior art keywords
dictionary
search
template
keyword
multiple target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610224273.3A
Other languages
Chinese (zh)
Other versions
CN105930362A (en
Inventor
汤奇峰
王万宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZAMPLUS ADVERTISING (SHANGHAI) CO Ltd
Original Assignee
ZAMPLUS ADVERTISING (SHANGHAI) CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZAMPLUS ADVERTISING (SHANGHAI) CO Ltd filed Critical ZAMPLUS ADVERTISING (SHANGHAI) CO Ltd
Priority to CN201610224273.3A priority Critical patent/CN105930362B/en
Publication of CN105930362A publication Critical patent/CN105930362A/en
Application granted granted Critical
Publication of CN105930362B publication Critical patent/CN105930362B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

A kind of search target identification method, device and terminal, search target identification method includes: to be segmented according to known dictionary to search text, obtains multiple target keywords, wherein, the dictionary includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries;According to the corresponding dictionary of the multiple target keyword, at least part in the multiple target keyword is subjected to matching conversion with default dictionary template, to obtain portmanteau word;The portmanteau word is formed into search key with the target keyword for not matching conversion, described search keyword corresponds to described search target;Wherein, the dictionary template is the combination of one or more dictionaries in the dictionary.Technical solution of the present invention improves the efficiency of search target identification.

Description

Search for target identification method, device and terminal
Technical field
The present invention relates to Internet technical field more particularly to a kind of search target identification methods, device and terminal.
Background technique
With the economic emergence of search, people increasingly pay close attention to performance, technology and the daily flow of the major search engine in the whole world Etc. characteristics, therefore the bottleneck of search engine and performance boost gradually become current research hotspot.In internet area, user Ordinary practice carries out web search in using universal search engine, by one section of text of input, obtains relevant to input text Information.
In the prior art, in certain commercial articles searching industries, for example, in the vertical search engine of automobile industry The search module of (Vertical Search Engine), automobile industry website, the interface for being supplied to user is structuring.With Family, according to the interface of the structuring of offer, is screened one by one in search, and is submitted and inquired to search engine.User is to searching Index, which is held up, submits searching keyword (mode for generalling use natural language inputs, such as " MP3 ").And it is carried out inside search engine Only receive or identify the query string (that is, machine language) of specific format when search, for example, q=[MP3] OR [MP3 recording pen] AND [MP3 earphone] &filter:customer_id=54321&sorttype:+update_time&count=12.Therefore, it is necessary to Respective handling is carried out to the searching keyword of user to generate identifiable search engine inquiry string inside search engine.It is big at present Most search engines are generated in advance according to the query string logic of different application type or type of service by its front end system Then corresponding template applies template relevant to the searching keyword of input in practical applications to generate search engine inquiry String.In addition, emerging customer service robot can simulate human-computer dialogue, interacted based on voice, realizes the search of target.
But user needs click to be filtered for multiple times when using similar vertical search engine, could match final search Suo Yitu causes inconvenient for use;In addition, customer service robot, when searching for target, the flexibility of search key combination is low, reduce Search efficiency, poor user experience.
Summary of the invention
Present invention solves the technical problem that being how to improve the efficiency of search target identification.
In order to solve the above technical problems, the embodiment of the present invention provides a kind of search target identification method, target identification is searched for Method includes: to be segmented according to known dictionary to search text, obtains multiple target keywords, wherein the dictionary Library includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries;According to the multiple target keyword At least part in the multiple target keyword is carried out matching conversion with default dictionary template by corresponding dictionary, To obtain portmanteau word;The portmanteau word is formed at least part of search key with the target keyword for not matching conversion, Described search keyword corresponds to described search target;Wherein, the dictionary template is one or more dictionaries in the dictionary Combination.
Optionally, described search target identification method further include: described search keyword is formed and is mutually fitted with search engine The query statement answered.
Optionally, the query statement includes one or more of: sql search statement and lucene query statement.
Optionally, it carries out at least part in the multiple target keyword to match conversion packet with default dictionary template It includes: according to the corresponding dictionary of at least part in multiple target keywords and its putting in order, determine matched described Default dictionary template;By multiple target critical word combinations with the default dictionary template matching, to form the portmanteau word.
Optionally, the default dictionary template includes the scope template formed by range dictionary and unit dictionary, according to institute State the corresponding dictionary of multiple target keywords, by the multiple target keyword at least part and default dictionary mould It includes: by the multiple target keywords for corresponding to range dictionary and unit dictionary and the range mould that plate, which carries out matching conversion, Plate matching, and combination number is formed according to the scope template Combination conversion.
Optionally, the dictionary includes one or more of: unit dictionary, range dictionary, color dictionary and brand Dictionary.
Optionally, the brand dictionary further includes brand synonym or brand phonetic.
In order to solve the above technical problems, the embodiment of the invention also discloses a kind of search Target Identification Unit, described search Target Identification Unit includes: participle unit, suitable for segmenting according to known dictionary to search text, obtains multiple targets Keyword, wherein the dictionary includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries;Turn Change unit, be suitable for according to the corresponding dictionary of the multiple target keyword, by the multiple target keyword at least It is a part of to carry out matching conversion with default dictionary template, to obtain portmanteau word;Assembled unit, be suitable for by the portmanteau word with not At least part of target keyword composition search key with conversion, described search keyword correspond to described search target; Wherein, the dictionary template is the combination of one or more dictionaries in the dictionary.
Optionally, described search Target Identification Unit further include: query statement forms unit, is suitable for described search key Morphology is at the query statement being adapted with search engine.
Optionally, the query statement includes one or more of: sql search statement and lucene query statement.
Optionally, the converting unit comprises determining that subelement, suitable for according in multiple target keywords at least It a part of corresponding dictionary and its puts in order, determines the matched default dictionary template;Conversion subunit, being suitable for will be with institute Multiple target critical word combinations of default dictionary template matching are stated, to form the portmanteau word.
Optionally, the default dictionary template includes the scope template formed by range dictionary and unit dictionary, and described turn Changing unit includes: range conversion subunit, suitable for that will correspond to multiple target keywords of range dictionary and unit dictionary It is matched with the scope template, and forms combination number according to the scope template Combination conversion.
Optionally, the dictionary includes one or more of: unit dictionary, range dictionary, color dictionary and brand Dictionary.
Optionally, the brand dictionary further includes brand synonym or brand phonetic.
In order to solve the above technical problems, the terminal includes described search the embodiment of the invention also discloses a kind of terminal Target Identification Unit.
Compared with prior art, the technical solution of the embodiment of the present invention has the advantages that
The present invention segments search text according to known dictionary, obtains multiple target keywords, wherein described Dictionary includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries;It is closed according to the multiple target The corresponding dictionary of keyword, at least part and the default dictionary template in the multiple target keyword match turns It changes, to obtain portmanteau word;By the portmanteau word and at least one of the target keyword composition search key for not matching conversion Point, described search keyword corresponds to described search target;Wherein, the dictionary template is one or more words in the dictionary The combination of allusion quotation.By the way that obtained multiple target keywords and default dictionary template matching conversion will be carried out after participle, and with not A part of target keyword composition search key or search key with conversion, so that is used when search searches Rope keyword can more accurately express search text, improve the accuracy of search target identification, further improve search As a result efficiency and accuracy.
Further, it carries out at least part in the multiple target keyword to match conversion packet with default dictionary template It includes: according to the corresponding dictionary of at least part in multiple target keywords and its putting in order, determine matched described Default dictionary template;By multiple target critical word combinations with the default dictionary template matching, to form the portmanteau word.It is logical It crosses at least part target keyword according to its corresponding dictionary and putting in order, matches determining default dictionary template, shape It at portmanteau word, is carried out by way of intention assessment forms portmanteau word inputting text to user, to search user is structureless Rope is converted to structured search, improves the accuracy of search result, while the user experience is improved.
Detailed description of the invention
Fig. 1 is a kind of flow chart for searching for target identification method of the embodiment of the present invention;
Fig. 2 is the flow chart of another kind of embodiment of the present invention search target identification method;
Fig. 3 is a kind of structural schematic diagram for searching for Target Identification Unit of the embodiment of the present invention;
Fig. 4 is the structural schematic diagram of another kind of embodiment of the present invention search Target Identification Unit.
Specific embodiment
As described in the background art, in the search system of the prior art, user when using similar vertical search engine, It needs to click and be filtered for multiple times, final search intention could be matched, caused inconvenient for use;In addition, customer service robot is in search mesh When mark, the flexibility of search key combination is low, reduces search efficiency, poor user experience.
The embodiment of the present invention passes through match with default dictionary template by the multiple target keywords obtained after participle and turn It changes, and forms search key with the target keyword for not matching conversion, so that the search key used when search can More accurately to express the search text of user, the accuracy of search target identification is improved, search result is further improved Accuracy, so as to by user it is structureless search be converted to structured search, the user experience is improved.
To make the above purposes, features and advantages of the invention more obvious and understandable, with reference to the accompanying drawing to the present invention Specific embodiment be described in detail.
Fig. 1 is a kind of flow chart for searching for target identification method of the embodiment of the present invention, is known below with reference to Fig. 1 to search target Other method is described in detail.
Step S101: search text is segmented according to known dictionary, obtains multiple target keywords.Wherein, The dictionary includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries.
In the present embodiment, text representation user is searched in order to search for the passage that target inputs in search engine, example Such as can be, English alphabet, phonetic, Chinese character, character, number with and combinations thereof.The dictionary is known or pre-establishes, Each dictionary is the database for storing the keyword of classification and constituting, the keyword of the corresponding classification of a dictionary from An and corresponding dictionary classification.For example, the dictionary classification can have " brand ", " model ", " face for automobile industry Color " and " unit " etc., meanwhile, same type of keyword is added in advance for the dictionary of each dictionary classification.It is understood that It is that for automobile industry, other kind of dictionary classification and corresponding dictionary, such as " price " can also be defined Deng.
It should be noted that the dictionary classification is when pre-defined, it can be according to the different and different of industry.Again Such as, in cosmetic industry, it is known that dictionary classification can have " brand ", " function " " core word " and " capacity ", it can be seen that should Dictionary classification is different from the dictionary classification in upper example.It should be pointed out that the definition of the specific name of the dictionary classification is unlimited It can also be other titles by each dictionary class declaration, as long as different classes of keyword can be distinguished in this example.
The keyword of single classification is stored in each dictionary, for example, in the dictionary of above-mentioned automobile, brand class It is stored with the keywords such as " BMW ", " Jaguar ", " Land Rover ", " Rolls Royce " in this other dictionary, and colour type this word Allusion quotation is stored with the keywords such as " white ", " black " " silvery white " in library, be stored in this dictionary of model classification " 730Li ", The keywords such as " 330i ", " 325i ", the units dictionary such as " time ", " price ", " mileage ", " discharge capacity " be stored with respectively " year ", The keywords such as " ten thousand ", " Wan Gongli ", " liter ", detailed need not describe herein.
In an embodiment of the present invention, the dictionary classification of the dictionary is pre-defined known, and the dictionary In keyword be also pre-stored.It should be pointed out that being directed to each industry, different dictionary classification and word can be defined Allusion quotation library.
As a result, in the specific implementation of this step S101, search text can be carried out according to known multiple dictionarys Participle, obtains multiple target keywords of described search text.Furthermore, the process of participle is will to search for text according to word Allusion quotation library carries out cutting, to obtain and the matched multiple target keywords of dictionary.
For example, described search text is " ten thousand x5 of BMW 3-5 white ", it can be to " BMW according to known multiple dictionarys Ten thousand x5 of 3-5 white " is segmented, in this example, it is assumed that known dictionary is " brand ", " unit ", " model ", " range " " color ", and same class keywords are stored in known each dictionary, then when the keyword in dictionary includes When word in described search text, which is carried out to have " BMW " this keyword in cutting, such as " brand " dictionary, then may be used " BMW " in " ten thousand x5 of BMW 3-5 white " is carried out cutting, and so on, it is assumed that it can be found in this example in dictionary The word in text is searched for, then " ten thousand x5 of BMW 3-5 white " can be segmented to obtain multiple target keywords: " BMW ", " 3 ", "-" " 5 " " ten thousand ", " x5 " and " white ", respectively correspond dictionary " brand ", " number ", " range ", " number ", " unit ", " model " and " color ".
For example, the multiple keywords point hereafter obtained are also possible to: " BMW ", " 30,000 ", "-" " 50,000 " " x5 " and " white Color " respectively corresponds dictionary " brand ", " price ", " range ", " price ", " model " and " color ".It should be pointed out that at this In the embodiment of invention, digital " 3 " and " 5 " can be by system automatic identification, can only storage cell word in " unit " dictionary " ten thousand ", when Words partition system is after automatic identification number " 3 " and " 5 ", such as finding in dictionary has unit word " ten thousand ", then by number " 3 " and " 5 " are merged with " ten thousand ", obtain the target keyword " 30,000 " and " 50,000 ".
Step S102:, will be in the multiple target keyword according to the corresponding dictionary of the multiple target keyword At least part carry out matching conversion with default dictionary template, to obtain portmanteau word.
In the present embodiment, the multiple target keywords and default dictionary template that are obtained according to participle, wherein institute's predicate Allusion quotation template is the combination of one or more dictionaries in the dictionary.The default dictionary template is preset, such as can With as needed, preset rule of thumb or according to the result of big data machine learning, the known template includes not Same dictionary classification.The multiple target keyword of at least part is subjected to matching conversion, obtains portmanteau word.
In specific implementation, number and range dictionary can be combined into following structures: " number+range word+number " is built into For scope template, the range dictionary may include " extremely ", " more than ", " following ", " being more than " wait keywords.It is also possible to count Word, range dictionary and unit dictionary are combined into following structures: " number+range word+number+unit of value word " is configured to valence Lattice scope template.
For example, through step S101, determination obtains multiple target keywords: " BMW ", " 3 ", "-" " 5 " " ten thousand ", " x5 " and " white " respectively corresponds dictionary " brand ", " number ", " range ", " number ", " unit ", " model " and " color ".According to part Keyword " 3 ", "-" " 50,000 " and its corresponding dictionary classification " number ", " range ", " number " and " unit ", with Price Range mould Plate " number+range word+number+unit of value word " matches, and by the processing of this Price Range template, finally obtains, minimum Price 30000, maximum price 50000.According to the combination of part of speech, extract using word.Specifically, " minimum price " and " maximum Price " is the range value for being extracted according to Price Range template, and then can analyze out price.Obtain Price Range It accurately extracts relatively.
The embodiment of the present invention can carry out flexibly combining between multiple target keywords by default dictionary template, into one Step improves the efficiency of target search.
Step S103: by the portmanteau word and at least one of the target keyword composition search key for not matching conversion Point, described search keyword corresponds to described search target.
In the present embodiment, in order to user bring with the consistent user experience of universal search engine, by the portmanteau word with The target keyword composition search key of conversion is not matched, and the text inputted by user converts non-structured search At the search of structuring.Certainly, it is also not limited in search key only by portmanteau word and not matched target critical phrase At in addition also may include other proper compositions, such as some ingredients for assist searching for.
For example, through step S102, determination obtains portmanteau word: " minimum price 30000, maximum price 50000 ", and with not Target keyword " BMW ", " x5 " and " white " with conversion form together search key " brand: BMW, minimum price: 30000, maximum price: 50000, color: white ".
The search key that the search target identification method of the embodiment of the present invention obtains can more accurately express search text This, improves the accuracy of search target identification, further improves the accuracy of search result, can be adapted for specifically answering With scene, the scope of application of search target identification method is expanded.
Fig. 2 is the flow chart of another kind of embodiment of the present invention search target identification method, below with reference to Fig. 2 to described search Target identification method is described in detail.
Step S201: search text is segmented according to known dictionary, obtains multiple target keywords.Wherein, The dictionary includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries.
In specific implementation, the dictionary includes one or more of: unit dictionary, range dictionary, color dictionary and Brand dictionary.Specifically, the brand dictionary further includes brand synonym or brand phonetic, to adapt to user in search text The case where using phonetic or synonym.For example, the phonetic of " BMW " " baoma ", the English name of " BMW " abbreviation " BMW " etc., It detailed need not describe herein.
In the present embodiment, segmenting method is identical with step S101, can refer to aforementioned corresponding specific embodiment, herein It repeats no more.
Step S202: according to the corresponding dictionary of at least part in multiple target keywords and its putting in order, Determine the matched default dictionary template.
In specific implementation, after determining multiple target keywords, according to the corresponding word of wherein at least a part of keyword Allusion quotation and wherein at least part keyword put in order, the determining default dictionary to match at least part keyword Template.It may also be said that word in the corresponding dictionary classification of at least part keyword, the default dictionary template for putting in order and determining Classification, the sequence of combination of allusion quotation combination are consistent, then it represents that at least part keyword matches with the default dictionary template. Specifically, the default dictionary template is preset, such as be can according to need, rule of thumb or according to big data machine The result of device study is preset.
For example, through step S201, determine that user searches for multiple target keywords after text segments: " BMW ", " 3 ", "-" " 5 " " ten thousand ", " x5 " and " white ", respectively correspond dictionary " brand ", " number ", " range ", " number ", " unit ", " model " and " color ".According to Partial key word " 3 ", "-" " 5 ", " ten thousand " and its corresponding dictionary classification " number ", " range ", " number " and " unit " finds price model according to the combination of this dictionary and sequential search in the default dictionary template in dictionary Boxing plate " number+range word+number+unit of value word ", Price Range template and Partial key word " 3 ", "-" " 5 ", " ten thousand " phase Matching, that is to say, that Price Range template is then the default dictionary template.It should be pointed out that not found in dictionary When the default dictionary template to match at least part of multiple keywords, then without the combination operation of portmanteau word.
Step S203: by multiple target critical word combinations with the default dictionary template matching, to form the combination Word.
In specific implementation, the default dictionary template includes the scope template formed by range dictionary and unit dictionary, root According to the corresponding dictionary of the multiple target keyword, by the multiple target keyword at least part and default word It includes: by the multiple target keywords for corresponding to range dictionary and unit dictionary and the model that allusion quotation template, which carries out matching conversion, Template matching is enclosed, and forms combination number according to the scope template Combination conversion.
For example, through step S202, after determining that the default dictionary template is Price Range template, by Partial key word " 3 ", "-" " 5 ", " ten thousand " are combined conversion according to the format of Price Range template " number+range word+number+unit of value word ", obtain To combination number, " minimum price 30000, maximum price 50000 " are also possible to " minimum price 3w, maximum price 5w ".Specifically Ground, " minimum price " and " maximum price " are the models for being extracted according to Price Range template, and then can analyze out price Enclose value.Obtain accurately extracting relatively for Price Range.
Step S204: by the portmanteau word and at least one of the target keyword composition search key for not matching conversion Point, described search keyword corresponds to described search target.
For example, through step S203, determine combination number " minimum price 30000, maximum price 50000 ", and with do not match The target keyword " BMW ", " x5 " and " white " of conversion form together search key " brand: BMW, minimum price: 30000, maximum price: 50000, color: white ".
The embodiment of the present invention in the search text of user there are in the case where portmanteau word, can be to the word that can be combined Language is combined, and forms portmanteau word, and form search key with the target keyword for not matching conversion, laggard with directly segmenting The method of row search is compared, and the accuracy of the target searched improves, and user experience improves.
Step S205: described search keyword is formed to the query statement being adapted with search engine.
In specific implementation, the query statement may include one or more of: sql search statement and lucene inquiry Sentence.Specifically, sql search statement indicates structured query language (Structured Query Language), and lucene is looked into It askes sentence and is used for full-text search engine Lucene.Lucene is a set of open source program library for full-text search and search, can Do full-text index and search, it may be convenient to realize the function of full-text search in goal systems, or build based on this Erect complete full-text search engine.
For example, through step S204, determine search key " brand: BMW, minimum price: 30000, maximum price: 50000, color: white ", can be built into sql search statement: select*from auto where with this final structure Brand=" BMW " and price >=30000and price≤50000and color=" white ".As a result, by user The search text " ten thousand x5 of BMW 3-5 white " of input has been built into structured query sentence: select*from auto where Brand=" BMW " and price >=30000and price≤50000and color=" white " is based on the structure Subsequent operation can be carried out by changing query statement.
Following explanation is done to the corresponding device of described search target identification method in the embodiment of the present invention below.
Fig. 3 is the structural schematic diagram of one of embodiment of the present invention search Target Identification Unit.Search as shown in Figure 3 Target Identification Unit may include: participle unit 301, converting unit 302 and assembled unit 303.
Wherein, participle unit 301 is suitable for segmenting search text according to known dictionary, obtains multiple targets and closes Keyword, specifically, the dictionary include a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries;Turn Unit 302 is changed to be suitable for according to the corresponding dictionary of the multiple target keyword, by the multiple target keyword extremely Few a part carries out matching conversion with default dictionary template, to obtain portmanteau word;Assembled unit 303 be suitable for by the portmanteau word with At least part of the target keyword composition search key of conversion is not matched, and described search keyword corresponds to described search mesh Mark.
In specific implementation, text representation user is searched in order to search for the passage that target inputs in search engine, example Such as can be, English alphabet, phonetic, Chinese character, character, number with and combinations thereof.Dictionary is known or pre- in participle unit 301 It first establishes, each dictionary is the database for storing the keyword an of classification and constituting, the corresponding classification of a dictionary Keyword to a corresponding dictionary classification.For example, for automobile industry, the dictionary classification can have " brand ", " model ", " color " and " unit " etc., meanwhile, same type of keyword is added in advance for the dictionary of each dictionary classification. It is understood that other kind of dictionary classification and corresponding dictionary, example can also be defined for automobile industry Such as " price ".The multiple target keywords and default dictionary template that converting unit 302 is obtained according to participle, wherein described Dictionary template is the combination of one or more dictionaries in the dictionary.The default dictionary template is preset, such as It can according to need, rule of thumb or according to the result of big data machine learning preset, the known template includes Different dictionary classifications.The multiple target keyword of at least part is subjected to matching conversion, obtains portmanteau word.
The search key that the search target identification method of the embodiment of the present invention obtains can more accurately express search text This, improves the accuracy of search target identification, further improves the accuracy of search result.
Fig. 4 is the structural schematic diagram of another search Target Identification Unit in the embodiment of the present invention.
Search Target Identification Unit as shown in Figure 4 may include: participle unit 401, converting unit 402, determine that son is single Member 403, conversion subunit 404, assembled unit 405 and query statement form unit 406.
Wherein, participle unit 401 is suitable for segmenting search text according to known dictionary, obtains multiple targets and closes Keyword.Specifically, the dictionary includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries.Institute Stating dictionary may include one or more of: unit dictionary, range dictionary, color dictionary and brand dictionary.Specifically, institute Stating brand dictionary further includes brand synonym or brand phonetic, to adapt to user in search text using phonetic or synonym Situation.For example, the phonetic " baoma " of " BMW ", the English name abbreviation " BMW " of " BMW " etc., detailed need not describe herein.
In specific implementation, converting unit 402 can include determining that subelement 403 and conversion subunit 404.Determine subelement 403 are suitable for according to the corresponding dictionary of at least part in multiple target keywords and its put in order, and determine matched The default dictionary template;Conversion subunit 404 is suitable for multiple target critical phrases with the default dictionary template matching It closes, to form the portmanteau word.
In specific implementation, after determining multiple target keywords, according to the corresponding word of wherein at least a part of keyword Allusion quotation and wherein at least part keyword put in order, the determining default dictionary to match at least part keyword Template.It may also be said that word in the corresponding dictionary classification of at least part keyword, the default dictionary template for putting in order and determining Classification, the sequence of combination of allusion quotation combination are consistent, then it represents that at least part keyword matches with the default dictionary template. Specifically, the default dictionary template is preset, such as be can according to need, rule of thumb or according to big data machine The result of device study is preset.
Assembled unit 405 is suitable for the portmanteau word forming search key extremely with the target keyword for not matching conversion Few a part, described search keyword correspond to described search target.Query statement forms unit 406 and is suitable for described search key Morphology is at the query statement being adapted with search engine.
The concrete mode of the embodiment of the present invention can refer to aforementioned corresponding embodiment, and details are not described herein again.
The embodiment of the present invention in the search text of user there are in the case where portmanteau word, can be to the word that can be combined Language is combined, and forms portmanteau word, and form search key with the target keyword for not matching conversion, laggard with directly segmenting The method of row search is compared, and the accuracy of the target searched improves, and user experience improves.
The embodiment of the invention also discloses a kind of terminal, the terminal includes described search Target Identification Unit.The end End can support configuration described search Target Identification Unit, execute described search target identification method.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can store in computer readable storage medium, storage Medium may include: ROM, RAM, disk or CD etc..
Although present disclosure is as above, present invention is not limited to this.Anyone skilled in the art are not departing from this It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute Subject to the range of restriction.

Claims (15)

1. a kind of search target identification method characterized by comprising
Search text is segmented according to known dictionary, obtains multiple target keywords, wherein the dictionary includes A variety of dictionaries, the multiple target keyword are corresponding with a variety of dictionaries;
According to the corresponding dictionary of the multiple target keyword, by the multiple target keyword at least part with Default dictionary template carries out matching conversion, to obtain portmanteau word;
The portmanteau word is formed at least part of search key with the target keyword for not matching conversion, described search is closed Keyword corresponds to described search target;
Wherein, the dictionary template is the combination of one or more dictionaries in the dictionary.
2. search target identification method according to claim 1, which is characterized in that further include:
Described search keyword is formed to the query statement being adapted with search engine.
3. search target identification method according to claim 2, which is characterized in that the query statement includes following one kind It is or a variety of: sql search statement and lucene query statement.
4. search target identification method according to claim 1, which is characterized in that will be in the multiple target keyword At least part carries out matching conversion with default dictionary template
It according to the corresponding dictionary of at least part in multiple target keywords and its puts in order, determines matched described Default dictionary template;
By multiple target critical word combinations with the default dictionary template matching, to form the portmanteau word.
5. search target identification method according to claim 1, which is characterized in that the default dictionary template includes by model The scope template for enclosing dictionary and the formation of unit dictionary will be described more according to the corresponding dictionary of the multiple target keyword At least part in a target keyword carries out matching conversion including: that will correspond to range dictionary and list with default dictionary template Multiple target keywords of position dictionary are matched with the scope template, and according to the scope template Combination conversion formation group Close number.
6. search target identification method according to any one of claims 1 to 4, which is characterized in that the dictionary includes It is a variety of among below: unit dictionary, range dictionary, color dictionary and brand dictionary.
7. search target identification method according to claim 6, which is characterized in that the brand dictionary further includes that brand is same Adopted word or brand phonetic.
8. a kind of search Target Identification Unit characterized by comprising
Participle unit obtains multiple target keywords suitable for segmenting according to known dictionary to search text, wherein The dictionary includes a variety of dictionaries, and the multiple target keyword is corresponding with a variety of dictionaries;
Converting unit is suitable for according to the corresponding dictionary of the multiple target keyword, will be in the multiple target keyword At least part carry out matching conversion with default dictionary template, to obtain portmanteau word;
Assembled unit, suitable at least one by the portmanteau word with the target keyword composition search key for not matching conversion Point, described search keyword corresponds to described search target;
Wherein, the dictionary template is the combination of one or more dictionaries in the dictionary.
9. search Target Identification Unit according to claim 8, which is characterized in that further include:
Query statement forms unit, suitable for described search keyword is formed the query statement being adapted with search engine.
10. search Target Identification Unit according to claim 9, which is characterized in that the query statement includes with next Kind is a variety of: sql search statement and lucene query statement.
11. search Target Identification Unit according to claim 8, which is characterized in that the converting unit includes:
Determine subelement, suitable for according in multiple target keywords the corresponding dictionary of at least part and its arrangement it is suitable Sequence determines the matched default dictionary template;
Conversion subunit, suitable for by multiple target critical word combinations with the default dictionary template matching, to form described group Close word.
12. search Target Identification Unit according to claim 8, which is characterized in that the default dictionary template include by The scope template that range dictionary and unit dictionary are formed, the converting unit include:
Range conversion subunit, suitable for will correspond to range dictionary and unit dictionary multiple target keywords and the model Template matching is enclosed, and forms combination number according to the scope template Combination conversion.
13. search Target Identification Unit according to any one of claims 8 to 11, which is characterized in that the dictionary packet It includes a variety of among following: unit dictionary, range dictionary, color dictionary and brand dictionary.
14. search Target Identification Unit according to claim 13, which is characterized in that the brand dictionary further includes brand Synonym or brand phonetic.
15. a kind of terminal, which is characterized in that including the described in any item search Target Identification Units of such as claim 8 to 14.
CN201610224273.3A 2016-04-12 2016-04-12 Search for target identification method, device and terminal Active CN105930362B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610224273.3A CN105930362B (en) 2016-04-12 2016-04-12 Search for target identification method, device and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610224273.3A CN105930362B (en) 2016-04-12 2016-04-12 Search for target identification method, device and terminal

Publications (2)

Publication Number Publication Date
CN105930362A CN105930362A (en) 2016-09-07
CN105930362B true CN105930362B (en) 2019-03-12

Family

ID=56838044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610224273.3A Active CN105930362B (en) 2016-04-12 2016-04-12 Search for target identification method, device and terminal

Country Status (1)

Country Link
CN (1) CN105930362B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599206A (en) * 2016-12-15 2017-04-26 北京小米移动软件有限公司 Method and device for searching information
CN107168988B (en) * 2017-03-27 2022-01-28 百度在线网络技术(北京)有限公司 Method, device, equipment and computer storage medium for inquiring lottery ticket information
CN107679121B (en) * 2017-09-20 2020-10-20 晶赞广告(上海)有限公司 Mapping method and device of classification system, storage medium and computing equipment
CN109993592A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 Information-pushing method and device
CN108984600B (en) * 2018-06-04 2020-03-03 百度在线网络技术(北京)有限公司 Interaction processing method and device, computer equipment and readable medium
CN109344398B (en) * 2018-09-10 2024-02-09 北京京东尚科信息技术有限公司 Commodity name processing method and device, computer storage medium and electronic equipment
CN110222194B (en) * 2019-05-21 2022-10-04 深圳壹账通智能科技有限公司 Data chart generation method based on natural language processing and related device
CN110888876A (en) * 2019-10-31 2020-03-17 平安科技(深圳)有限公司 Method and device for generating database script, storage medium and computer equipment
CN111506704B (en) * 2020-04-10 2023-09-12 上海携程商务有限公司 Japanese keyword group generation method and device, electronic equipment and storage medium
CN111523311B (en) * 2020-04-21 2023-10-03 度小满科技(北京)有限公司 Search intention recognition method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4382634B2 (en) * 2004-11-08 2009-12-16 日本電信電話株式会社 Address analysis apparatus, address analysis method, and address analysis program
CN103425714A (en) * 2012-05-25 2013-12-04 北京搜狗信息服务有限公司 Query method and system
CN105138511A (en) * 2015-08-10 2015-12-09 北京思特奇信息技术股份有限公司 Method and system for semantically analyzing search keyword

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4382634B2 (en) * 2004-11-08 2009-12-16 日本電信電話株式会社 Address analysis apparatus, address analysis method, and address analysis program
CN103425714A (en) * 2012-05-25 2013-12-04 北京搜狗信息服务有限公司 Query method and system
CN105138511A (en) * 2015-08-10 2015-12-09 北京思特奇信息技术股份有限公司 Method and system for semantically analyzing search keyword

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Dictionary Chinese Word Segmentation research a method combined with CRFs;Chengliang WANG et al;《5th International Conference on Computer Sciences and Convergence Information Technology》;20110210;第962-965页

Also Published As

Publication number Publication date
CN105930362A (en) 2016-09-07

Similar Documents

Publication Publication Date Title
CN105930362B (en) Search for target identification method, device and terminal
CN109492077B (en) Knowledge graph-based petrochemical field question-answering method and system
CN106649818B (en) Application search intention identification method and device, application search method and server
CN106502994B (en) method and device for extracting keywords of text
CN109800284B (en) Task-oriented unstructured information intelligent question-answering system construction method
CN105718586B (en) The method and device of participle
CN105095204B (en) The acquisition methods and device of synonym
CN102479191B (en) Method and device for providing multi-granularity word segmentation result
CN105653706B (en) A kind of multilayer quotation based on literature content knowledge mapping recommends method
CN106601237B (en) Interactive voice response system and voice recognition method thereof
WO2021174783A1 (en) Near-synonym pushing method and apparatus, electronic device, and medium
CN106776564B (en) Semantic recognition method and system based on knowledge graph
CN103488724A (en) Book-oriented reading field knowledge map construction method
CN102915299A (en) Word segmentation method and device
CN107562919B (en) Multi-index integrated software component retrieval method and system based on information retrieval
CN104199965A (en) Semantic information retrieval method
CN111858888B (en) Multi-round dialogue system of check-in scene
CN111177591A (en) Knowledge graph-based Web data optimization method facing visualization demand
CN108920482B (en) Microblog short text classification method based on lexical chain feature extension and LDA (latent Dirichlet Allocation) model
CN108875065B (en) Indonesia news webpage recommendation method based on content
JP2011227688A (en) Method and device for extracting relation between two entities in text corpus
CN112883165B (en) Intelligent full-text retrieval method and system based on semantic understanding
CN102214189A (en) Data mining-based word usage knowledge acquisition system and method
CN109522396B (en) Knowledge processing method and system for national defense science and technology field
JPH0816620A (en) Data sorting device/method, data sorting tree generation device/method, derivative extraction device/method, thesaurus construction device/method, and data processing system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant