CN104123319B - The method and apparatus that search terms with map demand are parsed - Google Patents

The method and apparatus that search terms with map demand are parsed Download PDF

Info

Publication number
CN104123319B
CN104123319B CN201310156743.3A CN201310156743A CN104123319B CN 104123319 B CN104123319 B CN 104123319B CN 201310156743 A CN201310156743 A CN 201310156743A CN 104123319 B CN104123319 B CN 104123319B
Authority
CN
China
Prior art keywords
word
search
map
tag
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310156743.3A
Other languages
Chinese (zh)
Other versions
CN104123319A (en
Inventor
李扬
孙帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201310156743.3A priority Critical patent/CN104123319B/en
Publication of CN104123319A publication Critical patent/CN104123319A/en
Application granted granted Critical
Publication of CN104123319B publication Critical patent/CN104123319B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of method and apparatus parsed to the search terms (query) with map search demand, and wherein method includes: to carry out word segmentation processing to the query of user's input;Tag mapping is carried out to the word of natural language in the query: according to the similarity between each tag in the word and tag system of natural language in the query, determining the tag being mapped to;Wherein the tag in the tag system is point of interest (POI) attribute in map, can hit corresponding POI;Determine that the corresponding search key of the query, google maps scan for determining search key according to tag mapping result.The present invention can also return to the search result of user demand for the query of natural language, the covering situation without depending on artificial vocabulary.

Description

The method and apparatus that search terms with map demand are parsed
[technical field]
It is the present invention relates to the information search field in Computer Applied Technology, in particular to a kind of to map demand The method and apparatus that search terms are parsed.
[background technique]
With the rapid development of network technology, the information resources on network are enriched constantly, and information data amount is also rapidly swollen It is swollen.Search engine has been increasingly becoming the important way that people obtain information, and map search is that one of important search is answered With, for people trip provide convenience.
In map search, after user inputs search terms (query) in input frame, google maps can be mentioned to user For the corresponding cartographic information of the query, such as when user inputs " KFC ", search engine can exist the location information of KFC User is presented to after being identified in map.Due in existing map search, direct progress usually without any processing to query Text matches, this is the name entities or some all such as place name, building name, trade company's name for query of user's input When such as " express hotel " classifier, since it is consistent with the information description of map POI point, the search result of return can be fine Meet user demand.
However, the query statement of user's input is more casual in many cases, have the characteristics that natural language, such as " it is joyful what Beijing has ", " nearby where have and learn cook's " etc., it is this to be difficult to find that by traditional text matches mode Search result, there is no " joyful ", " learning cook " such descriptions in map POI point, in addition, even by artificial word The matched mode of table, which will also result in, covers infull problem, not can solve the word for the natural language that do not include.
[summary of the invention]
In view of this, the present invention provides a kind of method and apparatus that the search terms with map demand are parsed with The search result of user demand can be returned convenient for the query to natural language.
Specific technical solution is as follows:
A method of the search terms query with map search demand is parsed, this method comprises:
S1, word segmentation processing is carried out to the query of user's input;
S2, tag mapping is carried out to the word of natural language in the query: according to natural language in the query Word and tag system in similarity between each tag, determine the tag being mapped to;Wherein the tag in the tag system is ground Point of interest POI attribute in figure, can hit corresponding POI;
S3, the corresponding search key of the query is determined according to tag mapping result, google maps are to determining Search key scans for.
A preferred embodiment according to the present invention, in the step S1 further include: obtained in word after removal participle Stop words.
A preferred embodiment according to the present invention, between the step S1 and step S2 further comprising the steps of S11 and At least one of S12:
S11, it is based on attribute vocabulary, Attribute Recognition is carried out to the word obtained after participle and determines attribute word;
S12, it is based on mode expansions table, map search pattern-recognition is carried out to the word obtained after participle;
In the step S2 by the query it is unidentified for attribute word and it is unidentified go out map search mode word It is determined as the word of natural language.
A preferred embodiment according to the present invention, the mode expansions table establish mode are as follows:
After carrying out word segmentation processing to the query of known map search mode, based on attribute vocabulary to hit attribute vocabulary Word is filtered, and remaining word is determined as mode word;
The statistics of co-occurrence frequency is carried out to mode word, and is ranked up based on co-occurrence frequency;
The mode word that the sequence of selection co-occurrence frequency meets preset requirement constitutes the mode of the known map search mode Expression formula.
A preferred embodiment according to the present invention, the phase between the word and tag of natural language in the step S2 It can be embodied by co-occurrence rate like degree, co-occurrence rate is higher, and similarity is bigger;Wherein between word x and the tag y of natural language Co-occurrence rate determine in the following ways:
The co-occurrence times N 1 of the x and the y in corpus in one text or the same window is counted, counts the x points Not and all tag including the y show times N in one text or the same window in corpus in total, determine the x Co-occurrence rate between the y is N1/N.
A preferred embodiment according to the present invention, in the step S2, by the word with natural language in the query The tag that similarity meets preset requirement between language is determined as the tag being mapped to, wherein the preset requirement are as follows: similarity highest Or similarity reaches preset threshold.
A preferred embodiment according to the present invention, if identifying attribute word, according to tag mapping knot in the step S3 Fruit determines the corresponding search key of the query are as follows:
By the tag being mapped to search key corresponding with the attribute word composition query identified.
A preferred embodiment according to the present invention, if identifying map search mode, map is searched in the step S3 Index, which is held up, scans for determining search key are as follows: google maps are according to the map search mode identified to determination Search key scan for;
Otherwise, google maps scan for determining search key in the step S3 are as follows: map search draws It holds up and determining search key is scanned for according to the map search mode of default.
A preferred embodiment according to the present invention, if the user inputs the query by common big search, such as Fruit, which exists, at least one of to be identified attribute word, identifies map search mode and be mapped to tag, it is determined that the query Have map search demand, the google maps are embedded in the search result commonly searched for greatly in the step S3 Search result, and the search result in the search result commonly searched for greatly by the google maps in the step S3 Come significant position.
A kind of device parsed to the search terms with map search demand, the device include:
Participle unit, the query for inputting to user carry out word segmentation processing;
Map unit carries out tag mapping for the word to natural language in the query: according in the query Similarity in the word and tag system of natural language between each tag determines the tag being mapped to;The wherein tag system In tag be POI attribute in map, corresponding POI can be hit;
Search unit determines that the corresponding search of the query is crucial for the tag mapping result according to the map unit Word, invocation map search engine scan for determining search key.
A preferred embodiment according to the present invention, the participle unit are also used to remove in the word obtained after participle Stop words.
A preferred embodiment according to the present invention, the device further include in Attribute Recognition unit and pattern recognition unit extremely Few one kind;
The Attribute Recognition unit carries out Attribute Recognition to the word obtained after participle and determines for being based on attribute vocabulary Attribute word;
The pattern recognition unit carries out map search to the word obtained after participle for being based on mode expansions table Pattern-recognition;
The map unit by the query it is unidentified for attribute word and it is unidentified go out map search mode word it is true It is set to the word of natural language.
A preferred embodiment according to the present invention, the device further include: Model Establishment unit, for establishing the mode table Da Shibiao, specific to execute:
After carrying out word segmentation processing to the query of known map search mode, based on attribute vocabulary to hit attribute vocabulary Word is filtered, and remaining word is determined as mode word;
The statistics of co-occurrence frequency is carried out to mode word, and is ranked up based on co-occurrence frequency;
The mode word that the sequence of selection co-occurrence frequency meets preset requirement constitutes the mode of the known map search mode Expression formula.
A preferred embodiment according to the present invention, between the word and tag of the natural language that the map unit uses Similarity can be embodied by co-occurrence rate, co-occurrence rate is higher, and similarity is bigger;The wherein word x and tag y of natural language Between co-occurrence rate determine in the following ways:
The co-occurrence times N 1 of the x and the y in corpus in one text or the same window is counted, counts the x points Not and all tag including the y show times N in one text or the same window in corpus in total, determine the x Co-occurrence rate between the y is N1/N.
A preferred embodiment according to the present invention, the map unit is by the word with natural language in the query Between similarity meet the tag of preset requirement and be determined as the tag being mapped to, wherein the preset requirement are as follows: similarity highest or Person's similarity reaches preset threshold.
A preferred embodiment according to the present invention, if the Attribute Recognition unit identifies attribute word, described search Unit is when determining the corresponding search key of the query according to tag mapping result, by the tag being mapped to and identification Attribute word out constitutes the corresponding search key of the query.
A preferred embodiment according to the present invention, if the pattern recognition unit identifies map search mode, institute Search unit invocation map search engine is stated to scan for determining search key according to the map search mode identified; Otherwise, described search cell call google maps carry out determining search key according to the map search mode of default Search.
A preferred embodiment according to the present invention, if the user inputs the query by common big search, such as There are the Attribute Recognition units to identify that attribute word, pattern recognition unit identify that map search mode and map unit are reflected for fruit It is mapped at least one of tag, then described search unit determines that the query has map search demand, in common big search Search result in be embedded in the search results of the google maps, and will describedly in the search result commonly searched for greatly The search result of graph search engine comes significant position.
As can be seen from the above technical solutions, the present invention is reflected by the word progress tag to natural language in query It penetrates, the POI attribute being mapped as in map, the search key formed using tag after mapping is hit in map Corresponding POI so that the query for natural language can also return to the search result of user demand, without according to Rely the covering situation in artificial vocabulary.
[Detailed description of the invention]
Fig. 1 is the method flow diagram that the embodiment of the present invention one provides;
Fig. 2 is the search result instance graph for the google maps that the embodiment of the present invention one provides;
Fig. 3 is the common search result instance graph searched for greatly that the embodiment of the present invention one provides;
Fig. 4 is structure drawing of device provided by Embodiment 2 of the present invention.
[specific embodiment]
To make the objectives, technical solutions, and advantages of the present invention clearer, right in the following with reference to the drawings and specific embodiments The present invention is described in detail.
Embodiment one,
Fig. 1 is method flow diagram provided in an embodiment of the present invention, as shown in Figure 1, this method mainly comprises the steps that
Step 101: word segmentation processing is carried out to the query of user's input.
Other than carrying out word segmentation processing to query, the stop words in query can also be removed.In the embodiment of the present invention In by query " nearby where have learn cook school " for, to the query carry out word segmentation processing after obtain: " nearby ", " where Have ", " ", " cook ", " ", " school ".Remove stop words therein " ".
Step 102: being based on attribute vocabulary, Attribute Recognition is carried out to the word obtained after participle.
Here the attribute vocabulary used pre-establishes, wherein can include but is not limited to: such as place name, mechanism name, The name entities relevant to map such as name, trade company's name, brand name are built, alternatively, such as " express hotel ", " school ", " bank " Deng classifier relevant to map.
The mode of establishing of attribute vocabulary can be by the way of manually adding or the mode of automatic mining, the contents of the section Be embodied as existing more mature technology, this will not be detailed here.
Example in connecting, the word after word segmentation processing are matched with attribute vocabulary respectively, wherein " school " hits attribute word Classifier in table, other words and miss.
Step 103: being based on mode expansions table, map search pattern-recognition is carried out to the unidentified word for attribute word.
Mode expansions table herein is also based on what common map search mode pre-established, the mode expansions table In there are the corresponding expression formulas of each map search mode.Some expression formula in some or certain words and mode expansions table Matching, then by this, some or certain words recognitions are the corresponding map search mode of matched expression formula.
Wherein, the map search mode that mode expansions include can include but is not limited to following several:
1) Perimeter mode, corresponding expression formula may is that " nearby where having * ", " nearby where * ", " nearby have not There is * " etc..
2) route inquiry mode, corresponding expression formula may is that " public transport is from * to * ", " subway * to * ", " drive from * To * " etc..
3) place query pattern, corresponding expression formula may is that " * is at which ", " where is * ", " position * " etc..
Equally, mode expansions table establish mode can by the way of manually adding or the mode of automatic mining, When wherein by the way of automatic mining, the statistics of word co-occurrence frequency can be carried out to each query of each map search mode And obtain, may include following procedure specifically: after carrying out word segmentation processing to the query of known map search mode first, base It is filtered in word of the attribute vocabulary to hit attribute vocabulary, remaining word is just mode word;Co-occurrence is carried out to mode word The statistics of frequency, and be ranked up based on co-occurrence frequency;The mode word that the sequence of selection co-occurrence frequency meets preset requirement is constituted The mode expansions of the known map search mode, such as selection co-occurrence frequency come preceding M mode words and constitute this knownly The mode expansions of graph search mode, M are preset positive integer.It can manually be marked with further progress when constituting mode expansions Note.
Example in connecting, after carrying out Attribute Recognition it is unidentified go out attribute word word are as follows: " near ", " where having ", " ", " cook " matches remaining word in mode expansions table, wherein " near "+" where " to have hit expression formula " attached It is close where * ", the corresponding map search mode of expression formula " nearby where * " is Perimeter mode, then identify " nearby "+" where In " correspond to Perimeter mode.
It should be noted that above-mentioned steps 102 and step 103 can optionally one execute, can also all hold as described above Row executes step 103 after can first carrying out step 102 as described above, holds after can also first carrying out step 103 when being carried out Row step 102 first carries out map search pattern-recognition to the word after word segmentation processing based on mode expansions, is then based on again Attribute vocabulary carries out Attribute Recognition to the word of unidentified map search mode out.Even step 102 and step 103 can be simultaneously It executes, i.e., the identification of attribute word is carried out to the word after word segmentation processing based on attribute vocabulary, also, based on mode expansions to participle Treated, and word carries out map search pattern-recognition, later at step 104 again to not only unidentified unidentified for attribute word The word of map search mode carries out tag mapping out.
Step 104: according to the similarity between each tag in remaining word and tag system, determining remaining word mapping The tag arrived.
Remaining word is usually exactly the word of natural language after the above-mentioned processing before carrying out this step, on The process of stating can also regard the process of the word of determining natural language as.
The tag in tag system in the present invention can be the POI attribute in map, i.e., to the classification of POI, property etc. The word being described can hit the POI in map by the tag.For example, tag " cook's training " is " cook's training behind sea The attribute of the POI such as " center ", " long-range cook vocational-technical training school ", " Chinese cooking association trains exchange centre ", Ke Yiming In these POI;Tag snack food is the attribute of the POI such as " KFC ", " McDonald ", " Yonghe County's soya-bean milk ", " Burger King ", Ke Yiming In these POI.
Similarity in remaining word and tag system between each tag can use the calculating side of such as semantic similarity Formula, it is preferable that can be embodied in embodiments of the present invention by co-occurrence rate, co-occurrence rate is higher, and similarity is bigger.Term1 and tag1 Between co-occurrence rate determine that method can be with are as follows: co-occurrence of the statistics term1 and tag1 in corpus in one text or the same window Times N 1, statistics term1 show times N in one text or the same window in corpus with all tag respectively in total, determine Co-occurrence rate between term1 and tag1 is N1/N.
Then remaining word is mapped to the tag for meeting preset requirement with similarity between it, wherein preset requirement can To be that similarity highest or similarity reach preset threshold etc..
Example in connecting: remaining word " learning cook " and tag " cook's training " one text or the same window in corpus In co-occurrence number be other tag in 200, with tag system the number of occurrence in total be 50, " learn cook " then remaining word with Co-occurrence rate between tag " cook's training " is 200/ (200+50)=0.8.Successively calculated using identical method " learn cook " with Co-occurrence rate between other tag finally determines the co-occurrence rate highest between " learning cook " and tag " cook's training ", i.e., " learns kitchen Similarity highest between teacher " and tag " cook's training " " will learn cook " and be mapped to tag " cook's training ".
Similarity in remaining word and tag system between each tag can calculate in real time, can also inquire and precalculate It is good as a result, i.e. in advance that the word in the query of some common natural languages is similar to the tag progress in tag system Degree calculates, and when going to the step 104 during parsing to the query that user inputs, directly inquiry is precalculated Similarity calculation result.
Step 105: the corresponding search key of the query is determined according to Attribute Recognition result and tag mapping result, according to The map search mode that step 103 identifies scans for determining search key.
In this step, it can will be determined in the word (the attribute word identified) for hitting attribute vocabulary and tag mapping Tag constitute new search key, example in connecting, the word of hit attribute vocabulary is " school ", is determined in tag mapping Tag is " cook's training ", and the new search key of composition is " cook training school ", and the new search term constituted certainly can also To be " cook training school ", or " school's cook's training " etc..Since the map search mode identified is Perimeter mould Formula, therefore " cook training school " is scanned for according to Perimeter mode.That is, user input query " near Which have learn cook school " after above-mentioned resolving, conversion are as follows: according to Perimeter mode to " cook training school " into Row search.Obvious this parsing result enables search result to be more in line with the search need of user.Assuming that the current institute of user Locate position in the Renmin University of China, then the search returned can be in Renmin University of China's Perimeter " cook training school " Search result, as shown in Figure 2.
If the query of user's input does not identify map search mode, can be according to the map search mould of default Formula scans for;If not identifying attribute word, the tag that mapping is obtained is as search key.
The common application scenarios of one of above-mentioned analytic method are that user inputs query in the search box of map search, This application scenarios are actually to have defaulted user with map search demand, that is to say, that only user searches with map Rope demand just can input query in the search box of map search and scan for, then the query of user's input at this time is also silent Recognize with map search demand.
There are also a kind of application scenarios, user inputs query in the search box commonly searched for greatly, at this time cannot be direct There is default user map search demand can also execute the process of above-described embodiment one under this application scenarios, work as step Simply by the presence of a kind of knowledge in the tag mapping in pattern-recognition and step 104 that 102 Attribute Recognition, the steps 103 carried out carry out Not as a result, being identified as attribute word if there is word, perhaps it is identified one of map search mode or is reflected It is mapped on certain tag in tag system, it may be considered that the query of user's input has map search demand, then can call ground Graph search engine is scanned for according to described in step 105, and the search result of google maps is embedded into searching of searching for greatly In hitch fruit, and the search result of google maps can be come into significant position in the search result searched for greatly, such as It ranks the first, or highlights.
For example, user inputs query the school of cook " nearby where have learn " in the search box commonly searched for greatly, due to Attribute word is identified as by word in above-mentioned steps 102, step 103 and step 104, search pattern around is identified and reflects It is mapped in tag " cook's training ", it is thus determined that the query has map search demand, invocation map search engine is according to step 105 mode scans for, while where the query that the search engine commonly searched for greatly can also input user " nearby has kitchen The school of teacher " scans for returning to search result, includes the search result of google maps in the search result, and by ground The search result of graph search engine ranks the first.As shown in Figure 3.
It is the detailed description carried out to method provided by the present invention above, below with reference to embodiment, two couples of present invention are provided Device be described in detail.
Embodiment two,
Fig. 4 is structure drawing of device provided by Embodiment 2 of the present invention, and the server end of the device setting and search engine is used It is interacted in browser or client, obtains the query of user's input, and returning to browser or client should The corresponding search result of query.As shown in figure 4, the device mainly includes: participle unit 01, map unit 02 and search unit 03.It may include further at least one of Attribute Recognition unit 04 and pattern recognition unit 05.
After the server end of search engine gets the query for user's input that browser is sent, if the query It is the query that user inputs in the search box of map search, then it is fair to consider that the query has map search demand. At this point, participle unit 01 carries out word segmentation processing to the query of user's input first, and may further be to the word after word segmentation processing Language is removed the processing of stop words.
The major function of map unit 02 is to carry out tag mapping to the word of natural language in query, so that after mapping Word can directly hit map POI.Specifically: according to each tag in the word and tag system of natural language in query Between similarity, determine the tag that is mapped to;Wherein the tag in tag system is the POI attribute in map, can hit correspondence POI.
Search unit 03 determines the corresponding search key of query according to the tag mapping result of map unit later, calls Google maps scan for determining search key.
Above-mentioned Attribute Recognition unit 04 and pattern recognition unit 05 by select one or it is simultaneous in a manner of be arranged in point Between word unit 01 and map unit 02.Attribute Recognition unit 04 be used for be based on attribute vocabulary, to the word obtained after participle into Row Attribute Recognition determines attribute word.Pattern recognition unit 05 be used for be based on mode expansions table, to the word obtained after participle into Row map search pattern-recognition.Map unit 02 by query it is unidentified for attribute word and it is unidentified go out map search mode Word is determined as the word of natural language.
If Attribute Recognition unit 04 and pattern recognition unit 05 exist simultaneously, can serially deposit in any order Can also exist parallel.When serial, can be Attribute Recognition unit 04 first to the word after 01 word segmentation processing of participle unit into Row Attribute Recognition determines attribute word, and then pattern recognition unit 05 carries out base to the word for attribute word unidentified in query again It is exactly this serial mode shown in Fig. 4 in the map search pattern-recognition of mode expansions table;It is also possible to pattern-recognition Unit 05 first carries out map search pattern-recognition to the word after 01 word segmentation processing of participle unit based on mode expansions table, then Attribute Recognition unit 04 again it is unidentified to pattern recognition unit 05 go out map search mode word carry out based on attribute vocabulary Attribute Recognition.
The mode of establishing of attribute vocabulary can be by the way of manually adding or the mode of automatic mining, the contents of the section Be embodied as existing more mature technology, this will not be detailed here.Mode expansions table is also based on common map search mode It pre-establishes, there are the corresponding expression formulas of each map search mode in the mode expansions table.When some or certain words with Some expression formula matching in mode expansions table, then by this some or certain words recognitions be matched expression formula correspondingly Graph search mode.Equally, the mode of establishing of mode expansions table can be by the way of manually adding or the side of automatic mining Formula, wherein when by the way of automatic mining, the device further include: Model Establishment unit 06 is used for establishment model expression formula table, It is specific to execute following procedure: after carrying out word segmentation processing to the query of known map search mode, hit being belonged to based on attribute vocabulary The word of property vocabulary is filtered, and remaining word is determined as mode word;The statistics of co-occurrence frequency is carried out to mode word, and is based on Co-occurrence frequency is ranked up;The mode word that the sequence of selection co-occurrence frequency meets preset requirement constitutes known map search mode Mode expansions.
Map unit 02 is similar between the word and tag of the natural language use during tag mapping Degree can be embodied by co-occurrence rate, and co-occurrence rate is higher, and similarity is bigger;Wherein between the word term1 and tag1 of natural language Co-occurrence rate determine in the following ways: co-occurrence time of the statistics term1 and tag1 in corpus in one text or the same window Number N1 count total co-occurrence of all tags of the term1 respectively and including tag1 in corpus in one text or the same window Times N determines that the co-occurrence rate between term1 and tag1 is N1/N.
The calculating of above-mentioned similarity, which can be, to be also possible to obtain by inquiring precalculated result in real time, i.e., The word in the query of some common natural languages and the tag in tag system are subjected to similarity calculation in advance, right The query of user's input directly inquires the similarity calculation result precalculated during parsing.
Later, similarity between the word of natural language in query is met the tag of preset requirement by map unit 02 It is determined as the tag being mapped to, wherein preset requirement are as follows: similarity highest or similarity reach preset threshold.
If Attribute Recognition unit 04 identifies that attribute word, search unit 03 are determining query according to tag mapping result When corresponding search key, the tag that will be mapped to search key corresponding with the attribute word composition query identified.
If pattern recognition unit 05 identifies map search mode, 03 invocation map search engine of search unit according to The map search mode identified scans for determining search key;Otherwise, 03 invocation map of search unit search is drawn It holds up and determining search key is scanned for according to the map search mode of default.
There is also such a application scenarios, at this time cannot be straight if user inputs query by common big search The query for connecing default user input has map search demand, but needs whether the query inputted to user searches with map Rope demand is judged, identifies that attribute word, pattern recognition unit 05 identify that map is searched if there is Attribute Recognition unit 04 Rope mode and map unit 02 are mapped at least one of tag, then search unit 03 is assured that query has map and searches Rope demand can be embedded in the search result of google maps in the search result commonly searched for greatly at this time, and common The search result of google maps is come into significant position in the search result searched for greatly, such as is ranked the first, or is prominent Display.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation.In addition, in each embodiment of the present invention Each functional unit can integrate in one processing unit, be also possible to each unit and physically exist alone, can also be two Or more than two units are integrated in one unit.Above-mentioned integrated unit both can take the form of hardware realization, can also be with It is realized in the form of hardware plus SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various It can store the medium of program code.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims (18)

1. a kind of method parsed to the search terms query with map search demand, which is characterized in that this method packet It includes:
S1, word segmentation processing is carried out to the query of user's input;
S2, tag mapping is carried out to the word of natural language in the query to obtain tag mapping result: according to described in Similarity in query in the word of natural language and tag system between each tag determines the tag being mapped to;It is wherein described Tag in tag system is the point of interest POI attribute in map, can hit corresponding POI;It wherein, will in the step S2 In the query it is unidentified for attribute word relevant to map and it is unidentified go out map search mode word be determined as nature language The word of opinion on public affairs;
S3, the corresponding search key of the query is determined according to the tag mapping result, google maps are to determining Search key scans for.
2. the method according to claim 1, wherein in the step S1 further include: obtained after removal participle Stop words in word.
3. the method according to claim 1, wherein further including following step between the step S1 and step S2 Rapid at least one of S11 and S12:
S11, it is based on attribute vocabulary, Attribute Recognition is carried out to the word obtained after participle and determines attribute word relevant to map;
S12, it is based on mode expansions table, map search pattern-recognition is carried out to the word obtained after participle.
4. according to the method described in claim 3, it is characterized in that, the mode expansions table establishes mode are as follows:
After carrying out word segmentation processing to the query of known map search mode, based on attribute vocabulary to the word of hit attribute vocabulary It is filtered, remaining word is determined as mode word;
The statistics of co-occurrence frequency is carried out to mode word, and is ranked up based on co-occurrence frequency;
The mode word that the sequence of selection co-occurrence frequency meets preset requirement constitutes the mode expression of the known map search mode Formula.
5. the method according to claim 1, wherein in the step S2 natural language word and tag Between similarity embodied by co-occurrence rate, co-occurrence rate is higher, and similarity is bigger;The wherein word x and tag y of natural language Between co-occurrence rate determine in the following ways:
Count the co-occurrence times N 1 of the x and the y in corpus in one text or the same window, count the x respectively and All tag including the y show times N in one text or the same window in corpus in total, determine the x and institute Stating the co-occurrence rate between y is N1/N.
6. method according to claim 1 or 5, which is characterized in that in the step S2, will in the query from The tag that similarity meets preset requirement between the word of right language is determined as the tag being mapped to, wherein the preset requirement Are as follows: similarity highest or similarity reach preset threshold.
7. described according to the method described in claim 3, it is characterized in that, if identifying attribute word relevant to map The corresponding search key of the query is determined according to the tag mapping result in step S3 are as follows:
It is crucial that the tag being mapped to and the attribute word relevant to map identified are constituted into the corresponding search of the query Word.
8. according to the method described in claim 3, it is characterized in that, if identifying map search mode, the step S3 Middle google maps scan for determining search key are as follows: google maps are according to the map search mould identified Formula scans for determining search key;
Otherwise, google maps scan for determining search key in the step S3 are as follows: google maps are pressed Determining search key is scanned for according to the map search mode of default.
9. according to method described in claim 3,7 or 8, which is characterized in that if the user passes through common big search input The query, then if there is identifying attribute word relevant to map, identify map search mode and be mapped in tag At least one, it is determined that the query has map search demand, is embedded in describedly in the search result commonly searched for greatly The search result of graph search engine in the step S3, and draw the map search in the search result commonly searched for greatly The search result held up in the step S3 comes significant position.
10. a kind of device parsed to the search terms with map search demand, which is characterized in that the device includes:
Participle unit, the query for inputting to user carry out word segmentation processing;
Map unit carries out tag for the word to natural language in the query and maps to obtain tag mapping result: according to According to the similarity between each tag in the word and tag system of natural language in the query, the tag being mapped to is determined;Its Described in tag in tag system be POI attribute in map, corresponding POI can be hit;Wherein, the map unit is by institute State in query it is unidentified for attribute word relevant to map and it is unidentified go out map search mode word be determined as natural language The word of change;
Search unit determines that the corresponding search of the query is crucial for the tag mapping result according to the map unit Word, invocation map search engine scan for determining search key.
11. device according to claim 10, which is characterized in that the participle unit obtains after being also used to remove participle Word in stop words.
12. device according to claim 10, which is characterized in that the device further includes Attribute Recognition unit and pattern-recognition At least one of unit;
The Attribute Recognition unit carries out Attribute Recognition to the word obtained after participle and determines and ground for being based on attribute vocabulary Scheme relevant attribute word;
The pattern recognition unit carries out map search mode to the word obtained after participle for being based on mode expansions table Identification.
13. device according to claim 12, which is characterized in that the device further include: Model Establishment unit, for establishing The mode expansions table, specific to execute:
After carrying out word segmentation processing to the query of known map search mode, based on attribute vocabulary to the word of hit attribute vocabulary It is filtered, remaining word is determined as mode word;
The statistics of co-occurrence frequency is carried out to mode word, and is ranked up based on co-occurrence frequency;
The mode word that the sequence of selection co-occurrence frequency meets preset requirement constitutes the mode expression of the known map search mode Formula.
14. device according to claim 10, which is characterized in that the word for the natural language that the map unit uses Similarity between tag is embodied by co-occurrence rate, and co-occurrence rate is higher, and similarity is bigger;Wherein the word x of natural language with Co-occurrence rate between tag y determines in the following ways:
Count the co-occurrence times N 1 of the x and the y in corpus in one text or the same window, count the x respectively and All tag including the y show times N in one text or the same window in corpus in total, determine the x and institute Stating the co-occurrence rate between y is N1/N.
15. device described in 0 or 14 according to claim 1, which is characterized in that the map unit will in the query from The tag that similarity meets preset requirement between the word of right language is determined as the tag being mapped to, wherein the preset requirement Are as follows: similarity highest or similarity reach preset threshold.
16. device according to claim 12, which is characterized in that if the Attribute Recognition unit identifies and map phase The attribute word of pass, then described search unit is determining the corresponding search key of the query according to the tag mapping result When, it is crucial that the tag being mapped to and the attribute word relevant to map identified are constituted into the corresponding search of the query Word.
17. device according to claim 12, which is characterized in that if the pattern recognition unit identifies map search Mode, then described search cell call google maps are according to the map search mode identified to determining search key It scans for;Otherwise, described search cell call google maps are according to the map search mode of default to determining search Keyword scans for.
18. device described in 2,16 or 17 according to claim 1, which is characterized in that if the user passes through common big search The query is inputted, then identifies attribute word relevant to map, pattern recognition unit if there is the Attribute Recognition unit Identify that map search mode and map unit are mapped at least one of tag, then described search unit determines the query Have map search demand, is embedded in the search result of the google maps in the search result commonly searched for greatly, and The search result of the google maps is come into significant position in the search result commonly searched for greatly.
CN201310156743.3A 2013-04-28 2013-04-28 The method and apparatus that search terms with map demand are parsed Active CN104123319B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310156743.3A CN104123319B (en) 2013-04-28 2013-04-28 The method and apparatus that search terms with map demand are parsed

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310156743.3A CN104123319B (en) 2013-04-28 2013-04-28 The method and apparatus that search terms with map demand are parsed

Publications (2)

Publication Number Publication Date
CN104123319A CN104123319A (en) 2014-10-29
CN104123319B true CN104123319B (en) 2019-08-27

Family

ID=51768731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310156743.3A Active CN104123319B (en) 2013-04-28 2013-04-28 The method and apparatus that search terms with map demand are parsed

Country Status (1)

Country Link
CN (1) CN104123319B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462531A (en) * 2014-12-23 2015-03-25 北京奇虎科技有限公司 Method and system for determining whether search term invokes map interface
CN104537041B (en) * 2014-12-23 2018-05-04 北京奇虎科技有限公司 A kind of definite user's query word whether the method and system of invocation map interface
CN110609880A (en) * 2018-06-15 2019-12-24 北京搜狗科技发展有限公司 Information query method and device and electronic equipment
CN109783589B (en) * 2018-12-13 2023-07-25 中国平安人寿保险股份有限公司 Method, device and storage medium for resolving address of electronic map
CN111666292B (en) * 2020-04-24 2023-05-26 百度在线网络技术(北京)有限公司 Similarity model establishment method and device for retrieving geographic position

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101339551A (en) * 2007-07-05 2009-01-07 日电(中国)有限公司 Natural language query demand extension equipment and its method
CN102855251A (en) * 2011-06-30 2013-01-02 北京百度网讯科技有限公司 Method and device for requirement identification
CN102880721A (en) * 2012-10-15 2013-01-16 瑞庭网络技术(上海)有限公司 Implementation method of vertical search engine

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7937402B2 (en) * 2006-07-10 2011-05-03 Nec (China) Co., Ltd. Natural language based location query system, keyword based location query system and a natural language and keyword based location query system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101339551A (en) * 2007-07-05 2009-01-07 日电(中国)有限公司 Natural language query demand extension equipment and its method
CN102855251A (en) * 2011-06-30 2013-01-02 北京百度网讯科技有限公司 Method and device for requirement identification
CN102880721A (en) * 2012-10-15 2013-01-16 瑞庭网络技术(上海)有限公司 Implementation method of vertical search engine

Also Published As

Publication number Publication date
CN104123319A (en) 2014-10-29

Similar Documents

Publication Publication Date Title
CN106156365B (en) A kind of generation method and device of knowledge mapping
CN107515877B (en) Sensitive subject word set generation method and device
CN104636465B (en) Web-page summarization generation method, methods of exhibiting and related device
CN104123319B (en) The method and apparatus that search terms with map demand are parsed
CN104537341B (en) Face picture information getting method and device
CN106844571B (en) Method and device for identifying synonyms and computing equipment
CN106874441A (en) Intelligent answer method and apparatus
CN110516047A (en) The search method and searching system of knowledge mapping based on packaging field
CN109918560A (en) A kind of answering method and device based on search engine
US8949227B2 (en) System and method for matching entities and synonym group organizer used therein
CN104199833B (en) The clustering method and clustering apparatus of a kind of network search words
US20110112995A1 (en) Systems and methods for organizing collective social intelligence information using an organic object data model
CN106970991B (en) Similar application identification method and device, application search recommendation method and server
CN103313248B (en) Method and device for identifying junk information
CN102495892A (en) Webpage information extraction method
CN106649849A (en) Text information base building method and device and searching method, device and system
CN103593412B (en) A kind of answer method and system based on tree structure problem
CN105787134B (en) Intelligent answer method, apparatus and system
CN109829045A (en) A kind of answering method and device
CN107943514A (en) The method for digging and system of core code element in a kind of software document
KR101319413B1 (en) Summary Information Generating System and Method for Review of Product and Service
CN107861944A (en) A kind of text label extracting method and device based on Word2Vec
CN115840812A (en) Method and system for intelligently matching enterprises according to policy text
CN106202312B (en) A kind of interest point search method and system for mobile Internet
CN110750626B (en) Scene-based task-driven multi-turn dialogue method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant