CN106653006B - Searching method and device based on interactive voice - Google Patents

Searching method and device based on interactive voice Download PDF

Info

Publication number
CN106653006B
CN106653006B CN201611019821.5A CN201611019821A CN106653006B CN 106653006 B CN106653006 B CN 106653006B CN 201611019821 A CN201611019821 A CN 201611019821A CN 106653006 B CN106653006 B CN 106653006B
Authority
CN
China
Prior art keywords
search
word
segmentation
participle
combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611019821.5A
Other languages
Chinese (zh)
Other versions
CN106653006A (en
Inventor
郎文静
李裕东
朱群燕
石远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201611019821.5A priority Critical patent/CN106653006B/en
Publication of CN106653006A publication Critical patent/CN106653006A/en
Application granted granted Critical
Publication of CN106653006B publication Critical patent/CN106653006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/081Search algorithms, e.g. Baum-Welch or Viterbi

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes a kind of searching method and device based on interactive voice, the searching method based on interactive voice includes obtaining the first term according to the voice data for search that user provides, and be retrieved as multiple second terms of the information above of the first term;A variety of participle combinations are generated according to the first participle result of the first term and the second word segmentation result of each second term;According to the combined relevance score of every kind of participle, selection target participle group is incorporated as third term from the combination of a variety of participles;It is scanned for according to third term.It can be scanned for through the invention in conjunction with the contextual information of phonetic search, promote search precision, promote user's search experience degree.

Description

Search method and device based on voice interaction
Technical Field
The invention relates to the technical field of internet, in particular to a search method and a search device based on voice interaction.
Background
With the continuous development of internet technology, the input cost of the keyboard is also continuously increased. The voice search is based on strong voice recognition capability, and supports quick search initiation through voice commands, so that the search is quicker, more direct and more intelligent. In the related art, the conventional voice search is in a mode of receiving instructions and feeding back instructions, and in each search, the search is independent from the last search and the next search, that is, no relation exists between searches in the same retrieval sequence.
In this way, if the user needs to restate the requirement subject and give a complete retrieval requirement expression when performing relevant query or supplementary search on the current search term, the search cannot be performed in combination with the context information of the voice search, and the search accuracy is not high under the condition that the voice search expression is simplified.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, an object of the present invention is to provide a search method based on voice interaction, which can perform a search in combination with context information of a voice search, so as to improve search accuracy and improve user search experience.
Another object of the present invention is to provide a search device based on voice interaction.
It is still another object of the present invention to provide a search apparatus based on voice interaction.
It is another object of the invention to propose a non-transitory computer-readable storage medium.
It is a further object of the invention to propose a computer program product.
In order to achieve the above object, a search method based on voice interaction according to an embodiment of a first aspect of the present invention includes: acquiring a first search term according to voice data for searching provided by a user, and acquiring a plurality of second search terms which are the above information of the first search term; generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; selecting a target word segmentation group from the multiple word segmentation groups and combining the target word segmentation group as a third search word according to the relevance score of each word segmentation group; and searching according to the third search term.
According to the searching method based on voice interaction provided by the embodiment of the first aspect of the invention, a plurality of word segmentation combinations are generated according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; selecting a target word group from the multi-word combination and combining the target word group as a third search word according to the relevance score of each word group; the search is carried out according to the third search term, the search can be carried out by combining the context information of the voice search, the search precision is improved, and the search experience of the user is improved.
In order to achieve the above object, a search device based on voice interaction according to an embodiment of a second aspect of the present invention includes: the first acquisition module is used for acquiring a first search term according to voice data provided by a user and used for searching; the second acquisition module is used for acquiring a plurality of second search terms of the above information of the first search term; the generating module is used for generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; the selection module is used for selecting a target word segmentation group from the multiple word segmentation groups and combining the target word segmentation group as a third search word according to the relevance score of each word segmentation group; and the searching module is used for searching according to the third search term.
The searching device based on voice interaction provided by the embodiment of the second aspect of the invention generates a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; selecting a target word group from the multi-word combination and combining the target word group as a third search word according to the relevance score of each word group; the search is carried out according to the third search term, the search can be carried out by combining the context information of the voice search, the search precision is improved, and the search experience of the user is improved.
In order to achieve the above object, a search device based on voice interaction according to a third aspect of the present invention is characterized by comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
acquiring a first search term according to voice data for searching provided by a user, and acquiring a plurality of second search terms which are the above information of the first search term;
generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word;
selecting a target word segmentation group from the multiple word segmentation groups and combining the target word segmentation group as a third search word according to the relevance score of each word segmentation group;
and searching according to the third search term.
The search device based on voice interaction provided by the embodiment of the third aspect of the invention generates a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; selecting a target word group from the multi-word combination and combining the target word group as a third search word according to the relevance score of each word group; the search is carried out according to the third search term, the search can be carried out by combining the context information of the voice search, the search precision is improved, and the search experience of the user is improved.
To achieve the above object, a non-transitory computer-readable storage medium according to a fourth aspect of the present invention is a non-transitory computer-readable storage medium, when instructions in the storage medium are executed by a processor of a mobile terminal, the instructions enabling the mobile terminal to perform a search method based on voice interaction, the method including:
acquiring a first search term according to voice data for searching provided by a user, and acquiring a plurality of second search terms which are the above information of the first search term;
generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word;
selecting a target word segmentation group from the multiple word segmentation groups and combining the target word segmentation group as a third search word according to the relevance score of each word segmentation group;
and searching according to the third search term.
A non-transitory computer-readable storage medium according to a fourth aspect of the present invention generates a plurality of word segmentation combinations according to a first word segmentation result of a first search word and a second word segmentation result of each second search word; selecting a target word group from the multi-word combination and combining the target word group as a third search word according to the relevance score of each word group; the search is carried out according to the third search term, the search can be carried out by combining the context information of the voice search, the search precision is improved, and the search experience of the user is improved.
To achieve the above object, a computer program product according to a fifth embodiment of the present invention is a computer program product, which when executed by an instruction processor performs a search method based on voice interaction, the method including:
acquiring a first search term according to voice data for searching provided by a user, and acquiring a plurality of second search terms which are the above information of the first search term;
generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word;
selecting a target word segmentation group from the multiple word segmentation groups and combining the target word segmentation group as a third search word according to the relevance score of each word segmentation group;
and searching according to the third search term.
The computer program product according to the fifth embodiment of the present invention generates multiple word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; selecting a target word group from the multi-word combination and combining the target word group as a third search word according to the relevance score of each word group; the search is carried out according to the third search term, the search can be carried out by combining the context information of the voice search, the search precision is improved, and the search experience of the user is improved.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a search method based on voice interaction according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a search method based on voice interaction according to another embodiment of the present invention;
FIG. 3 is a flowchart illustrating a search method based on voice interaction according to another embodiment of the present invention;
FIG. 4 is a schematic diagram of a voice search interface in an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a search apparatus based on voice interaction according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a search apparatus based on voice interaction according to another embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. On the contrary, the embodiments of the invention include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.
Fig. 1 is a flowchart illustrating a search method based on voice interaction according to an embodiment of the present invention. The voice interaction based search method is exemplified as being configured in a voice interaction based search apparatus. The search method based on voice interaction can be applied to a search engine of electronic equipment, wherein the search engine is a system which can receive search information input by a user, collect information related to the search information from the Internet, provide retrieval service for the user after organizing and processing the information, and display the information related to the search information to the user.
The electronic device is, for example, a Personal Computer (PC), a cloud device or a mobile device, and the mobile device is, for example, a smart phone or a tablet Computer.
Referring to fig. 1, the search method based on voice interaction includes:
s11: the method comprises the steps of obtaining a first search term according to voice data used for searching and provided by a user, and obtaining a plurality of second search terms which are the above information of the first search term.
The traditional voice search is in a mode of receiving an instruction and feeding back an instruction, when each search is carried out, the search is independent from the previous search and the next search, namely, the searches are unrelated in the same search sequence, if a user carries out related inquiry or supplementary search on the current search word, the user needs to repeat a requirement main body and give a complete search requirement expression, the search cannot be carried out by combining context information of the voice search, and the search accuracy is not high under the condition that the voice search expression is simplified.
In the embodiment of the invention, through the understanding of the deep requirements of the search text and the current search, if the historical search words comprise the initial word segmentation combination of the first word segmentation result of the first search word and the second word segmentation result of the second search word, the search is carried out according to the initial word segmentation combination, so that the search efficiency can be effectively improved; if the historical search words have no initial word segmentation combination, the first word segmentation and the second word segmentation are combined according to the type information of the first search word to obtain multiple word segmentation combinations, the relevance score of each word segmentation combination is calculated according to a relevant algorithm, the word segmentation combination with the largest score value is used as a final search word to be searched, the search can be performed by combining the context information of the voice search, the search accuracy is improved, and the search experience of a user is improved.
In the embodiment of the invention, voice data provided by a user and used for searching can be received, a first text corresponding to the voice data is obtained, and the corresponding first text is used as a first search word.
Optionally, the user may click a microphone button in the search box of the search engine, input the voice data, and the voice data is collected by the voice collection module in the search engine, and obtain a first text corresponding to the voice data, and use the corresponding first text as the first search term.
It will be appreciated that the user may conduct a related follow-up search or a supplemental search on the current search results, and that the above information for the current first term may be plural.
For example, the voice data input by the user in the search engine is "usa", the search result presentation page provides relevant information in the usa, when the user wants to query population data in the usa, the user inputs "how many people are", and in combination with the above information, the first search term is "how many people are", and the second search term is "usa", or the user can continue to ask "where the first place is", in combination with the above information, the first search term is "where the first place is", and the second search term is: "United states" and "population size".
S12: and generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word.
In the embodiment of the present invention, the word segmentation processing may be performed on the first search term and the second search term by using a correlation technique, so as to obtain the lexical information of the first search term and the second search term, that is, obtain the first word segmentation result and the second word segmentation result.
It should be noted that, for a first search term that does not appear in history, the first-order word segmentation that is the same can be filtered out and the fuzzy sound that is the same in pinyin can be filtered out through word segmentation processing according to the context characteristics.
In the embodiment of the invention, whether an initial segmentation combination comprising a first segmentation result and a second segmentation result exists in the historical search term can be judged; if an initial segmentation combination comprising the first segmentation result and the second segmentation result exists, taking the initial segmentation combination as a generated segmentation combination; if the initial segmentation combination containing the first segmentation result and the second segmentation result does not exist, extracting first characteristic information of each first segmentation in the first segmentation result, and extracting second characteristic information of each second segmentation in the second segmentation result; determining the type information of the first search term according to the first characteristic information and the second characteristic information; and combining the first participle and the second participle according to the type information of the first search word to obtain various participle combinations.
In some embodiments, referring to fig. 2, step S12 specifically includes:
s21: and judging whether the historical search words have an initial word segmentation combination containing the first word segmentation result and the second word segmentation result, if so, executing S22, and otherwise, executing S23.
Optionally, whether an initial segmentation combination comprising a first segmentation result and a second segmentation result exists in the historical search word is judged, when the initial segmentation combination comprising the first segmentation result and the second segmentation result exists in the historical search word, the initial segmentation combination is directly used as a generated segmentation combination, searching is performed according to the generated segmentation combination, and the searching efficiency can be effectively improved; when there is no initial segmentation combination including the first segmentation result and the second segmentation result in the history search word, S23 is performed.
S22: and taking the initial segmentation combination as the generated segmentation combination.
Optionally, when the historical search term has the initial segmentation combination comprising the first segmentation result and the second segmentation result, the initial segmentation combination is directly used as the generated segmentation combination, and the search is performed according to the generated segmentation combination, so that the search efficiency can be effectively improved.
S23: and extracting first characteristic information of each first word segmentation in the first word segmentation result, and extracting second characteristic information of each second word segmentation in the second word segmentation result.
In an embodiment of the present invention, the first/second feature information includes at least one of: entity type, weight, user history search frequency, click frequency after user history search, punishment weight of pan-spoken language and edit distance of the first participle/the second participle.
The entity type of the first participle/the second participle is entity information of the first participle/the second participle, such as a name of a person, a name of a place and the like.
The weight value of the weight of the first participle/the second participle describes the importance of the first participle/the second participle in the first search term/the second search term.
The user history search frequency is used to determine a relevance score for each word combination in subsequent steps.
The generalized spoken penalty weight represents a normalization process of the spoken generalization of the first participle/second participle. Since there are a large number of spoken expressions in the speech data, for example, "i want to search … …" and "find … … what is what's what", it is necessary to perform normalization processing for generalization of spoken expressions. Specifically, the first participle/the second participle are subjected to spoken generalization normalization processing according to a manually constructed spoken word list, each participle in the spoken word list corresponds to a penalty weight, and the higher the penalty weight is, the higher the spoken language degree of the participle is.
Edit distance is a common similarity calculation method.
Optionally, first feature information of each first segmentation word in the first segmentation result may be extracted, and second feature information of each second segmentation word in the second segmentation result may be extracted, so as to determine type information of the first search word according to the first feature information and the second feature information.
S24: and determining the type information of the first search term according to the first characteristic information and the second characteristic information.
Wherein the type information includes: the first search term is the first type information of the supplementary search of the second search term, and the first search term is the second type information of the inquiry search of the second search term.
In an embodiment of the present invention, the first type information is: the subject semantics are unchanged, and the requirements are strongly related, for example, "download in cool day" is the subject word, and "download" is the requirement word; the second type of information is: the main body semantics are changed, and the requirement semantics are not changed.
Alternatively, the entity attribute components of the first term and the second term may be identified by means of an entity attribute repository. It should be noted that the same meaning in natural language can be expressed in different ways, and the same is true in the description of the entity attribute, that is, the same attribute requirement has different ways, for example, "how many people there are", and "how many people there are" are different expressions of the attribute requirement of "people", so that the same requirement stem needs to be extracted from different ways by the stem extraction technique.
Specifically, by determining the correlation between the entity attribute information of the current first search term and the entity attributes of the plurality of second search terms of the above information, that is, according to the first feature information and the second feature information, it is further determined whether the first search term is attribute changed or entity changed.
For example, when the first search term is "that weight?", and the second search term is "Liu De Hua height", the attributes of the entity "Liu De Hua", and both "height" and "weight" are "Liu De Hua" are obtained through entity attribute matching, so that the type information of the first search term is that the attribute is changed, the main body semantic is unchanged, that is, the first type information.
Or when the first search term is 'the Yaohonging?', and the second search term is 'the Liudehua height', the entities are 'Liudehua' and 'Yaohonging' obtained through entity attribute matching, and the common attribute of the entities is 'the height', so that the type information of the first search term is known to be that the main body semantic changes, the requirement semantic does not change, namely, the second type information.
S25: and combining the first participle and the second participle according to the type information of the first search word to obtain various participle combinations.
As an example, when the first term is the second type of information of the question search of the second term, for example, when the first term is: "wool? in that France", and the second term is: "where the first of the United states is", the resulting various participle combinations can be as shown in Table 1.
TABLE 1
In the embodiment, when the initial segmentation combination comprising the first segmentation result and the second segmentation result exists in the historical search word, the initial segmentation combination is used as the generated segmentation combination, so that the search efficiency can be effectively improved, when the initial segmentation combination comprising the first segmentation result and the second segmentation result does not exist in the historical search word, the type information of the first search word is determined according to the first characteristic information and the second characteristic information, the first segmentation and the second segmentation are combined according to the type information of the first search word, so that various segmentation combinations are obtained, the subsequent search on the segmentation combination with the largest correlation score can be facilitated, the search requirement can be freely expressed by a user, the voice search interaction experience is more intelligent, the search accuracy is improved, and the search experience of the user is improved.
In some embodiments, after step S12, the method further includes:
s31: determining the occurrence frequency of each participle combination in the user search history, the first user history search frequency of a first participle corresponding to the participle combination and the second user history search frequency of a second participle corresponding to the participle combination.
Alternatively, it may be assumed that the frequency of occurrence of the participle combination in the user search history is NxyiThe first user history search frequency of the first participle corresponding to the participle combination is NxiAnd the second user history search frequency of the second participle corresponding to the participle combination is NyiWhere i is 1,2, …, M is the number of word segmentation combinations, and the total search frequency of the user history is N.
S32: and determining a relevance score according to the frequency of occurrence, the historical search frequency of the first user and the historical search frequency of the second user.
In the embodiment of the invention, the occurrence frequency Nxy can be determined according toiFirst user history search frequency NxiThe second user history searching frequency NyiAnd determining a correlation score corr (i) by a preset formula, wherein the preset formula is as follows:
Corr(i)=Math.log10(N/Nxi)*Math.log10(N/Nyi)*Nxyi/(Nxi+Nyi-Nxyi);
the relevance score is determined according to the occurrence frequency, the historical search frequency of the first user and the historical search frequency of the second user, so that the word combination with the largest relevance score can be conveniently searched subsequently, the user can freely express the retrieval requirement, the voice search interaction experience is more intelligent, the search accuracy is improved, and the search experience of the user is improved.
S13: and selecting a target word group from the multi-word group and combining the target word group as a third search word according to the relevance score of each word group.
In the embodiment of the present invention, a part-word combination with the highest relevance score may be selected from the plurality of part-word combinations as a target part-word combination, and the target part-word combination may be used as a third search word.
For example, when the first search term is the second type of information of the question-following search of the second search term, and the first search term is "where is the first of france?", and the second search term is "where is the first of the united states", the relevance score of each word combination in table 1 is obtained according to the preset formula as shown in table 2 (the processing procedure of the first type of information of the supplementary search of the first search term which is the second search term is similar, and is not repeated here).
TABLE 2
The combination of the terms with the highest relevance score may be selected from the plurality of combinations of terms: where the first place of france is, as a target participle combination, and using the target participle combination as a third search term to search according to the third search term.
S14: and searching according to the third search term.
As an example, referring to fig. 4, fig. 4 is a schematic diagram of a voice search interface in an embodiment of the present invention, when voice data input by a user is "usa" (second search term), a search result display page provides relevant information of the usa, when the user wants to query population data of the usa, only the voice data "how many people are" (first search term) needs to be directly input, and in combination with the above information, the embodiment can automatically identify first type information of supplementary search in which the first search term is the second search term, obtain the third search term "how many people are in the usa", and trigger a search result, so as to meet the user requirement.
Similarly, the user can continue to ask where the first word is (the first search term), and in combination with the above information, the second search term is: "usa" and "population size", the present embodiment can automatically identify the first type of information of the supplementary search in which the first search term is the second search term, obtain the third search term "where the first is in the usa", and show the search result.
Further, in order to satisfy the spoken language expression of the user, the user may also naturally ask "the worship of france", and in combination with the above information, the present embodiment can automatically recognize the second type of information of the inquiry search in which the first search term is the second search term, obtain where the first of france is the third search term, and display the search result.
In the embodiment, a plurality of word segmentation combinations are generated according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; selecting a target word group from the multi-word combination and combining the target word group as a third search word according to the relevance score of each word group; the search is carried out according to the third search term, the search can be carried out by combining the context information of the voice search, the search precision is improved, and the search experience of the user is improved.
Fig. 5 is a schematic structural diagram of a search apparatus based on voice interaction according to an embodiment of the present invention. The voice interaction based search apparatus 500 may be implemented by software, hardware, or a combination of both.
Referring to fig. 5, the voice interaction based search apparatus 500 may include: a first acquisition module 510, a second acquisition module 520, a generation module 530, a selection module 540, and a search module 550. Wherein,
a first obtaining module 510, configured to obtain a first search term according to voice data provided by a user for searching.
Optionally, the first obtaining module 510 is specifically configured to: the method comprises the steps of receiving voice data used for searching and provided by a user, obtaining a first text corresponding to the voice data, and taking the corresponding first text as a first search word.
The second obtaining module 520 is configured to obtain a plurality of second search terms that are the above information of the first search term.
The generating module 530 is configured to generate a plurality of segmentation combinations according to the first segmentation result of the first search term and the second segmentation result of each second search term.
And the selecting module 540 is configured to select a target segmented word combination from the multiple segmented word combinations and combine the target segmented word combination as the third search word according to the relevance score of each segmented word combination.
Optionally, the selecting module 540 is specifically configured to: and selecting the word segmentation combination with the highest relevance score from the word segmentation combinations as a target word segmentation combination, and using the target word segmentation combination as a third search word.
And a searching module 550, configured to perform a search according to the third search term.
In some embodiments, referring to fig. 6, the apparatus 500 for searching based on voice interaction may further include:
optionally, the generating module 530 includes:
the judging submodule 531 is configured to judge whether an initial segmentation combination including a first segmentation result and a second segmentation result exists in the historical search term.
And the first processing sub-module 532 is used for taking the initial segmentation combination as the generated segmentation combination when the initial segmentation combination containing the first segmentation result and the second segmentation result exists.
The extracting sub-module 533 is configured to, when there is no initial segmentation combination including the first segmentation result and the second segmentation result, extract first feature information of each first segmentation in the first segmentation result, and extract second feature information of each second segmentation in the second segmentation result.
Optionally, the first/second feature information includes at least one of:
entity type, weight, user history search frequency, click frequency after user history search, punishment weight of pan-spoken language and edit distance of the first participle/the second participle.
The determining sub-module 534 is configured to determine type information of the first search term according to the first feature information and the second feature information, where the type information includes: the first search term is the first type information of the supplementary search of the second search term, and the first search term is the second type information of the inquiry search of the second search term.
The second processing sub-module 535 is configured to perform combination processing on the first participle and the second participle according to the type information of the first search term, so as to obtain multiple participle combinations.
The first determining module 560 is configured to determine the frequency of occurrence of each word segmentation group in the user search history, and the first user history search frequency of the first word segmentation corresponding to the word segmentation group and the second user history search frequency of the second word segmentation corresponding to the word segmentation group.
A second determining module 570 for determining a relevance score according to the frequency of occurrence, the first user historical search frequency, and the second user historical search frequency.
It should be noted that the explanation of the embodiment of the search method based on voice interaction in the foregoing embodiments of fig. 1 to fig. 4 is also applicable to the search apparatus 500 based on voice interaction in this embodiment, and the implementation principle thereof is similar, and is not described herein again.
In the embodiment, a plurality of word segmentation combinations are generated according to the first word segmentation result of the first search word and the second word segmentation result of each second search word; selecting a target word group from the multi-word combination and combining the target word group as a third search word according to the relevance score of each word group; the search is carried out according to the third search term, the search can be carried out by combining the context information of the voice search, the search precision is improved, and the search experience of the user is improved.
It should be noted that the terms "first," "second," and the like in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (12)

1. A search method based on voice interaction is characterized by comprising the following steps:
acquiring a first search term according to voice data for searching provided by a user, and acquiring a plurality of second search terms which are the above information of the first search term;
generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word;
selecting a target word segmentation group from the multiple word segmentation groups and combining the target word segmentation group as a third search word according to the relevance score of each word segmentation group;
searching according to the third search term;
the generating a plurality of segmentation combinations according to the first segmentation result of the first search term and the second segmentation result of each second search term includes:
judging whether an initial word segmentation combination containing the first word segmentation result and the second word segmentation result exists in the historical search words;
and if the initial segmentation combination comprising the first segmentation result and the second segmentation result exists, taking the initial segmentation combination as the generated segmentation combination.
2. The method of claim 1, wherein the generating a plurality of segmentation combinations according to the first segmentation result of the first search term and the second segmentation result of each second search term further comprises:
if the initial segmentation combination containing the first segmentation result and the second segmentation result does not exist, extracting first characteristic information of each first segmentation in the first segmentation result, and extracting second characteristic information of each second segmentation in the second segmentation result;
determining type information of the first search term according to the first characteristic information and the second characteristic information, wherein the type information comprises: the first search term is first type information of supplementary search of the second search term, and the first search term is second type information of inquiry search of the second search term;
and combining the first participle and the second participle according to the type information of the first search word to obtain the multiple participle combinations.
3. The voice interaction-based search method of claim 2, wherein the first/second feature information comprises at least one of:
entity types, weights, user historical search frequency, click frequency after user historical search, punishment weights of pan-spoken language and editing distance of the first participle/the second participle.
4. The voice interaction-based search method of claim 1, further comprising, before said scoring the relevance according to each combination of words:
determining the occurrence frequency of each participle combination in the user search history, the first user history search frequency of a first participle corresponding to the participle combination and the second user history search frequency of a second participle corresponding to the participle combination;
and determining the relevance score according to the frequency of occurrence, the historical search frequency of the first user and the historical search frequency of the second user.
5. The searching method based on voice interaction as claimed in claim 1, wherein the selecting the target word group from the multiple word group and combining the target word group as the third search word comprises:
and selecting the word segmentation combination with the highest relevance score from the multiple word segmentation combinations as the target word segmentation combination, and using the target word segmentation combination as the third search word.
6. The searching method based on voice interaction as claimed in claim 1, wherein the obtaining the first search term according to the voice data provided by the user for searching comprises:
receiving voice data provided by a user and used for searching, acquiring a first text corresponding to the voice data, and taking the corresponding first text as the first search word.
7. A search apparatus based on voice interaction, comprising:
the first acquisition module is used for acquiring a first search term according to voice data provided by a user and used for searching;
the second acquisition module is used for acquiring a plurality of second search terms of the above information of the first search term;
the generating module is used for generating a plurality of word segmentation combinations according to the first word segmentation result of the first search word and the second word segmentation result of each second search word;
the selection module is used for selecting a target word segmentation group from the multiple word segmentation groups and combining the target word segmentation group as a third search word according to the relevance score of each word segmentation group;
the searching module is used for searching according to the third search term;
the generation module comprises:
the judging submodule is used for judging whether an initial word segmentation combination containing the first word segmentation result and the second word segmentation result exists in the historical search words or not;
and the first processing submodule is used for taking the initial segmentation combination as a generated segmentation combination when the initial segmentation combination containing the first segmentation result and the second segmentation result exists.
8. The voice interaction-based search apparatus of claim 7, wherein the generation module further comprises:
the extraction sub-module is used for extracting first characteristic information of each first segmentation in the first segmentation results and extracting second characteristic information of each second segmentation in the second segmentation results when the initial segmentation combination containing the first segmentation results and the second segmentation results does not exist;
a determining submodule, configured to determine type information of the first search term according to the first feature information and the second feature information, where the type information includes: the first search term is first type information of supplementary search of the second search term, and the first search term is second type information of inquiry search of the second search term;
and the second processing submodule is used for combining the first participle and the second participle according to the type information of the first search word to obtain the multiple participle combinations.
9. The voice interaction-based search apparatus of claim 8, wherein the first/second feature information comprises at least one of:
entity types, weights, user historical search frequency, click frequency after user historical search, punishment weights of pan-spoken language and editing distance of the first participle/the second participle.
10. The voice interaction-based search apparatus of claim 7, further comprising:
the first determining module is used for determining the occurrence frequency of each participle combination in the user search history, the first user history search frequency of a first participle corresponding to the participle combination and the second user history search frequency of a second participle corresponding to the participle combination;
and the second determining module is used for determining the relevance score according to the occurrence frequency, the historical search frequency of the first user and the historical search frequency of the second user.
11. The search apparatus based on voice interaction of claim 7, wherein the selection module is specifically configured to:
and selecting the word segmentation combination with the highest relevance score from the multiple word segmentation combinations as the target word segmentation combination, and using the target word segmentation combination as the third search word.
12. The search apparatus based on voice interaction of claim 7, wherein the first obtaining module is specifically configured to:
receiving voice data provided by a user and used for searching, acquiring a first text corresponding to the voice data, and taking the corresponding first text as the first search word.
CN201611019821.5A 2016-11-17 2016-11-17 Searching method and device based on interactive voice Active CN106653006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611019821.5A CN106653006B (en) 2016-11-17 2016-11-17 Searching method and device based on interactive voice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611019821.5A CN106653006B (en) 2016-11-17 2016-11-17 Searching method and device based on interactive voice

Publications (2)

Publication Number Publication Date
CN106653006A CN106653006A (en) 2017-05-10
CN106653006B true CN106653006B (en) 2019-11-08

Family

ID=58807746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611019821.5A Active CN106653006B (en) 2016-11-17 2016-11-17 Searching method and device based on interactive voice

Country Status (1)

Country Link
CN (1) CN106653006B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133345B (en) 2017-05-22 2020-11-06 北京百度网讯科技有限公司 Interaction method and device based on artificial intelligence
CN107608957A (en) * 2017-09-06 2018-01-19 百度在线网络技术(北京)有限公司 Text modification method, apparatus and its equipment based on voice messaging
CN108538291A (en) * 2018-04-11 2018-09-14 百度在线网络技术(北京)有限公司 Sound control method, terminal device, cloud server and system
CN112259096B (en) * 2020-10-23 2022-10-18 海信视像科技股份有限公司 Voice data processing method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075435A (en) * 2007-04-19 2007-11-21 深圳先进技术研究院 Intelligent chatting system and its realizing method
CN101140587A (en) * 2007-10-15 2008-03-12 深圳市迅雷网络技术有限公司 Searching method and apparatus
CN101281745A (en) * 2008-05-23 2008-10-08 深圳市北科瑞声科技有限公司 Interactive system for vehicle-mounted voice
CN103369398A (en) * 2013-07-01 2013-10-23 安徽广电信息网络股份有限公司 Voice searching method and voice searching system based on television EPG (electronic program guide) information
CN103995870A (en) * 2014-05-21 2014-08-20 百度在线网络技术(北京)有限公司 Interactive searching method and device
CN103995880A (en) * 2014-05-27 2014-08-20 百度在线网络技术(北京)有限公司 Interactive searching method and device
CN104102723A (en) * 2014-07-21 2014-10-15 百度在线网络技术(北京)有限公司 Search content providing method and search engine
CN104239459A (en) * 2014-09-02 2014-12-24 百度在线网络技术(北京)有限公司 Voice search method, voice search device and voice search system
CN104714954A (en) * 2013-12-13 2015-06-17 中国电信股份有限公司 Information searching method and system based on context understanding
CN105279227A (en) * 2015-09-11 2016-01-27 百度在线网络技术(北京)有限公司 Voice search processing method and device of homonym

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8831944B2 (en) * 2009-12-15 2014-09-09 At&T Intellectual Property I, L.P. System and method for tightly coupling automatic speech recognition and search

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075435A (en) * 2007-04-19 2007-11-21 深圳先进技术研究院 Intelligent chatting system and its realizing method
CN101140587A (en) * 2007-10-15 2008-03-12 深圳市迅雷网络技术有限公司 Searching method and apparatus
CN101281745A (en) * 2008-05-23 2008-10-08 深圳市北科瑞声科技有限公司 Interactive system for vehicle-mounted voice
CN103369398A (en) * 2013-07-01 2013-10-23 安徽广电信息网络股份有限公司 Voice searching method and voice searching system based on television EPG (electronic program guide) information
CN104714954A (en) * 2013-12-13 2015-06-17 中国电信股份有限公司 Information searching method and system based on context understanding
CN103995870A (en) * 2014-05-21 2014-08-20 百度在线网络技术(北京)有限公司 Interactive searching method and device
CN103995880A (en) * 2014-05-27 2014-08-20 百度在线网络技术(北京)有限公司 Interactive searching method and device
CN104102723A (en) * 2014-07-21 2014-10-15 百度在线网络技术(北京)有限公司 Search content providing method and search engine
CN104239459A (en) * 2014-09-02 2014-12-24 百度在线网络技术(北京)有限公司 Voice search method, voice search device and voice search system
CN105279227A (en) * 2015-09-11 2016-01-27 百度在线网络技术(北京)有限公司 Voice search processing method and device of homonym

Also Published As

Publication number Publication date
CN106653006A (en) 2017-05-10

Similar Documents

Publication Publication Date Title
CN110188168B (en) Semantic relation recognition method and device
CN108829893B (en) Method and device for determining video label, storage medium and terminal equipment
EP3648099B1 (en) Voice recognition method, device, apparatus, and storage medium
CN106649818B (en) Application search intention identification method and device, application search method and server
US8126897B2 (en) Unified inverted index for video passage retrieval
US10037758B2 (en) Device and method for understanding user intent
US20210406260A1 (en) Combining parameters of multiple search queries that share a line of inquiry
CN106570180B (en) Voice search method and device based on artificial intelligence
JP2018077858A (en) System and method for conversation-based information search
US20150074112A1 (en) Multimedia Question Answering System and Method
CN106653006B (en) Searching method and device based on interactive voice
EP2717176A2 (en) Method for searching for information using the web and method for voice conversation using same
US8731930B2 (en) Contextual voice query dilation to improve spoken web searching
CN105279227B (en) Method and device for processing voice search of homophone
JP2015511746A5 (en)
CN106446018B (en) Query information processing method and device based on artificial intelligence
US9165058B2 (en) Apparatus and method for searching for personalized content based on user's comment
CN104866308A (en) Scenario image generation method and apparatus
CN111414763A (en) Semantic disambiguation method, device, equipment and storage device for sign language calculation
CN110941694A (en) Knowledge graph searching and positioning method and system, electronic equipment and storage medium
EP3053068A1 (en) System and method for content-based medical macro sorting and search system
CN110473543B (en) Voice recognition method and device
KR101695014B1 (en) Method for building emotional lexical information and apparatus for the same
CN109063182B (en) Content recommendation method based on voice search questions and electronic equipment
CN109783612B (en) Report data positioning method and device, storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant