CN110956958A - Searching method, searching device, terminal equipment and storage medium - Google Patents

Searching method, searching device, terminal equipment and storage medium Download PDF

Info

Publication number
CN110956958A
CN110956958A CN201911228496.7A CN201911228496A CN110956958A CN 110956958 A CN110956958 A CN 110956958A CN 201911228496 A CN201911228496 A CN 201911228496A CN 110956958 A CN110956958 A CN 110956958A
Authority
CN
China
Prior art keywords
voice data
data
user
search
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911228496.7A
Other languages
Chinese (zh)
Inventor
卢甜恬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhuiyi Technology Co Ltd
Original Assignee
Shenzhen Zhuiyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhuiyi Technology Co Ltd filed Critical Shenzhen Zhuiyi Technology Co Ltd
Priority to CN201911228496.7A priority Critical patent/CN110956958A/en
Publication of CN110956958A publication Critical patent/CN110956958A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval

Abstract

The embodiment of the application provides a searching method, a searching device, terminal equipment and a storage medium. The method comprises the steps of obtaining voice data of a user in an interactive process, judging whether the voice data meet a target condition through a preset algorithm model if the voice data are used for information search, correcting the voice data to obtain corrected voice data when a network state meeting a preset state is detected if the voice data do not meet the target condition, searching a search result matched with the corrected voice data, and outputting the search result. By means of the method, under the condition that the voice data of the user does not meet the target conditions, the voice data are corrected to obtain corrected voice data, so that the search result matched with the corrected voice data is searched and output, the search is more accurate, and the user experience is improved.

Description

Searching method, searching device, terminal equipment and storage medium
Technical Field
The present application relates to the field of image search technologies, and in particular, to a search method, an apparatus, a terminal device, and a storage medium.
Background
With the continuous development of search engine technology, voice search has been gradually applied to various terminal devices. As one way, the search speech input by the user may be subjected to speech recognition to convert the search speech into words, analyze keywords in the words, search for a matching search result according to the keywords or query a corresponding question-answer result in a database of the question-answer system according to the keywords, and present the search result to the user in the form of speech, web page, words, and the like. However, when searching by using voice, a search result error usually occurs due to the fact that the voice content is not standard, and it is difficult to realize accurate search.
Disclosure of Invention
In view of the above problems, the present application provides a searching method, apparatus, terminal device and storage medium to solve the above problems.
In a first aspect, an embodiment of the present application provides a search method, where the method includes: acquiring voice data of a user in an interaction process; if the voice data is used for information search, judging whether the voice data meets a target condition through a preset algorithm model; if the voice data do not meet the target condition, when the network state is detected to meet the preset state, the voice data are corrected to obtain corrected voice data; searching for a search result matching the corrected voice data and outputting the search result.
Further, the performing the correction processing on the voice data includes: acquiring a preset target voice model, wherein the target voice model is obtained by training voice characteristic data of a user or historical voice characteristic data of the user; and carrying out correction processing on the voice data based on the target voice model.
Further, the searching for the search result matching the corrected voice data and outputting the search result includes: converting the corrected voice data into text data; and searching a search result matched with the text data and outputting the search result.
Further, the performing the correction processing on the voice data further includes: converting the corrected voice data into text data; and correcting the text data by adopting a target text model to obtain text data with complete semantics.
Further, the searching for the search result matching the corrected voice data and outputting the search result includes: searching for a search result matched with the corrected text data and outputting the search result.
Further, before the performing the correction processing on the voice data, the method includes: acquiring voiceprint characteristics of the voice data; judging whether the voiceprint features are preset voiceprint features or not; and if so, converting the voice data into target voice data.
Further, the performing the correction processing on the voice data includes: and carrying out correction processing on the target voice data.
Further, the target condition is used for representing the search intention of the user which can be completely recognized according to the voice data.
In a second aspect, an embodiment of the present application provides a search apparatus, including: the acquisition module is used for acquiring voice data of a user in the interaction process; the judging module is used for judging whether the voice data meet the target condition or not through a preset algorithm model if the voice data are used for information search; the processing module is used for correcting the voice data to obtain corrected voice data when detecting that the network state meets a preset state if the voice data does not meet the target condition; and the searching module is used for searching the searching result matched with the corrected voice data and outputting the searching result.
Further, the processing module may be specifically configured to acquire a preset target speech model, where the target speech model is a model obtained through training of speech feature data of a user or historical speech feature data of the user; and carrying out correction processing on the voice data based on the target voice model.
Further, the search module may be specifically configured to convert the corrected speech data into text data; and searching a search result matched with the text data and outputting the search result.
Further, the apparatus further comprises: a conversion module for converting the corrected voice data into text data; and the correction processing module is used for correcting the text data by adopting a target text model so as to obtain text data with complete semantics.
Further, the search module may be specifically configured to search for a search result that matches the corrected text data, and output the search result.
Further, the apparatus further comprises: the voice print characteristic acquisition module is used for acquiring voice print characteristics of the voice data; the judging unit is used for judging whether the voiceprint features are preset voiceprint features or not; and the processing unit is used for converting the voice data into target voice data if the target voice data is the target voice data.
Further, the processing module may be configured to perform correction processing on the target speech data.
Further, the target condition is used for representing the search intention of the user which can be completely recognized according to the voice data.
In a third aspect, an embodiment of the present application provides a terminal device, which includes: a memory; one or more processors coupled with the memory; one or more programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of the first aspect as described above.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, in which program code is stored, and the program code can be called by a processor to execute the method according to the first aspect.
The embodiment of the application provides a searching method, a searching device, terminal equipment and a storage medium. The method comprises the steps of obtaining voice data of a user in an interactive process, judging whether the voice data meet a target condition through a preset algorithm model if the voice data are used for information search, correcting the voice data to obtain corrected voice data when a network state meeting a preset state is detected if the voice data do not meet the target condition, searching a search result matched with the corrected voice data, and outputting the search result. By means of the method, under the condition that the voice data of the user does not meet the target conditions, the voice data are corrected to obtain corrected voice data, so that the search result matched with the corrected voice data is searched and output, the search is more accurate, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 shows a schematic diagram of an application environment suitable for the embodiment of the present application.
Fig. 2 shows a flowchart of a method of a search method according to an embodiment of the present application.
Fig. 3 shows a flowchart of a method of searching according to another embodiment of the present application.
Fig. 4 shows a flowchart of a method of searching according to another embodiment of the present application.
Fig. 5 is a flowchart illustrating a method of searching according to still another embodiment of the present application.
Fig. 6 shows a block diagram of a search apparatus according to an embodiment of the present application.
Fig. 7 shows a block diagram of a terminal device for executing a search method according to an embodiment of the present application.
Fig. 8 illustrates a storage unit for storing or carrying program codes for implementing a search method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In recent years, with the accelerated breakthrough and wide application of technologies such as mobile internet, big data, cloud computing, sensors and the like, the development of artificial intelligence also enters a brand-new stage. While the intelligent voice search technology is used as a key ring in the Artificial Intelligence industry chain, AI (Artificial Intelligence) is one of the most mature technologies, and is rapidly developed in the fields of marketing customer service, intelligent home, intelligent vehicle-mounted, intelligent wearing, intelligent search and the like. Such as a cell phone smart assistant.
As one mode, the mobile phone intelligent assistant can be used for recognizing the voice input by the user, searching the content matched with the recognized voice data and displaying the content to the user through a mobile phone interface. However, if the user is at a too fast speed or has unclear pronunciation, the mobile phone may not be able to accurately identify the search intention of the user, which reduces the user experience.
The inventor finds in research that a customized voice correction strategy can be provided for a user through historical voice data of the user by combining speaking habits of the user, so that complete voice data are obtained and then search is carried out based on the complete voice data, the accuracy of voice search is improved, and user experience is improved. Therefore, the search method, the search device, the terminal device and the storage medium in the embodiment of the application are provided.
In order to better understand the searching method, the searching apparatus, the terminal device, and the storage medium provided in the embodiments of the present application, an application environment suitable for the embodiments of the present application is described below.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating an application environment suitable for the embodiment of the present application. The search method provided by the embodiment of the present application can be applied to the polymorphic interaction system 100 shown in fig. 1. The polymorphic interaction system 100 includes a terminal device 101 and a server 102, the server 102 being communicatively coupled to the terminal device 101. The server 102 may be a conventional server or a cloud server, and is not limited herein.
The terminal device 101 may be various electronic devices having a display screen and supporting data input, including but not limited to a smart phone, a tablet computer, a laptop portable computer, a desktop computer, a wearable electronic device, and the like. Specifically, the data input may be voice input based on a voice module provided on the terminal apparatus 101, character input based on a character input module, or the like. Terminal equipment 101 is provided with the camera, and the camera can set up in the one side that terminal equipment 101 is furnished with the display screen, and optionally, the camera of terminal equipment 101 also can set up in the one side that terminal equipment 101 deviates from the display screen. It should be noted that, image data of the user can be collected through the camera, and the image data includes posture information of the user, so as to assist in accurately identifying the search intention of the user.
The terminal device 101 may have a client application installed thereon, and the user may communicate with the server 102 based on the client application (e.g., APP, wechat applet, etc.). Specifically, the server 102 is installed with a corresponding server application, a user may register a user account in the server 102 based on the client application, and communicate with the server 102 based on the user account, for example, the user logs in the user account in the client application, inputs the user account through the client application based on the user account, and may input text information, voice data, image data, and the like, after receiving information input by the user, the client application may send the information to the server 102, so that the server 102 may receive, process, and store the information, and the server 102 may also receive the information and return a corresponding output information to the terminal device 101 according to the information.
In some embodiments, the means for processing the information input by the user may also be disposed on the terminal device 101, so that the terminal device 101 can interact with the user without relying on establishing communication with the server 102, and in this case, the polymorphic interaction system 100 may only include the terminal device 101.
The above application environments are only examples for facilitating understanding, and it is to be understood that the embodiments of the present application are not limited to the above application environments.
The search method, apparatus, terminal device and storage medium provided by the embodiments of the present application will be described in detail below with specific embodiments.
As shown in fig. 2, a flowchart of a method of a search method provided in an embodiment of the present application is shown. The searching method provided by the embodiment can be applied to terminal equipment with a display screen or other image output devices, and the terminal equipment can be electronic equipment such as a smart phone, a tablet personal computer and a wearable intelligent terminal.
In a specific embodiment, the search method can be applied to the search apparatus 500 shown in fig. 6 and the terminal device 101 shown in fig. 7. The flow shown in fig. 2 will be described in detail below. The above search method may specifically include the steps of:
step S110: and acquiring voice data of the user in the interactive process.
It should be noted that, in the embodiment of the present application, the voice data of the user includes the voice feature of the user. For example, the timbre of the voice of the user (the timbre of the male is different from the timbre of the female, and optionally, the gender of the user can be discriminated according to the timbre), the volume, the pitch, the fundamental frequency, and the language to which the voice belongs (for example, common speech, Sichuan speech, Henan speech, Shandong speech, Shanghai speech, Guangdong speech, etc.), the language (for example, English, German, French, Russian, Korean, Japanese, etc.), and the like can be included. The voice data differs for different users.
As one way, the voice data in the embodiment of the present application may be voice data input by a user through a voice input function of the terminal device on the human-computer interaction interface, for example, the voice data input by the user may be collected through a voice assistant installed in the terminal device, a voice SDK (Software Development Kit), a voice recognition engine application, or the like. For example, the voice data of the user in the interaction process may be the voice data of the user who is currently interacting with the terminal device through the human-computer interaction interface of the terminal device. Optionally, the voice data of the user may be acquired during the call of the user through the terminal device.
Alternatively, the voice data may be voice recording information of the user stored in advance. Optionally, the voice data of the same user may include voice data of the user at the same time or different times, or may be voice data of different users, and the like, which is not limited herein.
As one mode, the voice data of the user in the interactive process can be obtained by extracting the features of the voice data and then decoding the extracted voice features by using the acoustic model and the language model obtained by pre-training.
Optionally, the voice data of the user acquired by the terminal device may be stored locally, or may be sent to the server by the terminal device for storage. The method for storing the data by the server can avoid the reduction of the operation speed caused by the redundancy of the stored data of the terminal equipment.
Step S120: and judging whether the voice data is used for information search.
It can be understood that the obtained voice data of the user is not all used for searching, for example, the voice data collected during the call process of the user through the terminal device, if the user searches through the terminal device in this case, resource consumption, such as power consumption and reduction of the operating memory, will be caused. In order to avoid these problems, the acquired voice data may be judged, and as a way, whether the voice data is used for information retrieval may be judged, and if so, the subsequent processing may be performed; if not, the corresponding voice data is directly discarded, or the voice data is not added into the list to be searched.
As one approach, it may be detected whether other intelligent search type applications or windows are open while voice data is being acquired. Alternatively, if turned on, the corresponding retrieved voice data may be used as the voice data for searching, and if not turned on, the retrieved voice data may be ignored (i.e., discarded or discarded).
For example, in a specific application scenario, when a user picks up a mobile phone to speak a piece of voice, as an implementation manner, the terminal device may read a monitoring event of an application program, detect whether an application program of a search class is in an open state, and if so, determine that the application program is in the search state, and further recognize the voice of the user to implement voice search. If a listening event is not captured for the search class application, it may be determined that it is not in the search state and the speech will be discarded.
As another embodiment, for some search-type applications, when a user speaks a voice to attempt to search, a reminder or an instruction of the application is usually received, and optionally, the terminal device may determine whether a prompt instruction exists within a period of time (including before speaking the voice, while speaking the voice, and within a short time after speaking the voice, for example, 5 seconds, 10 seconds, 20 seconds, and the like) during which the user speaks the voice, and if so, may determine that the terminal device is in a search state, for example, when the user searches a map of a certain place through voice, a voice prompt may pop up to remind the user to input a name of the place (destination name) to be searched; if not, the terminal device may determine not to be in the search state, for example, the user may not receive a similar prompt instruction during the call through the terminal device.
By judging whether the acquired voice data is used for searching or not, the voice data which is not used for searching can be prevented from being searched, the power consumption is reduced, and the standby time of the terminal equipment is prolonged.
Step S130: and if so, judging whether the voice data meets the target condition.
As one mode, if it is determined that the voice data is for voice search, it may be further determined whether the voice data satisfies the target condition. The target condition can be used for representing the search intention of the user which can be completely recognized according to the voice data.
It can be understood that, when a user searches through voice data, due to individual differences, some users speak faster or pronounce the speech data, so that the spoken voice data is unclear, that is, the search intention of the user cannot be completely recognized through the voice data. Then, in this case, if the search is still performed through the voice data, the search result may not be as expected by the user, and may cause an increase in power consumption to some extent. Then, as a way of improving the above-described problem, in order to increase the reliability of the search result and reduce power consumption, it may be determined whether the voice data used for the search satisfies the target condition. Alternatively, if the target condition is satisfied, the voice data may be used for a subsequent voice search, and if the target condition is not satisfied, the voice data may not be used for the subsequent voice search.
As one way, whether the voice data satisfies the target condition may be judged through a preset algorithm model. The preset algorithm model may be a Neural Network model obtained by training sample speech data of a large number of users, for example, a Neural Network model such as RNN (Recurrent Neural Network) or LSTM (Long Short-Term Memory Network), which is not illustrated herein. Optionally, if the voice data satisfies the target condition, the search intention of the user may be completely recognized according to the voice data, and if the voice data does not satisfy the target condition, the search intention of the user may not be completely recognized according to the voice data.
Alternatively, if it is determined that the speech data is not for a speech search, the search process will end. Power consumption due to redundant searches can be avoided.
Step S140: and if the target condition is met, searching a search result matched with the voice data and outputting the search result.
It is understood that if the voice data satisfies the target condition, a search result matching the voice data may be directly searched and output.
In the embodiment of the present application, the search result matched with the voice data includes, but is not limited to, pictures, texts, videos, audios, animations, and any combination form therebetween. Optionally, the output mode of the search result may be picture output, text output, voice output, ring output, animation pop-up or other multimedia presentation, and may be a combination of different output modes, for example, a combination of picture and text output; the manner of outputting the animation and the ring tone in combination is not illustrated and not limited herein.
Optionally, the search result may be displayed and output by a terminal device currently used for searching, or may also be displayed and output by another terminal device, for example, the search result is displayed remotely, which is not limited herein.
Step S150: and if the target condition is not met, detecting whether the network state meets a preset state or not.
It is understood that, during the search of the voice data by the user, the user experience is greatly reduced if the search is interrupted due to poor network signals, and the user may be required to repeatedly try the search process. In order to ensure the reliability of the searching process and enhance the user-friendly experience, under the condition that the voice data is judged not to meet the target condition, whether the current network state of the terminal equipment meets the preset state or not can be further detected, the searching interruption or termination caused by the sudden abnormality of the network can be prevented, and the power consumption of the terminal equipment is saved.
The preset state may be the strength of the network signal. As one mode, a network signal threshold may be set, and whether the network state satisfies the preset state may be detected by comparing the network signal of the terminal device with the signal threshold. Optionally, when the current network signal is greater than the threshold, determining that the network state meets a preset state; and if the current signal is not larger than the threshold value, judging that the network state does not meet the preset state.
As an embodiment, on the basis that the network signal is greater than the signal threshold, it may be determined whether the power of the terminal device is sufficient, for example, whether the power reaches a set threshold, and if so, it may be determined that the network status satisfies the preset status; if not, it can be determined that the network status does not satisfy the preset status.
Alternatively, the preset state may be a variation trend of the network signal. As an implementation mode, the variation trend of the network signal intensity can be counted through the terminal equipment. If the network signal is weaker and weaker, the terminal device may be in a network disconnection state soon, and further abnormal interruption of the voice search process may be caused, and in this case, it may be determined that the network state does not satisfy the preset state. Optionally, if the network signal is stronger, it may be determined that the network state satisfies the preset state.
It should be noted that, if the content that the user needs to search by voice is stored locally, for example, a certain song downloaded by voice search, the user can directly search by using the voice data meeting the target condition without detecting whether the network state meets the preset state.
Step S160: and if the preset state is met, correcting the voice data to obtain corrected voice data.
The correction processing refers to correcting the voice data that cannot completely recognize the search intention of the user, and includes semantic correction, semantic filling, and the like, which will be specifically described in the following embodiments to obtain corrected voice data. Alternatively, the corrected voice data may be understood as voice data that can more completely recognize the search intention of the user.
For example, in a specific application scenario, if a user says that the sentence "how to split a tire" is to search for the tire dismounting process, since the sentence includes the introduction "zha", the identification requirement of the user cannot be accurately identified, the sentence "how to split a tire" can be corrected into "how to split a tire" through correction processing, and thus, the search can be performed according to the sentence "how to split a tire", and the search intention of the user can be more completely identified.
By performing correction processing on the voice data when the preset state is satisfied, the reliability and accuracy of the search can be increased.
Step S170: searching for a search result matching the corrected voice data and outputting the search result.
As one mode, after correcting the voice data, a search result matching the obtained corrected voice data may be directly searched and output. The form and the output form of the search result may refer to the corresponding description in the step S140, and are not described herein again.
Step S180: and if the preset state is not met, discarding the voice data.
It can be understood that if the network state does not satisfy the preset state, the obtained voice data can be directly discarded, so that the problem of abnormal search caused by abnormal network interruption can be avoided, and the user experience is improved. Optionally, the terminal device may send a prompt message to prompt the user to search after the network state is normal, so as to avoid power consumption.
In the searching method provided by this embodiment, the voice data of the user in the interactive process is acquired, and then, if the voice data is used for information search, whether the voice data meets the target condition is judged through a preset algorithm model, and if the voice data does not meet the target condition, when it is detected that the network state meets the preset state, the voice data is corrected to obtain corrected voice data, and then, a search result matched with the corrected voice data is searched and the search result is output. By means of the method, under the condition that the voice data of the user does not meet the target conditions, the voice data are corrected to obtain corrected voice data, so that the search result matched with the corrected voice data is searched and output, the search is more accurate, and the user experience is improved.
As shown in fig. 3, a flowchart of a method of searching provided in another embodiment of the present application is shown, where the method includes:
step S210: and acquiring voice data of the user in the interactive process.
Step S220: and judging whether the voice data is used for information search.
Step S230: and if so, judging whether the voice data meets the target condition.
Optionally, if the search is not for voice search, the search process may be ended.
Step S240: and if the target condition is met, searching a search result matched with the voice data and outputting the search result.
Step S250: and if the target condition is not met, detecting whether the network state meets a preset state or not.
Step S261: and if the preset state is met, acquiring a preset target voice model.
The target voice model in the embodiment of the application is a model obtained by pre-training voice feature data of a user or historical voice feature data of the user. As one mode, historical speech data of the user may be acquired, and the historical speech data of the user may be input into the machine learning model to obtain the target speech model. Optionally, the machine may learn a target speech model (i.e., personalized) suitable for the speech data characteristics of the user according to the speech characteristics of the user, such as language, speaking habit, etc. Optionally, the target speech models corresponding to different users are different.
As one way, in the case that it is determined that the voice data of the user satisfies the preset state, a preset target voice model may be acquired so as to perform corresponding correction processing on the voice data of the user based on the target voice model.
Step S262: and correcting the voice data based on the target voice model to obtain corrected voice data.
As one way, after the target speech model is obtained, the speech data may be subjected to correction processing based on the target speech model to obtain corrected speech data. Optionally, a large amount of speech data of the user, i.e. speech data matching the speaking habits of the user, is stored in the target speech model. For example, if a user speaks too fast and often says "circuit diagram" as "electrogram", the target speech model corresponding to the user may automatically correct all "electrograms" spoken by the user into "circuit diagram". The voice data is corrected through a pre-customized target voice model matched with the voice data characteristics of the user, so that the searching speed can be increased, and meanwhile, the searching accuracy is improved.
Step S263: the corrected voice data is converted into text data.
Alternatively, the corrected speech data may be converted into text data using an existing speech recognition technique to facilitate searching based on the converted text data.
Step S264: and searching a search result matched with the text data and outputting the search result.
Optionally, the search result matched with the text data converted from the corrected voice data is searched, so that the search requirement of the user can be met, and the search result meeting the expectation of the user is obtained. For the style of the search result and the form of outputting the search result, reference may be made to the corresponding description in step S140 in the foregoing embodiment, which is not described herein again.
Step S270: and if the preset state is not met, discarding the voice data.
In the searching method provided by this embodiment, the voice data of the user in the interactive process is acquired, and then, if the voice data is used for information search, whether the voice data meets the target condition is judged through a preset algorithm model, and if the voice data does not meet the target condition, when it is detected that the network state meets the preset state, a preset target voice model is acquired, the voice data is corrected based on the target voice model to obtain corrected voice data, then, the corrected voice data is converted into text data, a search result matched with the text data is searched, and the search result is output. By the method, under the condition that the voice data of the user does not meet the target condition, the voice data is corrected through the target voice model to obtain corrected voice data, and then the corrected voice data is converted into text data, so that the search result matched with the text voice data is searched and output, and the search accuracy is improved.
As shown in fig. 4, a flowchart of a method of searching provided in another embodiment of the present application is shown, where the method includes:
step S310: and acquiring voice data of the user in the interactive process.
Step S320: and judging whether the voice data is used for information search.
Step S330: and if so, judging whether the voice data meets the target condition.
Optionally, if the search is not for voice search, the search process may be ended.
Step S340: and if the target condition is met, searching a search result matched with the voice data and outputting the search result.
Step S350: and if the target condition is not met, detecting whether the network state meets a preset state or not.
Step S361: and if the preset state is met, acquiring a preset target voice model.
Step S362: and correcting the voice data based on the target voice model to obtain corrected voice data.
Step S363: the corrected voice data is converted into text data.
Step S364: and correcting the text data by adopting a target text model to obtain text data with complete semantics.
It is understood that the correction of the voice data of the user can supplement the voice data of the user, for example, supplement the voice data which cannot be reflected due to the fast speed or unclear pronunciation of the user, so that the obtained voice data is complete. However, for some user voice data, if the voice data is not recorded in the target voice model, for example, the voice data that the user does not usually speak, after the user speaks such voice data, there may be a grammar error and the like, and further, in order to improve the accuracy of the search result, the target text model may be used to correct the text data to obtain text data with complete semantics. The target text model may refer to the existing speech recognition technology, and is not described herein again.
For example, in a specific application scenario, suppose the user's voice data is "electrogram, good? Once searching ", the voice data can be corrected into" circuit diagram, how good? Search for one. It will be appreciated that the corrected speech data may still be semantically erroneous, which may lead to inaccurate searches. Then, the correction processing may be continued on the text data converted from the corrected voice data, and optionally, "do me search for a circuit diagram? And text data with complete semantics can be searched according to the text data, so that the searching accuracy is improved.
Step S365: searching for a search result matched with the corrected text data and outputting the search result.
Referring to the above description, searching for a search result matched with the corrected text data may make the search result more accurate, according to the search intention of the user. Alternatively, the specific form of the search result matched with the corrected text data may be unlimited, for example, pictures, texts, voice, video, ring tones, advertisements, etc. or any combination thereof, and is not limited herein, and the description in the foregoing embodiments may be referred to specifically. Optionally, the output form of the search result may also refer to the description in the foregoing embodiments, and is not described herein again.
Step S370: and if the preset state is not met, discarding the voice data.
According to the searching method provided by the embodiment, under the condition that the voice data of the user does not meet the target condition, the voice data is corrected to obtain the corrected voice data, the corrected voice data is converted into the text data, and then the text data is corrected, so that the searching result matched with the text data after correction is searched and output, the searching accuracy and reliability are further improved, and the user experience is improved.
As shown in fig. 5, a flowchart of a method of searching provided by another embodiment of the present application is shown, where the method includes:
step S410: and acquiring voice data of the user in the interactive process.
Step S420: and judging whether the voice data is used for information search.
Step S430: and if so, judging whether the voice data meets the target condition.
Step S440: and if the target condition is met, searching a search result matched with the voice data and outputting the search result.
Step S450: and if the target condition is not met, detecting whether the network state meets a preset state or not.
Step S460: and if the preset state is met, acquiring the voiceprint characteristics of the voice data.
As one approach, assuming that the user speaks several languages or dialects, if the user changes one language, it may cause recognition errors in the speech search. For example, a Sichuan person speaking Mandarin or learning to perform a phonetic search in northeast China may cause inaccurate speech recognition due to inaccurate pronunciation. It should be noted that the pronunciation inaccuracy is not that the user pronounces words, but does not learn the expression of the language that the user wants to speak, for example, if a person in Sichuan says northeast and does not say northeast by himself, it is easy to cause the error of voice search.
In order to reduce such an error, in the case where it is detected that the voice data of the user is for information search and the network status satisfies the preset status, a voiceprint feature of the voice data of the user may be acquired. Optionally, the voiceprint characteristics may include a frequency, an intensity, a variation characteristic of a lapse of a sound pressure year, or a characteristic of an intensity and a frequency of the sound wave in a certain period of time corresponding to the voice data of the user. As an implementation manner, the voiceprint feature may be obtained by analyzing the voice data of the user by using a filter or the like, or may be obtained by using another method for obtaining the voiceprint feature, which is not limited herein.
By acquiring the voiceprint features of the voice data of the user, the search result matched with the voice data features of the user can be identified according to the voiceprint features of the user even if the user performs voice expression by using different languages or languages, and the user experience is improved.
Step S470: and judging whether the voiceprint features are preset voiceprint features.
Optionally, a voiceprint feature of a voice when the user performs a voice search using original voice data (i.e., data frequently used by the user or only voice data that the user should initially) may be obtained, and the obtained voiceprint feature is used as a preset voiceprint feature. When the voiceprint features of the voice data when the user performs voice search by adopting different languages or languages are obtained, the voiceprint features can be compared with the preset voiceprint features, and if the voiceprint features are the same, the voiceprint features can be judged to be the preset voiceprint features; if not, it can be determined that the voiceprint feature is not the preset voiceprint feature.
By judging whether the voiceprint features are preset voiceprint features or not, voice search of voice data which are not the preset voiceprint features can be avoided, and therefore resources can be saved.
Step S471: and if the voice data are the preset voiceprint characteristics, converting the voice data into target voice data.
The target speech data may be understood as the original speech data of the user, that is, no matter what language or language the user uses for the speech search, for each user, a target speech data suitable for the user may be adapted. Optionally, if a certain user searches by using different voice data, accurate voice search can be realized only by converting the data which is spoken by the user and is different from the target voice data corresponding to the user into the target voice data.
For example, in one particular application scenario, assume that a user is about "where to play on the weekend? "the voice data has many different expressions, for example, the user can express" weekend duhukui "in Sichuan, the user can express" medium weekend group kneading "in Lanzhou, and the user can express" this weekend dry-hakui "in Xinjiang. Optionally, no matter what language or language the user uses for the voice search, if the voiceprint feature of the voice data is the preset voiceprint feature, the voice data may be converted into the target voice data, and optionally, the target voice data of the user may be "where to play on the weekend".
Step S472: and carrying out correction processing on the target voice data to obtain corrected voice data.
The specific implementation of the correction processing on the target voice data may refer to the description in the foregoing embodiments, and is not described herein again.
Step S473: searching for a search result matching the corrected voice data and outputting the search result.
Step S474: and if the voice data is not the preset voiceprint characteristic, discarding the voice data.
It can be understood that if the voice print feature is not the preset voiceprint feature, the voice data may be voice data of another user or voice data counterfeited by a lawless person, so that in order to avoid false recognition, the voice data may be discarded, and the security of voice search is improved.
Step S480: and if the preset state is not met, discarding the voice data.
According to the searching method provided by the embodiment, under the condition that the voice data of the user does not meet the target condition, if the network state is detected to meet the preset state, the voiceprint feature of the voice data is obtained, then under the condition that the voiceprint feature is the preset voiceprint feature, the voice data is converted into the target voice data, then the target voice data is corrected, the corrected voice data is obtained, and therefore the searching result matched with the corrected voice data is searched and output, searching is more accurate, and user experience is improved.
As shown in fig. 6, a block diagram of a searching apparatus 500 provided in this embodiment of the present application is shown, where the apparatus 500 operates in a terminal device having a display screen or other audio or image output devices, and the terminal device may be an electronic device such as a smart phone, a tablet computer, a wearable smart terminal, and the apparatus 500 includes:
an obtaining module 510, configured to obtain voice data of the user during the interaction process.
A determining module 520, configured to determine whether the voice data meets a target condition through a preset algorithm model if the voice data is used for information search.
Wherein the target condition is used for representing the search intention of the user which can be completely recognized according to the voice data.
The processing module 530 is configured to, if the voice data does not meet the target condition, correct the voice data when it is detected that the network state meets a preset state, so as to obtain corrected voice data.
Optionally, the apparatus 500 may further include: the voice print characteristic acquisition module is used for acquiring voice print characteristics of the voice data; the judging unit is used for judging whether the voiceprint features are preset voiceprint features or not; and the processing unit is used for converting the voice data into target voice data if the target voice data is the target voice data.
As one way, the processing module 530 may be specifically configured to perform correction processing on the target voice data.
As a manner, the processing module 530 may be specifically configured to obtain a preset target speech model, where the target speech model is a model obtained through training of speech feature data of a user or historical speech feature data of the user; and carrying out correction processing on the voice data based on the target voice model.
Optionally, the apparatus 500 may further include: a conversion module for converting the corrected voice data into text data; and the correction processing module is used for correcting the text data by adopting a target text model so as to obtain text data with complete semantics.
And a searching module 540, configured to search for a search result matching the corrected voice data and output the search result.
As one way, the search module 540 may be specifically configured to convert the corrected voice data into text data; and searching a search result matched with the text data and outputting the search result.
Optionally, the searching module 540 may be further specifically configured to search for a search result matching the corrected text data and output the search result.
The search device provided in this embodiment obtains the voice data of the user during the interaction process, and then, if the voice data is used for information search, determines whether the voice data meets the target condition through a preset algorithm model, and if the voice data does not meet the target condition, when it is detected that the network state meets the preset state, performs correction processing on the voice data to obtain corrected voice data, and then searches for a search result matched with the corrected voice data and outputs the search result. By means of the method, under the condition that the voice data of the user does not meet the target conditions, the voice data are corrected to obtain corrected voice data, so that the search result matched with the corrected voice data is searched and output, the search is more accurate, and the user experience is improved.
The searching device provided by the embodiment of the application is used for realizing the corresponding searching method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein again.
It can be clearly understood by those skilled in the art that the search apparatus provided in the embodiment of the present application can implement each process in the foregoing method embodiments, and for convenience and simplicity of description, the specific working processes of the apparatus and the module described above may refer to corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, the coupling or direct coupling or communication connection between the modules shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or modules may be in an electrical, mechanical or other form.
In addition, each functional module in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Referring to fig. 7, a block diagram of a terminal device 101 according to an embodiment of the present disclosure is shown. The terminal device 101 may be a terminal device capable of running an application, such as a smart phone, a tablet computer, and an electronic book. The terminal device 101 in the present application may include one or more of the following components: a processor 1012, a memory 1014, and one or more applications, wherein the one or more applications may be stored in the memory 1014 and configured to be executed by the one or more processors 1012, the one or more programs configured to perform a method as described in the aforementioned method embodiments.
Processor 1012 may include one or more processing cores. The processor 1012 connects various parts within the entire terminal apparatus 101 using various interfaces and lines, and performs various functions of the terminal apparatus 101 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1014 and calling data stored in the memory 1014. Alternatively, the processor 1012 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1012 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is to be appreciated that the modem can be implemented solely using a communication chip without being integrated into the processor 1012.
The Memory 1014 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 1014 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1014 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the terminal device 101 during use (such as a phonebook, audio-video data, chat log data), and the like.
Referring to fig. 8, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable storage medium 700 has stored therein program code that can be called by a processor to execute the methods described in the above-described method embodiments.
The computer-readable storage medium 700 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer-readable storage medium 700 includes a non-volatile computer-readable storage medium. The computer readable storage medium 700 has storage space for program code 710 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 710 may be compressed, for example, in a suitable form.
To sum up, according to the search method, the search device, the terminal device, and the storage medium provided in the embodiments of the present application, voice data of a user during an interaction process is acquired, and then, if the voice data is used for information search, whether the voice data satisfies a target condition is determined by using a preset algorithm model, and if the voice data does not satisfy the target condition, when a network state is detected to satisfy a preset state, the voice data is corrected to obtain corrected voice data, and then, a search result matched with the corrected voice data is searched and the search result is output. By means of the method, under the condition that the voice data of the user does not meet the target conditions, the voice data are corrected to obtain corrected voice data, so that the search result matched with the corrected voice data is searched and output, the search is more accurate, and the user experience is improved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A method of searching, the method comprising:
acquiring voice data of a user in an interaction process;
if the voice data is used for information search, judging whether the voice data meets a target condition through a preset algorithm model;
if the voice data do not meet the target condition, when the network state is detected to meet the preset state, the voice data are corrected to obtain corrected voice data;
searching for a search result matching the corrected voice data and outputting the search result.
2. The method of claim 1, wherein the step of performing correction processing on the voice data to obtain corrected voice data comprises:
acquiring a preset target voice model, wherein the target voice model is obtained by training voice characteristic data of a user or historical voice characteristic data of the user;
and correcting the voice data based on the target voice model to obtain corrected voice data.
3. The method according to claim 2, wherein the step of searching for a search result matching the corrected voice data and outputting the search result comprises:
converting the corrected voice data into text data;
and searching a search result matched with the text data and outputting the search result.
4. The method of claim 2, further comprising:
converting the corrected voice data into text data;
and correcting the text data by adopting a target text model to obtain text data with complete semantics.
5. The method according to claim 4, wherein the step of searching for a search result matching the corrected voice data and outputting the search result comprises:
searching for a search result matched with the corrected text data and outputting the search result.
6. The method according to any one of claims 1 to 5, wherein the step of performing correction processing on the speech data is preceded by:
acquiring voiceprint characteristics of the voice data;
judging whether the voiceprint features are preset voiceprint features or not;
if yes, converting the voice data into target voice data;
the step of performing correction processing on the voice data to obtain corrected voice data includes:
and carrying out correction processing on the target voice data to obtain corrected voice data.
7. The method according to any one of claims 1-6, wherein the target condition is used to characterize a search intention of the user that can be recognized completely from the speech data.
8. A search apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring voice data of a user in the interaction process;
the judging module is used for judging whether the voice data meet the target condition or not through a preset algorithm model if the voice data are used for information search;
the processing module is used for correcting the voice data to obtain corrected voice data when detecting that the network state meets a preset state if the voice data does not meet the target condition;
and the searching module is used for searching the searching result matched with the corrected voice data and outputting the searching result.
9. A terminal device, comprising:
a memory;
one or more processors coupled with the memory;
one or more programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-7.
10. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 7.
CN201911228496.7A 2019-12-04 2019-12-04 Searching method, searching device, terminal equipment and storage medium Pending CN110956958A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911228496.7A CN110956958A (en) 2019-12-04 2019-12-04 Searching method, searching device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911228496.7A CN110956958A (en) 2019-12-04 2019-12-04 Searching method, searching device, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110956958A true CN110956958A (en) 2020-04-03

Family

ID=69979737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911228496.7A Pending CN110956958A (en) 2019-12-04 2019-12-04 Searching method, searching device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110956958A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102831A (en) * 2020-09-15 2020-12-18 海南大学 Cross-data, information and knowledge modal content encoding and decoding method and component
CN112530442A (en) * 2020-11-05 2021-03-19 广东美的厨房电器制造有限公司 Voice interaction method and device
CN112700769A (en) * 2020-12-26 2021-04-23 科大讯飞股份有限公司 Semantic understanding method, device, equipment and computer readable storage medium

Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523349A (en) * 2011-12-22 2012-06-27 苏州巴米特信息科技有限公司 Special cellphone voice searching method
CN102722539A (en) * 2012-05-23 2012-10-10 华为技术有限公司 Query method and device based on voice recognition
CN103369398A (en) * 2013-07-01 2013-10-23 安徽广电信息网络股份有限公司 Voice searching method and voice searching system based on television EPG (electronic program guide) information
CN103853736A (en) * 2012-11-29 2014-06-11 北京掌城科技有限公司 Traffic information voice query system and voice processing unit thereof
CN104008132A (en) * 2014-05-04 2014-08-27 深圳市北科瑞声科技有限公司 Voice map searching method and system
US20150178268A1 (en) * 2013-12-19 2015-06-25 Abbyy Infopoisk Llc Semantic disambiguation using a statistical analysis
US20160063994A1 (en) * 2014-08-29 2016-03-03 Google Inc. Query Rewrite Corrections
CN106056207A (en) * 2016-05-09 2016-10-26 武汉科技大学 Natural language-based robot deep interacting and reasoning method and device
CN106095766A (en) * 2015-04-28 2016-11-09 谷歌公司 Use selectivity again to talk and correct speech recognition
CN106328166A (en) * 2016-08-31 2017-01-11 上海交通大学 Man-machine dialogue anomaly detection system and method
CN106507244A (en) * 2016-12-23 2017-03-15 深圳先进技术研究院 A kind of central control system
CN106571139A (en) * 2016-11-09 2017-04-19 百度在线网络技术(北京)有限公司 Artificial intelligence based voice search result processing method and device
CN106663424A (en) * 2014-03-31 2017-05-10 三菱电机株式会社 Device and method for understanding user intent
CN106782519A (en) * 2016-12-23 2017-05-31 深圳先进技术研究院 A kind of robot
CN107240398A (en) * 2017-07-04 2017-10-10 科大讯飞股份有限公司 Intelligent sound exchange method and device
CN107357875A (en) * 2017-07-04 2017-11-17 北京奇艺世纪科技有限公司 A kind of voice search method, device and electronic equipment
CN107977183A (en) * 2017-11-16 2018-05-01 百度在线网络技术(北京)有限公司 voice interactive method, device and equipment
US10049655B1 (en) * 2016-01-05 2018-08-14 Google Llc Biasing voice correction suggestions
CN108600219A (en) * 2018-04-23 2018-09-28 海信(广东)空调有限公司 A kind of sound control method and equipment
CN108597495A (en) * 2018-03-15 2018-09-28 维沃移动通信有限公司 A kind of method and device of processing voice data
CN109545184A (en) * 2018-12-17 2019-03-29 广东小天才科技有限公司 It is a kind of that detection method and electronic equipment are recited based on voice calibration
CN109710055A (en) * 2017-12-15 2019-05-03 蔚来汽车有限公司 The interaction control method of vehicle intelligent interactive system and vehicle-mounted interactive terminal
CN110136705A (en) * 2019-04-10 2019-08-16 华为技术有限公司 A kind of method and electronic equipment of human-computer interaction
CN110211577A (en) * 2019-07-19 2019-09-06 宁波方太厨具有限公司 Terminal device and its voice interactive method
CN110473521A (en) * 2019-02-26 2019-11-19 北京蓦然认知科技有限公司 A kind of training method of task model, device, equipment

Patent Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523349A (en) * 2011-12-22 2012-06-27 苏州巴米特信息科技有限公司 Special cellphone voice searching method
CN102722539A (en) * 2012-05-23 2012-10-10 华为技术有限公司 Query method and device based on voice recognition
CN103853736A (en) * 2012-11-29 2014-06-11 北京掌城科技有限公司 Traffic information voice query system and voice processing unit thereof
CN103369398A (en) * 2013-07-01 2013-10-23 安徽广电信息网络股份有限公司 Voice searching method and voice searching system based on television EPG (electronic program guide) information
US20150178268A1 (en) * 2013-12-19 2015-06-25 Abbyy Infopoisk Llc Semantic disambiguation using a statistical analysis
CN106663424A (en) * 2014-03-31 2017-05-10 三菱电机株式会社 Device and method for understanding user intent
CN104008132A (en) * 2014-05-04 2014-08-27 深圳市北科瑞声科技有限公司 Voice map searching method and system
US20160063994A1 (en) * 2014-08-29 2016-03-03 Google Inc. Query Rewrite Corrections
CN106095766A (en) * 2015-04-28 2016-11-09 谷歌公司 Use selectivity again to talk and correct speech recognition
US10049655B1 (en) * 2016-01-05 2018-08-14 Google Llc Biasing voice correction suggestions
CN106056207A (en) * 2016-05-09 2016-10-26 武汉科技大学 Natural language-based robot deep interacting and reasoning method and device
CN106328166A (en) * 2016-08-31 2017-01-11 上海交通大学 Man-machine dialogue anomaly detection system and method
CN106571139A (en) * 2016-11-09 2017-04-19 百度在线网络技术(北京)有限公司 Artificial intelligence based voice search result processing method and device
CN106507244A (en) * 2016-12-23 2017-03-15 深圳先进技术研究院 A kind of central control system
CN106782519A (en) * 2016-12-23 2017-05-31 深圳先进技术研究院 A kind of robot
CN107240398A (en) * 2017-07-04 2017-10-10 科大讯飞股份有限公司 Intelligent sound exchange method and device
CN107357875A (en) * 2017-07-04 2017-11-17 北京奇艺世纪科技有限公司 A kind of voice search method, device and electronic equipment
CN107977183A (en) * 2017-11-16 2018-05-01 百度在线网络技术(北京)有限公司 voice interactive method, device and equipment
CN109710055A (en) * 2017-12-15 2019-05-03 蔚来汽车有限公司 The interaction control method of vehicle intelligent interactive system and vehicle-mounted interactive terminal
CN108597495A (en) * 2018-03-15 2018-09-28 维沃移动通信有限公司 A kind of method and device of processing voice data
CN108600219A (en) * 2018-04-23 2018-09-28 海信(广东)空调有限公司 A kind of sound control method and equipment
CN109545184A (en) * 2018-12-17 2019-03-29 广东小天才科技有限公司 It is a kind of that detection method and electronic equipment are recited based on voice calibration
CN110473521A (en) * 2019-02-26 2019-11-19 北京蓦然认知科技有限公司 A kind of training method of task model, device, equipment
CN110136705A (en) * 2019-04-10 2019-08-16 华为技术有限公司 A kind of method and electronic equipment of human-computer interaction
CN110211577A (en) * 2019-07-19 2019-09-06 宁波方太厨具有限公司 Terminal device and its voice interactive method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102831A (en) * 2020-09-15 2020-12-18 海南大学 Cross-data, information and knowledge modal content encoding and decoding method and component
CN112530442A (en) * 2020-11-05 2021-03-19 广东美的厨房电器制造有限公司 Voice interaction method and device
CN112530442B (en) * 2020-11-05 2023-11-17 广东美的厨房电器制造有限公司 Voice interaction method and device
CN112700769A (en) * 2020-12-26 2021-04-23 科大讯飞股份有限公司 Semantic understanding method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN109493850B (en) Growing type dialogue device
CN113327609B (en) Method and apparatus for speech recognition
US20240021202A1 (en) Method and apparatus for recognizing voice, electronic device and medium
CN110910903B (en) Speech emotion recognition method, device, equipment and computer readable storage medium
CN106407393B (en) information processing method and device for intelligent equipment
CN110956958A (en) Searching method, searching device, terminal equipment and storage medium
CN109492221B (en) Information reply method based on semantic analysis and wearable equipment
CN107844470B (en) Voice data processing method and equipment thereof
WO2020024620A1 (en) Voice information processing method and device, apparatus, and storage medium
CN112232276B (en) Emotion detection method and device based on voice recognition and image recognition
CN110931006A (en) Intelligent question-answering method based on emotion analysis and related equipment
CN108595406B (en) User state reminding method and device, electronic equipment and storage medium
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
CN110955818A (en) Searching method, searching device, terminal equipment and storage medium
CN112151015A (en) Keyword detection method and device, electronic equipment and storage medium
CN111639529A (en) Speech technology detection method and device based on multi-level logic and computer equipment
CN111210824B (en) Voice information processing method and device, electronic equipment and storage medium
CN111062221A (en) Data processing method, data processing device, electronic equipment and storage medium
CN113611316A (en) Man-machine interaction method, device, equipment and storage medium
CN110781329A (en) Image searching method and device, terminal equipment and storage medium
CN111048068B (en) Voice wake-up method, device and system and electronic equipment
WO2023137920A1 (en) Semantic truncation detection method and apparatus, and device and computer-readable storage medium
CN114299955B (en) Voice interaction method and device, electronic equipment and storage medium
CN114171000A (en) Audio recognition method based on acoustic model and language model
CN111144125B (en) Text information processing method and device, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200403

RJ01 Rejection of invention patent application after publication