CN116467407A - Voice processing method, device and equipment - Google Patents

Voice processing method, device and equipment Download PDF

Info

Publication number
CN116467407A
CN116467407A CN202310429483.6A CN202310429483A CN116467407A CN 116467407 A CN116467407 A CN 116467407A CN 202310429483 A CN202310429483 A CN 202310429483A CN 116467407 A CN116467407 A CN 116467407A
Authority
CN
China
Prior art keywords
voice
questions
similarity
determining
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310429483.6A
Other languages
Chinese (zh)
Inventor
张若璇
陈永录
高宏超
李宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310429483.6A priority Critical patent/CN116467407A/en
Publication of CN116467407A publication Critical patent/CN116467407A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a voice processing method, device and equipment, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring a first voice; acquiring a voice response relation network, wherein the voice response relation network comprises a plurality of problem groups and responses corresponding to each problem group, and the similarity of the problems in the problem groups is greater than or equal to a first threshold; processing the first voice through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice; determining a plurality of to-be-selected questions in the voice response relation network according to the at least one semantic feature, wherein the similarity between the to-be-selected questions and the at least one semantic feature is greater than or equal to a second threshold; and determining a target reply corresponding to the first voice in the voice reply relation network according to the to-be-selected problem, and outputting the target reply. The method improves the accuracy of voice processing.

Description

Voice processing method, device and equipment
Technical Field
The embodiment of the application relates to the technical field of artificial intelligence, in particular to a voice processing method, a voice processing device and voice processing equipment.
Background
In the business transaction process, enterprises can automatically answer questions of users through artificial intelligent equipment, so that human resources are saved, and the working efficiency is improved.
In the related art, the voice of the user question may be processed as follows: after the artificial intelligent device obtains the questioning voice of the user, text extraction processing can be carried out on the questioning voice to obtain at least one keyword corresponding to the questioning voice. And determining the problem with the highest similarity with the questioning voice in the database according to at least one keyword corresponding to the questioning voice. And determining a reply corresponding to the problem to be fed back to the client.
In the process, the question text with the highest similarity with the questioning voice is determined only through at least one keyword corresponding to the questioning voice. The text of the question that may have the highest similarity to the question speech may differ significantly from the semantics of the question speech. Therefore, the corresponding reply text is determined not to be the reply corresponding to the question voice according to the question text, so that the voice processing accuracy is low.
Disclosure of Invention
The embodiment of the application provides a voice processing method, device and equipment, which are used for solving the problem of low accuracy of voice processing.
In a first aspect, an embodiment of the present application provides a method for processing speech, including:
acquiring a first voice;
acquiring a voice response relation network, wherein the voice response relation network comprises a plurality of problem groups and responses corresponding to each problem group, and the similarity of the problems in the problem groups is greater than or equal to a first threshold;
processing the first voice through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice;
determining a plurality of to-be-selected questions in the voice response relation network according to the at least one semantic feature, wherein the similarity between the to-be-selected questions and the at least one semantic feature is greater than or equal to a second threshold;
and determining a target reply corresponding to the first voice in the voice reply relation network according to the to-be-selected problem, and outputting the target reply.
In a second aspect, embodiments of the present application provide a speech processing apparatus, the apparatus including:
the first acquisition module is used for acquiring first voice;
the second acquisition module is used for acquiring a voice response relation network, wherein the voice response relation network comprises a plurality of problem groups and responses corresponding to each problem group, and the similarity of the problems in the problem groups is greater than or equal to a first threshold;
The processing module is used for processing the first voice through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice;
a first determining module, configured to determine a plurality of questions to be selected in the voice response relationship network according to the at least one semantic feature, where a similarity between the questions to be selected and the at least one semantic feature is greater than or equal to a second threshold;
and the second determining module is used for determining a target reply corresponding to the first voice in the voice reply relation network according to the to-be-selected problem and outputting the target reply.
In a third aspect, an embodiment of the present application provides a speech processing apparatus, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the first aspects.
In a fourth aspect, embodiments of the present application provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of any one of the first aspects.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the method of any of the first aspects.
The voice processing method, the voice processing device and the voice processing equipment provided by the embodiment of the application acquire the first voice and the voice reply relation network. And processing the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice. And determining the similarity of at least one semantic feature and each problem in the voice response relation network through a first similarity algorithm. And sequencing the plurality of similarity from big to small, sequencing the problems in the voice response relation network, and obtaining a plurality of sequenced problems. The first K questions of the ordered plurality of questions are determined as a first set of questions and the other questions of the plurality of questions, except for the first K questions, are determined as a second set of questions. A plurality of candidate questions are determined based on the at least one semantic feature, the first set of questions, and the second set of questions. And according to the problem to be selected, determining a target reply corresponding to the first voice in the voice reply relation network, and outputting the target reply. In the above process, the first voice may be processed through the first model, so as to obtain at least one semantic feature corresponding to at least one keyword in the first voice. The same semantic features are used for indicating all words of the same or similar semantics, and a plurality of questions to be selected are determined through the semantic features corresponding to the keywords. According to the multiple problems to be selected, determining target replies corresponding to the first voice in the voice reply relation network, avoiding the condition that keywords of the problems to be selected in the first voice and the database are the same or similar but have larger semantic differences, and improving the accuracy of voice processing.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic diagram of an application scenario provided in an embodiment of the present application;
fig. 2 is a schematic flow chart of a voice processing method according to an embodiment of the present application;
fig. 3 is a schematic diagram of a process of acquiring a first voice according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating another speech processing method according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a voice relationship reply network according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a speech processing procedure according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a voice processing device according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of another speech processing device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a speech processing device according to an embodiment of the present application.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards, and provide corresponding operation entries for the user to select authorization or rejection.
It should be noted that the method and apparatus for speech processing of the present application may be used in the field of artificial intelligence, and may also be used in any field other than artificial intelligence, and the application field of the method and apparatus for speech processing of the present application is not limited.
In order to facilitate understanding, an application scenario to which the embodiments of the present application are applicable is described below with reference to fig. 1.
Fig. 1 is a schematic diagram of an application scenario provided in an embodiment of the present application. Referring to fig. 1, the terminal device 101 and the voice processing device 102 are included. The terminal device 101 may be a mobile phone, a computer, etc., and the speech processing device 102 may be a server. The user can ask questions through an application program provided by the terminal device 101, and the terminal device 101 acquires the asking voice of the user and sends the asking voice to the voice processing device 102. The voice processing device 102 determines a reply corresponding to the question voice in the database based on the question voice transmitted from the terminal device 101, and transmits the reply to the terminal device 101. The terminal device 101 may display the reply or play the reply so that the user obtains the reply corresponding to the question.
In the related art, the voice of the user question may be processed as follows: after the artificial intelligent device obtains the questioning voice of the user, text extraction processing can be carried out on the questioning voice to obtain at least one keyword corresponding to the questioning voice. And determining the problem with the highest similarity with the questioning voice in the database according to at least one keyword corresponding to the questioning voice. And determining a reply corresponding to the problem to be fed back to the client. In the above process, the question with the highest similarity with the questioning voice is determined only by at least one keyword corresponding to the questioning voice. There may be a question with the highest similarity to the question speech and the semantic difference from the question speech may be large. Therefore, determining that the corresponding reply is not the reply corresponding to the questioning voice according to the question results in lower accuracy of voice processing.
In the embodiment of the application, a first voice corresponding to a user question and a voice reply relation network are acquired. And processing the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice. And determining a plurality of questions to be selected with highest similarity with the first voice in the voice response relation network according to at least one semantic feature. And according to the problem to be selected, determining a target reply corresponding to the first voice in the voice reply relation network, and outputting the target reply. In the above process, the first voice may be processed through the first model, so as to obtain at least one semantic feature corresponding to at least one keyword in the first voice. The same semantic features are used for indicating all words of the same or similar semantics, and a plurality of questions to be selected are determined through the semantic features corresponding to the keywords. According to the multiple problems to be selected, determining target replies corresponding to the first voice in the voice reply relation network, avoiding the condition that keywords of the problems to be selected in the first voice and the database are the same or similar but have larger semantic differences, and improving the accuracy of voice processing.
The method shown in the present application will be described below by way of specific examples. It should be noted that the following embodiments may exist alone or in combination with each other, and for the same or similar content, the description will not be repeated in different embodiments.
Fig. 2 is a flow chart of a voice processing method according to an embodiment of the present application. Referring to fig. 2, the method may include:
s201, acquiring a first voice.
The execution body of the embodiment of the application may be a voice processing device, or may be a voice processing apparatus provided in the voice processing device. The speech processing means may be implemented by software or by a combination of software and hardware. The speech processing device can be a server.
The user can ask questions through prompt information displayed by an application program in the terminal equipment. The audio acquisition device of the terminal device acquires the questions of the user and sends the questions of the user to the voice processing device. The terminal equipment can be a mobile phone, a tablet personal computer and the like.
Next, a process of acquiring the first voice will be described with reference to fig. 3. Fig. 3 is a schematic diagram of a process of acquiring a first voice according to an embodiment of the present application. Referring to FIG. 3, interfaces 301-302 are included. Interfaces 301-302 are query pages provided by applications in the terminal device. Referring to interface 301, a user clicks and opens a query page in an application program of the terminal device, and a dialog box is displayed in the query page to prompt the user to perform a corresponding operation. When the user makes an information inquiry, the talk button can be clicked and pressed in the interface 301, and the terminal device starts recording the question of the user through the recording device in response to the clicking operation of the user. Referring to interface 302, after recording a question of a user, the terminal device sends a first voice corresponding to the question to the voice processing device, and displays a text corresponding to the first voice on the query page to prompt the user to receive the question.
S202, acquiring a voice reply relation network.
The voice response relation network comprises a plurality of problem groups and responses corresponding to each problem group, and the similarity of the problems in the problem groups is greater than or equal to a first threshold.
According to the voice and the corresponding reply of the voice acquired in the historical period, a voice reply relation network is established through a knowledge graph, and the voice reply relation network is stored in a preset storage space of the voice processing equipment.
The multiple questions in the question set may be questions of the same or similar semantics. For example, the plurality of questions in the question group may be specifically as shown in table 1:
TABLE 1
Problem(s) Problem content
Problem 1 How much money is there?
Problem 2 Querying the current account balance
Problem 3 How much money remains in the account?
S203, processing the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice.
The first model may be a word to vector (word 2 vec) model.
The at least one semantic feature corresponding to the at least one keyword in the first voice may be obtained by: performing voice recognition processing and word segmentation processing on the first voice to obtain a first text corresponding to the first voice, wherein the first text comprises at least one sentence text corresponding to the first voice and a keyword label corresponding to the sentence text, and the keyword label is used for indicating part-of-speech classification of the keyword; and processing the first text through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice.
The part of speech classification includes at least: nouns, verbs, adjectives, prepositions, pronouns, numbers, conjunctions, facilitators, time words, status words, orientation words, punctuation marks, and the like.
Semantic features may be represented by word vectors. The word vector may be an n-dimensional vector, and each element in the word vector is used to indicate a semantic feature corresponding to a keyword in the first text.
Semantic features are used to indicate multiple keywords of the same or similar semantics. For example, balance, remaining money, etc. may be represented by the same semantic features.
For example, after the voice processing device acquires the first voice, the voice processing device performs voice recognition processing on the first voice, where the obtained text is: there is also some money in me account. Word segmentation processing is carried out on the text to obtain a first text: i (pronoun), there are also (adverbs) how much (adjectives) money (nouns) in the account. And inputting each keyword in the first text into a first model, and processing the first text through the first model to obtain at least one semantic feature word vector A corresponding to at least one keyword in the first voice. The word vector may be specifically a= (a, b, c, d) T . Wherein, each element is used for indicating the semantic features corresponding to each keyword in the first text.
S204, determining a plurality of candidate questions in the voice response relation network according to at least one semantic feature.
The similarity of the candidate question and the at least one semantic feature is greater than or equal to a second threshold.
The plurality of questions to be selected may be determined in the voice response relationship network by: determining the similarity of at least one semantic feature and each problem in the voice response relation network through a first similarity algorithm; sequencing the plurality of similarity from big to small, sequencing the problems in the voice reply relation network, and obtaining a plurality of sequenced problems; determining the first K questions of the ordered plurality of questions as a first question set, and determining the other questions of the plurality of questions except the first K questions as a second question set, wherein K is an integer greater than or equal to 1; a plurality of candidate questions are determined based on the at least one semantic feature, the first set of questions, and the second set of questions.
The first similarity algorithm may be a word centroid distance (Word Centoid Distance, WCD) algorithm.
In the first question set and the second question set, K questions having a similarity greater than or equal to a second threshold and greater than other questions than the first K questions are determined as a plurality of candidate questions.
For example, there are 100 problems in the voice reply relationship network, assuming K is 10. And determining the similarity of at least one semantic feature and 100 problems in the voice response relation network through a WMD algorithm. And sequencing the 100 similarity from big to small, sequencing the problems in the voice response relation network, and obtaining a plurality of sequenced problems. The first 10 questions (questions 1-10) of the ordered 100 questions are determined as the first question set. The remaining 90 questions (questions 11 to 100) out of the 100 ordered questions are determined as the second question set. A plurality of candidate questions are determined from the at least one semantic feature, the first set of questions, and the second set of questions with a similarity of 10 questions greater than or equal to a second threshold of 90%.
If the similarity between the questions 1 and 10 is equal to or greater than the similarity between the questions 11 and 100, and the similarity between the questions 1 and 10 is equal to or greater than a second threshold, determining the questions 1 to 10 in the first question set as a plurality of candidate questions. If the problem 11-100 has a similarity greater than the problem 1-10, updating the ranking of all the problems until the similarity of 10 problems is greater than or equal to the second threshold and the similarity is greater than the rest 90 problems. The 10 questions are determined as a plurality of candidate questions.
S205, determining a target reply corresponding to the first voice in the voice reply relation network according to the problem to be selected, and outputting the target reply.
For example, the voice processing device may specifically determine, according to at least one semantic feature, a plurality of candidate questions in the voice response relationship network as shown in table 2:
TABLE 2
Problem(s) Problem content
Problem 1 When the next repayment of me is
Problem 2 What is the repayment cycle
Problem 3 The repayment before the number is not out of date
According to the problem to be selected, the voice processing equipment determines that the target reply corresponding to the first voice can be repayment with the repayment time of 5 # per month in the voice reply relation network. The voice processing device may send the target reply to the terminal device, which may directly play or display the target reply.
According to the voice processing method provided by the embodiment of the application, the first voice and the voice reply relation network are obtained. And processing the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice. A plurality of candidate questions are determined in the voice response relationship network based on the at least one semantic feature. And according to the problem to be selected, determining a target reply corresponding to the first voice in the voice reply relation network, and outputting the target reply. In the above process, the first voice may be processed through the first model, so as to obtain at least one semantic feature corresponding to at least one keyword in the first voice. The same semantic features are used for indicating all words of the same or similar semantics, and a plurality of questions to be selected are determined through the semantic features corresponding to the keywords. According to the multiple problems to be selected, determining target replies corresponding to the first voice in the voice reply relation network, avoiding the condition that keywords of the problems to be selected in the first voice and the database are the same or similar but have larger semantic differences, and improving the accuracy of voice processing.
On the basis of any of the above embodiments, a detailed procedure of the voice processing will be described below with reference to fig. 4.
Fig. 4 is a flow chart of another voice processing method according to an embodiment of the present application. Referring to fig. 4, the method includes:
s401, acquiring a first voice.
It should be noted that, the step of executing S401 may refer to S201, which is not described herein.
S402, acquiring a voice reply relation network.
The voice response relation network comprises a plurality of problem groups and responses corresponding to each problem group, and the similarity of the problems in the problem groups is greater than or equal to a first threshold.
Before the voice response relationship network is acquired, the voice response relationship network can be established according to a plurality of questions acquired in a history period and responses corresponding to each question. And storing the voice reply relation network into a preset storage space of the voice processing equipment.
The voice reply relationship network may be determined by: acquiring a plurality of voice questions and replies corresponding to each voice question; classifying the plurality of voice questions to obtain a plurality of question groups, wherein the similarity of the questions in the question groups is greater than or equal to a first threshold; determining replies corresponding to each question set; and generating a voice response relation network according to the plurality of voice questions, the plurality of question sets and the responses corresponding to each question set.
Next, the structure of the voice response relationship network will be described with reference to fig. 5. Fig. 5 is a schematic structural diagram of a voice relationship reply network according to an embodiment of the present application. Referring to fig. 5, a voice reply relation network 501 is included, and the voice reply relation network 501 is stored in a preset storage space of the voice processing device. The voice reply relationship network 501 includes 5 question groups, question group 1, question group 2, question group 3, question group 4, and question group 5, respectively. And replies corresponding to each question group are reply 1, reply 2, reply 3, reply 4 and reply 5 respectively. Each question group comprises a plurality of questions, and the similarity of each question is greater than or equal to a first threshold value of 95%.
An association relationship can be established between a plurality of problem groups of the same type, and the association relationship is used for indicating that problems in the problem groups belong to the same service type. For example, the problem group 1 and the problem group 2 shown in fig. 5 have an association relationship (indicated by a dotted line box). The service types corresponding to the questions in the question group 1 and the question group 2 are query account information. The problem group 3 and the problem group 4 have an association relationship, and the service types corresponding to the problems in the problem group 3 and the problem group 4 are inquiry repayment service deadlines.
When determining the association relationship between the problem groups, service types with multiple dimensions and multiple ranges can be set in advance, and the association relationship between the problem groups is established according to the dimensions. For example, query business information related questions, the dimensions corresponding to the business types may be set to include query account information-query account type-query account deposit and withdrawal limits. Thus, multi-dimensional and multi-level association relations among the problem groups can be established according to a plurality of problems.
S403, processing the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice.
Before the first voice is processed through the first model, the first model can be trained according to the voice acquired in the historical period and the reply corresponding to the voice, so that the accuracy of the output result of the first model is improved.
The first model may be trained by: acquiring a training set, wherein the training set comprises a plurality of second voices and at least one semantic feature corresponding to each second voice; performing voice recognition processing and word segmentation processing on the second voice to obtain a second text corresponding to the second voice and a keyword label corresponding to the second text, wherein the keyword label is used for indicating part-of-speech classification of the keyword; and (3) performing ith iterative training on the ith intermediate model through the second voice to obtain an (i+1) th intermediate model, wherein i sequentially takes 1, 2, 3 and … … until the ith intermediate model converges, and when i is greater than or equal to N, the ith intermediate model is determined to be a first model, N is a preset iteration number, N is an integer greater than 1, and the 1 st intermediate model is an initial model.
The i+1th intermediate model can be obtained by: performing feature extraction processing on each second text through the ith intermediate model to obtain at least one predicted semantic feature corresponding to each second text; determining a loss value according to at least one predicted semantic feature corresponding to each first text and at least one semantic feature corresponding to the training set; and updating model parameters of the ith intermediate model according to the loss value to obtain an (i+1) th intermediate model, wherein the model parameters comprise the dimension of the word vector and the window size.
The model convergence condition is that the loss value is smaller than or equal to a preset threshold value. That is, the similarity of at least one semantic feature corresponding to at least one predicted semantic feature training set is less than or equal to a preset threshold.
The method for determining the loss value is the same as the method for determining the similarity of the plurality of questions to be selected, which is not described herein.
S404, determining the similarity of at least one semantic feature and each problem in the voice response relation network through a first similarity algorithm.
When the similarity is determined through the first similarity algorithm, the similarity algorithm with smaller time complexity can be used because the similarity between all problems in the voice reply relation network and at least one semantic feature needs to be determined, so that the calculation efficiency is improved.
S405, sorting the plurality of similarity from big to small, and sorting the problems in the voice reply relation network to obtain a plurality of sorted problems.
For example, the speech processing device determines, via a first similarity algorithm, the similarity of at least one semantic feature to 10 questions in the speech reply relationship network. Sequencing 10 questions in the voice reply relation network from big to small, wherein the 10 sequenced questions comprise: problem 2, problem 1, problem 4, problem 6, problem 3, problem 9, problem 8, problem 10, problem 5, problem 7.
S406, determining the first K questions in the ordered plurality of questions as a first question set, and determining other questions except the first K questions in the ordered plurality of questions as a second question set.
K is an integer greater than or equal to 1. The value corresponding to K may be determined based on the number of questions having a similarity to at least one semantic feature greater than or equal to a second threshold.
For example, if the number of questions having a similarity with at least one semantic feature of 95% or more of the second threshold is 3, it may be determined that the value corresponding to K is 3.
For example, suppose K is 3. Then the first set of questions includes the first 3 questions, question 2, question 1, question 4, respectively, according to the ordered 10 questions shown in the example above. The second set of questions includes the last 7 questions, question 6, question 3, question 9, question 8, question 10, question 5, question 7, respectively.
S407, determining the first similarity of at least one semantic feature and each question in the first question set through a second similarity algorithm.
The second similarity algorithm may be a Word move's Distance algorithm.
For example, according to the first problem set shown in the above example, the determining, by the second similarity algorithm, the first similarity of the at least one semantic feature to each problem in the first problem set may specifically be as shown in table 3:
TABLE 3 Table 3
S408, determining the second similarity of at least one semantic feature and each question in the second question set through a third similarity algorithm.
The third similarity algorithm may be a relaxed word shift distance (Relaxed word moving distance, RWMD) algorithm.
And determining the word shift distance between at least one semantic feature and each problem through the similar algorithm. The similarity of the at least one semantic feature to each question may be determined based on the word shift distance of the at least one semantic feature to each question.
S409, judging whether target similarity exists in the plurality of second similarities.
The target similarity is greater than each of the first similarities.
If yes, S411 is executed.
If not, S410 is performed.
S410, determining target questions corresponding to the target similarity, updating a first question set according to the target questions, and determining a plurality of questions to be selected according to the first question set.
The first question set may be updated according to the target questions by: determining the similarity of at least one semantic feature and the target problem according to a second similarity algorithm; sequencing all the problems and target problems in the first problem set according to the sequence of the similarity from large to small; the first question set is updated to the top K questions after ordering.
For example, according to the second problem set illustrated by the above example, the determining, by the third similarity algorithm, the second similarity of the at least one semantic feature to each problem in the second problem set may specifically be as shown in table 4:
TABLE 4 Table 4
Second problem set Second similarity degree
Problem 6 95.2%
Problem 3 91.5%
Problem 9 90.0%
Problem 8 88.0%
Problem 10 85.0%
Problem 5 83.0%
Problem 7 75.0%
From the first similarity shown in table 3 and the second similarity shown in table 4, it can be determined that the second similarity of the question 6 is greater than the similarity of the question 4. It may be determined whether the target similarity 95.2% exists among the plurality of second similarities. At this time, according to the second similarity algorithm, it is determined that the similarity of at least one semantic feature to the question 6 is 98.0%. And sequencing all the questions and the target questions in the first question set according to the sequence of the similarity from large to small to obtain the sequences of the questions 6, 2, 1 and 4. The first question set is updated to the first 3 questions ordered. I.e. the first set of questions comprises question 6, question 2, question 1.
After S410, S412 is performed.
S411, determining a plurality of questions to be selected according to the first question set.
The plurality of candidate questions may be determined from the first set of questions by: aiming at any one problem in the first problem set, determining a problem group in which the problem is positioned in a voice reply relation network; if the problem groups of the problems in the first problem set are the same, determining a plurality of problems in the first problem set as a plurality of to-be-selected problems.
For example, according to the second problem set illustrated by the above example, the determining, by the third similarity algorithm, the second similarity of the at least one semantic feature to each problem in the second problem set may specifically be as shown in table 5:
TABLE 5
Second problem set Second similarity degree
Problem 6 93.2%
Problem 3 91.5%
Problem 9 90.0%
Problem 8 88.0%
Problem 10 85.0%
Problem 5 83.0%
Problem 7 75.0%
From the first similarity shown in table 3 and the second similarity shown in table 5, it can be determined that the target similarity does not exist among the plurality of second similarities. At this time, the problem group in which the problem 2, the problem 1, and the problem 4 are located in the first problem set shown in table 3 is determined in the voice response relation network. If all of the problem groups of the problems 2, 1 and 4 in the first problem set are the problem group 2, determining the problems 2, 1 and 4 in the first problem set as a plurality of candidate problems.
By combining three similarity algorithms to determine a plurality of problems to be selected, when the calculated amount is large, the algorithm with smaller time complexity can be used for screening out the problems with small similarity, so that the calculated time is reduced, and the efficiency is improved. When the calculation amount is small, a more complex and accurate algorithm is used, so that the accuracy of determining the similarity is improved.
S412, determining a target reply corresponding to the first voice in the voice reply relation network according to the problem to be selected, and outputting the target reply.
For example, the speech processing apparatus selects question 2, question 1, question 4 according to the plurality of candidate questions shown in the above example. In the voice reply relation network, determining the problem group where the problem 2, the problem 1 and the problem 4 are located as the problem group 2. And determining that the target reply corresponding to the problem group 2 is that the repayment date is 4 months and 12 days per month. The voice processing device sends the target reply to the terminal device, and the terminal device displays or plays the target reply.
According to the voice processing method provided by the embodiment of the application, the first voice and the voice reply relation network are obtained. And processing the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice. And determining the similarity of at least one semantic feature and each problem in the voice response relation network through a first similarity algorithm. And sequencing the plurality of similarity from big to small, sequencing the problems in the voice response relation network, and obtaining a plurality of sequenced problems. The first K questions of the ordered plurality of questions are determined as a first set of questions and the other questions of the plurality of questions, except for the first K questions, are determined as a second set of questions. A plurality of candidate questions are determined based on the at least one semantic feature, the first set of questions, and the second set of questions. And according to the problem to be selected, determining a target reply corresponding to the first voice in the voice reply relation network, and outputting the target reply. In the above process, the first voice may be processed through the first model, so as to obtain at least one semantic feature corresponding to at least one keyword in the first voice. The same semantic features are used for indicating all words of the same or similar semantics, and a plurality of questions to be selected are determined through the semantic features corresponding to the keywords. According to the multiple problems to be selected, determining target replies corresponding to the first voice in the voice reply relation network, avoiding the condition that keywords of the problems to be selected in the first voice and the database are the same or similar but have larger semantic differences, and improving the accuracy of voice processing.
On the basis of any of the above embodiments, a detailed procedure of the voice processing will be exemplified below with reference to fig. 6.
Fig. 6 is a schematic diagram of a voice processing procedure according to an embodiment of the present application. Referring to fig. 6, a terminal device 601 and a voice processing device 602 are included. The terminal device 601 may be a mobile phone, a computer, etc., and the speech processing device 602 may be a server. The voice processing device 602 is provided with a first algorithm, and a preset storage space of the voice processing device 602 stores a voice relation reply network.
The user clicks and opens the query page in the application program of the terminal device 601, and performs corresponding input selection operation according to the prompt information displayed on the query page. The terminal device 601 starts recording a question of the user through the recording device in response to a click operation of the user. After recording the question of the user, the terminal device 601 sends a first voice corresponding to the question to the voice processing device 602, and simultaneously displays a text or a prompt message corresponding to the first voice on the query page so as to prompt the user to receive the question. The first voice may be a current payment time of the account.
The voice processing device 602 performs voice recognition processing and word segmentation processing on the first voice to obtain a first text corresponding to the first voice, where the first text includes an account (noun), a current (noun), a (assisted word), and a payment time (time word). The voice processing device 602 processes the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice as a= (a, b, c) T . Wherein, each element is used for indicating the semantic features corresponding to each keyword in the first text.
The voice processing device 602 obtains a voice response relationship network in a preset storage space, and determines the similarity between at least one semantic feature and each problem in the voice response relationship network through a first similarity algorithm. The speech processing device 602 ranks the plurality of similarities from large to small, ranks the questions in the speech reply relationship network, and obtains a plurality of ranked questions including question 11, question 2, question 7, question 3, question 6, question 1, question 5, question 8, question 10, question 4, question 12, question 9, and question 13. Assuming K is 5, the first question set includes the first 5 questions ordered, question 11, question 2, question 7, question 3, question 6, respectively. The second set of questions includes the last 8 questions of the order, question 1, question 5, question 8, question 10, question 4, question 12, question 9, question 13, respectively. The determining, by the speech processing device 602, the first similarity of the at least one semantic feature to each question in the first set of questions using the second similarity algorithm may specifically be as shown in table 6:
TABLE 5
First question set First similarity degree
Problem 11 98.2%
Problem 2 97.5%
Problem 7 97.0%
Problem 3 96.0%
Problem 6 95.0%
The speech processing device 602, through a third similarity algorithm, may specifically determine a second similarity of the at least one semantic feature to each question in the second set of questions as shown in table 6:
TABLE 6
Second problem set Second similarity degree
Problem 1 94.2%
Problem 5 94.0%
Problem 8 93.5%
Problem 10 92.0%
Problem 4 90.0%
Problem 12 88.0%
Problem 9 86.3%
Problem 13 80.0%
The speech processing device 602 may determine that the target similarity does not exist among the plurality of second similarities based on the first similarity shown in table 5 and the second similarity shown in table 6. The speech processing device 602 determines that the problem groups in which the problems 11, 2, 7, 3, and 6 in the first problem set shown in table 5 are all the problem group 1 in the speech reply relation network. At this time, the speech processing device 602 determines the questions 11, 2, 7, 3, and 6 in the first question set as a plurality of candidate questions. The voice processing device 602 determines, according to the question to be selected, that the target reply corresponding to the first voice is the current day of repayment of 4 months and 12 days in the voice reply relation network, and outputs the target reply. The voice processing device 602 sends the target reply to the terminal device 601, and the terminal device 601 displays or plays the target reply through the application program.
According to the voice processing method provided by the embodiment of the application, the first voice and the voice reply relation network are obtained. And processing the first voice through the first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice. And determining the similarity of at least one semantic feature and each problem in the voice response relation network through a first similarity algorithm. And sequencing the plurality of similarity from big to small, sequencing the problems in the voice response relation network, and obtaining a plurality of sequenced problems. The first K questions of the ordered plurality of questions are determined as a first set of questions and the other questions of the plurality of questions, except for the first K questions, are determined as a second set of questions. A plurality of candidate questions are determined based on the at least one semantic feature, the first set of questions, and the second set of questions. And according to the problem to be selected, determining a target reply corresponding to the first voice in the voice reply relation network, and outputting the target reply. In the above process, the first voice may be processed through the first model, so as to obtain at least one semantic feature corresponding to at least one keyword in the first voice. The same semantic features are used for indicating all words of the same or similar semantics, and a plurality of questions to be selected are determined through the semantic features corresponding to the keywords. According to the multiple problems to be selected, determining target replies corresponding to the first voice in the voice reply relation network, avoiding the condition that keywords of the problems to be selected in the first voice and the database are the same or similar but have larger semantic differences, and improving the accuracy of voice processing.
Fig. 7 is a schematic structural diagram of a speech processing device according to an embodiment of the present application. Referring to fig. 7, the voice processing apparatus 10 may include:
a first acquiring module 11, configured to acquire a first voice;
a second obtaining module 12, configured to obtain a voice response relationship network, where the voice response relationship network includes a plurality of problem groups, and responses corresponding to each problem group, where a similarity of problems in the problem groups is greater than or equal to a first threshold;
the processing module 13 is configured to process the first voice through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice;
a first determining module 14, configured to determine a plurality of questions to be selected in the voice response relationship network according to the at least one semantic feature, where a similarity between the questions to be selected and the at least one semantic feature is greater than or equal to a second threshold;
and the second determining module 15 is configured to determine, according to the candidate problem, a target reply corresponding to the first voice in the voice reply relation network, and output the target reply.
In a possible embodiment, the second determining module 15 is specifically configured to:
Determining the similarity of the at least one semantic feature and each problem in the voice response relation network through a first similarity algorithm;
sequencing the plurality of similarity from big to small, sequencing the problems in the voice response relation network, and obtaining a plurality of sequenced problems;
determining the first K questions of the ordered plurality of questions as a first question set and the other questions of the plurality of questions except the first K questions as a second question set, wherein K is an integer greater than or equal to 1;
determining the plurality of candidate questions according to the at least one semantic feature, the first question set and the second question set.
In a possible embodiment, the second determining module 15 is specifically configured to:
determining a first similarity of the at least one semantic feature to each question in the first set of questions by a second similarity algorithm;
determining a second similarity of the at least one semantic feature to each question in the second set of questions by a third similarity algorithm;
and determining the plurality of questions to be selected from the first question set and the second question set according to the plurality of first similarities and the plurality of second similarities.
In a possible embodiment, the second determining module 15 is specifically configured to:
judging whether target similarity exists in the plurality of second similarities, wherein the target similarity is larger than each first similarity;
if yes, determining target problems corresponding to the target similarity, updating the first problem set according to the target problems, and determining the plurality of problems to be selected according to the first problem set;
if not, determining the plurality of questions to be selected according to the first question set.
In a possible embodiment, the second determining module 15 is specifically configured to:
determining the similarity of the at least one semantic feature and the target problem according to the second similarity algorithm;
sorting all the questions in the first question set and the target questions according to the sequence of the similarity from big to small;
the first question set is updated to the top K questions after ordering.
In a possible embodiment, the second determining module 15 is specifically configured to:
aiming at any one problem in the first problem set, determining a problem group in which the problem is positioned in the voice reply relation network;
And if the problem groups of all the problems in the first problem set are the same, determining a plurality of problems in the first problem set as the plurality of to-be-selected problems.
In one possible embodiment, the processing module 13 is specifically configured to:
performing voice recognition processing and word segmentation processing on the first voice to obtain a first text corresponding to the first voice, wherein the first text comprises at least one sentence text corresponding to the first voice and a keyword label corresponding to the sentence text, and the keyword label is used for indicating part-of-speech classification to which the keyword belongs;
and processing the first text through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice.
The voice processing device provided in the embodiment of the present application may execute the technical solution shown in the foregoing method embodiment, and its implementation principle and beneficial effects are similar, and will not be described herein again.
Fig. 8 is a schematic structural diagram of another speech processing device according to an embodiment of the present application. Referring to fig. 8, the speech processing apparatus 10 further includes a generating module 16 based on the embodiment shown in fig. 7.
Wherein the generating module 16 is configured to:
Acquiring a plurality of voice questions and replies corresponding to each voice question;
classifying the plurality of voice questions to obtain a plurality of question groups, wherein the similarity of the questions in the question groups is greater than or equal to a first threshold;
determining replies corresponding to each question set;
and generating the voice response relation network according to the voice questions, the question sets and the responses corresponding to the question sets.
The voice processing device provided in the embodiment of the present application may execute the technical solution shown in the foregoing method embodiment, and its implementation principle and beneficial effects are similar, and will not be described herein again.
Fig. 9 is a schematic structural diagram of a speech processing device according to an embodiment of the present application. Referring to fig. 9, the voice processing apparatus 20 may include: a memory 21, and a processor 22. The memory 21, the processor 22, are illustratively interconnected by a bus 23.
The memory 21 is used for storing program instructions;
the processor 22 is configured to execute the program instructions stored in the memory, so as to cause the speech processing device 20 to perform the method shown in the above-described method embodiment.
The voice processing device provided in the embodiment of the present application may execute the technical solution shown in the foregoing method embodiment, and its implementation principle and beneficial effects are similar, and will not be described in detail herein.
Embodiments of the present application provide a computer-readable storage medium having stored therein computer-executable instructions for implementing the above-described method when the computer-executable instructions are executed by a processor.
Embodiments of the present application may also provide a computer program product comprising a computer program which, when executed by a processor, performs the above-described method.
All or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a readable memory. The program, when executed, performs steps including the method embodiments described above; and the aforementioned memory (storage medium) includes: read-only memory (ROM), random-access memory (Random Access Memory, RAM), flash memory, hard disk, solid state disk, magnetic tape, floppy disk (floppy disk), optical disk (optical disk), and any combination thereof.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to encompass such modifications and variations.

Claims (12)

1. A method of speech processing, comprising:
acquiring a first voice;
acquiring a voice response relation network, wherein the voice response relation network comprises a plurality of problem groups and responses corresponding to each problem group, and the similarity of the problems in the problem groups is greater than or equal to a first threshold;
processing the first voice through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice;
determining a plurality of to-be-selected questions in the voice response relation network according to the at least one semantic feature, wherein the similarity between the to-be-selected questions and the at least one semantic feature is greater than or equal to a second threshold;
and determining a target reply corresponding to the first voice in the voice reply relation network according to the to-be-selected problem, and outputting the target reply.
2. The method of claim 1, wherein determining a plurality of candidate questions in the voice response relationship network based on the at least one semantic feature comprises:
determining the similarity of the at least one semantic feature and each problem in the voice response relation network through a first similarity algorithm;
Sequencing the plurality of similarity from big to small, sequencing the problems in the voice response relation network, and obtaining a plurality of sequenced problems;
determining the first K questions of the ordered plurality of questions as a first question set and the other questions of the plurality of questions except the first K questions as a second question set, wherein K is an integer greater than or equal to 1;
determining the plurality of candidate questions according to the at least one semantic feature, the first question set and the second question set.
3. The method of claim 2, wherein determining the plurality of candidate questions from the at least one semantic feature, the first set of questions, and the second set of questions comprises:
determining a first similarity of the at least one semantic feature to each question in the first set of questions by a second similarity algorithm;
determining a second similarity of the at least one semantic feature to each question in the second set of questions by a third similarity algorithm;
and determining the plurality of questions to be selected from the first question set and the second question set according to the plurality of first similarities and the plurality of second similarities.
4. The method of claim 3, wherein determining the plurality of candidate questions in the first and second sets of questions based on the plurality of first similarities and the plurality of second similarities comprises:
judging whether target similarity exists in the plurality of second similarities, wherein the target similarity is larger than each first similarity;
if yes, determining target problems corresponding to the target similarity, updating the first problem set according to the target problems, and determining the plurality of problems to be selected according to the first problem set;
if not, determining the plurality of questions to be selected according to the first question set.
5. The method of claim 4, wherein updating the first set of questions from the target questions comprises:
determining the similarity of the at least one semantic feature and the target problem according to the second similarity algorithm;
sorting all the questions in the first question set and the target questions according to the sequence of the similarity from big to small;
the first question set is updated to the top K questions after ordering.
6. The method of claim 4 or 5, wherein determining the plurality of candidate questions from the first set of questions comprises:
Aiming at any one problem in the first problem set, determining a problem group in which the problem is positioned in the voice reply relation network;
and if the problem groups of all the problems in the first problem set are the same, determining a plurality of problems in the first problem set as the plurality of to-be-selected problems.
7. The method of any one of claims 1-6, wherein the processing of the first speech through a first model results in at least one semantic feature corresponding to at least one keyword in the first speech:
performing voice recognition processing and word segmentation processing on the first voice to obtain a first text corresponding to the first voice, wherein the first text comprises at least one sentence text corresponding to the first voice and a keyword label corresponding to the sentence text, and the keyword label is used for indicating part-of-speech classification to which the keyword belongs;
and processing the first text through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice.
8. The method of any of claims 1-7, further comprising, prior to obtaining the voice reply relationship network:
Acquiring a plurality of voice questions and replies corresponding to each voice question;
classifying the plurality of voice questions to obtain a plurality of question groups, wherein the similarity of the questions in the question groups is greater than or equal to a first threshold;
determining replies corresponding to each question set;
and generating the voice response relation network according to the voice questions, the question sets and the responses corresponding to the question sets.
9. A speech processing apparatus, the apparatus comprising:
the first acquisition module is used for acquiring first voice;
the second acquisition module is used for acquiring a voice response relation network, wherein the voice response relation network comprises a plurality of problem groups and responses corresponding to each problem group, and the similarity of the problems in the problem groups is greater than or equal to a first threshold;
the processing module is used for processing the first voice through a first model to obtain at least one semantic feature corresponding to at least one keyword in the first voice;
a first determining module, configured to determine a plurality of questions to be selected in the voice response relationship network according to the at least one semantic feature, where a similarity between the questions to be selected and the at least one semantic feature is greater than or equal to a second threshold;
And the second determining module is used for determining a target reply corresponding to the first voice in the voice reply relation network according to the to-be-selected problem and outputting the target reply.
10. A speech processing apparatus, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 8.
11. A non-transitory computer readable storage medium storing computer instructions, wherein the computer instructions are for causing a computer to perform the method of any one of claims 1 to 8.
12. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 8.
CN202310429483.6A 2023-04-20 2023-04-20 Voice processing method, device and equipment Pending CN116467407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310429483.6A CN116467407A (en) 2023-04-20 2023-04-20 Voice processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310429483.6A CN116467407A (en) 2023-04-20 2023-04-20 Voice processing method, device and equipment

Publications (1)

Publication Number Publication Date
CN116467407A true CN116467407A (en) 2023-07-21

Family

ID=87185034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310429483.6A Pending CN116467407A (en) 2023-04-20 2023-04-20 Voice processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN116467407A (en)

Similar Documents

Publication Publication Date Title
CN108304437B (en) automatic question answering method, device and storage medium
CN109299344B (en) Generation method of ranking model, and ranking method, device and equipment of search results
US20180157960A1 (en) Scalable curation system
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
CN111324713B (en) Automatic replying method and device for conversation, storage medium and computer equipment
EP2734938A1 (en) Method and system of classification in a natural language user interface
CN110874401B (en) Information processing method, model training method, device, terminal and computing equipment
CN104834651B (en) Method and device for providing high-frequency question answers
CN111339277A (en) Question-answer interaction method and device based on machine learning
CN113342958B (en) Question-answer matching method, text matching model training method and related equipment
CN110990533A (en) Method and device for determining standard text corresponding to query text
CN113064980A (en) Intelligent question and answer method and device, computer equipment and storage medium
CN114706945A (en) Intention recognition method and device, electronic equipment and storage medium
CN110287284B (en) Semantic matching method, device and equipment
CN110377803B (en) Information processing method and device
CN110929014A (en) Information processing method, information processing device, electronic equipment and storage medium
CN111104422A (en) Training method, device, equipment and storage medium of data recommendation model
CN108959327B (en) Service processing method, device and computer readable storage medium
CN110019750A (en) The method and apparatus that more than two received text problems are presented
CN116467407A (en) Voice processing method, device and equipment
CN116414940A (en) Standard problem determining method and device and related equipment
CN110377721B (en) Automatic question answering method, device, storage medium and electronic equipment
CN109787784B (en) Group recommendation method and device, storage medium and computer equipment
CN112148855A (en) Intelligent customer service problem retrieval method, terminal and storage medium
CN111708862A (en) Text matching method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination