CN116189681B - Intelligent voice interaction system and method - Google Patents

Intelligent voice interaction system and method Download PDF

Info

Publication number
CN116189681B
CN116189681B CN202310486481.0A CN202310486481A CN116189681B CN 116189681 B CN116189681 B CN 116189681B CN 202310486481 A CN202310486481 A CN 202310486481A CN 116189681 B CN116189681 B CN 116189681B
Authority
CN
China
Prior art keywords
sound signal
digital sound
user
tone
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310486481.0A
Other languages
Chinese (zh)
Other versions
CN116189681A (en
Inventor
李广鹏
周林娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Crystal Digital Technology Co ltd
Original Assignee
Beijing Crystal Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Crystal Digital Technology Co ltd filed Critical Beijing Crystal Digital Technology Co ltd
Priority to CN202310486481.0A priority Critical patent/CN116189681B/en
Publication of CN116189681A publication Critical patent/CN116189681A/en
Application granted granted Critical
Publication of CN116189681B publication Critical patent/CN116189681B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/14Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Abstract

The invention discloses an intelligent voice interaction system and method, which relate to the field of voice interaction and comprise a data acquisition module, a data processing module, a data analysis module, a data center, an execution module and a control center.

Description

Intelligent voice interaction system and method
Technical Field
The invention relates to the technical field of intelligent voice control, in particular to an intelligent voice interaction system and method.
Background
Speech is the most common way of communication for humans, and is also the most desirable way for humans to communicate with computers. Therefore, communication between voice and a computer is also a recent research hotspot, along with development of technology, an intelligent voice system is increasingly applied to various industries, and an intelligent voice guide for exhibition is a device for facilitating visitors to deeply understand exhibition objects by performing voice broadcasting explanation on indoor exhibition objects.
The intelligent voice guide has a man-machine interaction function, can record voice in a certain range, analyze voice semantics and communicate; but the current common intelligent voice guidance does not have the accurate voice recognition capability under the complex environment, is easily interfered by the outside to cause unclear voice and interference sound, and especially relates to the intelligent voice guidance in the exhibition field, and because the working environment of the intelligent voice guidance is noisy and various, the voice interaction function of the intelligent voice guidance is easily affected due to the fact that the voice of the intelligent voice guidance is very easy to interfere.
In addition, in the special scene of exhibition, intelligent voice navigation is difficult to recognize different users according to the voice characteristics of the different users, and personalized communication service of the users cannot be provided, so that the interaction experience of the users in the scene of exhibition is poor.
Disclosure of Invention
In order to solve the above-mentioned shortcomings in the background art, the present invention aims to provide an intelligent voice interaction system and method.
The aim of the invention can be achieved by the following technical scheme: in a first aspect, the invention provides an intelligent voice interaction system, which comprises a data acquisition module, a data processing module, a data analysis module, a data center, an execution module and a control center; the data center comprises a tone database, a noise database, a general question-answer database and a user question-answer database; the data acquisition module is used for: collecting analog sound signals, and sending the collected analog sound signals to the data processing module for data processing; the data processing module: converting the analog sound signal into a digital sound signal by using analog-to-digital conversion, and extracting the characteristics of the converted digital sound signal to obtain characteristic parameters of the digital sound signal, wherein the characteristic parameters of the digital sound signal comprise decibels, speeds, tones and audios of the digital sound signal, marking the characteristic parameters of the digital sound signal, and sending the characteristic parameters to the data analysis module for analysis; the data analysis module: calculating the decibel, the speed and the tone of the digital sound signal by utilizing the characteristic parameters of the digital sound signal to obtain a first judgment parameter, setting a standard judgment parameter, carrying out first-order derivation on the first judgment parameter and the standard judgment parameter, and obtaining the difference between the first derivative of the first judgment parameter after the first-order derivation and the absolute value of the first derivative of the standard judgment parameter to obtain a judgment difference value; comparing the judging difference value with a preset difference value threshold, judging that the digital sound signal of the collected sound does not accord with a control standard if the judging difference value is larger than or equal to the difference value threshold, and recording the digital sound signal by the noise database; if the judging difference value is smaller than the difference value threshold value, judging that the digital sound signal of the collected sound accords with a control standard, and the control center filtering the digital sound signal recorded in the noise database and analyzing the tone of the filtered digital sound signal; matching the timbre of the digital sound signal with the timbre of the user in the user timbre parameter set stored in the timbre database: if matching is successful, analyzing the user NLP natural language according to the digital sound signal, traversing the historical question-answer records of the user question-answer library by the control center, eliminating information differences according to the correlation of the analyzed user NLP natural language result and the content of the historical question-answer records of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result, and executing an interaction instruction by the execution module; if the matching fails, analyzing the NLP natural language of the user according to the digital sound signals, accessing the general question-answering library by the control center, calling the data of the general question-answering library to answer, executing an interactive instruction by the execution module, generating a historical question-answering record of the user question-answering library of the user, and receiving and inputting the question-answering content into the user question-answering library.
Preferably, the process of the data processing module for data processing includes the following steps: converting the analog sound signal into a digital sound signal by using analog-to-digital conversion, extracting the characteristics of the converted digital sound signal to obtain characteristic parameters of the digital sound signal, wherein the characteristic parameters of the digital sound signal comprise decibels, speed, tone and tone of the digital sound signal, marking the characteristic parameters of the digital sound signal, and marking the decibels of the digital sound signal as F by Marking the speed of the digital sound signal as S dy Marking the tone of the digital sound signal as G dy Marking the tone of the digital sound signal as Y sy Wherein y is the number of collection labels, and y=1, 2, 3,..and n, n is the total number of collection; decibel F of the digital sound signal by Speed S of the digital sound signal dy Tone G of the digital sound signal dy And tone Y of the digital sound signal sy And sending the data to the data analysis module for data analysis.
Preferably, the process of the data analysis module for data analysis includes the following steps: using the formula Calculating to obtain a first determination parameter P dy Wherein F is b0 Is the standard sound decibel parameter S d0 G is the standard sound velocity parameter d0 For the standard sound tone parameters, α is the sound decibel influencing parameter, β is the sound speed influencing parameter, γ is the sound tone influencing parameter, +.>Is a preset proportionality coefficient; using the calculated first decision parameter P dy Obtaining the first derivative P of the decision parameter dy1 And set standard judgment parameter P db And determines the parameter P for the standard db Performing first-order derivation to obtain a first derivative P of the standard judgment parameter db1 The method comprises the steps of carrying out a first treatment on the surface of the Calculating the saidFirst derivative P of first decision parameter dy1 And the first derivative P of the standard decision parameter db1 The absolute value difference of (2) is given by:a difference Cz is obtained and is compared with a preset difference threshold Cz 0 Comparing if Cz is larger than or equal to Cz 0 The digital sound signals of the collected sound are not in accordance with the control standard, and the noise database records the digital sound signals; if Cz < Cz 0 The collected sound is according with a control standard, and the control center filters the digital sound signals recorded in the noise database and analyzes the tone of the filtered digital sound signals; acquiring a user tone color parameter set Y stored in the tone color database through a data acquisition unit in the data analysis module sbp And the tone Y of the digital sound signal is calculated sy And the tone color parameter set Y of the user sbp Matching the parameters of the user tone color parameters in the digital voice signal, if the tone color Y of the digital voice signal is sy The matching is successful, the user NLP natural language is analyzed according to the digital sound signals, the control center traverses the historical question-answer records of the user question-answer library, the information difference is eliminated according to the correlation between the analyzed user NLP natural language results and the content of the historical question-answer records of the user question-answer library, a final language processing result is obtained, answer content is generated according to the final language processing result to interact, and the execution module executes an interaction instruction; if tone Y of digital sound signal sy And if the matching fails, analyzing the NLP natural language of the user according to the digital sound signal, accessing the general question-answering library by the control center, calling the data of the general question-answering library to answer, executing an interactive instruction by the execution module, generating a historical question-answering record of the user question-answering library of the user, and receiving and inputting the question-answering content into the user question-answering library.
Preferably, the user tone color parameter set Y sbp ={Y sb1 、Y sb2 、Y sb3 、...、Y sbt Where p is the user number and t is the total number of users.
Preferably, the user tone color parameter set Y sbp The acquisition process of (a) is as follows: recording voice information of a user through a data acquisition terminal in the control center, wherein the voice information of the user comprises voice decibels, voice speeds and voice tones; and combining the sound information with a tone mapping model, acquiring and storing user tone parameters, and integrating all acquired user tone parameters to form a user tone parameter set, wherein the tone mapping model is trained based on an artificial intelligence model.
Preferably, the timbre mapping model is trained based on the artificial intelligence model by the following steps: integrating and acquiring standard training data through a server, wherein the standard training data comprises sound information and user tone parameters; training the artificial intelligent model through the standard training data to acquire and store the tone mapping model; wherein the artificial intelligence model comprises a deep convolutional neural network model and an RBF neural network model.
Preferably, the data acquisition module is configured to acquire the analog sound signal by using a sound pickup.
Preferably, the sound pickup is an analog sound pickup, and is composed of a microphone and an audio amplifying circuit.
In a second aspect, the present invention also provides an intelligent voice interaction method, which includes the following steps: obtaining an analog sound signal, and performing analog-to-digital conversion on the analog sound signal to obtain a digital sound signal; extracting the characteristics of the digital sound signals to obtain characteristic parameters of the digital sound signals, and marking the characteristic parameters of the digital sound signals; calculating by using the marked characteristic parameters of the digital sound signals to obtain first judging parameters, setting standard judging parameters, respectively carrying out first-order derivation on the first judging parameters and the standard judging parameters, and calculating the difference between the first derivative of the first judging parameters and the absolute value of the first derivative of the standard judging parameters to obtain judging difference values; comparing the judging difference value with a set difference value threshold value, judging that the digital sound signal of the collected sound does not accord with a control standard if the judging difference value is larger than or equal to the difference value threshold value, and recording the digital sound signal by a noise database; if the judging difference value is smaller than the difference value threshold value, judging that the digital sound signal of the collected sound accords with a control standard, and the control center filtering the digital sound signal recorded in the noise database and analyzing the tone of the filtered digital sound signal; matching the timbre of the digital sound signal with the timbre of the user in the user timbre parameter set stored in the timbre database: if matching is successful, analyzing the user NLP natural language according to the digital sound signal, traversing the historical question-answer records of the user question-answer library by the control center, eliminating information differences according to the correlation of the analyzed user NLP natural language result and the content of the historical question-answer records of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result for interaction, and executing an interaction instruction by the execution module; if the matching fails, analyzing the NLP natural language of the user according to the digital sound signal, accessing a general question-answer library by the control center, calling data of the general question-answer library to answer, executing an interaction instruction by the execution module, generating a historical question-answer record of the user question-answer library of the user, and receiving and inputting the question-answer content into the user question-answer library.
The invention has the following beneficial effects: in the use process of the intelligent voice interaction system provided by the invention, analog voice signals are required to be acquired, analog-to-digital conversion is carried out on the analog voice signals to digital voice signals, the characteristics of the converted digital voice signals are extracted, the characteristic parameters of the digital voice signals are obtained, and the characteristic parameters of the digital voice signals are marked; calculating by using the marked characteristic parameters of the digital sound signal to obtain a first judgment parameter, setting a standard judgment parameter, respectively carrying out first-order derivation on the first judgment parameter and the standard judgment parameter, and calculating the difference between the first derivative of the first judgment parameter and the absolute value of the first derivative of the standard judgment parameter to obtain a difference value; comparing the difference with a set difference threshold: if the difference value is greater than or equal to the difference value threshold value, judging that the digital sound signal of the collected sound does not accord with the control standard, and recording the digital sound signal by the noise database; if the difference value is smaller than the difference value threshold value, judging that the digital sound signal of the collected sound accords with the control standard, and the control center filters the digital sound signal recorded in the noise database and analyzes the tone of the filtered digital sound signal; matching the timbre of the digital sound signal with the user timbre in the user timbre parameter set stored in the timbre database: if matching is successful, analyzing the user NLP natural language according to the digital sound signal, traversing the historical question-answer records of the user question-answer library by the control center, eliminating information difference according to the correlation between the analyzed user NLP natural language result and the content of the historical question-answer records of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result for interaction, and executing an interaction instruction by the execution module; if the matching fails, analyzing the NLP natural language of the user according to the digital sound signals, accessing the general question-answering library by the control center, calling the data of the general question-answering library to answer, executing an interactive instruction by the execution module, generating a historical question-answering record of the user question-answering library of the user, and receiving and inputting the question-answering content into the user question-answering library.
The invention can realize the identification of effective sound production or noisy noise of the environment by using the intelligent voice equipment, and can shield other environment interference sounds if the intelligent voice equipment judges that the intelligent voice equipment is effective sound production, thereby increasing the accuracy of voice identification; the invention can also identify the identity of the sounding user according to the comparison result of the tone database, and the correlation of the content of the history question-answer record accessing the user database eliminates the information difference, so that the problem that the user interaction experience is poor due to unclear voice identification is avoided, if no question-answer history exists, the user database can be created, the NLP natural language is analyzed and the general database is accessed for answering.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort.
Fig. 1 is a system architecture diagram of an intelligent voice interaction system according to an embodiment of the present invention.
Fig. 2 is a flowchart of an intelligent voice interaction method according to a second embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
The intelligent voice interaction system shown in fig. 1 comprises a data acquisition module, a data processing module, a data analysis module, a data center, an execution module and a control center, wherein the data center comprises a tone database, a noise database, a general question-answering database and a user question-answering database.
And a data acquisition module: and acquiring an analog sound signal, and sending the acquired analog sound signal to a data processing module for data processing.
And a data processing module: converting the analog sound signal into a digital sound signal by using analog-to-digital conversion, and extracting the characteristics of the converted digital sound signal to obtain the characteristic parameters of the digital sound signal, wherein the characteristic parameters of the digital sound signal comprise decibels, speed, tone and tone of the digital sound signal, marking the characteristic parameters of the digital sound signal, and sending the characteristic parameters to a data analysis module for analysis.
And a data analysis module: calculating the decibel, the speed and the tone of the digital sound signal by utilizing the characteristic parameters of the digital sound signal to obtain a first judgment parameter, setting a standard judgment parameter, carrying out first-order derivation on the first judgment parameter and the standard judgment parameter, and obtaining the difference between the first derivative of the first judgment parameter after the first-order derivation and the absolute value of the first derivative of the standard judgment parameter to obtain a judgment difference value.
And comparing the judging difference value with a preset difference value threshold, and if the judging difference value is larger than or equal to the difference value threshold, judging that the digital sound signal of the collected sound does not accord with the control standard, and recording the digital sound signal by the noise database.
If the judging difference is smaller than the difference threshold, judging that the digital sound signal of the collected sound accords with the control standard, filtering the digital sound signal recorded in the noise database by the control center, and analyzing the tone of the filtered digital sound signal.
Matching the timbre of the digital sound signal with the user timbre in the user timbre parameter set stored in the timbre database: if the matching is successful, analyzing the natural language of the user NLP (Natural Language Processing ) according to the digital sound signal, traversing the historical question-answer records of the user question-answer library by the control center, eliminating information differences according to the correlation between the analyzed natural language result of the user NLP and the content of the historical question-answer records of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result to interact, and executing an interaction instruction by the execution module; if the matching fails, analyzing the NLP natural language of the user according to the digital sound signals, accessing a general question-answer library by the control center, calling data of the general question-answer library to answer, executing an interactive instruction by the execution module, generating a history question-answer record of the user question-answer library of the user, and recording the question-answer content into the user question-answer library.
In the first embodiment of the invention, in the use process, analog voice signals are required to be collected and converted into digital voice signals in an analog-to-digital mode, the converted digital voice signals are subjected to characteristic extraction to obtain characteristic parameters of the digital voice signals, and the characteristic parameters of the digital voice signals are marked; calculating by using the marked characteristic parameters of the digital sound signal to obtain a first judgment parameter, setting a standard judgment parameter, respectively carrying out first-order derivation on the first judgment parameter and the standard judgment parameter, and calculating the difference between the first derivative of the first judgment parameter and the absolute value of the first derivative of the standard judgment parameter to obtain a difference value; comparing the difference with a set difference threshold: if the difference value is greater than or equal to the difference value threshold value, judging that the digital sound signal of the collected sound does not accord with the control standard, and recording the digital sound signal by the noise database; if the difference value is smaller than the difference value threshold value, judging that the digital sound signal of the collected sound accords with the control standard, and the control center filters the digital sound signal recorded in the noise database and analyzes the tone of the filtered digital sound signal.
Matching the timbre of the digital sound signal with the user timbre in the user timbre parameter set stored in the timbre database: if matching is successful, analyzing the user NLP natural language according to the digital sound signal, traversing the historical question-answer records of the user question-answer library by the control center, eliminating information difference according to the correlation between the analyzed user NLP natural language result and the content of the historical question-answer records of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result for interaction, and executing an interaction instruction by the execution module; if the matching fails, analyzing the NLP natural language of the user according to the digital sound signals, accessing a general question-answer library by the control center, calling data of the general question-answer library to answer, executing an interactive instruction by the execution module, generating a history question-answer record of the user question-answer library of the user, and recording the question-answer content into the user question-answer library.
The intelligent voice interaction system provided by the embodiment of the invention can realize that the intelligent voice equipment is used for identifying effective sounding or noisy environmental noise, if the intelligent voice equipment is judged to be effective sounding, other environmental interference sounds can be shielded, and the voice identification accuracy is improved; in the intelligent voice interaction system provided by the embodiment of the invention, the identity of the sounding user can be identified according to the comparison result of the tone database, the information difference is eliminated by the relativity of the contents of the history question-answer records accessed to the user database, the problem that the user interaction experience is poor due to unclear voice identification is avoided, if no question-answer history exists, the user database can be created, the NLP natural language is analyzed and the general database is accessed for answering.
It should be further described that, in the first embodiment of the present invention, the data acquisition module acquires the analog sound signal by using a pickup, where the pickup is an analog pickup, and is composed of a microphone and an audio amplifying circuit.
The sound pickup is a sound sensing device which converts an analog audio signal into a digital signal through a digital signal processing system and performs corresponding digital signal processing. The analog pick-up amplifies the sound collected by the microphone by using a common analog circuit. The pickup has three-wire system and four-wire system; the three-wire system pickup generally has red representing the positive electrode of the power supply, white representing the positive electrode of the audio frequency, and black representing the negative electrode of the signal and power supply. The four-wire system pickup is generally red for the positive power supply, white for the positive audio supply, and the negative audio supply and the negative power supply are separate.
The data processing module processes the data after receiving the analog sound signal sent by the data acquisition module, and specifically, the process of the data processing module for processing the data comprises the following steps: converting the analog sound signal into a digital sound signal by using analog-to-digital conversion, and extracting the characteristics of the converted digital sound signal to obtain characteristic parameters of the digital sound signal, wherein the characteristic parameters of the digital sound signal comprise decibels, speed, tone and tone of the digital sound signal; marking characteristic parameters of the digital sound signal, and marking decibels of the digital sound signal as F by Marking the speed of the digital sound signal as S dy Marking the tone of a digital sound signal as G dy Marking the timbre of the digital sound signal as Y sy Where y is the number of collection labels, and y=1, 2, 3,..and n, n is the total number of collection.
It should be further described that, in the first embodiment of the present invention, an intelligent voice is providedIn an interactive system, the decibel F of a digital sound signal by Speed S of digital sound signal dy Tone G of digital sound signal dy Tone Y of digital sound signal sy And sending the data to a data analysis module for data analysis.
In the characteristic parameters of the digital sound signals, the decibels represent the loudness of sound, the timbre of the sound signals represents different characteristics of different sounds in terms of waveforms, the characteristic parameters are used for distinguishing different human voices, and the tone of the digital sound signals represents the height of sound frequency; the speed of the digital sound signal indicates the length of the interval of sound production.
Then the decibel F of the digital sound signal by Tone Y of digital sound signal sy Tone G of digital sound signal dy Speed S of digital sound signal dy The digital sound signal is sent to a data analysis module for data analysis, and the data analysis module receives the decibel F of the digital sound signal sent by the data processing module by Tone Y of digital sound signal sy Tone G of digital sound signal dy Speed S of digital sound signal dy Then, data analysis is carried out, and specifically, the analysis process of the data analysis module comprises the following steps: using the formulaCalculating to obtain a first determination parameter P dy Wherein F is b0 Is the standard sound decibel parameter S d0 G is the standard sound velocity parameter d0 For the standard sound tone parameters, α is the sound decibel influencing parameter, β is the sound speed influencing parameter, γ is the sound tone influencing parameter, +.>Is a preset proportionality coefficient.
Using the calculated first decision parameter P dy Obtaining the first derivative P of the first decision parameter dy1 And set standard judgment parameter P db And determines the parameter P for the standard db Performing first-order derivation to obtain first-order derivative P of standard judgment parameter db1 The method comprises the steps of carrying out a first treatment on the surface of the Calculating a first decision parameterDerivative of order P dy1 And first derivative P of standard decision parameter db1 The absolute value difference of (2) is given by:obtaining a difference Cz and comparing the difference Cz with a preset difference threshold Cz 0 Comparing if Cz is larger than or equal to Cz 0 The digital sound signal of the collected sound is not in accordance with the control standard, and the noise database records the digital sound signal; if Cz < Cz 0 The collected sound accords with the control standard, and the control center filters the digital sound signals recorded in the noise database and analyzes the tone of the filtered digital sound signals; acquiring a user tone parameter set Y stored in a tone database through a data acquisition unit in a data analysis module sbp And the tone Y of the digital sound signal sy Tone color parameter set Y with user sbp The user tone parameters in the digital voice signal are matched with parameters, if the tone Y of the digital voice signal sy The matching is successful, the user NLP natural language is analyzed according to the digital sound signals, the control center traverses the historical question-answer records of the user question-answer library, the information difference is eliminated according to the correlation between the analyzed user NLP natural language results and the content of the historical question-answer records of the user question-answer library, the final language processing results are obtained, the answer content is generated according to the final language processing results to interact, and the execution module executes the interaction instruction; if tone Y of digital sound signal sy And if the matching fails, analyzing the NLP natural language of the user according to the digital sound signals, accessing a general question-answer library by the control center, calling data of the general question-answer library to answer, executing an interactive instruction by the execution module, generating a history question-answer record of the user question-answer library of the user, and recording the question-answer content into the user question-answer library.
It should be noted that the standard sound decibel parameter, the standard sound tone parameter, and the standard sound speed parameter are the optimal decibel value, the optimal pitch value, and the optimal speed value in the entire control system, and the sound decibel influencing parameter, the sound tone influencing parameter, and the sound speed influencing parameter are three parameter values influencing sound decibel, pitch, and speed.
Need to further sayIt is clear that, in the intelligent voice interaction system provided in the first embodiment of the present invention, the user tone color parameter set Y sbp ={Y sb1 、Y sb2 、Y sb3 、...、Y sbt Where p is the user number and t is the total number of users.
User tone color parameter set Y sbp The acquisition process of (a) is as follows: recording voice information of the user through a data acquisition terminal in the control center, wherein the voice information of the user comprises voice decibels, voice speeds and voice tones.
And combining the sound information with a tone mapping model, acquiring and storing user tone parameters, and integrating all acquired user tone parameters to form a user tone parameter set, wherein the tone mapping model is trained based on an artificial intelligent model.
It should be further described that, training the tone mapping model based on the artificial intelligence model comprises the following specific processes: integrating and acquiring standard training data through a server, wherein the standard training data comprises sound information and voice user color parameters; training the artificial intelligent model through standard training data to obtain and store a tone mapping model; wherein the artificial intelligence model comprises a deep convolutional neural network model and an RBF neural network model.
It will be appreciated that the range of physical characteristic parameters in the standard training data should be large enough, e.g. gender should include male and female, and the age range should be evenly distributed over the 1-120 years.
It should be further noted that the deep convolutional neural network model is a feedforward neural network (Feedforward Neural Networks) with a deep structure, which includes convolutional calculation, and is one of representative algorithms of deep learning (deep learning), and the convolutional neural network has a capability of feature learning (representation learning) and can perform translational invariant classification on input information according to its hierarchical structure, and the convolutional is a linear operation, and a set of weights needs to be multiplied by the input to generate a two-dimensional weight array called a filter. If a filter is adjusted to detect a particular feature type in an input, repeated use of the filter throughout the input image may reveal features anywhere in the image, the structure comprising: input layer: the input layer of the convolutional neural network can process multidimensional data, and the input layer of the one-dimensional convolutional neural network receives a one-dimensional or two-dimensional array, wherein the one-dimensional array is usually time or frequency spectrum sampling; the two-dimensional array may include a plurality of channels; the input layer of the two-dimensional convolutional neural network receives a two-dimensional or three-dimensional array; the input layer of the three-dimensional convolutional neural network receives a four-dimensional array. Since convolutional neural networks are widely used in the field of computer vision, many studies have previously assumed three-dimensional input data, i.e., two-dimensional pixel points and RGB channels on a plane, when introducing their structures. Similar to other neural network algorithms, the input features of convolutional neural networks require normalization processing due to learning using gradient descent algorithms. Specifically, before the learning data is input into the convolutional neural network, the input data needs to be normalized in the channel or time/frequency dimension.
Hidden layer: the hidden layer of the convolutional neural network comprises common structures of a convolutional layer, a pooling layer and a full-connection layer 3, and complex structures such as an acceptance module, a residual block (residual block) and the like can exist in some more modern algorithms. In a common architecture, the convolutional layer and the pooling layer are specific to convolutional neural networks. The convolution kernels in the convolution layer contain weight coefficients, whereas the pooling layer does not, and thus in the literature the pooling layer may not be considered a separate layer. Taking the LeNet-5 as an example, the order in which class 3 is commonly built into the hidden layer is typically: input-convolution layer-pooling layer-full connection layer-output.
The RBF (Radial Basis Function ) neural network model is also called radial basis function neural network model, and is a three-layer forward network, the first layer is an input layer composed of signal source nodes, the second layer is a hidden layer, the number of hidden units is determined according to the requirement of a problem, the transformation function of the hidden units is a non-negative nonlinear function RBF, the third layer is an output layer, the output layer is a linear combination of hidden layer neuron outputs, and the basic idea of the RBF neural network model is that: the hidden layer space is constructed using the RBF as the basis of the hidden units, so that the input vector can be mapped directly to the hidden space without requiring a pass through weight connection. After the center point of the RBF is determined, this mapping relationship is also determined. The mapping from hidden layer space to output space is linear, i.e. the output of the network is a linear weighted sum of hidden unit outputs, where the weights are the network adjustable parameters. The function of the hidden layer is to map the vector from low dimension to high dimension, so that the situation that the low dimension is linearly inseparable to high dimension can become linearly inseparable, which is mainly the idea of kernel function. In this way, the mapping from input to output of the network is nonlinear, and the output of the network is linear for the adjustable parameters, so that the weight of the network can be directly solved by a linear equation system, thereby greatly improving the learning speed and avoiding the problem of local minima.
Example two
The second embodiment of the present invention provides an intelligent voice interaction method, as shown in fig. 2, including the following steps: obtaining an analog sound signal, and performing analog-to-digital conversion on the analog sound signal to obtain a digital sound signal; extracting the characteristics of the digital sound signal to obtain the characteristic parameters of the digital sound signal, and marking the characteristic parameters of the digital sound signal; calculating by using the marked characteristic parameters of the digital sound signals to obtain first judgment parameters, setting standard judgment parameters, respectively carrying out first-order derivation on the first judgment parameters and the standard judgment parameters, and calculating the difference between the first derivative of the first judgment parameters and the absolute value of the first derivative of the standard judgment parameters to obtain judgment difference values; comparing the determined difference value with a set difference value threshold value, and if the determined difference value is larger than or equal to the difference value threshold value, judging that the digital sound signal of the collected sound does not accord with the control standard, and recording the digital sound signal by a noise database; if the judging difference value is smaller than the difference value threshold value, judging that the digital sound signal of the collected sound accords with the control standard, and the control center filters the digital sound signal recorded in the noise database and analyzes the tone of the filtered digital sound signal; matching the timbre of the digital sound signal with the user timbre in the user timbre parameter set stored in the timbre database: if matching is successful, analyzing the user NLP natural language according to the digital sound signal, traversing the historical question-answer records of the user question-answer library by the control center, eliminating information difference according to the correlation between the analyzed user NLP natural language result and the content of the historical question-answer records of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result for interaction, and executing an interaction instruction by the execution module; if the matching fails, analyzing the NLP natural language of the user according to the digital sound signals, accessing a general question-answer library by the control center, calling data of the general question-answer library to answer, executing an interactive instruction by the execution module, generating a history question-answer record of the user question-answer library of the user, and recording the question-answer content into the user question-answer library.
According to the intelligent voice interaction method provided by the second embodiment of the invention, the intelligent voice equipment can be used for identifying effective sounding or noisy environmental noise, if the effective sounding is judged, other environmental interference sounds can be shielded, and the voice identification accuracy is improved; according to the intelligent voice interaction method provided by the embodiment of the invention, the identity of the sounding user can be identified according to the comparison result of the tone database, the information difference is eliminated by accessing the correlation of the contents of the history question-answer records of the user database, the problem that the user interaction experience is poor due to unclear voice identification is avoided, if no question-answer history exists, the user database can be created, the NLP natural language is analyzed, and the general database is accessed for answer; according to the intelligent voice interaction method provided by the second embodiment of the invention, whether effective sounding is effectively recognized, the sounding user is recognized and the history record is recorded, and the man-machine interaction experience is optimized.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above formulas are all formulas with dimensions removed and numerical values calculated, the formulas are formulas which are obtained by acquiring a large amount of data and performing software simulation to obtain the closest actual situation, and preset parameters and preset thresholds in the formulas are set by a person skilled in the art according to the actual situation or are obtained by simulating a large amount of data.
Finally, it should be noted that: the above examples are only specific embodiments of the present invention for illustrating the technical solution of the present invention, but not for limiting the scope of the present invention, and although the present invention has been described in detail with reference to the foregoing examples, it will be understood by those skilled in the art that the present invention is not limited thereto: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (7)

1. The intelligent voice interaction system is characterized by comprising a data acquisition module, a data processing module, a data analysis module, a data center, an execution module and a control center;
the data center comprises a tone database, a noise database, a general question-answer database and a user question-answer database;
the data acquisition module is used for: collecting analog sound signals, and sending the collected analog sound signals to the data processing module for data processing;
the data processing module: converting the analog sound signal into a digital sound signal by using analog-to-digital conversion, and extracting the characteristics of the converted digital sound signal to obtain characteristic parameters of the digital sound signal, wherein the characteristic parameters of the digital sound signal comprise decibels, speeds, tones and audios of the digital sound signal, marking the characteristic parameters of the digital sound signal, and sending the characteristic parameters to the data analysis module for analysis;
the data analysis module: calculating the decibel, the speed and the tone of the digital sound signal by utilizing the characteristic parameters of the digital sound signal to obtain a first judgment parameter, setting a standard judgment parameter, carrying out first-order derivation on the first judgment parameter and the standard judgment parameter, and obtaining the difference between the first derivative of the first judgment parameter after the first-order derivation and the absolute value of the first derivative of the standard judgment parameter to obtain a judgment difference value;
Comparing the judging difference value with a preset difference value threshold, judging that the digital sound signal of the collected sound does not accord with a control standard if the judging difference value is larger than or equal to the difference value threshold, and recording the digital sound signal by the noise database;
if the judging difference value is smaller than the difference value threshold value, judging that the digital sound signal of the collected sound accords with a control standard, filtering out the digital sound signal recorded by the noise database by the control center, and analyzing the tone of the filtered digital sound signal;
matching the timbre of the digital sound signal with the timbre of the user in the user timbre parameter set stored in the timbre database:
if matching is successful, analyzing a user natural language by utilizing an NLP according to the digital sound signal, traversing a historical question-answer record of the user question-answer library by the control center, eliminating information difference according to the correlation between the result of analyzing the user natural language by utilizing the NLP and the content of the historical question-answer record of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result, and performing interaction by the execution module, wherein the execution module executes an interaction instruction;
If the matching fails, analyzing natural language of the user by utilizing NLP according to the digital sound signals, accessing the general question-answering library by the control center, calling data of the general question-answering library to answer, executing an interactive instruction by the execution module, generating a history question-answering record of the user question-answering library of the user, and receiving and inputting the question-answering content into the user question-answering library;
the process of the data processing module for data processing comprises the following steps:
converting the analog sound signal into a digital sound signal by using analog-to-digital conversion, and extracting the characteristics of the converted digital sound signal to obtain characteristic parameters of the digital sound signal, wherein the characteristic parameters of the digital sound signal comprise decibels, speed, tone and tone of the digital sound signal;
marking the characteristic parameters of the digital sound signal, and marking the decibel of the digital sound signal as F by Marking the speed of the digital sound signal as S dy Marking the tone of the digital sound signal as G dy Marking the tone of the digital sound signal as Y sy Wherein y is the number of collection labels, and y=1, 2, 3,..and n, n is the total number of collection;
Decibel F of the digital sound signal by Speed S of the digital sound signal dy Tone G of the digital sound signal dy And tone Y of the digital sound signal sy Sending the data to the data analysis module for data analysis;
the process of the data analysis module for data analysis comprises the following steps:
calculating a first determination parameter P by using a formula dy Wherein F is b0 Is the standard sound decibel parameter S d0 G is the standard sound velocity parameter d0 The method is characterized in that the method comprises the steps of taking standard sound tone parameters, wherein alpha is a sound decibel influence parameter, beta is a sound speed influence parameter, gamma is a sound tone influence parameter and is a preset proportionality coefficient;
using the calculated first decision parameter P dy Obtaining the first derivative P of the first decision parameter dy1 And set standard judgment parameter P db And determines the parameter P for the standard db Performing first-order derivation to obtain a first derivative P of the standard judgment parameter db1
Calculating a first derivative P of the first decision parameter dy1 And the first derivative P of the standard decision parameter db1 The absolute value difference of (2) is given by: to obtain a difference Cz, anAnd is in accordance with a preset difference threshold Cz 0 Comparing if Cz is larger than or equal to Cz 0 The digital sound signals of the collected sound are not in accordance with the control standard, and the noise database records the digital sound signals;
If Cz < Cz 0 The collected sound is according with a control standard, the control center filters out the digital sound signals recorded by the noise database and analyzes the tone of the filtered digital sound signals;
acquiring a user tone color parameter set Y stored in the tone color database through a data acquisition unit in the data analysis module sbp And the tone Y of the digital sound signal is calculated sy And the tone color parameter set Y of the user sbp Matching the parameters of the user tone color parameters in the digital voice signal, if the tone color Y of the digital voice signal is sy The matching is successful, the user natural language is analyzed by utilizing NLP according to the digital sound signal, the control center traverses the history question-answer record of the user question-answer library, the information difference is eliminated according to the correlation between the result of analyzing the user natural language by utilizing NLP and the content of the history question-answer record of the user question-answer library, the final language processing result is obtained, the answer content is generated according to the final language processing result to interact, and the execution module executes the interaction instruction;
if tone Y of digital sound signal sy And if matching fails, analyzing natural language of the user by utilizing NLP according to the digital sound signal, accessing the general question-answer library by the control center, calling data of the general question-answer library to answer, executing an interactive instruction by the execution module, generating a historical question-answer record of the user question-answer library of the user, and receiving and inputting the question-answer content into the user question-answer library.
2. An intelligent voice interactive system according to claim 1, wherein said user tone color parameter set Y sbp ={Y sb1 、Y sb2 、Y sb3 、...、Y sbt Where p is the user number and t is the total number of users.
3. An intelligent voice interactive system according to claim 2, characterized in that said user tone color parameter set Y sbp The acquisition process of (a) is as follows:
recording voice information of a user through a data acquisition terminal in the control center, wherein the voice information of the user comprises voice decibels, voice speeds and voice tones;
and combining the sound information with a tone mapping model, acquiring and storing user tone parameters, and integrating all acquired user tone parameters to form a user tone parameter set, wherein the tone mapping model is trained based on an artificial intelligence model.
4. A smart voice interactive system according to claim 3, wherein the timbre mapping model is trained based on the artificial intelligence model by:
integrating and acquiring standard training data through a server, wherein the standard training data comprises sound information and user tone parameters;
training the artificial intelligent model through the standard training data to acquire and store the tone mapping model; wherein the artificial intelligence model comprises a deep convolutional neural network model and an RBF neural network model.
5. The intelligent voice interactive system according to claim 1, wherein the data acquisition module is configured to acquire the analog sound signal using a microphone.
6. The intelligent voice interactive system according to claim 5, wherein the sound pick-up is an analog sound pick-up, and comprises a microphone and an audio amplifier circuit.
7. An intelligent voice interaction method is characterized by comprising the following steps:
obtaining an analog sound signal, and performing analog-to-digital conversion on the analog sound signal to obtain a digital sound signal;
extracting the characteristics of the digital sound signals to obtain characteristic parameters of the digital sound signals, and marking the characteristic parameters of the digital sound signals;
calculating by using the marked characteristic parameters of the digital sound signals to obtain first judging parameters, setting standard judging parameters, respectively carrying out first-order derivation on the first judging parameters and the standard judging parameters, and calculating the difference between the first derivative of the first judging parameters and the absolute value of the first derivative of the standard judging parameters to obtain judging difference values;
comparing the judging difference value with a set difference value threshold value, judging that the digital sound signal of the collected sound does not accord with a control standard if the judging difference value is larger than or equal to the difference value threshold value, and recording the digital sound signal by a noise database;
If the judging difference value is smaller than the difference value threshold value, judging that the digital sound signal of the collected sound accords with a control standard, filtering out the digital sound signal recorded by the noise database by a control center, and analyzing the tone of the filtered digital sound signal;
matching the timbre of the digital sound signal with the timbre of the user in the user timbre parameter set stored in the timbre database:
if matching is successful, analyzing a user natural language by utilizing an NLP according to the digital sound signal, traversing a historical question-answer record of the user question-answer library by a control center, eliminating information difference according to the correlation between the result of analyzing the user natural language by utilizing the NLP and the content of the historical question-answer record of the user question-answer library, obtaining a final language processing result, generating answer content according to the final language processing result to interact, and executing an interaction instruction by an execution module;
if the matching fails, analyzing a user natural language by utilizing NLP according to the digital sound signal, accessing a general question-answer library by a control center, calling data of the general question-answer library to answer, executing an interactive instruction by an execution module, generating a historical question-answer record of the user question-answer library of the user, and receiving and inputting the question-answer content into the user question-answer library;
The process of the data processing module for data processing comprises the following steps:
converting the analog sound signal into a digital sound signal by using analog-to-digital conversion, and extracting the characteristics of the converted digital sound signal to obtain characteristic parameters of the digital sound signal, wherein the characteristic parameters of the digital sound signal comprise decibels, speed, tone and tone of the digital sound signal;
marking the characteristic parameters of the digital sound signal, and marking the decibel of the digital sound signal as F by Marking the speed of the digital sound signal as S dy Marking the tone of the digital sound signal as G dy Marking the tone of the digital sound signal as Y sy Wherein y is the number of collection labels, and y=1, 2, 3,..and n, n is the total number of collection;
decibel F of the digital sound signal by Speed S of the digital sound signal dy Tone G of the digital sound signal dy And tone Y of the digital sound signal sy Sending the data to a data analysis module for data analysis;
the process of the data analysis module for data analysis comprises the following steps:
calculating a first determination parameter P by using a formula dy Wherein F is b0 Is the standard sound decibel parameter S d0 G is the standard sound velocity parameter d0 The method is characterized in that the method comprises the steps of taking standard sound tone parameters, wherein alpha is a sound decibel influence parameter, beta is a sound speed influence parameter, gamma is a sound tone influence parameter and is a preset proportionality coefficient;
using the calculated first decision parameter P dy Obtaining the first derivative P of the first decision parameter dy1 And set standard judgment parameter P db And determines the parameter P for the standard db Performing first-order derivation to obtain a first derivative P of the standard judgment parameter db1
Calculating a first derivative P of the first decision parameter dy1 And the first derivative P of the standard decision parameter db1 The absolute value difference of (2) is given by: a difference Cz is obtained and is compared with a preset difference threshold Cz 0 Comparing if Cz is larger than or equal to Cz 0 The digital sound signals of the collected sound are not in accordance with the control standard, and the noise database records the digital sound signals;
if Cz < Cz 0 The collected sound accords with the control standard, the control center filters out the digital sound signals recorded by the noise database and analyzes the tone of the filtered digital sound signals;
acquiring a user tone color parameter set Y stored in the tone color database through a data acquisition unit in a data analysis module sbp And the tone Y of the digital sound signal is calculated sy And the tone color parameter set Y of the user sbp Matching the parameters of the user tone color parameters in the digital voice signal, if the tone color Y of the digital voice signal is sy The matching is successful, the user natural language is analyzed by utilizing NLP according to the digital sound signal, the control center traverses the history question-answer record of the user question-answer library, the information difference is eliminated according to the correlation between the result of analyzing the user natural language by utilizing NLP and the content of the history question-answer record of the user question-answer library, the final language processing result is obtained, the answer content is generated according to the final language processing result to interact, and the execution module executes the interaction instruction;
if tone Y of digital sound signal sy And analyzing natural language of the user by utilizing NLP according to the digital sound signal, accessing the general question-answer library by the control center, calling data of the general question-answer library to answer, executing an interaction instruction by the execution module, generating a history question-answer record of the user question-answer library of the user, and receiving and inputting the question-answer content into the user question-answer library.
CN202310486481.0A 2023-05-04 2023-05-04 Intelligent voice interaction system and method Active CN116189681B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310486481.0A CN116189681B (en) 2023-05-04 2023-05-04 Intelligent voice interaction system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310486481.0A CN116189681B (en) 2023-05-04 2023-05-04 Intelligent voice interaction system and method

Publications (2)

Publication Number Publication Date
CN116189681A CN116189681A (en) 2023-05-30
CN116189681B true CN116189681B (en) 2023-09-26

Family

ID=86442665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310486481.0A Active CN116189681B (en) 2023-05-04 2023-05-04 Intelligent voice interaction system and method

Country Status (1)

Country Link
CN (1) CN116189681B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116913277B (en) * 2023-09-06 2023-11-21 北京惠朗时代科技有限公司 Voice interaction service system based on artificial intelligence

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030040761A (en) * 2001-11-16 2003-05-23 인벤텍 코오포레이션 System and method that randomly makes question and answer sentences for enhancing user's foreign language speaking and listening abilities
JP2003152860A (en) * 2001-11-08 2003-05-23 Nec Saitama Ltd Voice detection circuit and telephone set
CN1511312A (en) * 2001-04-13 2004-07-07 多尔拜实验特许公司 High quality time-scaling and pitch-scaling of audio signals
CN109712628A (en) * 2019-03-15 2019-05-03 哈尔滨理工大学 A kind of voice de-noising method and audio recognition method based on RNN
WO2019174072A1 (en) * 2018-03-12 2019-09-19 平安科技(深圳)有限公司 Intelligent robot based training method and apparatus, computer device and storage medium
CN111400469A (en) * 2020-03-12 2020-07-10 法雨科技(北京)有限责任公司 Intelligent generation system and method for voice question answering

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI753576B (en) * 2020-09-21 2022-01-21 亞旭電腦股份有限公司 Model constructing method for audio recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1511312A (en) * 2001-04-13 2004-07-07 多尔拜实验特许公司 High quality time-scaling and pitch-scaling of audio signals
JP2003152860A (en) * 2001-11-08 2003-05-23 Nec Saitama Ltd Voice detection circuit and telephone set
KR20030040761A (en) * 2001-11-16 2003-05-23 인벤텍 코오포레이션 System and method that randomly makes question and answer sentences for enhancing user's foreign language speaking and listening abilities
WO2019174072A1 (en) * 2018-03-12 2019-09-19 平安科技(深圳)有限公司 Intelligent robot based training method and apparatus, computer device and storage medium
CN109712628A (en) * 2019-03-15 2019-05-03 哈尔滨理工大学 A kind of voice de-noising method and audio recognition method based on RNN
CN111400469A (en) * 2020-03-12 2020-07-10 法雨科技(北京)有限责任公司 Intelligent generation system and method for voice question answering

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Noise-robust algorithm of speech features extraction for automatic speech recognition system;A. N. Yakhnev等;2016 XIX IEEE International Conference on Soft Computing and Measurements (SCM);全文 *
Security Control for Multi-Time-Scale CPSs Under DoS Attacks: An Improved Dynamic Event-Triggered Mechanism;L. Ma等;IEEE Transactions on Network Science and Engineering;第2022-02-23卷;全文 *
基于线性预测倒谱的被动声纳目标特征提取技术;柳革命;孙超;刘兵;;应用声学(第05期);全文 *
改进的快速独立分量分析在语音分离系统中的应用;陈国良等;计算机应用;全文 *
语音信号数字处理技术及其军事应用;蔡静平;;国防科技(第09期);全文 *

Also Published As

Publication number Publication date
CN116189681A (en) 2023-05-30

Similar Documents

Publication Publication Date Title
CN109559736B (en) Automatic dubbing method for movie actors based on confrontation network
CN112233698B (en) Character emotion recognition method, device, terminal equipment and storage medium
CN104036776A (en) Speech emotion identification method applied to mobile terminal
CN116189681B (en) Intelligent voice interaction system and method
CN111724770B (en) Audio keyword identification method for generating confrontation network based on deep convolution
CN112581980B (en) Method and network for time-frequency channel attention weight calculation and vectorization
CN113191178A (en) Underwater sound target identification method based on auditory perception feature deep learning
CN111868823A (en) Sound source separation method, device and equipment
CN112382302A (en) Baby cry identification method and terminal equipment
CN115457980A (en) Automatic voice quality evaluation method and system without reference voice
CN114863905A (en) Voice category acquisition method and device, electronic equipment and storage medium
CN111012340A (en) Emotion classification method based on multilayer perceptron
CN116705059B (en) Audio semi-supervised automatic clustering method, device, equipment and medium
CN113707158A (en) Power grid harmful bird seed singing recognition method based on VGGish migration learning network
CN115884032B (en) Smart call noise reduction method and system for feedback earphone
CN111863035A (en) Method, system and equipment for recognizing heart sound data
CN111862991A (en) Method and system for identifying baby crying
CN113887339A (en) Silent voice recognition system and method fusing surface electromyogram signal and lip image
CN113990303A (en) Environmental sound identification method based on multi-resolution cavity depth separable convolution network
CN113691382A (en) Conference recording method, conference recording device, computer equipment and medium
CN111554319A (en) Multichannel cardiopulmonary sound abnormity identification system and device based on low-rank tensor learning
KR20210043862A (en) Apparatus and method for generating feature information and method for transferring biomedical signal comprising the same
CN111026268A (en) Gesture recognition device and method
CN117475360B (en) Biological feature extraction and analysis method based on audio and video characteristics of improved MLSTM-FCN
CN113378660B (en) Face recognition method and device with low data cost

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant