CN112735390B - Intelligent voice terminal equipment with voice recognition function - Google Patents

Intelligent voice terminal equipment with voice recognition function Download PDF

Info

Publication number
CN112735390B
CN112735390B CN202011564820.5A CN202011564820A CN112735390B CN 112735390 B CN112735390 B CN 112735390B CN 202011564820 A CN202011564820 A CN 202011564820A CN 112735390 B CN112735390 B CN 112735390B
Authority
CN
China
Prior art keywords
voice
module
user
processing module
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011564820.5A
Other languages
Chinese (zh)
Other versions
CN112735390A (en
Inventor
刘伟
杨志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Taide Intelligence Technology Co Ltd
Original Assignee
Jiangxi Taide Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Taide Intelligence Technology Co Ltd filed Critical Jiangxi Taide Intelligence Technology Co Ltd
Priority to CN202011564820.5A priority Critical patent/CN112735390B/en
Publication of CN112735390A publication Critical patent/CN112735390A/en
Application granted granted Critical
Publication of CN112735390B publication Critical patent/CN112735390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Abstract

The invention discloses intelligent voice terminal equipment with a voice recognition function, relates to a voice terminal and belongs to the technical field of intelligent voice recognition; the voice recognition system comprises a voice acquisition module, a voice processing module, a voice storage module, a voice matching module and a voice recognition module; the voice acquisition module is used for acquiring voice information, and send the voice information to the voice processing module, the voice processing module processes the received voice, and send the processed result to the voice storage module, when the central controller detects voice input, the central controller controls the voice acquisition module to acquire voice, and send the acquired voice to the voice processing module, the voice processing module intercepts voice segments to carry out the voice matching module, if the voice segments are matched with data in the voice storage module, the voice recognition module acquires the complete voice process, and combines the knowledge processing module, and the recognized voice is displayed in the recognition display module.

Description

Intelligent voice terminal equipment with voice recognition function
Technical Field
The invention relates to a voice terminal, in particular to an intelligent voice terminal device with a voice recognition function, and belongs to the technical field of intelligent voice recognition.
Background
In general, an intelligent terminal is a type of embedded computer system device, and therefore, the architecture framework of the intelligent terminal is consistent with the architecture of an embedded system; meanwhile, the intelligent terminal is used as an application direction of the embedded system, and the application scene setting is clear, so that the system structure is more clear than that of a common embedded system, the granularity is finer, and the system has certain characteristics.
The intelligent terminal system structure is divided into a hardware structure and a software structure, and from the hardware, the intelligent terminal is generally adopted as a computer classical system structure-a von Neumann structure, namely, the intelligent terminal is composed of five parts, namely an arithmetic unit, a controller, a memory, an input device and an output device, wherein the arithmetic unit and the controller form a core part, namely a central processing unit, of the computer. In the software structure of the intelligent terminal, system software mainly comprises an operating system and middleware. The operating system has the function of managing all resources (including hardware and software) of the intelligent terminal and is also a kernel and a foundation of the intelligent terminal system.
The existing intelligent voice terminal equipment can automatically output voice information by a user and convert the voice information into text to be displayed for the user to confirm, but a huge voice database is not established, a reference is provided for later-stage user authentication, voice filtering is not performed after the user confirms, and the problem that the recognized voice information is inaccurate when other noise exists beside the voice information is solved.
Therefore, an intelligent voice terminal device with a voice recognition function is provided.
Disclosure of Invention
The invention aims to provide intelligent voice terminal equipment with a voice recognition function, which is used for solving the problems that the existing intelligent voice terminal equipment can automatically output voice information by a user and convert the voice information into characters to be displayed for the user to confirm, but a huge voice database is not established, a reference is provided for later authentication of the user, voice filtering is not performed after the user confirms, and the recognized voice information is possibly inaccurate when other noise exists nearby. The invention relates to a voice acquisition module, a voice processing module, a voice storage module, a voice matching module, a voice recognition module, a user authentication module, a central controller, a recognition display module, a voice output module and a knowledge processing module, wherein the voice acquisition module is used for acquiring voice information and sending the voice information to the voice processing module, the voice processing module is used for processing the received voice and sending a processing result to the voice storage module, when the central controller detects voice input, the central controller controls the voice acquisition module to acquire the voice and sends the acquired voice to the voice processing module, the voice processing module intercepts voice segments to carry out the voice matching module, and if the voice segments are matched with data in the voice storage module, the voice recognition module acquires a complete voice process and combines the knowledge processing module to display the recognized voice in the recognition display module.
The purpose of the invention can be realized by the following technical scheme:
an intelligent voice terminal device with a voice recognition function comprises a voice acquisition module, a voice processing module, a voice storage module, a voice matching module, a voice recognition module, a user authentication module, a central controller, a recognition display module, a voice output module and a knowledge processing module; the central controller is electrically connected with the voice acquisition module, the voice acquisition module is in wireless communication connection with the voice processing module, the voice processing module is in wireless communication connection with the voice storage module, the voice processing module is in wireless communication connection with the voice matching module, the voice matching module is in wireless communication connection with the voice recognition module, and the voice recognition module and the knowledge processing module are in wireless communication connection with the voice output module and the recognition display module;
the voice acquisition module is used for acquiring voice information, the voice information is sent to the voice processing module, the voice processing module processes the received voice, the processed result is sent to the voice storage module, when the central controller detects voice input, the central controller controls the voice acquisition module to acquire the voice, the acquired voice is sent to the voice processing module, the voice processing module intercepts voice fragments to carry out the voice matching module, if the voice fragments are matched with data in the voice storage module, the voice recognition module acquires the complete voice process, and the knowledge processing module is combined, and the recognized voice is displayed in the recognition display module.
Specifically, the specific way of storing data by the voice storage module includes the following processes:
the user inputs an account password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user, generates an identity code of the storage user, and acquires a voice fragment of the storage user through the voice acquisition module;
the voice acquisition module sends the acquired voice segments of the storage users to the voice processing module, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects the starting and ending endpoints;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the signals as Ai, F and Ci; wherein i represents the number of frames of the speech segment; i =1,2 … m;
and sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module.
Specifically, the voice matching module is configured to perform voice matching on a user, and a specific matching process includes the following steps:
the voice acquisition module acquires voice information of a user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice fragments with the same length, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects a start end point and a tail end point;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and overtone interval; marking the three-dimensional image as Ai ', F ' and Ci ' respectively;
the voice matching degree Pc between the user and a plurality of stored users is calculated by using a calculation formula
Figure BDA0002861529070000031
Wherein a1, a2 and a3 are preset values, and a1 is more than a2 and more than a3; c represents the number of the storage user, c =1,2 … m;
setting a voice matching degree threshold, if the voice matching degree Pc is larger than the voice matching degree threshold, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the identity code of the user with the largest voice matching degree Pc is sent to the voice recognition module;
if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module.
Specifically, after the voice recognition module receives the identity code of the user sent by the voice matching module, complete voice of the user is obtained, character recognition is carried out, and the recognized voice is converted into characters by combining the knowledge processing module and displayed in the recognition display module.
Specifically, after the voice recognition module obtains complete voice, the voice recognition module performs character recognition and sends recognized characters to the knowledge processing module, and common phrases and words of the user are stored in the knowledge processing module.
Specifically, the user authentication module is used for a new user to input personal information for registration and login, and perform user authentication when logging in next time, wherein the personal information comprises name, age and home address, the personal information of the user who successfully registers is stored in the voice storage module, and the user authentication module generates an identity code at the same time.
Specifically, the working process of the intelligent voice terminal equipment with the voice recognition function comprises the following steps:
the method comprises the following steps: storing the user voice;
the user inputs an account password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user and generates an identity code of the user, and a voice fragment of the storage user is obtained through the voice acquisition module; the voice acquisition module sends the acquired voice fragments of the storage user to the voice processing module, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects the starting and ending points;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the signals as Ai, F and Ci; wherein i represents the number of frames of the speech segment; sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module;
step two: intelligent voice matching;
when a user inputs voice, the voice acquisition module acquires voice information of the user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice segments with the same length, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects start and end points; acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the three-dimensional image as Ai ', F ' and Ci ' respectively;
the voice matching degree Pc between the user and the stored user is calculated by the formula
Figure BDA0002861529070000051
Setting a voice matching degree threshold, if the voice matching degree Pc is larger than the voice matching degree threshold, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the identity code of the user with the largest voice matching degree Pc is sent to the voice recognition module;
if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module;
step three: intelligent voice recognition;
and when the voice recognition module receives the identity code of the user sent by the voice matching module, the complete voice of the user is acquired, character recognition is carried out, and the recognized voice is converted into characters by combining the knowledge processing module and displayed in the recognition display module.
Compared with the prior art, the invention has the beneficial effects that:
1. the intelligent voice recognition system comprises a voice acquisition module, a voice processing module, a voice storage module, a voice matching module, a voice recognition module, a user authentication module, a central controller, a recognition display module, a voice output module and a knowledge processing module; the central controller is electrically connected with the voice acquisition module, the voice acquisition module is in wireless communication connection with the voice processing module, the voice processing module is in wireless communication connection with the voice storage module, the voice processing module is in wireless communication connection with the voice matching module, the voice matching module is in wireless communication connection with the voice recognition module, and the voice recognition module and the knowledge processing module are in wireless communication connection with the voice output module and the recognition display module; the voice acquisition module is used for acquiring voice information, the voice information is sent to the voice processing module, the voice processing module processes the received voice, the processed result is sent to the voice storage module, when the central controller detects voice input, the central controller controls the voice acquisition module to acquire the voice, the acquired voice is sent to the voice processing module, the voice processing module intercepts voice fragments to carry out the voice matching module, if the voice fragments are matched with data in the voice storage module, the voice recognition module acquires the complete voice process, and the knowledge processing module is combined, and the recognized voice is displayed in the recognition display module.
2. The voice storage module is used for storing the voice of the authenticated user and providing a database for later voice authentication, the specific user inputs an account number and a password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user and generates an identity code of the storage user, and a voice acquisition module acquires a voice fragment of the storage user; the voice acquisition module sends the acquired voice segments of the storage users to the voice processing module, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects the starting and ending endpoints; acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the signals as Ai, F and Ci; and sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module.
3. The voice recognition system is provided with a voice matching module, wherein the voice matching module is used for matching voice of a user, the voice acquisition module acquires voice information of the user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice fragments with the same length, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects starting and ending endpoints; acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; it is labeled Ai ', F ', ci ', respectively; the voice matching degree Pc between the user and a plurality of stored users is calculated by using a calculation formula
Figure BDA0002861529070000071
Setting a voice matching degree threshold value, if the voice matching degree Pc is larger than the voice matching degree threshold value, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the user with the largest voice matching degree Pc isThe code share is sent to a voice recognition module; if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an intelligent voice terminal device with a voice recognition function according to the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, an intelligent voice terminal device with a voice recognition function includes a voice collecting module, a voice processing module, a voice storage module, a voice matching module, a voice recognition module, a user authentication module, a central controller, a recognition display module, a voice output module, and a knowledge processing module; the central controller is electrically connected with the voice acquisition module, the voice acquisition module is in wireless communication connection with the voice processing module, the voice processing module is in wireless communication connection with the voice storage module, the voice processing module is in wireless communication connection with the voice matching module, the voice matching module is in wireless communication connection with the voice recognition module, and the voice recognition module and the knowledge processing module are in wireless communication connection with the voice output module and the recognition display module;
the voice acquisition module is used for acquiring voice information, the voice information is sent to the voice processing module, the voice processing module processes the received voice, the processed result is sent to the voice storage module, when the central controller detects voice input, the central controller controls the voice acquisition module to acquire the voice, the acquired voice is sent to the voice processing module, the voice processing module intercepts voice fragments to carry out the voice matching module, if the voice fragments are matched with data in the voice storage module, the voice recognition module acquires the complete voice process, and the knowledge processing module is combined, and the recognized voice is displayed in the recognition display module.
The specific way of storing data by the voice storage module comprises the following processes:
the user inputs an account password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user, generates an identity code of the storage user, and acquires a voice fragment of the storage user through the voice acquisition module;
the voice acquisition module sends the acquired voice fragments of the storage user to the voice processing module, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects the starting and ending points;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the signals as Ai, F and Ci; wherein i represents the number of frames of the speech segment; i =1,2 … m;
and sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module.
The voice matching module is used for matching voice of a user, and the specific matching process comprises the following steps:
the voice acquisition module acquires voice information of a user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice fragments with the same length, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects a start end point and a tail end point;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; it is labeled Ai ', F ', ci ', respectively;
the voice matching degree Pc between the user and a plurality of stored users is calculated by using a calculation formula
Figure BDA0002861529070000091
Wherein a1, a2 and a3 are preset values, and a1 is more than a2 and more than a3; c denotes the number of the storage user, c =1,2 … m;
setting a voice matching degree threshold, if the voice matching degree Pc is larger than the voice matching degree threshold, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the identity code of the user with the largest voice matching degree Pc is sent to the voice recognition module;
if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module.
When the voice recognition module receives the identity code of the user sent by the voice matching module, complete voice of the user is obtained, character recognition is carried out, and the recognized voice is converted into characters by combining the knowledge processing module and displayed in the recognition display module.
After the voice recognition module acquires complete voice, the voice recognition module performs character recognition and sends recognized characters to the knowledge processing module, and commonly used phrases and words of a user are stored in the knowledge processing module.
The user authentication module is used for inputting personal information to register and log in by a new user, and performing user authentication when logging in next time, wherein the personal information comprises name, age and home address, the personal information of the user who successfully registers is stored in the voice storage module, and the user authentication module simultaneously generates an identity code.
The working process of the intelligent voice terminal equipment with the voice recognition function comprises the following steps:
the method comprises the following steps: storing the user voice;
the user inputs an account password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user and generates an identity code of the user, and a voice fragment of the storage user is obtained through the voice acquisition module; the voice acquisition module sends the acquired voice fragments of the storage user to the voice processing module, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects the starting and ending points;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the signals as Ai, F and Ci; wherein i represents the number of frames of the speech segment; sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module;
step two: intelligent voice matching;
when a user inputs voice, the voice acquisition module acquires voice information of the user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice segments with the same length, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects start and end points; acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; it is labeled Ai ', F ', ci ', respectively;
the voice matching degree Pc of the user and the stored user is calculated by the calculation formula
Figure BDA0002861529070000101
Setting a voice matching degree threshold, if the voice matching degree Pc is larger than the voice matching degree threshold, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the identity code of the user with the largest voice matching degree Pc is sent to the voice recognition module;
if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module;
step three: intelligent voice recognition;
and when the voice recognition module receives the identity code of the user sent by the voice matching module, the complete voice of the user is acquired, character recognition is carried out, and the recognized voice is converted into characters by combining the knowledge processing module and displayed in the recognition display module.
When a user inputs an account password and an identity code to log in the intelligent voice terminal equipment, the voice matching module performs authentication matching, when the matching is passed, the data acquisition module acquires complete voice and sends the complete voice to the voice recognition module, the voice recognition module removes other impurity sounds in the recognition process, and the voice of the user who inputs the account password and the identity code is output in the recognition display module by combining the knowledge processing module.
The above formulas are all calculated by removing dimensions and taking values thereof, the formula is one closest to the real situation obtained by collecting a large amount of data and performing software simulation, and the preset parameters in the formula are set by the technical personnel in the field according to the actual situation.
The working principle of the invention is as follows: the user inputs an account password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user and generates an identity code of the user, and a voice fragment of the storage user is obtained through the voice acquisition module; the voice acquisition module sends the acquired voice segments of the storage users to the voice processing module, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects the starting and ending endpoints; acquiring the amplitude of the voice fragment, the frequency of the voice fragment and overtone interval; marking the signals as Ai, F and Ci; wherein i represents the number of frames of the speech segment; sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module;
when a user inputs voice, the voice acquisition module acquires voice information of the user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice segments with the same length, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects start and end points; acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval;it is labeled Ai ', F ', ci ', respectively; the voice matching degree Pc of the user and the stored user is calculated by the calculation formula
Figure BDA0002861529070000111
Setting a voice matching degree threshold, if the voice matching degree Pc is larger than the voice matching degree threshold, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the identity code of the user with the largest voice matching degree Pc is sent to the voice recognition module; if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module;
and when the voice recognition module receives the identity code of the user sent by the voice matching module, acquiring the complete voice of the user, recognizing characters, converting the recognized voice into characters by combining the knowledge processing module, and displaying the characters in the recognition display module.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (5)

1. An intelligent voice terminal device with a voice recognition function is characterized by comprising a voice acquisition module, a voice processing module, a voice storage module, a voice matching module, a voice recognition module, a user authentication module, a central controller, a recognition display module, a voice output module and a knowledge processing module; the central controller is electrically connected with the voice acquisition module, the voice acquisition module is in wireless communication connection with the voice processing module, the voice processing module is in wireless communication connection with the voice storage module, the voice processing module is in wireless communication connection with the voice matching module, the voice matching module is in wireless communication connection with the voice recognition module, and the voice recognition module and the knowledge processing module are in wireless communication connection with the voice output module and the recognition display module;
the voice acquisition module is used for acquiring voice information and sending the voice information to the voice processing module, the voice processing module is used for processing the received voice and sending a processing result to the voice storage module, when the central controller detects voice input, the central controller controls the voice acquisition module to acquire the voice and sends the acquired voice to the voice processing module, the voice processing module intercepts voice fragments to carry out a voice matching module, if the voice fragments are matched with data in the voice storage module, the voice recognition module acquires a complete voice process and displays the recognized voice in the recognition display module by combining with the knowledge processing module;
the specific way of storing data by the voice storage module comprises the following processes:
the user inputs an account password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user, generates an identity code of the storage user, and acquires a voice fragment of the storage user through the voice acquisition module;
the voice acquisition module sends the acquired voice segments of the storage users to the voice processing module, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects the starting and ending endpoints;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the signals as Ai, F and Ci; wherein i represents the number of frames of the speech segment; i =1,2 … m;
sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module;
the voice matching module is used for carrying out voice matching on a user, and the specific matching process comprises the following steps:
the voice acquisition module acquires voice information of a user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice fragments with the same length, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects a start end point and a tail end point;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; it is labeled Ai ', F ', ci ', respectively;
the voice matching degree Pc between the user and a plurality of stored users is calculated by using a calculation formula
Figure FDA0003937132890000021
Wherein a1, a2 and a3 are preset values, and a1 is more than a2 and more than a3; c denotes the number of the storage user, c =1,2 … m;
setting a voice matching degree threshold, if the voice matching degree Pc is larger than the voice matching degree threshold, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the identity code of the user with the largest voice matching degree Pc is sent to the voice recognition module;
if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module.
2. The intelligent voice terminal equipment with the voice recognition function as claimed in claim 1, wherein after the voice recognition module receives the identity code of the user sent by the voice matching module, the complete voice of the user is obtained, character recognition is performed, and the recognized voice is converted into characters by combining the knowledge processing module and displayed in the recognition display module.
3. The intelligent voice terminal device with the voice recognition function as claimed in claim 1, wherein after the voice recognition module obtains complete voice, the voice recognition module performs character recognition and sends recognized characters to the knowledge processing module, and common phrases and words of the user are stored in the knowledge processing module.
4. The intelligent voice terminal device with voice recognition function as claimed in claim 1, wherein the user authentication module is used for a new user to input personal information for registration login, and to perform user authentication at the next login, wherein the personal information includes name, age and home address, and the personal information of the user who successfully registers is stored in the voice storage module, and the user authentication module generates the identity code at the same time.
5. An intelligent voice terminal device with voice recognition function according to claim 1, characterized in that the working process of the intelligent voice terminal device with voice recognition function comprises the following steps:
the method comprises the following steps: storing the user voice;
the user inputs an account password through the user authentication module and then logs in the intelligent voice terminal equipment through authentication, the central controller marks the user as a storage user and generates an identity code of the user, and a voice fragment of the storage user is obtained through the voice acquisition module; the voice acquisition module sends the acquired voice fragments of the storage user to the voice processing module, and the voice processing module normalizes the amplitude of the voice fragments, corrects frequency response, divides frames, adds windows and detects the starting and ending points;
acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; marking the signals as Ai, F and Ci; wherein i represents the number of frames of the speech segment; sending the Ai, the F, the Ci and the identity codes of the stored users to a voice storage module;
step two: intelligent voice matching;
when a user inputs voice, the voice acquisition module acquires voice information of the user and sends the voice information of the user to the voice processing module, the voice processing module intercepts voice segments with the same length, and the voice processing module normalizes the amplitude of the voice segments, corrects frequency response, divides frames, adds windows and detects start and end points; acquiring the amplitude of the voice fragment, the frequency of the voice fragment and the overtone interval; it is labeled Ai ', F ', ci ', respectively;
the voice matching degree Pc of the user and the stored user is calculated by the calculation formula
Figure FDA0003937132890000041
Setting a voice matching degree threshold, if the voice matching degree Pc is larger than the voice matching degree threshold, the voice matching module carries out descending order arrangement on the calculated voice matching degree Pc, and the identity code of the user with the largest voice matching degree Pc is sent to the voice recognition module;
if the voice matching degree Pc is not larger than the voice matching degree threshold value, the user is represented as a new user, and the user is reminded to perform user authentication through the voice output module;
step three: intelligent voice recognition;
and when the voice recognition module receives the identity code of the user sent by the voice matching module, the complete voice of the user is acquired, character recognition is carried out, and the recognized voice is converted into characters by combining the knowledge processing module and displayed in the recognition display module.
CN202011564820.5A 2020-12-25 2020-12-25 Intelligent voice terminal equipment with voice recognition function Active CN112735390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011564820.5A CN112735390B (en) 2020-12-25 2020-12-25 Intelligent voice terminal equipment with voice recognition function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011564820.5A CN112735390B (en) 2020-12-25 2020-12-25 Intelligent voice terminal equipment with voice recognition function

Publications (2)

Publication Number Publication Date
CN112735390A CN112735390A (en) 2021-04-30
CN112735390B true CN112735390B (en) 2023-02-28

Family

ID=75616352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011564820.5A Active CN112735390B (en) 2020-12-25 2020-12-25 Intelligent voice terminal equipment with voice recognition function

Country Status (1)

Country Link
CN (1) CN112735390B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
CN105740686A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Application control method and device
CN105895096A (en) * 2016-03-30 2016-08-24 乐视控股(北京)有限公司 Identity identification and voice interaction operating method and device
CN109147773A (en) * 2017-06-16 2019-01-04 上海寒武纪信息科技有限公司 A kind of speech recognition equipment and method
CN111741369A (en) * 2020-07-10 2020-10-02 安徽芯智科技有限公司 Smart television set top box based on voice recognition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
CN105740686A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Application control method and device
CN105895096A (en) * 2016-03-30 2016-08-24 乐视控股(北京)有限公司 Identity identification and voice interaction operating method and device
CN109147773A (en) * 2017-06-16 2019-01-04 上海寒武纪信息科技有限公司 A kind of speech recognition equipment and method
CN111741369A (en) * 2020-07-10 2020-10-02 安徽芯智科技有限公司 Smart television set top box based on voice recognition

Also Published As

Publication number Publication date
CN112735390A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
WO2019095586A1 (en) Meeting minutes generation method, application server, and computer readable storage medium
JP2018081297A (en) Method and device for processing voice data
US20170092276A1 (en) Voiceprint Verification Method And Device
CN105427855A (en) Voice broadcast system and voice broadcast method of intelligent software
CN112148922A (en) Conference recording method, conference recording device, data processing device and readable storage medium
CN107544272A (en) terminal control method, device and storage medium
WO2020238045A1 (en) Intelligent speech recognition method and apparatus, and computer-readable storage medium
CN103366745A (en) Method for protecting terminal equipment based on speech recognition and terminal equipment
CN106896933B (en) method and device for converting voice input into text input and voice input equipment
WO2022048319A1 (en) Switching method and apparatus for multiple user accounts, electronic device, and storage medium
CN110992955A (en) Voice operation method, device, equipment and storage medium of intelligent equipment
CN112397057A (en) Voice processing method, device, equipment and medium based on generation countermeasure network
CN111159987A (en) Data chart drawing method, device, equipment and computer readable storage medium
CN110910898B (en) Voice information processing method and device
CN110600045A (en) Sound conversion method and related product
CN112735390B (en) Intelligent voice terminal equipment with voice recognition function
CN108090044B (en) Contact information identification method and device
CN111833907B (en) Man-machine interaction method, terminal and computer readable storage medium
CN115104151A (en) Offline voice recognition method and device, electronic equipment and readable storage medium
CN108153568B (en) Information processing method and electronic equipment
CN111444377A (en) Voiceprint identification authentication method, device and equipment
CN108010518B (en) Voice acquisition method, system and storage medium of voice interaction equipment
CN109088873A (en) A kind of login system based on recognition of face big data
WO2021184211A1 (en) Risk evaluation method and apparatus, electronic device, and storage medium
CN113643706A (en) Voice recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant